Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HoodieCommitArchiveLog OOM's if the number of commits/cleans is too large #364

Closed
n3nash opened this issue Mar 23, 2018 · 5 comments
Closed

Comments

@n3nash
Copy link
Contributor

n3nash commented Mar 23, 2018

If a job fails to archive over 1-2 days, the (numberOfCommits - minCommits) can be really large and much more than (maxCommits - minCommits) intended to be archived. Add a logic in the HoodieCommitArchiveLog to cap the number of commits to archive as follows : min( (numberOfCommits - minCommits), (maxCommits - minCommits)).

@jianxu @kaushikd49 FYI

@jianxu
Copy link
Contributor

jianxu commented Mar 23, 2018

Thanks @n3nash ! Just to 1 additional note: the cap should apply to both clean instants and commit instants.

@bvaradar
Copy link
Contributor

bvaradar commented Mar 23, 2018

@n3nash , @jianxu : Can you link stack-trace or heap-dump when OOM occured ?

@n3nash
Copy link
Contributor Author

n3nash commented Mar 28, 2018

Caused by: java.lang.OutOfMemoryError: Java heap space
	at java.util.Arrays.copyOf(Arrays.java:3236)
	at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:118)
	at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
	at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153)
	at java.io.DataOutputStream.write(DataOutputStream.java:107)
	at java.io.FilterOutputStream.write(FilterOutputStream.java:97)
	at com.uber.hoodie.common.table.log.block.HoodieAvroDataBlock.lambda$getBytes$0(HoodieAvroDataBlock.java:92)
	at com.uber.hoodie.common.table.log.block.HoodieAvroDataBlock$$Lambda$100/1958503496.accept(Unknown Source)
	at java.util.ArrayList.forEach(ArrayList.java:1249)
	at com.uber.hoodie.common.table.log.block.HoodieAvroDataBlock.getBytes(HoodieAvroDataBlock.java:79)

@vinothchandar
Copy link
Member

Not sure if I want to rush into caping yet. Can we get into why the archival failed for 1-2 days? How much memory was used, how much was needed?

@n3nash
Copy link
Contributor Author

n3nash commented Apr 4, 2018

We know why that happened, it's because our jobs were failing due to a different reason. Ideally, we shouldn't run into this issue, but the concern was that if we do, this ends up taking a huge amount of memory on the driver (12G) and even then fails, so this is worth exploring.

vinishjail97 pushed a commit to vinishjail97/hudi that referenced this issue Dec 15, 2023
…full read of initial commit (apache#364)

Co-authored-by: harshal <harshal.j.patil@gmail.com>
Co-authored-by: Lokesh Lingarajan <lokeshlingarajan@Lokeshs-MacBook-Pro.local>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants