New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HoodieCommitArchiveLog OOM's if the number of commits/cleans is too large #364
Comments
Thanks @n3nash ! Just to 1 additional note: the cap should apply to both |
|
Not sure if I want to rush into caping yet. Can we get into why the archival failed for 1-2 days? How much memory was used, how much was needed? |
We know why that happened, it's because our jobs were failing due to a different reason. Ideally, we shouldn't run into this issue, but the concern was that if we do, this ends up taking a huge amount of memory on the driver (12G) and even then fails, so this is worth exploring. |
…full read of initial commit (apache#364) Co-authored-by: harshal <harshal.j.patil@gmail.com> Co-authored-by: Lokesh Lingarajan <lokeshlingarajan@Lokeshs-MacBook-Pro.local>
If a job fails to archive over 1-2 days, the (numberOfCommits - minCommits) can be really large and much more than (maxCommits - minCommits) intended to be archived. Add a logic in the HoodieCommitArchiveLog to cap the number of commits to archive as follows : min( (numberOfCommits - minCommits), (maxCommits - minCommits)).
@jianxu @kaushikd49 FYI
The text was updated successfully, but these errors were encountered: