Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(#590): Catalog data file gets broken after compaction #592

Merged
merged 3 commits into from
Jun 1, 2024

Conversation

novoj
Copy link
Collaborator

@novoj novoj commented Jun 1, 2024

It seems that when the catalog file exceeds the threshold size (100MB by default) and is compressed into a new file, it's somehow corrupted. The previous file is not deleted, but partially overwritten, and the new file is also corrupted. This was discovered by analyzing file remnants where the original file was much smaller than the 100MB it should have been, and the contents were completely corrupted. It is likely that different tasks are writing to corrupted versions of the file. We need to investigate this issue and write a more complex integration test for this scenario.

novoj added 3 commits June 1, 2024 22:13
It seems that when the catalog file exceeds the threshold size (100MB by default) and is compressed into a new file, it's somehow corrupted. The previous file is not deleted, but partially overwritten, and the new file is also corrupted. This was discovered by analyzing file remnants where the original file was much smaller than the 100MB it should have been, and the contents were completely corrupted. It is likely that different tasks are writing to corrupted versions of the file. We need to investigate this issue and write a more complex integration test for this scenario.
It seems that when the catalog file exceeds the threshold size (100MB by default) and is compressed into a new file, it's somehow corrupted. The previous file is not deleted, but partially overwritten, and the new file is also corrupted. This was discovered by analyzing file remnants where the original file was much smaller than the 100MB it should have been, and the contents were completely corrupted. It is likely that different tasks are writing to corrupted versions of the file. We need to investigate this issue and write a more complex integration test for this scenario.
It seems that when the catalog file exceeds the threshold size (100MB by default) and is compressed into a new file, it's somehow corrupted. The previous file is not deleted, but partially overwritten, and the new file is also corrupted. This was discovered by analyzing file remnants where the original file was much smaller than the 100MB it should have been, and the contents were completely corrupted. It is likely that different tasks are writing to corrupted versions of the file. We need to investigate this issue and write a more complex integration test for this scenario.
@novoj novoj merged commit 4f890a4 into master Jun 1, 2024
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant