Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make partition deletion resilient against oversize #2431

Merged
merged 16 commits into from
Jul 20, 2022

Conversation

tobim
Copy link
Member

@tobim tobim commented Jul 14, 2022

We now reingest data from oversized partitions on startup.

📝 Checklist

  • All user-facing changes have changelog entries.
  • The changes are reflected on vast.io, if necessary.
  • The PR description contains instructions for the reviewer, if necessary.

🎯 Review Instructions

Testing instructions: You can increase the size of a partition files manually beyond the fbs size limit, then start VAST:

vast -N import zeek < libvast_test/artifacts/logs/zeek/conn.log
dd if=/dev/zero bs=1G seek=3 count=0 of=vast.db/index/<the_uuid>
vast start &
vast flush

After that sequence the database should be fully repaired.

@tobim tobim added the bug Incorrect behavior label Jul 14, 2022
@tobim tobim requested a review from lava July 14, 2022 12:31
@tobim tobim force-pushed the story/sc-35575/transform-oversized-partitions branch 2 times, most recently from 05f8967 to 283cc84 Compare July 15, 2022 15:56
@tobim tobim marked this pull request as ready for review July 15, 2022 16:08
@tobim
Copy link
Member Author

tobim commented Jul 15, 2022

This still needs a fix for the index statistics.

Copy link
Member

@lava lava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me from a high-level perspective: When we attempt to erase partitions that are corrupted we try to guess the store file and erase that; and when we attempt to start VAST and encounter an oversized partition we try to fix the DB state by reimporting the data.

It should actually be possible to create an unit test for this by creating a regular partition, adjusting the seek position to >= 2G, and verifying that the data is queryable after the index has restarted.

@tobim tobim force-pushed the story/sc-35575/transform-oversized-partitions branch 2 times, most recently from 6c3d7a9 to c7f6a6f Compare July 19, 2022 08:54
@tobim tobim force-pushed the story/sc-35575/transform-oversized-partitions branch from c7f6a6f to 9cef2a9 Compare July 19, 2022 12:22
@tobim tobim enabled auto-merge July 19, 2022 12:31
@tobim tobim force-pushed the story/sc-35575/transform-oversized-partitions branch from 361d664 to 30a3dd7 Compare July 19, 2022 13:50
@tobim tobim force-pushed the story/sc-35575/transform-oversized-partitions branch from a80a0a4 to 2262218 Compare July 20, 2022 12:55
@tobim tobim force-pushed the story/sc-35575/transform-oversized-partitions branch from 5b0d70d to b40cf5b Compare July 20, 2022 14:44
@tobim tobim merged commit 56d626e into master Jul 20, 2022
@tobim tobim deleted the story/sc-35575/transform-oversized-partitions branch July 20, 2022 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants