-
-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix use-after-free bug in indexer state #896
Conversation
This should actually be covered by the protocol between the index and the indexers. The indexers send a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit concerned about this comment in index.cpp:171
: (it seems i cant add comments to lines that were not changed in this PR?)
// Flush all unpersisted partitions. This only writes the meta state of
// each partition. For actually writing the contents of each INDEXER we
// need to rely on messaging.
If I'm understanding this correctly, it seems to suggest that indexers are supposed to continue writing after the index has shut down?
Also, it would be nice to write a test for this, but I'm not sure how to construct one.
FYI: I can locally reproduce the issue with python integration/integration.py -d it --app build/ci/bin/vast -t "Conn log counting" The current change does not fix the problem. |
@tobim I added a missing |
215bf80
to
337d6b8
Compare
@tenzir/backend a review would be very welcome. |
When receiving an `exit` message in the index, we flush the meta state of all partitions but dont wait until all indexers are finished. This can cause memory corruption issues, because the indexers contain pointers to `measurement` structs that are stored inside the `partition` classes, which are destructed along with the index. This could also lead to data loss, when an indexer handles a batch after the index was already destructed.
I'll wait a bit with the rebase push so you can verify if your comments are addressed. |
I resolved every item but one. From my side this is ready to be merged, but I cannot reproduce the issue locally, which means I also cannot test it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code makes sense to me. Since I cannot test this, please test this again on your end before merging to verify (since you're the only that managed to reproduce it locally).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, apart from some minor things below. I couldn't test it locally, but for a CI issue probably a green CI run will be good enough anyways. (and i also cant approve because i'm the PR creator)
@@ -11,6 +11,9 @@ Every entry has a category for which we use the following visual abbreviations: | |||
|
|||
## Unreleased | |||
|
|||
- 🐞 A use after free bug would sometimes crash the node while it was shutting | |||
down. [#896](https://github.com/tenzir/vast/pull/896) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: The bug didn't crash the node, it was our asan instrumentation that did ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically true, but that is just luck, the memory location could have been given back to the OS. Also I don't want to get too technical in the changelog.
VAST_ASSERT(*it != nullptr); | ||
if (buffered(**it) == 0u) { | ||
// ... either removing them directly if the buffers are empty, | ||
// meaning all table slices have been forwarded to the indexers,... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this part an optimization, or necessary for correctness? I was wondering why we need to distinguish between empty/non-empty here and in unregister_partition()
, but not in register_partition()
below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe both instances are purely defensive, I don't want to rely on emit_batches_impl()
to clean up partitions that are already done from its point of view (although it currently does).
NOTE: I did not see any mechanism to ensure that index and indexer are destroyed in the correct order, but if one exists it would probably better to fix that mechanism rather than introducing shared_ptr here.