-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-26938 Compaction failures after StoreFileTracker integration (branch-2, branch-2.5) #4334
Conversation
ab7ff35
to
79617ad
Compare
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
…ranch-2, branch-2.5) - Throw an IllegalStateException for the "Writer exists when it should not" case in Compactor. - If a store is already compacting, do not select it for any additional concurrent compaction.
79617ad
to
08f8a6d
Compare
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
In the last precommit result
|
🎊 +1 overall
This message was automatically generated. |
As mentioned on HBASE-26938, @Apache9 suggested an alternative approach that I accept and will update this PR soon. |
Opened a PR for master, see #4338 . Closing this one. Will pick back the master commit after merge. |
One
Compactor
instance is reused for the lifetime of a store, and it has awriter
field that at issue here.More than one compaction cannot be concurrently selected and executed against a given store or else readers or writers of the
writer
field will encounter multithreaded correctness problems. Yet I am seeing concurrent selection and execution of compaction activity against the store in the test scenario.In the test scenario I have increased the size of the small and large compaction thread pools, to 10 and 5 threads, respectively, and increased the default point for blocking files to 24, and in the scenario the store is flushing furiously. Operation under these conditions used to be reliable, but perhaps only by an accidental serialization of compaction activity prior to the SFT changes.
With this change in place the reliability and performance under the test scenario returns to previous baseline for DEFAULT SFT. No ERRORs.
Suggestions of alternative approaches to fixing this are welcome too.