-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-24075: Fix a race between master shutdown and metrics (re)init #1394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
saintstack
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good.
|
Perhaps related, I wonder if getFromCacheOrFetch should check if master is stopped and if so, NOT trying fetching clusterid from fs. Another issue? |
I fixed it. I wanted to see how the metrics change alone affects the patch, hence marked it as a draft. The latest version has this change that makes getClusterId() a no-op if the server is shutting down. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
JMXCacheBuster resets the metrics state at various points in time. These events can potentially race with a master shutdown. When the master is tearing down, metrics initialization can touch a lot of unsafe state, for example invalidated FS objects. To avoid this, this patch makes the getMetrics() a no-op when the master is either stopped or in the process of shutting down. Additionally, getClusterId() when the server is shutting down is made a no-op. Simulating a test for this is a bit tricky but with the patch I don't locally see the long stacktraces from the jira. Signed-off-by: Michael Stack <stack@apache.org>
No description provided.