-
Notifications
You must be signed in to change notification settings - Fork 477
fixes clearing suspension for offline tables #4295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was code in the manager that seemed to have the intent of clearing suspension markers in tablets for an offline table. This code was not working or tested. This commit fixes this code and adds a tests that validates that suspension markers are removed when a table is taken offline. fixes apache#3314
|
With these changes. What is the suspend state when:
In both of those cases when the tablet(s) are reloaded, is the suspend state still set and subject to the timeout? |
Yeah I think so and the existing test was checking for this. I added the following line to the test after tservers were killed to be sure that tablets were in the suspended state. |
| ds = TabletLocations.retrieve(ctx, tableName); | ||
| log.info("Waiting for suspended {}", ds.suspended); | ||
| } | ||
| } else if (action == AfterSuspendAction.RESUME) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code in this else block was not changed except for indentation and being placed in an else block.
| private class ShutdownTserverKiller implements TServerKiller { | ||
|
|
||
| @Override | ||
| public void eliminateTabletServers(ClientContext ctx, TabletLocations locs, int count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code in the method was copied w/o change from an existing lambda in the test. This copy was done for reuse.
| private class CrashTserverKiller implements TServerKiller { | ||
|
|
||
| @Override | ||
| public void eliminateTabletServers(ClientContext ctx, TabletLocations locs, int count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code in this methos was also copied w/o change from an existing lambda in the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look okay. As long as this preserves the behavior that the suspend duration can be used for rolling restarts with tservers either stopped by command or by a kill (crash).
Mainly, with the suspend delay set to some value and when a tserver is stopped by command or is killed, the tablets assigned will not be reassigned until the expiration of the suspend duration. When a tserver with suspended tablets comes back online, the suspended tablets are reassigned to that same server.
As long as this holds, I approve the changes.
I did not change that behavior in this PR. |
There was code in the manager that seemed to have the intent of clearing suspension markers in tablets for an offline table. This code was not working or tested. This commit fixes this code and adds a tests that validates that suspension markers are removed when a table is taken offline.
fixes #3314