Rename lhmn_ tables to lhma_ to avoid IBP stalls#41
Merged
Conversation
added 3 commits
July 31, 2018 16:19
The amount of behavior implemented in the base Lhm module was excessive. The code written there was intentionally made terse to try and limit the amount of code written there. By extracting it to it's own class, we can be more expressive, which will make future refactoring easier.
We're about to change the behavior of the current cleanup and I'd like to have more explicit tests about exactly what will be executed.
In the next commit, we'll need to be able to generate timestamps so let's extract this logic first.
When an LHM worker fails, `cleanup_current_run` must be called to remove the triggers and "new" tables (which start with lhmn_). The previous behavior was to drop the table immediately. However, if this is an active table, the InnoDB buffer pool can be full of pages related to this "lhmn_" table. When it is dropped, this forces IBP to clear to those pages and can cause MySQL to become unresponsive. By instead renaming this table with the archive prefix (lhma_) when can let the buffer unload relevant pages overtime, and then later, safely, drop the archive tables as part of regular scheduled maintenance.
7326422 to
a9a4349
Compare
insom
reviewed
Aug 1, 2018
|
|
||
| def all_triggers_for_origin | ||
| @all_triggers_for_origin ||= connection.select_values("show triggers like '%#{origin_table_name}'").collect do |trigger| | ||
| trigger.respond_to?(:trigger) ? trigger.trigger : trigger |
There was a problem hiding this comment.
I realize this isn't your code but wtf? I wonder if this is Mysql vs Mysql2 stuff.
Author
There was a problem hiding this comment.
@jordanwheeler do you happen to know which is the mysql2 syntax?
insom
approved these changes
Aug 1, 2018
insom
left a comment
There was a problem hiding this comment.
Nice to follow commit by commit so I can see how the factoring took shape. 🚢
| @time = time | ||
| end | ||
|
|
||
| def to_s |
jordanwheeler
approved these changes
Aug 1, 2018
jordanwheeler
left a comment
There was a problem hiding this comment.
i'd feel better if the timestamp stuff was tested better, but it's the same code which you haven't actually changed, and testing it nicely there would likely require timecop or something, which is a lot of effort for such a small change.
i just thought i'd mention that. i like what you've done here 👍
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When an LHM worker fails,
cleanup_current_runmust be called to removethe triggers and "new" tables (which start with lhmn_). The previous
behavior was to drop the table immediately. However, if this is an active
table, the InnoDB buffer pool can be full of pages related to this "lhmn_"
table. When it is dropped, this forces IBP to clear to those pages and can
cause MySQL to become unresponsive.
By instead renaming this table with the archive prefix (lhma_) when can let
the buffer unload relevant pages overtime, and then later, safely, drop the
archive tables as part of regular scheduled maintenance.