Skip to content

Rename lhmn_ tables to lhma_ to avoid IBP stalls#41

Merged
bbuchalter merged 4 commits into
masterfrom
rename_instead_of_drop_tables
Aug 1, 2018
Merged

Rename lhmn_ tables to lhma_ to avoid IBP stalls#41
bbuchalter merged 4 commits into
masterfrom
rename_instead_of_drop_tables

Conversation

@bbuchalter

Copy link
Copy Markdown

When an LHM worker fails, cleanup_current_run must be called to remove
the triggers and "new" tables (which start with lhmn_). The previous
behavior was to drop the table immediately. However, if this is an active
table, the InnoDB buffer pool can be full of pages related to this "lhmn_"
table. When it is dropped, this forces IBP to clear to those pages and can
cause MySQL to become unresponsive.

By instead renaming this table with the archive prefix (lhma_) when can let
the buffer unload relevant pages overtime, and then later, safely, drop the
archive tables as part of regular scheduled maintenance.

Brian Buchalter added 3 commits July 31, 2018 16:19
The amount of behavior implemented in the base Lhm module was excessive.
The code written there was intentionally made terse to try and limit
the amount of code written there. By extracting it to it's own class,
we can be more expressive, which will make future refactoring easier.
We're about to change the behavior of the current cleanup
and I'd like to have more explicit tests about exactly what
will be executed.
In the next commit, we'll need to be able to generate timestamps
so let's extract this logic first.
@bbuchalter bbuchalter changed the title Rename lhmn_ tables to lhmna_ to avoid IBP stalls Rename lhmn_ tables to lhma_ to avoid IBP stalls Aug 1, 2018
When an LHM worker fails, `cleanup_current_run` must be called
to remove the triggers and "new" tables (which start with lhmn_).
The previous behavior was to drop the table
immediately. However, if this is an active table, the InnoDB buffer pool
can be full of pages related to this "lhmn_" table. When it is dropped,
this forces IBP to clear to those pages and can cause MySQL to become
unresponsive.

By instead renaming this table with the archive prefix (lhma_)
when can let the buffer unload relevant pages overtime, and then
later, safely, drop the archive tables as part of regular scheduled
maintenance.
@bbuchalter bbuchalter force-pushed the rename_instead_of_drop_tables branch from 7326422 to a9a4349 Compare August 1, 2018 17:06

def all_triggers_for_origin
@all_triggers_for_origin ||= connection.select_values("show triggers like '%#{origin_table_name}'").collect do |trigger|
trigger.respond_to?(:trigger) ? trigger.trigger : trigger

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this isn't your code but wtf? I wonder if this is Mysql vs Mysql2 stuff.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's likely the cause.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is exactly what it is, yes.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jordanwheeler do you happen to know which is the mysql2 syntax?

@insom insom left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to follow commit by commit so I can see how the factoring took shape. 🚢

Comment thread lib/lhm/timestamp.rb
@time = time
end

def to_s

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty fancy there guy

@jordanwheeler jordanwheeler left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd feel better if the timestamp stuff was tested better, but it's the same code which you haven't actually changed, and testing it nicely there would likely require timecop or something, which is a lot of effort for such a small change.

i just thought i'd mention that. i like what you've done here 👍

@bbuchalter bbuchalter merged commit 2bed67e into master Aug 1, 2018
@bbuchalter bbuchalter deleted the rename_instead_of_drop_tables branch August 1, 2018 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants