Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to defer index creation until everything else is done? #615

Closed
cmartell-at-ocp opened this issue Mar 18, 2022 · 7 comments · Fixed by #697
Closed

Is there a way to defer index creation until everything else is done? #615

cmartell-at-ocp opened this issue Mar 18, 2022 · 7 comments · Fixed by #697

Comments

@cmartell-at-ocp
Copy link

Is there a way to defer index creation until all the records are inserted? I am trying to restore roughly ~11 TB (when uncompressed), and have some massive tables.

I'm currently running into the issue that if I run the restore without --innodb-optimize-keys it takes waay too long, but with the --innodb-optimize-keys option, once one of the tables finishes loading it immediately starts creating the index while the rest of the tables are still loading data. The sever can't handle this and shortly after I start seeing "MySQL has gone away" errors during the load.

Thanks for the utilities, they've been a lifesaver!

@davidducos
Copy link
Member

Hi @cmartell-skillz,
We could change the behavior and do something like this:
--innodb-optimize-keys=[AFTER_IMPORT_PER_TABLE|AFTER_IMPORT_ALL_TABLES]
Is that what you want?

@davidducos davidducos added this to the Release 0.12.5-1 milestone Mar 18, 2022
@cmartell-at-ocp
Copy link
Author

@davidducos yeah, something like that would be amazing!

For context, we are running into a ton of issues trying to migrate from RDS to K8s PXC as the pod keeps crashing for x or y reason, and when myloader stops the load due to "MySQL has gone away" there is no resume functionality, so we've been having to destroy the cluster and restart the import.

So far, we've been having good success with --innodb-optimize-keys as the load is way faster than without it, until a big table is done loading (let's say, at around 33600 files out of 71000) and its index starts getting created. Then all other threads try to continue loading data for other tables, and the server slows down to a crawl until it stops responding and the pod restarts.

We're hoping deferring the index creation after importing all tables might help, as even if the pod crashes, we know we can just re-do the index creation.

@davidducos
Copy link
Member

@cmartell-skillz,
I get the point. Meanwhile, do you know that you can edit the -schema files and delete the indexes on some tables? Then you can add them manually.

@cmartell-at-ocp
Copy link
Author

@davidducos yup, that's the plan for this weekend. Currently taking a fresh backup and will report back how that goes. Thank you!

@davidducos
Copy link
Member

I think that will be nice to have:
--innodb-optimize-keys=[AFTER_IMPORT_PER_TABLE|AFTER_IMPORT_ALL_TABLES|NO_EXECUTE]
NO_EXECUTE: writes the ALTER statement in the log but do not execute it.

@matthewlenz
Copy link

@davidducos AFTER_IMPORT_ALL_TABLES is being ignored. It starts creating indexes as soon as there are no more tables to create on a given thread. In my case I had 3 very large tables still loading (--load-data) and 4 threads. Once all the smaller tables were created the unoccupied thread started creating indexes. It didn't wait until all the other threads had finished loading data.

@matthewlenz
Copy link

Here is how I was using myloader:

myloader -d test_reload -B dbname --innodb-optimize-keys AFTER_IMPORT_ALL_TABLES -v 3 --max-threads-for-index-creation 1 --skip-definer -L test_reload.log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants