Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue for the MySQLSchema doEmpty() call #2123

Closed
QuidamAzerty opened this issue Aug 29, 2018 · 6 comments
Closed

Performance issue for the MySQLSchema doEmpty() call #2123

QuidamAzerty opened this issue Aug 29, 2018 · 6 comments

Comments

@QuidamAzerty
Copy link

@QuidamAzerty QuidamAzerty commented Aug 29, 2018

Which version and edition of Flyway are you using?

5.0.7.
But it's irrelevant as it's still in the last version of the file

Which client are you using? (Command-line, Java API, Maven plugin, Gradle plugin)

Java via Maven plugin

Which database are you using (type & version)?

MySQL 5.6 and 5.7

Which operating system are you using?

Linux. I don't put more info on that because it's (a priori) not relevant here.

Usage context

We have 1086 schema. In these schema, ~1000 have the same structure with 845 tables and views.
So our information_schema is a little heavy.

What did you do?

flyway.migrate() on a non Flyway migrated yet db (and so with baselineOnMigrate to true).

What did you expect to see?

Relatively fast first migration.

What did you see instead?

Relatively slow first migration.

Why ?

The db has not been migrated with Flyway yet. So schemaHistory is empty.
And, by following the Flyway process from here, a schema.empty() is called, which call the doEmpty() of MySQLSchema (in our case).
But the doEmpty() is done via the information_schema. And takes approximately 1min to do this method in our case.

So ? What is the question ?

Why did you not use the SHOW methods ?
They are much more efficient than go through the information_schema.
And another question would be is this performance problem can also exist on the other Schema children ?

Thanks for the reading and have a great day !

@axelfontaine
Copy link
Contributor

@axelfontaine axelfontaine commented Aug 29, 2018

I don't believe using SHOW will improve performance as that will most likely use the information_schema behind the scenes anyway.

What definitely should be done though is replacing those SELECT COUNT(*) with either EXISTS or SELECT 1 ... LIMIT 1.

@QuidamAzerty
Copy link
Author

@QuidamAzerty QuidamAzerty commented Aug 29, 2018

Seems legit.

If you wish, I can try the query beforehand.
What takes time on the current query is the Select count(*) from information_schema.TRIGGERS Where TRIGGER_SCHEMA=?

@axelfontaine
Copy link
Contributor

@axelfontaine axelfontaine commented Aug 30, 2018

Yes, could you try:

Select 
(Select 1 from information_schema.TABLES Where TABLE_SCHEMA=? limit 1) + 
(Select 1 from information_schema.VIEWS Where TABLE_SCHEMA=? limit 1) + 
(Select 1 from information_schema.TABLE_CONSTRAINTS Where TABLE_SCHEMA=? limit 1) + 
(Select 1 from information_schema.EVENTS Where EVENT_SCHEMA=? limit 1) + 
(Select 1 from information_schema.TRIGGERS Where TRIGGER_SCHEMA=? limit 1) + 
(Select 1 from information_schema.ROUTINES Where ROUTINE_SCHEMA=? limit 1)

Don't forget to replace ? with your actual schema name.

@QuidamAzerty
Copy link
Author

@QuidamAzerty QuidamAzerty commented Aug 30, 2018

The limit does not change the performance.
The query stay still in a checking permissions state for a long time when I execute it in production environment with a readonly account.
In a local environment, I don't have consistent change between the two queries (with and without the limit 1).
EDIT: forgot word that changes sentence meaning: "not" --"

axelfontaine added a commit to flyway/flywaydb.org that referenced this issue Sep 20, 2018
@axelfontaine
Copy link
Contributor

@axelfontaine axelfontaine commented Sep 20, 2018

This has now been fixed.

The new query is

Select SUM(found) from(
       (Select 1 as found from information_schema.TABLES Where TABLE_SCHEMA='flyway_db' limit 1) union all
       (Select 1 as found from information_schema.VIEWS Where TABLE_SCHEMA='flyway_db' limit 1) union all
       (Select 1 as found from information_schema.TABLE_CONSTRAINTS Where TABLE_SCHEMA='flyway_db' limit 1) union all
       (Select 1 as found from information_schema.EVENTS Where EVENT_SCHEMA='flyway_db' limit 1) union all
       (Select 1 as found from information_schema.TRIGGERS Where TRIGGER_SCHEMA='flyway_db' limit 1) union all
       (Select 1 as found from information_schema.ROUTINES Where ROUTINE_SCHEMA='flyway_db' limit 1)) as all_found;
@golbi
Copy link

@golbi golbi commented Nov 6, 2019

@axelfontaine I hit this issue on MySQL 5.7 using a Flyway version that contains the above mentioned fix. I believe it is not really solving the problem due to one specific "feature" of information_schema in mysql 5.x. As you can see here https://dev.mysql.com/doc/refman/5.7/en/information-schema-optimization.html (see the table there) the problem is around TRIGGERS query. With where on TRIGGER_SCHEMA, it scans all databases (confirmed with EXPLAIN). E.g. in my case it takes 45s, while I have 0 triggers, but around 2k DBs.
Using EVENT_OBJECT_SCHEMA in where helps but I'm not sure if this is what you want to check here(?) SHOW TRIGGERS is also fast.

juliahayward added a commit that referenced this issue Apr 15, 2020
…y-triggers-where

Fix slow doEmpty by using EVENT_OBJECT_SCHEMA (#2123)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.