Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache classpath scanning results between migrate runs #1466

Closed
jshayward opened this issue Nov 14, 2016 · 1 comment
Closed

Cache classpath scanning results between migrate runs #1466

jshayward opened this issue Nov 14, 2016 · 1 comment

Comments

@jshayward
Copy link

@jshayward jshayward commented Nov 14, 2016

What version of Flyway are you using?

4.0.3

Which client are you using? (Command-line, Java API, Maven plugin, Gradle plugin, SBT plugin, ANT tasks)

Java API

What database are you using (type & version)?

H2 for testing. SQL Server for application

What operating system are you using?

Linux and Windows

What did you do?

(Please include the content causing the issue, any relevant configuration settings, and the command you ran)

I have to apply migrations to 15,000+ databases on a regular basis. I have an existing, custom app (not using Flyway), that submits migrations to these databases using a configurable number of threads. In production, I routinely run 1,000 threads (100 servers at a time processing 10 database migrations each in parallel). This is a lot of threads but the threads are generally I/O bound waiting for SQL Server to process scripts.

I am attempting to rewrite the app using Flyway and I have run into a performance issue. I want to store my migrations inside of the app .jar file and each instance of Flyway reads the list of resource names from this .jar file. This causes a lot of activity for each thread and reduces performance.

Internally, Flyway creates a resource name cache but it appears that this cache is flushed and reloaded each time .migrate() is called.

For example, I created a sample migration for 15,000 local H2 databases. When I ran the migration using my app, with 50 threads running Flyway in parallel, it took 16 minutes to complete. My profiler showed that the majority of time was spent in 'org.flywaydb.core.internal.util.scanner.classpath.JarFileClassPathLocationScanner'.

I made a small change to Flyway to create a static cache in JarFileClassPathLocationScanner to see what the performance impact would be. Running the same test with this modified version of Flyway only took 1 minute 30 seconds.

I'm sure the change I made isn't the correct long-term solution but I was able to identify what the performance bottleneck was for my particular application. This isn't a show-stopper for using Flyway for me but many of my migrations currently take less than 15 minutes so this will likely double my deployment time.

What did you expect to see?

See above

What did you see instead?

See above

@axelfontaine axelfontaine changed the title Performance when running multiple Flyway instances in parallel Cache classpath scanning results between migrate runs Jan 27, 2017
@axelfontaine

This comment has been minimized.

Copy link
Member

@axelfontaine axelfontaine commented Jan 27, 2017

Thanks for the suggestion. While we may not be able to do this for the filesystem, it should indeed be safe to cache the results of the scanning of jar files across migrate runs.

@axelfontaine axelfontaine added this to the Flyway 5.1.0 milestone Nov 27, 2017
juliahayward added a commit that referenced this issue Nov 1, 2019
@alextercete alextercete added the r: fixed label Nov 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.