Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Cache classpath scanning results between migrate runs #1466
What version of Flyway are you using?
Which client are you using? (Command-line, Java API, Maven plugin, Gradle plugin, SBT plugin, ANT tasks)
What database are you using (type & version)?
H2 for testing. SQL Server for application
What operating system are you using?
Linux and Windows
What did you do?
(Please include the content causing the issue, any relevant configuration settings, and the command you ran)
I have to apply migrations to 15,000+ databases on a regular basis. I have an existing, custom app (not using Flyway), that submits migrations to these databases using a configurable number of threads. In production, I routinely run 1,000 threads (100 servers at a time processing 10 database migrations each in parallel). This is a lot of threads but the threads are generally I/O bound waiting for SQL Server to process scripts.
I am attempting to rewrite the app using Flyway and I have run into a performance issue. I want to store my migrations inside of the app .jar file and each instance of Flyway reads the list of resource names from this .jar file. This causes a lot of activity for each thread and reduces performance.
Internally, Flyway creates a resource name cache but it appears that this cache is flushed and reloaded each time .migrate() is called.
For example, I created a sample migration for 15,000 local H2 databases. When I ran the migration using my app, with 50 threads running Flyway in parallel, it took 16 minutes to complete. My profiler showed that the majority of time was spent in 'org.flywaydb.core.internal.util.scanner.classpath.JarFileClassPathLocationScanner'.
I made a small change to Flyway to create a static cache in JarFileClassPathLocationScanner to see what the performance impact would be. Running the same test with this modified version of Flyway only took 1 minute 30 seconds.
I'm sure the change I made isn't the correct long-term solution but I was able to identify what the performance bottleneck was for my particular application. This isn't a show-stopper for using Flyway for me but many of my migrations currently take less than 15 minutes so this will likely double my deployment time.
What did you expect to see?
What did you see instead?