When a LocalEntityManagerFactory bean is configured with a bootstrap executor, there's a race between application context refresh and native entity manager factory bootstrapping. This makes things awkward for any logic that wants to run once bootstrapping has completed, for example something that relies on Hibernate DDL processing.
I've attached a small example that hopefully illustrates the problem. When imported into your IDE and run, EmfBootstrappingRaceApplication should thrown an exception when executing the SELECT query using JdbcTemplate. The failure will not occur if factory.setBootstrapExecutor(executor); is commented out. The failure will occur intermittently if the artificial delay introduced by the async executor is removed.
This is meant to be by design: The application and its web endpoints can be up and listening already while the persistence provider is still bootstrapping, only blocking once a request comes in that actually needs to access the persistence provider.
That said, I can see the issue with post-bootstrapping logic here. I guess we could try to attach a callback to the async bootstrap thread there, or we could have a mode of bootstrapping where we effectively wait and block at the end of the refresh phase.
Ah, I see. And that will of course work fine if you only access the database via the entity manager. It remains problematic for applications that aren't purely using JPA and are using JdbcTemplate, jOOQ, or whatever as well.
The area of Boot that led me to investigate this was DataSource initialisation. Users can provide data scripts that populate the database once EntityManagerFactory bootstrapping has completed. We're currently detecting the completion of bootstrapping by decorating the JpaVendorAdapter and (ab)using the postProcessEntityManagerFactory callback. This works, but doesn't feel particularly clean. An official callback for the completion of bootstrapping would be useful.
As things stand, even with our (ab)use of the postProcessEntityManagerFactory callback, we still get into the state where refresh completes with the database in an unknown state. I think there'd definitely be benefit to an option that blocks right at the end of refresh until bootstrapping is completed. This should allow anything not going through JPA to access the database once its reached a predictable state.