Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Circular reference in JPA, DataSource, initializers #13042

Closed
dsyer opened this issue May 3, 2018 · 41 comments
Closed

Circular reference in JPA, DataSource, initializers #13042

dsyer opened this issue May 3, 2018 · 41 comments
Assignees
Labels
status: noteworthy A noteworthy issue to call out in the release notes theme: datasource Issues relating to data sources type: bug A general bug
Milestone

Comments

@dsyer
Copy link
Member

dsyer commented May 3, 2018

This was originally reported by a Spring Cloud user (someone trying to upgrade the Petclinic microservice edition to Spring Boot 2.0 - so it works with 1.5): spring-cloud/spring-cloud-commons#355.

I'm still a bit mystified as to why it happens, but the evidence is pointing to something in the DataSourceInitializer and its configuration (there's a BeanPostProcessor that forcibly instantiates the DataSourceInitializerInvoker when it sees a DataSource).

Here's a really small app that blows up in the same way (with no Spring Cloud dependencies). You need a schema.sql and a data.sql to make it fail, and you need to set spring.datasource.initialization-mode=embedded (or always) to trigger the initialization. Dependencies are just h2 and the JPA starter.

@SpringBootApplication
public class BeanApplication {

	@Bean
	public static BeanFactoryPostProcessor fooScope() {
		return beanFactory -> {
			beanFactory.registerScope("foo", new SimpleThreadScope());
		};
	}

	@Bean
	@Scope("foo")
	@ConfigurationProperties(prefix = "spring.datasource.hikari")
	public HikariDataSource dataSource(DataSourceProperties properties) {
		HikariDataSource dataSource = properties.initializeDataSourceBuilder()
				.type(HikariDataSource.class).build();
		return dataSource;
	}

	public static void main(String[] args) {
		SpringApplication.run(BeanApplication.class, args);
	}

}

If you (exclude= {HibernateJpaAutoConfiguration.class, JpaRepositoriesAutoConfiguration.class}) the problem goes away. Here's some code in github: https://github.com/scratches/datasource-cycle.

While I was investigating the Cloud issue I also managed to find some combinations of auto configurations that led to other cycles (not involving JPA, but still with a scoped data source), so there is more than one way to tickle this.

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label May 3, 2018
@snicoll
Copy link
Member

snicoll commented May 3, 2018

there's a BeanPostProcessor that forcibly instantiates the DataSourceInitializerInvoker when it sees a DataSource

There is one in 1.5.x as well si I am not sure how that's relevant to the description.

I guess we need to figure out what has changed with regards to the scope thing.

@dsyer
Copy link
Member Author

dsyer commented May 3, 2018

This fails with 1.5 (in a different way):

@SpringBootApplication
public class BeanApplication {

	@Bean
	public static BeanFactoryPostProcessor fooScope() {
		return beanFactory -> {
			beanFactory.registerScope("foo", new SimpleThreadScope());
		};
	}

	@Bean
	@Scope("foo")
	@ConfigurationProperties(prefix = "spring.datasource")
	public DataSource dataSource(DataSourceProperties properties) {
		return DataSourceBuilder.create().url(properties.determineUrl())
				.type(org.apache.tomcat.jdbc.pool.DataSource.class).build();
	}

	public static void main(String[] args) {
		SpringApplication.run(BeanApplication.class, args);
	}

}

It's on a branch 1.5.x in the same repo.

@wilkinsona wilkinsona self-assigned this May 3, 2018
@wilkinsona
Copy link
Member

wilkinsona commented May 3, 2018

The 1.5.x version fails because the DataSourceInitializer ends up with this.dataSource being null. This happens because this.applicationContext.getBeanNamesForType(DataSource.class, false, false) returns an empty array. The array is empty because the dataSource bean definition doesn't match. The bean definition doesn't match because it's scoped which causes org.springframework.beans.factory.support.AbstractBeanFactory.isSingleton(String) to return false. This matters because of this logic in Framework.

As far as I can tell, there's no circular reference problem in this case. The problem is that calling this.applicationContext.getBeanNamesForType(DataSource.class, false, false), as DataSourceInitializer does, will never be able to find a DataSource that's in a scope. You can see this by removing the data.sql and schema.sql scripts from the 1.5.x branch of the sample and changing its main method to the following:

public static void main(String[] args) {
    String[] dataSources = SpringApplication.run(BeanApplication.class, args).getBeanNamesForType(DataSource.class, false, false);
    System.out.println(Arrays.toString(dataSources));
}

It will output an empty array.

@wilkinsona
Copy link
Member

The 2.0.x version fails because of a difference in how singleton beans and prototype beans are handled when they are requested while still be created. In the singleton case, there's some fallback logic that allows the singleton that's in creation to be retrieved. In the prototype case, there's no such fallback and a BeanCurrentlyInCreationException is thrown.

@wilkinsona
Copy link
Member

wilkinsona commented May 3, 2018

In SPR-16783, @jhoeller said that "the recommendation is to never access an object within its own post-processing phase to begin with". This is exactly what DataSourceInitializerPostProcessor does when it gets the DataSourceInitializer (1.5) or DataSourceInitializerInvoker (2.0) beans. Putting aside Framework's inconsistent handling of retrieving a singleton that's in creation vs retrieving a prototype that's in creation, we probably need to find another approach anyway.

@dsyer
Copy link
Member Author

dsyer commented May 3, 2018

The "singleton" in question isn't the same as the default scope for beans (I think). It's "raw" singletons (the ones that are registered directly as instances, not bean definitions). The DataSource isn't in that category. I'm not sure how that helps, though.

@wilkinsona
Copy link
Member

The "singleton" in question isn't the same as the default scope for beans (I think). It's "raw" singletons (the ones that are registered directly as instances, not bean definitions).

I'm pretty sure it's the same. When I remove the @Scope and step through this code, the dataSource bean is marked as being in creation but it's still accessible.

@dsyer
Copy link
Member Author

dsyer commented May 3, 2018

OK, maybe the raw singletons get cached in the same place as the bean definition ones.

I added some non-hibernate samples to my repro project, here: https://github.com/scratches/datasource-cycle/tree/master/src/main/java/org/springframework/boot/autoconfigure/jdbc. They are easier to analyse (fewer beans), but the exception is slightly different.

@wilkinsona
Copy link
Member

I added some non-hibernate samples to my repro project

Thanks. BrokenApplication exhibits the same problem as I described above, failing because the dataSource prototype is in creation when an attempt is made to retrieve it.

I'm intrigued by the other two as they look like they should fail in the same way. Time to do some digging.

@wilkinsona
Copy link
Member

wilkinsona commented May 3, 2018

NotBrokenApplication is not broken because of an ordering change in the configuration (DataSourceInitializationConfiguration going before DataSourceJmxConfiguration). This ordering change means that afterPropertiesSet on DataSourceInitializerInvoker is called outside of the post-processing of the DataSource. This means that the attempt to retrieve the DataSource while it's being created is avoided.

@wilkinsona
Copy link
Member

AnotherNotBrokenApplication appears to not be broken for the same reason as NotBrokenApplication. Updating the ordering of the imported configuration classes so that DataSourceJmxConfiguration goes before DataSourceInitializationConfiguration causes it to fail due to a cycle.

@wilkinsona
Copy link
Member

wilkinsona commented May 3, 2018

When the 2.0.x version of BeanApplication fails, it does so because the early retrieval of LoadTimeWeaverAware beans has triggered the creation and post-processing of the DataSource bean before DataSourceInitializerInvoker has been created because the LocalContainerEntityManagerFactoryBean is load time weaver-aware and depends on the DataSource. This leads to DataSourceInitializerInvoker bean created during the post-processing of the DataSource bean.

The problem can be avoided by making DataSourceInitializerInvoker implement LoadTimeWeaverAware. Due to auto-configuration ordering, it's defined before the definition of the LocalContainerEntityManagerFactoryBean so, if it's LoadTimeWeaverAware, it's created and initialised first which allows it to retrieve the DataSource when it isn't already under creation.

@spencergibb
Copy link
Member

Thanks for all the research @wilkinsona and @dsyer

@dsyer
Copy link
Member Author

dsyer commented May 3, 2018

if it's LoadTimeWeaverAware, it's created and initialised first

That would fix the issue in Spring Cloud because the other BrokenApplication cycle can be broken (for reasons I do not understand) just by declaring the Scope as a @Bean directly instead of using a BeanDefinitionRegistry.

@wilkinsona
Copy link
Member

I've yet to convince myself that implementing LoadTimeWeaverAware is the right thing to do. It causes the DataSource to be initialized very early in all applications, irrespective of whether or not they use JPA. On the other hand, if it works in a JPA app, it should probably work elsewhere too. It just feels wrong to be deliberately causing early initialisation of the DataSource… It also feels like a bit of a hack as we don't actually care about the LoadTimeWeaver. However, as far as I can tell, there's no less hacky way for us to trigger things early.

@dsyer
Copy link
Member Author

dsyer commented May 4, 2018

Tough decision. I suppose we could have 2 identical invokers, one that is LoadTimeWeaverAware and one that is not, and only create a bean for the first one when JPA is active. With a base class doing all the work it wouldn't be completely horrible.

@wilkinsona wilkinsona added the for: team-attention An issue we'd like other members of the team to review label May 10, 2018
@kentoj
Copy link

kentoj commented May 10, 2018

Could this be due to a change introduced by the Hikari connection pooler?
If I use Apache DBCP2 with spring.datasource.type: org.apache.commons.dbcp2.BasicDataSource and have the spring.datasource.initialization-mode: always then everything starts up fine and both the scripts are run properly. However, if I use the default of Hikari then things break.

@wilkinsona
Copy link
Member

That's very interesting. Thank you, @kentoj. Moving to BasicDataSource appears to help in some scenarios but not in others. Using 2.0.2.RELEASE, this fails:

@SpringBootApplication
public class BeanApplication {

	@Bean
	public static BeanFactoryPostProcessor fooScope() {
		return beanFactory -> {
			beanFactory.registerScope("foo", new SimpleThreadScope());
		};
	}

	@Bean
	@Scope("foo")
	@ConfigurationProperties(prefix = "spring.datasource.dbcp2")
	public BasicDataSource dataSource(DataSourceProperties properties) {
		BasicDataSource dataSource = properties.initializeDataSourceBuilder()
				.type(BasicDataSource.class).build();
		return dataSource;
	}

	public static void main(String[] args) {
		SpringApplication.run(BeanApplication.class, args);
	}

}

But this does not:

@SpringBootConfiguration
@EnableConfigurationProperties(DataSourceProperties.class)
@Import({ DataSourceConfiguration.Dbcp2.class, DataSourceJmxConfiguration.class,
		DataSourceInitializationConfiguration.class})
public class BrokenApplication {

	@Component
	protected static class RefreshScopeConfiguration
			implements BeanDefinitionRegistryPostProcessor {

		@Override
		public void postProcessBeanFactory(ConfigurableListableBeanFactory beanFactory)
				throws BeansException {
		}

		@Override
		public void postProcessBeanDefinitionRegistry(BeanDefinitionRegistry registry)
				throws BeansException {
			registry.registerBeanDefinition("fooScope",
					BeanDefinitionBuilder.genericBeanDefinition(FooScope.class)
							.setRole(BeanDefinition.ROLE_INFRASTRUCTURE)
							.getBeanDefinition());
		}
	}

	@Bean
	@Scope("foo")
	@ConfigurationProperties(prefix = "spring.datasource.dbcp2")
	public BasicDataSource dataSource(DataSourceProperties properties) {
		BasicDataSource dataSource = properties.initializeDataSourceBuilder()
				.type(BasicDataSource.class).build();
		return dataSource;
	}

	public static void main(String[] args) {
		SpringApplication.run(BrokenApplication.class, args);
	}

}

class FooScope implements org.springframework.beans.factory.config.Scope, BeanFactoryPostProcessor {

	@Override
	public Object get(String name, ObjectFactory<?> objectFactory) {
		return objectFactory.getObject();
	}

	@Override
	public Object remove(String name) {
		return null;
	}

	@Override
	public void registerDestructionCallback(String name, Runnable callback) {
	}

	@Override
	public Object resolveContextualObject(String key) {
		return null;
	}

	@Override
	public String getConversationId() {
		return null;
	}

	@Override
	public void postProcessBeanFactory(ConfigurableListableBeanFactory beanFactory)
			throws BeansException {
		beanFactory.registerScope("foo", this);
	}

}

@wilkinsona
Copy link
Member

wilkinsona commented May 11, 2018

BrokenApplication is no longer broken with DBCP2 as the switch has been enough to prevent the DataSource from being retrieved while its prototype is in creation. I don't yet know why.

BeanApplication continues to be broken due to the special handling of LoadTimeWeaverAware beans described above.

@snicoll
Copy link
Member

snicoll commented May 12, 2018

For the record, a workaround has been implemented in Spring Cloud for this, see spring-cloud/spring-cloud-commons@0e259ea

@wilkinsona
Copy link
Member

I've opened a Framework issue so that we can address this without resorting to the LoadTimeWeaverAware hack. This may mean that fixing this ends up being a duplicate of #9528, but let's keep this issue open for now so it doesn't fall through the cracks.

@wilkinsona wilkinsona added type: bug A general bug and removed for: team-attention An issue we'd like other members of the team to review status: waiting-for-triage An issue we've not yet triaged labels May 14, 2018
@wilkinsona wilkinsona added this to the Backlog milestone May 14, 2018
@xcbeyond
Copy link

xcbeyond commented Sep 3, 2018

@dsyer @wilkinsona Have you solved this problem? if not,i can only user spring cloud 1.5,is that it?

@wilkinsona
Copy link
Member

@xcbeyond As noted above, a workaround has been implemented in Spring Cloud.

@wilkinsona
Copy link
Member

wilkinsona commented Jan 17, 2019

the blurring of properties used to create the DataSource bean and properties that are used purely for initialisation

We'd like to fix this by introducing an init sub-group to collect together the properties that are specifically to do with initialisation.

differences in when initialisation will actually create a whole new DataSource and when it will reuse one, etc.

This occurs when a specific username and/or password is provided for the initialisation of the DataSource. When it occurs, a new DataSource is created with the default settings. This can lead to a pool with many connections being created purely for initialisation. We'd like to address this by tuning any pool that's created specifically for initialisation to have a more sensible number of connections (probably 1).

moving to a declarative @InitializeDataSource-based approach

This approach would, even when replacing a bean, would allow us to know if the user wanted the datasource to be initialised. The annotation could have an attribute that specifics the prefix for the properties that configure datasource initialisation. These same properties could be used to apply the initialisation to an auto-configured test database.

@samo-esure
Copy link

I'm able to work around this issue by specifying management.server.port different than the server.port

@klopfdreh
Copy link

klopfdreh commented May 31, 2019

Hey all,

we are also facing this issue and found out that beside the already mentioned information spring-boot 2.0 is using org.springframework.boot.actuate.jdbc.DataSourceHealthIndicator which tries to autowire the javax.sql.Datasource map which contains the datasource that is currently not being marked as initialized. As a workaround to not change the management port to a different one you could also consider to deactivate the db health indicator.

management.health.db.enabled=false

Afterwards you can define your own HealthChecks with the org.springframework.boot.actuate.health.HealthIndicator interface and test your datasources by your own.

Edit: The HealthIndicator is also not working because of injection of the datasources.

Hope to see the issue fixed soon.

@wilkinsona
Copy link
Member

@klopfdreh Thanks for the information. This is the first time that Actuator has apparently been involved in this issue. Could you share a sample that reproduces the problem you are describing? I'd like to see if it's a variant of one of the problems described above, or another case that we haven't seen before.

@klopfdreh
Copy link

Hi @wilkinsona - As mentioned in the original issue description and the mentioned link: spring-cloud/spring-cloud-commons#355
you can see that the stack trace is showing up the actuator endpoints stack strace entries - e.g.:

org.springframework.boot.actuate.autoconfigure.jdbc.DataSourceHealthIndicatorAutoConfiguration.getValidationQuery(DataSourceHealthIndicatorAutoConfiguration.java:115)

and the following lines.

Sadly I can't provide the source code but I just wanted to give the hint about the actuator.

@wilkinsona
Copy link
Member

@klopfdreh What version of Spring Cloud are you using? That part of the problem should have been fixed already by spring-cloud/spring-cloud-commons@0e259ea.

@klopfdreh
Copy link

klopfdreh commented May 31, 2019

@wilkinsona - no spring cloud, but spring-boot 2.1.x, but there the actuator endpoint is also affected. Could be that this is a project related issue. If I have more information I can share I am going to do so.

@wilkinsona
Copy link
Member

wilkinsona commented May 31, 2019

Thanks. Are you doing anything with your own custom scopes or prototype beans? IIRC, the various examples listed above all require one or the other for the problem to occur. You may be facing the same problem as described in this issue, or it may be something else. We've yet to see Actuator be the cause of the problem and I suspect it's not in your case and that what you're seeing is just a symptom.

Unfortunately, we won't be able to do any more than speculate without having some code to look at so fingers-crossed that you're able to share something. It doesn't have to be your exact code, just enough to reproduce the problem that you're seeing.

@klopfdreh
Copy link

klopfdreh commented May 31, 2019

I just tried to update to spring-boot 2.1.5 but facing the same error.

I just had a look in the initialization of the custom DataSource bean. The bean itself is created within a class using @Bean and checking against the same bean name (method name) with @ConditionalOnMissingBean. The class itself is a @Configuration with @ConditionalOnProperty.

In this case I guess the default scope is used (singleton) and it is not a prototype bean.

All in all there is no specific scope handling introduced.

I know that you can't do anything more without stacktrace, but I just wanted to provide those information about the actuator endpoint and that this is not only occuring in spring-cloud.

This is only an assumption, but is there any specific handling for datasources in kind of bean creation? This error seems to occure only with an additional DataSource provided as bean and in this particular scenario with the actuator endpoint, because @samo-esure also mentioned to change the managament port works (this also worked for me) - also the spring-cloud issue listed this in the stack. Also to disable the health endpoint or the db indicator worked.

So I assume that this is an issue around the DataSource injection within the actuator implementation or an issue related to the startup of this spring module.

@klopfdreh
Copy link

klopfdreh commented May 31, 2019

@wilkinsona - We found the issue and it is really hard to explain in detail. Basically it was an issue that the health endpoint was causing the DataSource bean to be created way earlier than without. Because of that shift in the order of the bean creation the configuration class in which the datasource was created was invoked a bit earlier and a setter (also autowired with required false) was resolved. Finally this setter was causing the circular dependency.

So the error was missleading and might also misslead in other scenarios to the assumption that the actuator health endpoint is the issue.

So my advice would be to ask devs facing this issue - for those who not defined custom scopes - to have a look in the context at which a custom datasource is created and search for circular dependencies in that area (especially with optional beans).

@jebeaudet
Copy link
Contributor

jebeaudet commented Jun 12, 2020

We just ran into this while trying to use org.springframework.jdbc.datasource.lookup.AbstractRoutingDataSource with some custom logic to which datasource each query will hit. I've put up a small sample here.

There are 2 configurations classes, ActualConditionsWeRanInto and DeadSimpleReproducingConditions, just enable any one of them to reproduce the problem.

The key here I believe is the @Primary annotation, the DataSourceInitializerInvoker gets invoked on the first DataSource created (not the primary one) and tries to create the primary instance in the ObjectProvider<Datasource>#getIfUnique method invocation, creating the cycle.

Our fix for now is to disable the DataSourceAutoConfiguration class but I'd rather avoid that, anything else we could do? We don't use initialization and the spring.datasource.initializationMode property gets checked after the problem so setting it to NEVER has no impact.

@wilkinsona wilkinsona added the theme: datasource Issues relating to data sources label Feb 10, 2021
@wilkinsona wilkinsona modified the milestones: 2.x, 2.5.x Feb 17, 2021
@wilkinsona wilkinsona self-assigned this Feb 17, 2021
@wilkinsona wilkinsona modified the milestones: 2.5.x, 2.5.0-M2 Feb 17, 2021
@wilkinsona wilkinsona added the status: noteworthy A noteworthy issue to call out in the release notes label Feb 17, 2021
@wilkinsona
Copy link
Member

This is noteworthy as the release notes should mention the new spring.datasource.initialization-order that controls whether initialization is performed before or after JPA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: noteworthy A noteworthy issue to call out in the release notes theme: datasource Issues relating to data sources type: bug A general bug
Projects
None yet
Development

No branches or pull requests