Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUSTCOM-135 Fixed race condition on ConfigProviderResolver initialization #4480

Merged
merged 2 commits into from Feb 11, 2020
Merged

CUSTCOM-135 Fixed race condition on ConfigProviderResolver initialization #4480

merged 2 commits into from Feb 11, 2020

Conversation

dmatej
Copy link
Contributor

@dmatej dmatej commented Feb 10, 2020

  • instane() is synchronized
  • setInstance is NOT synchronized, so it ignores any locks
  • preferring @Inject over ConfigProvider in HealthCheckService
  • using @ContractsProvided to allow both @Inject target types
  • more logging, but only when configured
  • tested in extremely slowed TestContainers docker container

Important Info

Removed the timeout and countdouwn latch, because

  • it is more simple to diagnose deadlock without timeout
  • deadlock should never happen - if it happened, something is broken again => timeout is not a way to get out of it
  • this PR fixed the cause of timeout
  • this PR still allows inverted (incorrect) order of initialization.

Testing

New tests

Not pushed reproducer in PAYARA-4176 branch (not production quality, but stacktraces in logs visible)

Testing Performed

…tion

- instane() is synchronized
- setInstance is NOT synchronized, so it ignores any locks
- preferring @Inject over ConfigProvider in HealthCheckService
- using @ContractsProvided to allow both @Inject target types
- more logging, but only when configured
- tested in extremely slowed TestContainers docker container
@dmatej dmatej self-assigned this Feb 10, 2020
if (resolver != null && resolver != this) {
return resolver;
}
Thread.yield();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.yield() is a hint, which means that control isn't necessarily passed on to another Thread. A sleep with very small amount should be used.

Copy link
Contributor Author

@dmatej dmatej Feb 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used it on Linux+Intel/AMD and on Solaris/SPARC, even old versions, and it always worked (lower CPU load because it is not so aggressive as while loop, but fast, because it does not sleep).
Also it is only few time blocks after startup, so it does not slow down the Payara Startup. Sleep is unnecessary overkill, because I don't need to sleep running thread, I only tell the scheduler "at this moment I have nothing to do, but if other threads have some work, they can".
https://www.geeksforgeeks.org/java-concurrency-yield-sleep-and-join-methods/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instance() also has synchronized block, so there's another yield point there, so I believe this might be correct.

But still, I trust CountDownLatch more, but might be overkill for 1 producer to 1 consumer scenario.

// this will block possible concurrent access.
synchronized (ConfigProviderResolver.class) {
LOG.log(Level.CONFIG, "Setting global ConfigProviderResolver instance to {0}", this);
ConfigProviderResolver.setInstance(this);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is executed too early if you ask me. The current instance is set as value BEFORE the instance is completely setup as HK2 service. So MP config calls can get routed to this class before HK2 is finished processing, resulting in errors (NPE since the @Inject isn't processed?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I was a bit too agressive and this is an antipattern (providing this before finishing constructor). But ... if we don't have other possibility, isn't the @PostConstruct method same situation?
From what I tested, @Inject will not cause NPE here, because HK2 will block any injection until it finishes PostConstruct method (constructor already finished). Bigger problem is programmatic usage of Globals class.

Copy link
Contributor

@pdudits pdudits Feb 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semantics of postconstruct guarantees that fields are set. The situation we're dealing with here is that someone triggered mp config API before this service was initialised. Postconstruct is therefore almost proper time as it is considered initialized after the method exits

Injections of this class via globals is indeed wrong. But it is used by config sources, and those are created by this service. Plain constructor parameter would be better here.

Copy link
Contributor

@pdudits pdudits left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving registration to @PostConstruct will resolve all my objections.

if (resolver != null && resolver != this) {
return resolver;
}
Thread.yield();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instance() also has synchronized block, so there's another yield point there, so I believe this might be correct.

But still, I trust CountDownLatch more, but might be overkill for 1 producer to 1 consumer scenario.

- safer, all fields should be set here.
@dmatej
Copy link
Contributor Author

dmatej commented Feb 10, 2020

Jenkins test please

@dmatej dmatej merged commit 7ecfc39 into payara:master Feb 11, 2020
@dmatej dmatej deleted the CUSTCOM-135-startup-with-heavy-load branch February 11, 2020 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants