Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deadlock and couple more problems in DefaultConnectionPool #699

Merged
merged 4 commits into from May 1, 2021

Conversation

stIncMale
Copy link
Member

@stIncMale stIncMale commented Apr 25, 2021

I discovered 3 problems in DefaultConnectionPool and ConcurrentPool introduced by me in #685.

  1. [deadlock] See this commit message for the details.
  2. [not expecting an exception where it may happen] Removing from OpenConcurrencyLimiter.desiredConnectionSlots despite not putting anything there previously. See this commit message for the details.
  3. [double release on exception] The method ConcurrentPool.ensureMinSize explicitly requires a caller to release resources if initialize throws an exception, but then releases the permit for newItem itself, which leads to double release. This was by far the hardest problem to investigate and find the cause. See this commit message for more details.

Commits in this PR can be reviewed one by one.

JAVA-3927

This commit fixes two problems in the `OpenConcurrencyLimiter`:
- deadlock in `DefaultConnectionPool`;
- potential removal from `OpenConcurrencyLimiter.desiredConnectionSlots` when its empty.

Deadlock

a) Before introducing OpenConcurrencyLimiter, "AsyncGetter" thread was used only
to do blocking `ConcurrentPool.get`.
b) After, I started to additionally use "AsyncGetter" to do
blocking `OpenConcurrencyLimiter.waitUntilOpenPermitAvailable`.

As a result, we may have a thread that gets the last connection (`maxSize` is reached)
from `ConcurrentPool` and submits `waitUntilOpenPermitAvailable` to "AsyncGetter".
Concurrently with this happening, the "AsyncGetter" tries to get from `ConcurrentPool`
and is blocked because there are no more connections available. In such an execution,
`waitUntilOpenPermitAvailable` cannot be completed by "AsyncGetter" because
"AsyncGetter" is blocked doing a different task, which itself cannot be completed.

a solution is to do `ConcurrentPool.get` and `waitUntilOpenPermitAvailable`
in different threads. This way these two different kinds of tasks
will not block each other by waiting in the same queue to be done by a single thread.

Potential removal from `OpenConcurrencyLimiter.desiredConnectionSlots` when its empty.

If `acquirePermitOrGetAvailableOpenedConnection` is called with `true` as `tryGetAvailable`,
and `getPooledConnectionImmediately` throws an exception,
then `expressDesireToGetAvailableConnection` is not called
but `giveUpOnTryingToGetAvailableConnection` is still called, which is incorrect.

JAVA-3928
@stIncMale stIncMale requested review from rozza and jyemin April 25, 2021 14:06
private static Stream<Arguments> concurrentUsageArguments() {
return Stream.of(
Arguments.of(0, 1, 8, true, false, 0.02f),
Arguments.of(0, 1, 8, false, true, 0.02f),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of all 5 parameter sets, only this one reproduced the deadlock.

@@ -710,7 +698,7 @@ public Prune shouldPrune(final UsageTrackingInternalConnection usageTrackingConn
}

/**
* Package-private methods are thread-safe,
* Package-access methods are thread-safe,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"package-private" is not a term the JLS uses, while "package-access" is.

@@ -874,7 +864,7 @@ private PooledConnection acquirePermitOrGetAvailableOpenedConnection(final boole
return availableConnection;
} finally {
try {
if (tryGetAvailable && availableConnection == null) {//the desired connection slot has not yet been removed
if (expressedDesireToGetAvailableConnection && availableConnection == null) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the second problem I discovered while implementing pausable pool. getPooledConnectionImmediately there may throw an exception, which in turn may result in this code calling giveUpOnTryingToGetAvailableConnection despite expressDesireToGetAvailableConnection not being called.

This is an example of how one should try not to rely if possible on methods not throwing exceptions.

*/
private long awaitNanos(final long timeoutNanos) {
private long awaitNanos(final Condition condition, final long timeoutNanos) throws MongoInterruptedException {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change makes the behavior of this method consistent with Timeout and allows to see explicitly what condition it waits for on the calling side.

@stIncMale stIncMale changed the title Fix deadlock in DefaultConnectionPool Fix deadlock couple more problems in DefaultConnectionPool Apr 27, 2021
On one hand `ConcurrentPool.ensureMinSize` tries to not require the caller to do
what can be done by the method itself, e.g., it releases the connection to the pool itself.
On the other hand, always releasing in `ensureMinSize` may lead
to releasing a permit for the same connection twice, thus not respecting the `maxSize`.

It is not easy (requires an additional knob and logic)
to prevent the caller (`DefaultConnectionPool`) from releasing a connection
when initialization fails. It is, therefore, seems better to mandate that the caller
releases a connection if initialization fails.

JAVA-3927
@stIncMale stIncMale changed the title Fix deadlock couple more problems in DefaultConnectionPool Fix deadlock and couple more problems in DefaultConnectionPool Apr 27, 2021
Copy link
Member

@rozza rozza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question but LGTM

@jyemin jyemin self-requested a review April 30, 2021 19:05
Copy link
Contributor

@jyemin jyemin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@stIncMale stIncMale merged commit 73e3085 into mongodb:master May 1, 2021
@stIncMale stIncMale deleted the JAVA-3927 branch May 1, 2021 16:18
ispringer pushed a commit to evergage/mongo-java-driver that referenced this pull request May 16, 2023
…godb#699)

This commit fixes three problems in `DefaultConnectionPool`/`ConcurrentPool`:
- deadlock in `DefaultConnectionPool`;
- unnecessary and harmful removal from `OpenConcurrencyLimiter.desiredConnectionSlots`;
- double release in `DefaultConnectionPool`/`ConcurrentPool`.


Deadlock in `DefaultConnectionPool`

a) Before introducing OpenConcurrencyLimiter, "AsyncGetter" thread was used only
to do blocking `ConcurrentPool.get`.
b) After, I started to additionally use "AsyncGetter" to do
blocking `OpenConcurrencyLimiter.waitUntilOpenPermitAvailable`.

As a result, we may have a thread that gets the last connection (`maxSize` is reached)
from `ConcurrentPool` and submits `waitUntilOpenPermitAvailable` to "AsyncGetter".
Concurrently with this happening, the "AsyncGetter" tries to get from `ConcurrentPool`
and is blocked because there are no more connections available. In such an execution,
`waitUntilOpenPermitAvailable` cannot be completed by "AsyncGetter" because
"AsyncGetter" is blocked doing a different task, which itself cannot be completed.

A solution is to do `ConcurrentPool.get` and `waitUntilOpenPermitAvailable`
in different threads. This way these two different kinds of tasks
will not block each other by waiting in the same queue to be done by a single thread.


Potential unnecessary and harmful removal from `OpenConcurrencyLimiter.desiredConnectionSlots`

If `acquirePermitOrGetAvailableOpenedConnection` is called with `true` as `tryGetAvailable`,
and `getPooledConnectionImmediately` throws an exception,
then `expressDesireToGetAvailableConnection` is not called
but `giveUpOnTryingToGetAvailableConnection` is still called, which is incorrect.


Double release in `DefaultConnectionPool`/`ConcurrentPool`

On one hand `ConcurrentPool.ensureMinSize` tries to not require the caller to do
what can be done by the method itself, e.g., it releases the connection to the pool itself.
On the other hand, always releasing in `ensureMinSize` may lead
to releasing a permit for the same connection twice, thus not respecting the `maxSize`.

It is not easy (requires an additional knob and logic)
to prevent the caller (`DefaultConnectionPool`) from releasing a connection
when initialization fails. It is, therefore, seems better to mandate that the caller
releases a connection if initialization fails.

JAVA-3927
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants