Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: closes pool maintainer on invalidation #784

Merged
merged 5 commits into from Jan 17, 2021

Conversation

thiagotnunes
Copy link
Contributor

@thiagotnunes thiagotnunes commented Jan 7, 2021

When the session pool is marked as invalid, we immediately close the pool maintainer in order to keep it from trying to replenish the pool. This way we prevent useless batch create sessions requests.

Fixes #768

When the session pool is marked as invalid, we immediately close the
pool maintainer in order to keep it from trying to replinish the pool.
This way we prevent useless batch create sessions requests.
@thiagotnunes thiagotnunes requested a review from as a code owner Jan 7, 2021
@google-cla google-cla bot added the cla: yes label Jan 7, 2021
@product-auto-label product-auto-label bot added the api: spanner label Jan 7, 2021
@thiagotnunes
Copy link
Contributor Author

@thiagotnunes thiagotnunes commented Jan 7, 2021

This fixes the following problem

  1. When a resource not found error occurs we set the exception in the session pool here.
  2. Because the exception was set in (1), the session pool becomes invalid as seen here.
  3. Since the session pool is invalid, a new one will be created if we try to get the database client (see here and here).
  4. On the invalid database client we delay closing it and buffer it to be closed when the spanner client is closed (see here).
  5. Because the invalid database client is not closed, the underlying pool maintainer is never closed (this method is not called).
  6. Since the method in (5) is never called the pool maintainer will try to keep maintaining the sessions (as seen here) for the invalid databases.
  7. The pool maintainer will try to keep creating sessions, even for the invalid session pools.
  8. Since we are repeatedly calling getDatabaseClient and creating new session pools, we end up accumulating more and more pool maintainers which are always invalid and will keep trying to create new sessions.
  9. The invalid session pool maintainers build up making the getDatabaseClient slower and slower.

@codecov
Copy link

@codecov codecov bot commented Jan 7, 2021

Codecov Report

Merging #784 (04ffcf5) into master (beeac09) will increase coverage by 0.05%.
The diff coverage is 93.33%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #784      +/-   ##
============================================
+ Coverage     85.01%   85.06%   +0.05%     
- Complexity     2561     2564       +3     
============================================
  Files           143      143              
  Lines         14005    14037      +32     
  Branches       1337     1344       +7     
============================================
+ Hits          11906    11941      +35     
+ Misses         1537     1533       -4     
- Partials        562      563       +1     
Impacted Files Coverage Δ Complexity Δ
...ain/java/com/google/cloud/spanner/SessionPool.java 89.24% <93.33%> (+0.42%) 73.00 <0.00> (+2.00)
...ain/java/com/google/cloud/spanner/SessionImpl.java 85.38% <0.00%> (-0.51%) 30.00% <0.00%> (ø%)
...va/com/google/cloud/spanner/AbstractResultSet.java 83.27% <0.00%> (-0.22%) 28.00% <0.00%> (ø%)
...ava/com/google/cloud/spanner/v1/SpannerClient.java 82.05% <0.00%> (ø) 63.00% <0.00%> (ø%)
...gle/cloud/spanner/v1/stub/SpannerStubSettings.java 94.53% <0.00%> (ø) 26.00% <0.00%> (ø%)
...spanner/admin/database/v1/DatabaseAdminClient.java 83.22% <0.00%> (ø) 100.00% <0.00%> (ø%)
...spanner/admin/instance/v1/InstanceAdminClient.java 79.14% <0.00%> (ø) 56.00% <0.00%> (ø%)
...in/database/v1/stub/DatabaseAdminStubSettings.java 93.54% <0.00%> (ø) 32.00% <0.00%> (ø%)
...in/instance/v1/stub/InstanceAdminStubSettings.java 93.61% <0.00%> (ø) 23.00% <0.00%> (ø%)
...om/google/cloud/spanner/TransactionRunnerImpl.java 84.86% <0.00%> (+0.16%) 9.00% <0.00%> (ø%)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update beeac09...04ffcf5. Read the comment docs.

Copy link
Contributor

@olavloite olavloite left a comment

Nice catch!

@olavloite
Copy link
Contributor

@olavloite olavloite commented Jan 7, 2021

@thiagotnunes I just thought of one small catch with this:

  1. When the session pool is being closed we keep track of the number of sessions and other resources that need to be closed.
  2. This is set here while executing the close method. This number also includes one for the session pool maintainer.
  3. The session pool maintainer (and other resources) will decrement the pendingClosures count when they actually close.
  4. The ApiFuture that is returned by SessionPool#close() method will be done when pendingClosures reach zero. That does not happen if the pool maintainer has already been closed, so the SessionPool is never reported as closed.

The above is probably also the reason that a number of the tests were cancelled, because they never finished (still waiting for the session pool to close).

The solution would be to only conditionally do +1 in step 2 above for the pool maintainer. If it has already been closed because the session pool is invalid, then we should not wait for it to close.

Copy link
Contributor

@olavloite olavloite left a comment

(Sorry for the flip-flopping, see my previous comment)

@thiagotnunes
Copy link
Contributor Author

@thiagotnunes thiagotnunes commented Jan 7, 2021

@olavloite Thanks for spotting this, will work on the fix.

thiagotnunes added 2 commits Jan 7, 2021
When closing the pool, only waits for the pool maintainer to close if it
has not been closed before.
Makes sure to close the pool maintainer only if it has not been closed
already. Also before returning to the caller, makes sure to mark the
closing as complete if there are no pending closures.
@thiagotnunes
Copy link
Contributor Author

@thiagotnunes thiagotnunes commented Jan 11, 2021

@olavloite Fixed the implementation, now we only close the poolMaintainer if it has not been closed before. Other than that, I had to add a check at the end of the closeAsync method to complete the closing future, if there is nothing left to be closed (in the previous implementation that was handled by the PoolMaintainer.close, since it called decrementPendingClosures).

@thiagotnunes
Copy link
Contributor Author

@thiagotnunes thiagotnunes commented Jan 14, 2021

Some tests seem to be getting stuck eventually, I think I introduced a bug somewhere. Will mark it as do not merge until I had the time to investigate.

@thiagotnunes thiagotnunes added the do not merge label Jan 14, 2021
Verifies that the pool maintainer is not closed before closing it. Also
moves the check of pendingClosures into the synchronized block to make
sure no stale reads are made.
@thiagotnunes thiagotnunes removed the do not merge label Jan 14, 2021
@thiagotnunes thiagotnunes merged commit d122ed9 into googleapis:master Jan 17, 2021
20 checks passed
@thiagotnunes thiagotnunes deleted the non-existing-database branch Jan 17, 2021
thiagotnunes added a commit that referenced this issue May 6, 2021
* fix: closes pool maintainer on invalidation

When the session pool is marked as invalid, we immediately close the
pool maintainer in order to keep it from trying to replinish the pool.
This way we prevent useless batch create sessions requests.

* fix: checks for pool maintainer closed status

When closing the pool, only waits for the pool maintainer to close if it
has not been closed before.

* fix: only closes pool maintainer if not closed

Makes sure to close the pool maintainer only if it has not been closed
already. Also before returning to the caller, makes sure to mark the
closing as complete if there are no pending closures.

* fix: avoids npe when closing pool maintainer

* fix: checks pool maintainer is not closed on close

Verifies that the pool maintainer is not closed before closing it. Also
moves the check of pendingClosures into the synchronized block to make
sure no stale reads are made.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: spanner cla: yes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants