Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address concurrency issues in DequeRecycler close #41695

Closed
wants to merge 1 commit into from

Conversation

jaymode
Copy link
Member

@jaymode jaymode commented Apr 30, 2019

This change addresses some concurrency issues that can occur when
closing a DequeRecycler. The first issue that is addressed is a
ConcurrentModificationException that can occur when using a locked
DequeRecycler that is backed by an ArrayDeque. The ArrayDeque is not
thread safe and requires external synchronization. In most cases the
locked DequeRecycler handles this correctly but closing the
DequeRecycler is not protected by the lock, which can lead to the
ConcurrentModificationException if other threads are calling the obtain
method. Additionally, the DequeRecycler close method used an iterator
to go over all entries in the Deque and then later cleared them. The
close method now uses pollFirst in a loop to empty the Deque and still
execute the destroy method.

Finally, the ConcurrentDequeRecycler had an overzealous assertion in
the close method that the size of the Deque is equivalent to the
externally tracked size. This assertion is not always going to be true
due to the nature of the implementation. There is no lock guarding both
the deque and the size value, so there is always a chance that the two
could be wrong depending on ongoing requests. This assertion has been
removed and a comment has been added that mentions there can be some
discrepancies between the actual size of the deque and the externally
tracked size.

These issues may have always been present, but I believe the changes in
#39317 has made their presence known.

Closes #41683

This change addresses some concurrency issues that can occur when
closing a DequeRecycler. The first issue that is addressed is a
ConcurrentModificationException that can occur when using a locked
DequeRecycler that is backed by an ArrayDeque. The ArrayDeque is not
thread safe and requires external synchronization. In most cases the
locked DequeRecycler handles this correctly but closing the
DequeRecycler is not protected by the lock, which can lead to the
ConcurrentModificationException if other threads are calling the obtain
method. Additionally, the DequeRecycler close method used an iterator
to go over all entries in the Deque and then later cleared them. The
close method now uses pollFirst in a loop to empty the Deque and still
execute the destroy method.

Finally, the ConcurrentDequeRecycler had an overzealous assertion in
the close method that the size of the Deque is equivalent to the
externally tracked size. This assertion is not always going to be true
due to the nature of the implementation. There is no lock guarding both
the deque and the size value, so there is always a chance that the two
could be wrong depending on ongoing requests. This assertion has been
removed and a comment has been added that mentions there can be some
discrepancies between the actual size of the deque and the externally
tracked size.

These issues may have always been present, but I believe the changes in

Closes elastic#41683
@jaymode jaymode added >non-issue :Core/Infra/Core Core issues without another label v8.0.0 v7.2.0 labels Apr 30, 2019
@jaymode jaymode requested a review from jpountz April 30, 2019 16:30
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@jpountz
Copy link
Contributor

jpountz commented May 7, 2019

Thanks @jaymode for looking into it. I think #39317 introduced problems indeed by closing the recycler before making sure that there are no more ongoing tasks in the threadpool. I tried to think about how we could make sure that close is only called after all tasks have run but I'm now wondering whether it would be simpler to just remove close() from Recycler and PageCacheRecycler, and removing the call to PageCacheRecycler#close from Node#close? The only things that these close() calls are releasing is memory, which would be reclaimed anyway by the garbage collector since there is no point in keeping references to a closed node.

@jaymode
Copy link
Member Author

jaymode commented May 7, 2019

@jpountz that makes sense to me. I'll close this PR and open another to remove the close method

@jaymode jaymode closed this May 7, 2019
jaymode added a commit to jaymode/elasticsearch that referenced this pull request May 8, 2019
The changes in elastic#39317 brought to light some concurrency issues in the
close method of Recyclers as we do not wait for threads running in the
threadpool to be finished prior to the closing of the PageCacheRecycler
and the Recyclers that are used internally. elastic#41695 was opened to
address the concurrent close issues but upon review, the closing of
these classes is not really needed as the instances should be become
available for garbage collection once there is no longer a reference to
the closed node.

Closes elastic#41683
jaymode added a commit that referenced this pull request May 10, 2019
The changes in #39317 brought to light some concurrency issues in the
close method of Recyclers as we do not wait for threads running in the
threadpool to be finished prior to the closing of the PageCacheRecycler
and the Recyclers that are used internally. #41695 was opened to
address the concurrent close issues but upon review, the closing of
these classes is not really needed as the instances should be become
available for garbage collection once there is no longer a reference to
the closed node.

Closes #41683
jaymode added a commit that referenced this pull request May 10, 2019
The changes in #39317 brought to light some concurrency issues in the
close method of Recyclers as we do not wait for threads running in the
threadpool to be finished prior to the closing of the PageCacheRecycler
and the Recyclers that are used internally. #41695 was opened to
address the concurrent close issues but upon review, the closing of
these classes is not really needed as the instances should be become
available for garbage collection once there is no longer a reference to
the closed node.

Closes #41683
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
The changes in elastic#39317 brought to light some concurrency issues in the
close method of Recyclers as we do not wait for threads running in the
threadpool to be finished prior to the closing of the PageCacheRecycler
and the Recyclers that are used internally. elastic#41695 was opened to
address the concurrent close issues but upon review, the closing of
these classes is not really needed as the instances should be become
available for garbage collection once there is no longer a reference to
the closed node.

Closes elastic#41683
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Core Core issues without another label >non-issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] ConcurrentModificationException in DequeRecycler.close
3 participants