Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inc store reference before refresh #28656

Merged
merged 1 commit into from Feb 13, 2018
Merged

Conversation

@jimczi
Copy link
Member

jimczi commented Feb 13, 2018

If a tragic even happens while we are refreshing a searcher/reader the engine can open new files on a store that is already closed.
For instance the following CI job failed because a refresh was concurrently called on a failing shard:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+oracle-java10-periodic/84
This change increments the ref count of the store during a refresh in order to postpone the closing after a tragic event.

If a tragic even happens while we are refreshing a searcher/reader the engine can open new files on a store that is already closed
For instance the following CI job failed because a merge was concurrently called on a failing shard:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+oracle-java10-periodic/84
This change increments the ref count of the store during a refresh in order to postpone the closing after a tragic event.
@jimczi jimczi requested review from jpountz and s1monw Feb 13, 2018
@s1monw
s1monw approved these changes Feb 13, 2018
Copy link
Contributor

s1monw left a comment

LGTM good catch

@jimczi jimczi merged commit 3b9f530 into elastic:master Feb 13, 2018
2 checks passed
2 checks passed
CLA Commit author is a member of Elasticsearch
Details
elasticsearch-ci Build finished.
Details
@jimczi jimczi deleted the jimczi:refresh_inc_store branch Feb 13, 2018
jimczi added a commit that referenced this pull request Feb 13, 2018
If a tragic even happens while we are refreshing a searcher/reader the engine can open new files on a store that is already closed
For instance the following CI job failed because a merge was concurrently called on a failing shard:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+oracle-java10-periodic/84
This change increments the ref count of the store during a refresh in order to postpone the closing after a tragic event.
@bleskes

This comment has been minimized.

Copy link
Member

bleskes commented Feb 13, 2018

great catch. I think it's trappy that acquiring a lock isn't enough to keep things alive. I think this kind of failure can happen in many other ways. I'm wondering if should increment the store count everytime we issue a lock and decrement it when the lock is freed. Just a thought. This change is great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.