-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
couch_mrview_changes_since_tests gen_server failing with unknown_info #649
Comments
No, this is different, just looks the same, because it's happening during getting view state and the cause is also race condition. In previous case we've been crashing on old index_server exit or be gone already, so exceptions been Here exception is On the other hand, why do we have a compaction happening during a test? I thought that new compaction daemon supposed to be turned off for eunits? |
I think this is because of #652 in the eunit test - but do we want to keep this issue to track fixing the race condition? Keep in mind that once the PR is merged, we won't easily reproduce this bug anymore. |
Closing because PR#654 landed. Will reopen if it recurs. |
Sadly, it recurred already. https://builds.apache.org/blue/organizations/jenkins/CouchDB/detail/master/64/pipeline/49/ |
The Makefile shows <0.22474.1> as the pid that is being monitored:
couch.log shows the compaction in the test, but a strange exit:
|
Looking into #649 I realized there's a pretty terrible race condition if an index is compacted quickly followed by an index update. Since we don't check the index updater message it would be possible for us to swap out a compaction change, followed by immediately resetting to the new state from the index updater. This would be bad as we'd possibly end up with a situation where our long lived index would be operating on a file that no longer existed on disk.
This was encountered during the test suite runs on Travis. It turns out that when we restart the indexer its possible to already have the 'EXIT' message in our mailbox. When we do we'll then crash with an unknown_info error since our updater pid was changed during the restart. This change simple filters any 'EXIT' message from the old updater from the mailbox before restarting thew new index updater. Fixes #649
Looking into #649 I realized there's a pretty terrible race condition if an index is compacted quickly followed by an index update. Since we don't check the index updater message it would be possible for us to swap out a compaction change, followed by immediately resetting to the new state from the index updater. This would be bad as we'd possibly end up with a situation where our long lived index would be operating on a file that no longer existed on disk.
This was encountered during the test suite runs on Travis. It turns out that when we restart the indexer its possible to already have the 'EXIT' message in our mailbox. When we do we'll then crash with an unknown_info error since our updater pid was changed during the restart. This change simple filters any 'EXIT' message from the old updater from the mailbox before restarting thew new index updater. Fixes #649
Looking into #649 I realized there's a pretty terrible race condition if an index is compacted quickly followed by an index update. Since we don't check the index updater message it would be possible for us to swap out a compaction change, followed by immediately resetting to the new state from the index updater. This would be bad as we'd possibly end up with a situation where our long lived index would be operating on a file that no longer existed on disk.
Possible recurrence of the #548 issue:
https://builds.apache.org/blue/organizations/jenkins/CouchDB/detail/master/57/pipeline/50
Searching on
<0.22793.1>
in the couch.log turned up:@eiri This feels a lot like what we ran into previously.
/cc @kocolosk @janl @davisp
The text was updated successfully, but these errors were encountered: