New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash during rapid resharding #728
Comments
There's a core file for this crash in ~jdoliner/issue_728 on newton. |
Working on this, by the way. |
Come talk to me about this. I have a few things you could do to prevent #727 from getting in your way too much. |
Keep running into hanging issues while trying to reproduce this. Even after your suggestions, I've got a secondary with its reactors stuck in |
You had a cluster when first having this issue, right? How many replicas for the table? |
Yeah cluster of 3 machines. No replicas. Meaning just the master. On Friday, May 3, 2013, Tryneus wrote:
|
I have never once observed this crash, not sure what else I can do. |
Aha, finally got this error today, based off of the issue_727 branch, so this is unfortunately still around. |
Wow, could this be another long-hidden bug in the snapshotting code? I feel with you guys :-| |
Moving this to 1.5.x since there is no way we can track this down in time for 1.5. |
@Tryneus -- what's the state of this issue? |
No progress on this, still have not seen this again despite tons of testing. Until we have a consistent way to reproduce this, I can't really expect us to track this down. |
Moving to 1.7.x |
Just talked to Marc. This didn't show up in a while and we can't fix it until we have a consistent way to reproduce it. Moving to backlog for now. |
Just noticed that this is probably exactly the problem I'm seeing here: #1380 (there was a second bug in that issue which is already fixed). I have a set of data files for a 3-node cluster with which this can be easily reproduced. I'm currently looking into this. Assigning to myself. |
The changes in 17ed59b made the problem much more difficult to reproduce (which is good, because the bug is now relatively rare), but I still saw it popping up twice since then. Investigating further... |
So we somehow have the following situation (in chronological order): Write transaction t1 acquires its first block (I assume the superblock). This initializes its version id. The snapshotting it turns out was designed under the assumption that a write transaction can never bypass another one. This seems to be violated somewhere, which might be a bug, or maybe it is by design? |
There is strong evidence that our snapshotting logic cannot cope with the stats block which we acquire in parallel_traversal.cc on line 480 ( It might be possible that our snapshotting code could actually handle such use cases with minor modifications. But I'm not sure. One solution would be to start a new transaction just to update the stat block each time we have to do so. We would lose the atomicity guarantees though, which is probably not very nice. The other alternative is to acquire the stat block once, and to only ever acquire it while we still hold the superblock. We would then hold on to the lock while the parallel traversal is proceeding. |
I can confirm that this is a block that we do acquire out of order which is by design. I don't think either of your solutions will work for the reasons you give. How hard would it be to relax the constraint? Another option would be to use the same synchronization method for the stat block as we do for the secondary index block. One thing I'm not sure of is how we handle the stat block when we do an erase_range. It's been a while since I touched this code. That could be kind of hard to do using sindex_tokens for synchronization. |
Thanks for chiming in. I'm going to have a deeper look at what it takes to expand the snapshotting logic today. |
So it turns out that this can be worked around in the cache rather easily. However, with the implementation I have in mind, snapshotting semantics would be violated for the affected blocks. Let's look at an example with three transactions. First let's say that everything happens in order, and we do not run into the conflict that this issue is about:
Now let's swap steps 4 and 5. Our current implementation has undefined behavior in this case. The adapted one would do the following: @jdoliner Would such a behavior be acceptable for the stats block? For a snapshotted read transaction, the stats block would (occasionally) appear inconsistent with the rest of the btree. |
This is the proposed change by the way: aee406d It does solve the crash (well, it removes the assertion actually. But at the same time adds a well-defined behavior for cases were it used to be violated). |
Short answer yes those semantics are fine. Longer answer: |
The fix is in code review 978. |
This is in next as of commit c42fbc9 . |
... and also cherry-picked into v1.10.x as of d582a0d |
I uncovered the following crash while rapidly resharding:
To reproduce:
This issue is post secondary indexes so they're a likely candidate. I currently have no hypothesis for how they could be involved though. I had no secondary indexes created at the time I got this and thus I can't think of a reason they'd be spawning parallel traversals. It seems more likely that it's the backfill parallel traversal that's erroring. But who knows.
The text was updated successfully, but these errors were encountered: