-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reset logunit server #1052
Reset logunit server #1052
Conversation
Hm. Can you clarify why healing can only be done if the server is reset? Can't the healing node take an intersection of the current log state and it's state? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, looks fine, but I wonder if (1) we really need to reset to heal a node, and (2) if we should find a way to "protect" this API.
Maybe make it only accessible from a special administrative epoch?
|
||
@Override | ||
public void reset() { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Changes Unknown when pulling b3befa0 on zalokhan:resetLogunit into ** on CorfuDB:master**. |
@no2chem There can be a case where the head of the chain (and also primary sequencer) was ahead than the others and crashed (address 100). The crashed head now tries to recover. Let me know if I understood and answered your question correctly. |
b3befa0
to
c40ecc9
Compare
Ok makes much more sense now. But wouldn't a partial reset be better than a complete reset? I guess that's just an optimization (drop only 50 to tail), but it seems quite important for performance. We can do that in a separate PR.
On Thu, Dec 07, 2017 at 3:18pm, Zeeshan Lokhandwala < notifications@github.com [notifications@github.com] > wrote:
On Thu, Dec 07, 2017 at 3:18pm, Zeeshan Lokhandwala < notifications@github.com [notifications@github.com] > wrote:
@no2chem [https://github.com/no2chem] There can be a case where the head of the chain (and also primary sequencer) was ahead than the others and crashed (address 100).
The backup sequencer is bootstrapped with a token from the maximum address seen by the remaining log units (address 50). Now clients can write a different set of data to the new chain for addresses (50 - 100)
The crashed head now tries to recover.
The intersection of the log addresses from the healing node and the current state will be 0-100 but have inconsistent data.
Let me know if I understood and answered your question correctly.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub [#1052 (comment)] , or mute the thread [https://github.com/notifications/unsubscribe-auth/AGMi2GDox7t52jqadTiXu2147MSRknbkks5s-HH4gaJpZM4Q6UV0] .
|
But how do you determine till where do we drop the log entries? |
Changes Unknown when pulling c40ecc9 on zalokhan:resetLogunit into ** on CorfuDB:master**. |
I see. The problem is the lack of a lease. If the previous sequencer had a lease of say, 10k entries, then this problem would go away
On Thu, Dec 07, 2017 at 3:35pm, Zeeshan Lokhandwala < notifications@github.com [notifications@github.com] > wrote:
But how do you determine till where do we drop the log entries?
You would have to do a deep reading of all the entries to get to know where the log stream branches off. I guess this would require access tot he deserializers and would not be feasible.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub [#1052 (comment)] , or mute the thread [https://github.com/notifications/unsubscribe-auth/AGMi2FJUHym3QvT3EcDsm7-FWvLCUu7Mks5s-HVsgaJpZM4Q6UV0] .
|
@no2chem don't merge right away, there are other reviewers looking at this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think resetLogUnit should block all other calls, right now, writes can interleave with a reset which result in a dirty state. One way of achieving this is by setting the logging unit's state to not ready and then doing the reset.
@Maithem To achieve this, should the reset call go through the batchwriter? This can ensure synchronization. |
@Maithem sorry about the merge. But concurrency shouldn't be an issue here. The logunit epoch should be sealed while this is happening |
@no2chem No worries. Let's consider the system as a whole, I think in general we need to minimize work that needs to happen between a seal and layout changes. In the case of chain replication, I think this is fine since the whole chain has to be ready for writes to go through, but in the case of a quorum, why block the whole system, if the system can still can accept requests? |
@zalokhan Readers are not blocked by the writer thread, so I don't think that would work. |
In the case of quorum we probably need leases, I suspect. Either way you're reconfiguring to reset hopefully, so a seal should occur...
On Thu, Dec 07, 2017 at 4:43pm, Maithem < notifications@github.com [notifications@github.com] > wrote:
No worries.
Let's consider the system as a whole, I think in general we need to minimize work that needs to happen between a seal and layout changes. In the case of chain replication, I think this is fine since the whole chain has to be ready for writes to go through, but in the case of a quorum, why block the whole system, if the system can still can accept requests?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub [#1052 (comment)] , or mute the thread [https://github.com/notifications/unsubscribe-auth/AGMi2LU1jgr0qSfoUS40rWsRPnm0vk16ks5s-IYqgaJpZM4Q6UV0] .
|
I'm not saying we shouldn't do a seal, i'm saying the work after the seal should be minimized. Ok, we can think more about the quorum case later. I think there is another issue, consider the case where the batch writer is writing while a reset occurs, this is a race condition that would leave the logging in a bad state. I think the before the reset happens, we either need to wait for all writes to succeed, or just cancel all pending writes. Essentially, the LU pipelines need to be flushed before a reset is issued. Moreover, I think this is a dangerous operation and the API should be protected some how. |
Overview
Description: Resets the log unit server by clearing all data and persisted state.
Why should this be merged: This is a requirement to heal failed nodes.
The healing nodes need to be added back to the chain and this can only be done once the state
on the log unit server on the healing node is reset.
Checklist (Definition of Done):