New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wait for Replication and Indexing to complete before becoming leader #3187
Conversation
e68aca5
to
3e46122
Compare
|
src/EventStore.Core/Services/VNode/LastIndexableRecordFinder.cs
Outdated
Show resolved
Hide resolved
8d87ee8
to
9616d38
Compare
force pushed to single commit, full history is on the |
4edcd36
to
56e5b0e
Compare
src/EventStore.Core/Services/Storage/ReaderIndex/EventFilter.cs
Outdated
Show resolved
Hide resolved
2f24900
to
ec96d27
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great work 👍
i've reviewed and tested some scenarios, just a small suggestion:
[13173,14,11:30:56.838,INF] "InaugurationManager" in state (PreLeader, WaitingForConditions): Replication "DONE": 681,693,696/681,693,033. Remaining: -663.
[13173,14,11:30:56.838,INF] "InaugurationManager" in state (PreLeader, WaitingForConditions): Index "DONE": 681,693,321/681,693,033. Remaining: -288.
for the case when it's done can we hide the progress? (i.e hide : 681,693,321/681,693,033. Remaining: -288
) otherwise it seems a bit strange to see a negative value for Remaining
- This change ensures that before becoming leader we successfully write an epoch and get it replicated. Thus demonstrating that we are in a fit state to become leader. - Therefore the log is successfully replicated before we transition to leader. - We also wait for the records to be indexed before we transition to leader, otherwise incorrect event numbers can be written because the latest events are not known to the index or IndexWriter - The InaugurationManager is reponsible for making sure the conditions are met for transitioning to leader. - In the PreLeader state we now accept replication subscriptions and monitor for quorum, transitioning to Unknown if a quorum is not maintained (exactly as we did in the leader state before) Additionally, whenever we write an epoch we immediately write an $epoch-information event to the $epoch-information stream. This allows the InaugurationManager to know where the index checkpoint is supposed to reach. As a bonus this also gives the users a way to get details about the epochs. These events will not be present in an unfiltered $all read, for backwards compatibilty, but can be read directly from their stream.
ec96d27
to
057d398
Compare
yeah you're right it is a bit surprising. ive made it show 0 when there is nothing remaining, instead of negative |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚨 @timothycoleman Failed to create cherry Pick PR due to error:
RequestError [HttpError]: Merge conflict
at /home/runner/work/_actions/EventStore/Automations/master/cherry-pick-pr-for-label/node_modules/@octokit/request/dist-node/index.js:66:23
at processTicksAndRejections (internal/process/task_queues.js:93:5) {
name: 'HttpError',
status: 409,
headers: {
'access-control-allow-origin': '*',
'access-control-expose-headers': 'ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Used, X-RateLimit-Resource, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, Deprecation, Sunset',
connection: 'close',
'content-length': '110',
'content-security-policy': "default-src 'none'",
'content-type': 'application/json; charset=utf-8',
date: 'Fri, 01 Oct 2021 08:54:23 GMT',
'referrer-policy': 'origin-when-cross-origin, strict-origin-when-cross-origin',
server: 'GitHub.com',
'strict-transport-security': 'max-age=31536000; includeSubdomains; preload',
vary: 'Accept-Encoding, Accept, X-Requested-With',
'x-content-type-options': 'nosniff',
'x-frame-options': 'deny',
'x-github-media-type': 'github.v3; format=json',
'x-github-request-id': '040B:086B:CF574F:1F39C8B:6156CCBF',
'x-ratelimit-limit': '1000',
'x-ratelimit-remaining': '990',
'x-ratelimit-reset': '1633082062',
'x-ratelimit-resource': 'core',
'x-ratelimit-used': '10',
'x-xss-protection': '0'
},
request: {
method: 'POST',
url: 'https://api.github.com/repos/EventStore/EventStore/merges',
headers: {
accept: 'application/vnd.github.v3+json',
'user-agent': 'octokit-core.js/3.3.2 Node.js/12.13.1 (linux; x64)',
authorization: 'token [REDACTED]',
'content-type': 'application/json; charset=utf-8'
},
body: '{"base":"cherry-pick-cherry-pick/3187/preleader-replication-master-256d5652-e8f5-46ca-a9d1-5fcfa56ea17b","commit_message":"Merge 057d39859ae46b41a86b4d86c290f781cc00c809 into cherry-pick-cherry-pick/3187/preleader-replication-master-256d5652-e8f5-46ca-a9d1-5fcfa56ea17b [skip ci]\\n\\n\\nskip-checks: true\\n","head":"057d39859ae46b41a86b4d86c290f781cc00c809"}',
request: { agent: [Agent], hook: [Function: bound bound register] }
},
documentation_url: 'https://docs.github.com/rest/reference/repos#merge-a-branch'
}
🚨👉 Check https://github.com/EventStore/EventStore/actions/runs/1294387899
Fixed: race condition where a node becomes leader and immediately writes to streams that have recently been written to but not indexed, resulting in events with incorrect numbers in their streams.
This change ensures that before becoming leader we successfully write an epoch and get it replicated. Thus demonstrating that we are in a fit state to become leader.
Therefore the log is successfully replicated before we transition to leader.
We also wait for the records to be indexed before we transition to leader, otherwise incorrect event numbers can be written because the latest events are not known to the index or IndexWriter
The InaugurationManager is reponsible for making sure the conditions are met for transitioning to leader.
In the PreLeader state we now accept replication subscriptions and monitor for quorum, transitioning to Unknown if a quorum is not maintained (exactly as we did in the leader state before)
Works as a rolling upgrade from 20.10.0