New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use current epoch (epoch-0) when starting up projections instead of (epoch-1) #1460

Merged
merged 1 commit into from Oct 13, 2017

Conversation

2 participants
@shaan1337
Member

shaan1337 commented Oct 11, 2017

The epoch id is used to create the $projections-$control-{guid} streams which should be unique per projection run.

The previous epoch id (the one obtained before latest elections: epoch-1) was being used but under certain circumstances it is possible for two nodes to end up using the same epoch id which causes projections to behave abnormally (they get stuck on start up).

Example:

Node 1 is master with Epoch E1
Node 2,3 update their epoch state to E1
Node 3 is stopped
Node 2 becomes master and updates epoch to E2
Node 1 updates epoch state to E2 (Node 3 doesn't update because it's stopped)
Node 2 is stopped
Node 3 is started
Since StorageChaser (which updates the epoch) and Elections run in 'parallel', it's possible that Node 3 becomes master and uses Epoch E1 when starting up projections

Resolution - Start Projections only when we are master and the epoch has been written, that is use epoch-0 instead of epoch-1.

  • Added a new message: SystemMessage.EpochWritten
  • Change and improve the start up logic in ProjectionManager and ProjectionCoreCoordinator
  • Removed EpochId from BecomeMaster / BecomeSlave messages since we are now obtaining it from EpochWritten
  • Fix tests, removing epoch id parameter from BecomeSlave/BecomeMaster
  • Fix tests to obtain the epoch id from SystemMessage.EpochWritten
  • Fix tests to send EpochWritten message before projections start
The epoch id is used to create the $projections-$control-{guid} strea…
…ms which should be unique per projection run.

Currently, the previous epoch id (the one obtained before latest elections: epoch-1) was used but under certain circumstances it is possible for two nodes to end up using the same id.

Resolution - Start Projections only when we are master and the epoch has been written (Use epoch-0 instead of epoch-1):

Change the start up logic in ProjectionManager and ProjectionCoreCoordinator
Added a new message: SystemMessage.EpochWritten
Removed EpochId from BecomeMaster / BecomeSlave messages since we are now obtaining it from EpochWritten
Fix tests, removing epoch id parameter from BecomeSlave/BecomeMaster
Fix tests to obtain the epoch id from SystemMessage.EpochWritten

@hayley-jean hayley-jean merged commit 5a08df0 into release-v4.0.4 Oct 13, 2017

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
wercker/build-mono4 Wercker pipeline passed
Details

@hayley-jean hayley-jean deleted the fix-epoch-late-update branch Oct 13, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment