Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip PRRL renewal on UNASSIGNED_SEQ_NO #44019

Conversation

DaveCTurner
Copy link
Contributor

@DaveCTurner DaveCTurner commented Jul 5, 2019

Today when renewing PRRLs we assert that any invalid "backwards" renewals must
be because we are recovering the shard. In fact it's also possible to have
checkpointState.globalCheckpoint == SequenceNumbers.UNASSIGNED_SEQ_NO on a
tracked shard copy if the primary was just promoted and hasn't received
checkpoints from all of its peers too.

This commit weakens the assertion to match.

Caught by a failure of the full cluster restart tests.

Relates #41536

Today when renewing PRRLs we assert that any invalid "backwards" renewals must
be because we are recovering the shard. In fact it's also possible to have
`checkpointState.globalCheckpoint == SequenceNumbers.UNASSIGNED_SEQ_NO` on a
tracked shard copy if the primary was just promoted and hasn't received
checkpoints from all of its peers too.

This commit weakens the assertion to match.

Caught by a [failure of the full cluster restart
tests](https://scans.gradle.com/s/5lllzgqtuegty/console-log#L8605)

Relates elastic#41536
@DaveCTurner DaveCTurner added >test Issues or PRs that are addressing/adding tests :Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. labels Jul 5, 2019
@DaveCTurner DaveCTurner requested a review from ywelsch July 5, 2019 15:48
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@DaveCTurner DaveCTurner merged commit da3c901 into elastic:peer-recovery-retention-leases Jul 5, 2019
@DaveCTurner DaveCTurner deleted the 2019-07-05-prrls-may-not-know-gcps-at-renewal-time branch July 5, 2019 18:59
DaveCTurner added a commit that referenced this pull request Jul 5, 2019
Today when renewing PRRLs we assert that any invalid "backwards" renewals must
be because we are recovering the shard. In fact it's also possible to have
`checkpointState.globalCheckpoint == SequenceNumbers.UNASSIGNED_SEQ_NO` on a
tracked shard copy if the primary was just promoted and hasn't received
checkpoints from all of its peers too.

This commit weakens the assertion to match.

Caught by a [failure of the full cluster restart
tests](https://scans.gradle.com/s/5lllzgqtuegty/console-log#L8605)

Relates #41536
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. >test Issues or PRs that are addressing/adding tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants