Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LogStorageAppender should use the LeaderRole from correct term #6346

Closed
deepthidevaki opened this issue Feb 16, 2021 · 3 comments · Fixed by #6392 or #6426
Closed

LogStorageAppender should use the LeaderRole from correct term #6346

deepthidevaki opened this issue Feb 16, 2021 · 3 comments · Fixed by #6392 or #6426
Assignees
Labels
kind/bug Categorizes an issue or PR as a bug scope/broker Marks an issue or PR to appear in the broker section of the changelog severity/high Marks a bug as having a noticeable impact on the user with no known workaround

Comments

@deepthidevaki
Copy link
Contributor

deepthidevaki commented Feb 16, 2021

When LogStorageAppender appends a new record, it gets the current LeaderRole object for each append. Since the role transition in raft and zeebe happens asynchronously, when the role on raft has stepdown to follower, LogStorageAppender may be still running. Then the raft role can become a leader again at a newer term. This means that after a step down (and then again becoming a leader), the appender may try to append with a LeaderRole object at a newer term, while the LogStorageAppender is from a previous term.
https://github.com/zeebe-io/zeebe/blob/e76d9256080e4e3331cfdd786210780e9b00ea35/logstreams/src/main/java/io/zeebe/logstreams/storage/atomix/AtomixLogStorage.java#L49

If it was using the old LeaderRole object to append, it would not be able to append because the role is already closed. It would be better to get the LeaderRole object once, immediately after the role transition to leader. It would prevent inconsistencies as observed in
Ref: #6316 (comment)_

@deepthidevaki deepthidevaki added the kind/bug Categorizes an issue or PR as a bug label Feb 16, 2021
@deepthidevaki deepthidevaki added severity/low Marks a bug as having little to no noticeable impact for the user Impact: Data scope/broker Marks an issue or PR to appear in the broker section of the changelog and removed Status: Needs Triage labels Feb 16, 2021
@npepinpe
Copy link
Member

Generally I would like to separate the appender part from the leader role itself - if it helps here then we should do that first.

@npepinpe npepinpe added severity/high Marks a bug as having a noticeable impact on the user with no known workaround Status: Ready and removed severity/low Marks a bug as having little to no noticeable impact for the user Status: Needs Priority labels Feb 16, 2021
@Zelldon
Copy link
Member

Zelldon commented Feb 17, 2021

@deepthidevaki do you have time tomorrow to discuss this?

@Zelldon Zelldon self-assigned this Feb 17, 2021
@deepthidevaki
Copy link
Contributor Author

Yes.

@Zelldon Zelldon mentioned this issue Feb 19, 2021
9 tasks
@ghost ghost closed this as completed in 5871e47 Feb 23, 2021
@ghost ghost closed this as completed in #6392 Feb 23, 2021
ghost pushed a commit that referenced this issue Feb 24, 2021
6427: [Backport 0.25] Use always new appender r=Zelldon a=Zelldon

## Description

Backports #6392 

> Previous if the leader close transition took to long and the broker become leader again it could happen that the new appender was used by the old log storage appender. This is now prevented by not using a supplier. It requests the appender once on leader transition and always uses the same object.
>
> @deepthidevaki you mentioned that you want to prevent stuff based on different term. I think it is already something checked in the ZeebePartition regarding that. If this is not enough for your we can create a follow up issue.
<!-- Please explain the changes you made here. -->

There were some changes necessary since here the journal changes are not available, which means we need to use the ZeebeIndexMapping etc.

## Related issues

<!-- Which issues are closed by this PR or are related -->

closes #6346 

## Definition of Done

_Not all items need to be done depending on the issue and the pull request._

Code changes:
* [ ] The changes are backwards compatibility with previous versions
* [ ] If it fixes a bug then PRs are created to [backport](https://github.com/zeebe-io/zeebe/compare/stable/0.24...develop?expand=1&template=backport_template.md&title=[Backport%200.24]) the fix to the last two minor versions. You can trigger a backport by assigning labels (e.g. `backport stable/0.25`) to the PR, in case that fails you need to create backports manually.

Testing:
* [ ] There are unit/integration tests that verify all acceptance criterias of the issue
* [ ] New tests are written to ensure backwards compatibility with further versions
* [ ] The behavior is tested manually
* [ ] The change has been verified by a QA run
* [ ] The impact of the changes is verified by a benchmark 

Documentation: 
* [ ] The documentation is updated (e.g. BPMN reference, configuration, examples, get-started guides, etc.)
* [ ] New content is added to the [release announcement](https://drive.google.com/drive/u/0/folders/1DTIeswnEEq-NggJ25rm2BsDjcCQpDape)


Co-authored-by: Christopher Zell <zelldon91@googlemail.com>
ghost pushed a commit that referenced this issue Feb 24, 2021
6426: [Backport 0.26] Use always new appender r=Zelldon a=Zelldon

## Description

Backports #6392 

> Previous if the leader close transition took to long and the broker become leader again it could happen that the new appender was used by the old log storage appender. This is now prevented by not using a supplier. It requests the appender once on leader transition and always uses the same object.
>
> @deepthidevaki you mentioned that you want to prevent stuff based on different term. I think it is already something checked in the ZeebePartition regarding that. If this is not enough for your we can create a follow up issue.
<!-- Please explain the changes you made here. -->

There were some changes necessary since here the journal changes are not available, which means we need to use the ZeebeIndexMapping etc.

## Related issues

<!-- Which issues are closed by this PR or are related -->

closes #6346 

## Definition of Done

_Not all items need to be done depending on the issue and the pull request._

Code changes:
* [ ] The changes are backwards compatibility with previous versions
* [ ] If it fixes a bug then PRs are created to [backport](https://github.com/zeebe-io/zeebe/compare/stable/0.24...develop?expand=1&template=backport_template.md&title=[Backport%200.24]) the fix to the last two minor versions. You can trigger a backport by assigning labels (e.g. `backport stable/0.25`) to the PR, in case that fails you need to create backports manually.

Testing:
* [ ] There are unit/integration tests that verify all acceptance criterias of the issue
* [ ] New tests are written to ensure backwards compatibility with further versions
* [ ] The behavior is tested manually
* [ ] The change has been verified by a QA run
* [ ] The impact of the changes is verified by a benchmark 

Documentation: 
* [ ] The documentation is updated (e.g. BPMN reference, configuration, examples, get-started guides, etc.)
* [ ] New content is added to the [release announcement](https://drive.google.com/drive/u/0/folders/1DTIeswnEEq-NggJ25rm2BsDjcCQpDape)


Co-authored-by: Christopher Zell <zelldon91@googlemail.com>
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes an issue or PR as a bug scope/broker Marks an issue or PR to appear in the broker section of the changelog severity/high Marks a bug as having a noticeable impact on the user with no known workaround
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants