Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[backport - sort of - 1.2] Prevent multiple stream processors - fix and narrow test #8101

Merged

Conversation

pihme
Copy link
Contributor

@pihme pihme commented Nov 1, 2021

Description

Changes the transition logic as follows:

  • preparation/cleanup is done for all steps (not just the steps started by the last iteration)
  • preparation/cleanup is done in the context of the next transition, not in the context of the last transition.

As a consequence, preparation/cleanup will be executed more often than transitionTo. This can also be seen in the log for the new test case. Essentially, when a transition is "skipped" then only it's transitionTo is skipped, but the cleanup is executed anyway. I think one could improve that by making the cleanup react to the cancel signal, but I want to be conservative here. Also, multiple cleanup calls should be fast, because if a cleanup succeeds it sets e.g. the stream processor to null in the context, and any subsequent call will do nothing if it finds no stream processor.

Previously:

  • The old transition did clean up the steps that were started by it
  • The cleanup assumed that the transitionTo will immediately follow, but this was not a given. The transitionTo might be cancelled, and might eventually transition to a completely different role.
  • So in essence, it did prepare for a role that maybe never came.

Related issues

closes #8044
subset of #8059
Essentially the same changes as #8062 (develop branch)

Review Hints

This PR is a subset of #8059. It contains the fix and commits related to the first round of review comments.
It does not contain the property based test, which is still the object of further discussion.

The purpose of this PR is to separate consensus from ongoing discussions.

The property based test has been run and in my mind validated the fix. Last week I also ran it with chains of six consecutive transitions and found no inconsistency. This gives me the confidence to merge the fix into the stable branch.

Please also indicate whether you think we need to add the property based test to stable/1.2. In my mind, it will be sufficient to introduce it in develop.

Definition of Done

Code changes:

  • The changes are backwards compatibility with previous versions
  • If it fixes a bug then PRs are created to backport the fix to the last two minor versions. You can trigger a backport by assigning labels (e.g. backport stable/0.25) to the PR, in case that fails you need to create backports manually.

Testing:

  • There are unit/integration tests that verify all acceptance criterias of the issue
  • New tests are written to ensure backwards compatibility with further versions
  • The behavior is tested manually
  • The change has been verified by a QA run
  • The impact of the changes is verified by a benchmark

Documentation:

  • The documentation is updated (e.g. BPMN reference, configuration, examples, get-started guides, etc.)
  • New content is added to the release announcement

Copy link
Member

@Zelldon Zelldon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you mentioned you will not follow my request I don't know what else I should review here (#8059 (comment) )

The purpose of this PR is to separate consensus from ongoing discussions.

The property based test has been run and in my mind validated the fix. Last week I also ran it with chains of six consecutive transitions and found no inconsistency. This gives me the confidence to merge the fix into the stable branch.

Please also indicate whether you think we need to add the property based test to stable/1.2. In my mind, it will be sufficient to introduce it in develop.

You know my opinion. I would like to have it on stable and close to the dev. Since we continue working on dev it might happen that dev property tests are green, but the property test in stable are not right? But you or @deepthidevaki might have an different opinion.

@deepthidevaki
Copy link
Contributor

IMO adding property test to the stable branch is a good idea. But it would be ok to do this as part of a separate PR because we are still discussing about the tests on the other PR.

Copy link
Contributor

@deepthidevaki deepthidevaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
Do we want to toggle the transition feature flag back to using the new one?

One small thing - This PR is mostly the backport of the fix done in develop, right? If yes, I would suggest to change the title as such and may be link the original PR in the PR description.

@pihme
Copy link
Contributor Author

pihme commented Nov 1, 2021

Do we want to toggle the transition feature flag back to using the new one?

Good question. @Zelldon @deepthidevaki @npepinpe What do you think?

@pihme pihme changed the title [1.2] Prevent multiple stream processors - fix and narrow test [backport - sort of - 1.2] Prevent multiple stream processors - fix and narrow test Nov 1, 2021
@pihme pihme requested a review from Zelldon November 1, 2021 15:33
@Zelldon Zelldon removed their request for review November 1, 2021 15:59
@pihme pihme mentioned this pull request Nov 1, 2021
9 tasks
With this commit the stream processor is always set in the transition context, regardless of whether it could
be started successfully or not. The main benefit is that the reference to the newly created stream processor
is always stored in the context, and so it can be found there and closed.

Previously, if the opening of the stream processor failed, it was in an undefined state - potentially running
and potentially registered in ActorScheduler which holds a permanent reference to it.
@pihme pihme force-pushed the 8044-prevent-multiple-stream-processors-fix-and-narrow-test branch from b544ef8 to c6cc35f Compare November 1, 2021 18:30
@pihme
Copy link
Contributor Author

pihme commented Nov 1, 2021

bors merge

ghost pushed a commit that referenced this pull request Nov 1, 2021
8101: [backport - sort of - 1.2] Prevent multiple stream processors - fix and narrow test r=pihme a=pihme

## Description

Changes the transition logic as follows:
- preparation/cleanup is done for all steps (not just the steps started by the last iteration)
- preparation/cleanup is done in the context of the next transition, not in the context of the last transition.

As a consequence, preparation/cleanup will be executed more often than transitionTo. This can also be seen in the log for the new test case. Essentially, when a transition is "skipped" then only it's transitionTo is skipped, but the cleanup is executed anyway. I think one could improve that by making the cleanup react to the cancel signal, but I want to be conservative here. Also, multiple cleanup calls should be fast, because if a cleanup succeeds it sets e.g. the stream processor to null in the context, and any subsequent call will do nothing if it finds no stream processor.

Previously:
- The old transition did clean up the steps that were started by it
- The cleanup assumed that the transitionTo will immediately follow, but this was not a given. The transitionTo might be cancelled, and might eventually transition to a completely different role.
- So in essence, it did prepare for a role that maybe never came.


## Related issues

closes #8044
subset of #8059 
Essentially the same changes as #8062 (develop branch)

## Review Hints
This PR is a subset of #8059. It contains the fix and commits related to the first round of review comments. 
It does not contain the property based test, which is still the object of further discussion.

The purpose of this PR is to separate consensus from ongoing discussions.

The property based test has been run and in my mind validated the fix. Last week I also ran it with chains of six consecutive transitions and found no inconsistency. This gives me the confidence to merge the fix into the stable branch.

Please also indicate whether you think we need to add the property based test to `stable/1.2`. In my mind, it will be sufficient to introduce it in develop.



Co-authored-by: pihme <pihme@users.noreply.github.com>
@ghost
Copy link

ghost commented Nov 1, 2021

Build failed:

Reason: Flaky test

@npepinpe
Copy link
Member

npepinpe commented Nov 1, 2021

We definitely want to enable it on develop. As for 1.2, how confident do we feel about it at the moment? I would defer to the judgement of those who were involved at the moment - however if you want, I can take a deeper look at the PR and the test coverage and give a more detailed opinion.

@pihme
Copy link
Contributor Author

pihme commented Nov 1, 2021

bors retry

ghost pushed a commit that referenced this pull request Nov 1, 2021
8101: [backport - sort of - 1.2] Prevent multiple stream processors - fix and narrow test r=pihme a=pihme

## Description

Changes the transition logic as follows:
- preparation/cleanup is done for all steps (not just the steps started by the last iteration)
- preparation/cleanup is done in the context of the next transition, not in the context of the last transition.

As a consequence, preparation/cleanup will be executed more often than transitionTo. This can also be seen in the log for the new test case. Essentially, when a transition is "skipped" then only it's transitionTo is skipped, but the cleanup is executed anyway. I think one could improve that by making the cleanup react to the cancel signal, but I want to be conservative here. Also, multiple cleanup calls should be fast, because if a cleanup succeeds it sets e.g. the stream processor to null in the context, and any subsequent call will do nothing if it finds no stream processor.

Previously:
- The old transition did clean up the steps that were started by it
- The cleanup assumed that the transitionTo will immediately follow, but this was not a given. The transitionTo might be cancelled, and might eventually transition to a completely different role.
- So in essence, it did prepare for a role that maybe never came.


## Related issues

closes #8044
subset of #8059 
Essentially the same changes as #8062 (develop branch)

## Review Hints
This PR is a subset of #8059. It contains the fix and commits related to the first round of review comments. 
It does not contain the property based test, which is still the object of further discussion.

The purpose of this PR is to separate consensus from ongoing discussions.

The property based test has been run and in my mind validated the fix. Last week I also ran it with chains of six consecutive transitions and found no inconsistency. This gives me the confidence to merge the fix into the stable branch.

Please also indicate whether you think we need to add the property based test to `stable/1.2`. In my mind, it will be sufficient to introduce it in develop.



Co-authored-by: pihme <pihme@users.noreply.github.com>
@pihme
Copy link
Contributor Author

pihme commented Nov 1, 2021

We definitely want to enable it on develop. As for 1.2, how confident do we feel about it at the moment? I would defer to the judgement of those who were involved at the moment - however if you want, I can take a deeper look at the PR and the test coverage and give a more detailed opinion.

On develop the old code has been removed for a month now, so there is no other option right now.

I feel confident to switch to the new code in stable/1.2.

@ghost
Copy link

ghost commented Nov 1, 2021

Build failed:

@pihme
Copy link
Contributor Author

pihme commented Nov 1, 2021

bors retry

@ghost
Copy link

ghost commented Nov 1, 2021

Build succeeded:

@ghost ghost merged commit dfbeff9 into stable/1.2 Nov 1, 2021
@ghost ghost deleted the 8044-prevent-multiple-stream-processors-fix-and-narrow-test branch November 1, 2021 21:04
@pihme pihme mentioned this pull request Nov 2, 2021
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants