Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRAFT: ProcessSchedulingService no extra submit #10390

Closed
wants to merge 3 commits into from

Conversation

Zelldon
Copy link
Member

@Zelldon Zelldon commented Sep 16, 2022

Description

@npepinpe I need your input here

I removed the additional submit, but it turns out to be needed for the tests, since Scheduling timers from another Actor is not allowed somehow (?).

The timer will be added to the timer queue which is part of the ActorThread. Tbh I see not really an issue here, even if this is another thread. When the timer is due the ActorTask will be enqueued and run from some ActorThread 🤷 Maybe I oversee something. Do you think we can remove the restriction?

Related issues

closes #10291

Definition of Done

Not all items need to be done depending on the issue and the pull request.

Code changes:

  • The changes are backwards compatibility with previous versions
  • If it fixes a bug then PRs are created to backport the fix to the last two minor versions. You can trigger a backport by assigning labels (e.g. backport stable/1.3) to the PR, in case that fails you need to create backports manually.

Testing:

  • There are unit/integration tests that verify all acceptance criterias of the issue
  • New tests are written to ensure backwards compatibility with further versions
  • The behavior is tested manually
  • The change has been verified by a QA run
  • The impact of the changes is verified by a benchmark

Documentation:

  • The documentation is updated (e.g. BPMN reference, configuration, examples, get-started guides, etc.)
  • New content is added to the release announcement
  • If the PR changes how BPMN processes are validated (e.g. support new BPMN element) then the Camunda modeling team should be informed to adjust the BPMN linting.

Please refer to our review guidelines.

@@ -136,7 +136,7 @@ public ActorFuture<Void> call(final Runnable action) {
* @return
*/
public ScheduledTimer runDelayed(final Duration delay, final Runnable runnable) {
ensureCalledFromWithinActor("runDelayed(...)");
// ensureCalledFromWithinActor("runDelayed(...)");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timer will be added to the timer queue which is part of the ActorThread. Tbh I see not really an issue here, even if this is another thread. When the timer is due the ActorTask will be enqueued and run from some ActorThread 🤷 Maybe I oversee something. Do you think we can remove the restriction?

@npepinpe or @saig0 maybe you have an idea?

Copy link
Member

@npepinpe npepinpe Sep 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC the DeadlineTimerWheel (which the ActorTimerQueue extends) is not thread safe, so modifying it from a different actor may cause race conditions.

It's a little unclear how the check that the current thread's job is, well, us ("us", as in, this actor) - meaning I guess we check if the thread is currently owned by us, so no other concurrent access to the timer queue can be done. I guess that's the reasoning, and why it's OK then to use a non-thread safe queue?

A little like how it's OK to prepend a job to the fast lane when the same check is true (the fast lane being a non-thread safe queue).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good point so it is about the DeadlineTimerWheel, why we have to use our thread. So one possibility would be to make it thread safe? 🤔

Here btw is the check https://github.com/camunda/zeebe/blob/main/scheduler/src/main/java/io/camunda/zeebe/scheduler/ActorControl.java#L363-L377 where we verify whether it is our OWN actor 🙈

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's the one 😄 To me it's more confusing - I mean I know what it does, but I don't think it's obvious the first time you read it what it guarantees 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know about making the actor timer queue thread safe. I guess there's no real harm other than maybe performance, but I wasn't there when this was decided and it's not documented, so maybe we'll find out there was a good reason 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@npepinpe another idea. What do you think of we keep the additional submit, to keep it thread-safe, but create an own actor for the PRocessingScheduleService. This might reduce the potential delays (since the actor only has to schedule and execute timers). This would also solve other issues, like blocking processing actor when having many timers or timeouts. If you agree then I would create another PR where I migrate the ProcessingScheduleService to an own actor with own

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, so upside is that it's not blocking the processing actor anymore (which is nice), downside is that there is some communication between both actors (in the form of the abort condition, the batch writer, etc.) which probably gets more complicated. I don't think it's too crazy for it to have its own writer tbh, and maybe even own abort condition (which is controlled when closing the whole stream processor), so maybe it's not too bad. OTOH I think it's a good idea

@github-actions
Copy link
Contributor

Test Results

   547 files   -    312     547 suites   - 312   10m 34s ⏱️ - 1h 41m 26s
3 744 tests  - 3 082  3 738 ✔️  - 3 077  6 💤  - 5  0 ±0 
3 905 runs   - 3 105  3 899 ✔️  - 3 100  6 💤  - 5  0 ±0 

Results for commit 222f027. ± Comparison against base commit 9649e4d.

zeebe-bors-camunda bot added a commit that referenced this pull request Sep 21, 2022
10428: Remove the extra submit from ProcessingScheduleServiceImpl r=Zelldon a=Zelldon

## Description


And another round! 😁 After #10414 and #10390 
I think I found a good way to replace the additional Actor#submit from the production code and move it to where it is necessary in tests.

As described here #10414 (comment) it is not easily possible to create a separate Actor for the ScheduleService, due to concurrency issues and further work we would need to do here.

## Details

In this PR I iterate over the ProcessingScheduleServiceImpl and remove the StreamProcessorContext from the constructor.
The service gets all necessary dependencies injected on the constructor (except the actor), mostly as suppliers. This allows to lazy load dependencies and reduce dependencies for tests. It allowed to write some more unit-test like tests.

The service gets an open method to initialize the service with the actor control and create its own writer. This allows us to no longer share the writer with the StreamProcessor, since this might lead to issues especially if we now run in between (during processing phases).

Due to the later initializing of the writers we reduce resources on followers, if the service is never scheduled we don't need to create a writer.

When the StreamProcessor is in the closing phase it will close the scheduled service as well.
<!-- Please explain the changes you made here. -->

## Related issues

<!-- Which issues are closed by this PR or are related -->

closes #10414
closes #10390 
closes #10291



Co-authored-by: Christopher Zell <zelldon91@googlemail.com>
zeebe-bors-camunda bot added a commit that referenced this pull request Sep 21, 2022
10418: Add monitor backup API endpoint r=npepinpe a=npepinpe

## Description

This PR extends the backup management endpoint to add a monitoring API under `GET /{id}` (e.g. if the actuator is at `http://localhost:9600/actuator/backups`, then `GET http://localhost:9600/actuator/backups/{id}`).

At the moment we use the internal types provided by the broker client as the response types, but in the future this will be replaced by the OpenAPI generated types. As such, I didn't think it was necessary to create user facing types and write some mapping code, since we will scrape it and replace it. However, I understand that we don't know when that will done, so I'm open to doing it anyway for safety.

The acceptance tests were updated (and renamed to better reflect what they do), and as they overlap with the old `BackupIT` tests, those were deleted.

## Related issues

closes #9902 



10428: Remove the extra submit from ProcessingScheduleServiceImpl r=Zelldon a=Zelldon

## Description


And another round! 😁 After #10414 and #10390 
I think I found a good way to replace the additional Actor#submit from the production code and move it to where it is necessary in tests.

As described here #10414 (comment) it is not easily possible to create a separate Actor for the ScheduleService, due to concurrency issues and further work we would need to do here.

## Details

In this PR I iterate over the ProcessingScheduleServiceImpl and remove the StreamProcessorContext from the constructor.
The service gets all necessary dependencies injected on the constructor (except the actor), mostly as suppliers. This allows to lazy load dependencies and reduce dependencies for tests. It allowed to write some more unit-test like tests.

The service gets an open method to initialize the service with the actor control and create its own writer. This allows us to no longer share the writer with the StreamProcessor, since this might lead to issues especially if we now run in between (during processing phases).

Due to the later initializing of the writers we reduce resources on followers, if the service is never scheduled we don't need to create a writer.

When the StreamProcessor is in the closing phase it will close the scheduled service as well.
<!-- Please explain the changes you made here. -->

## Related issues

<!-- Which issues are closed by this PR or are related -->

closes #10414
closes #10390 
closes #10291



Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>
Co-authored-by: Christopher Zell <zelldon91@googlemail.com>
@Zelldon Zelldon deleted the zell-no-extra-submit branch March 28, 2024 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove the additional submit
2 participants