-
Notifications
You must be signed in to change notification settings - Fork 562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LogStorageAppender
Actor occupies an Actor Thread forever due to a full backpressue queue
#8540
Labels
area/performance
Marks an issue as performance related
kind/bug
Categorizes an issue or PR as a bug
scope/broker
Marks an issue or PR to appear in the broker section of the changelog
support
Marks an issue as related to a customer support request
version:1.3.2
Marks an issue as being completely or in parts released in 1.3.2
Comments
9 tasks
ghost
pushed a commit
that referenced
this issue
Jan 19, 2022
8582: fix(log/appender): yield thread when experiencing backpressure r=romansmirnov a=romansmirnov ## Description Yield the thread, when the log storage appender experiences backpressure when trying to append the fragments to the log storage. That way, the actual actor task (log storage appender) is resubmitted to the working queue, and the actor thread is released to execute other actor tasks. ## Related issues closes #8540 8605: fix(log/stream): ensure the appender future always gets completed r=romansmirnov a=romansmirnov ## Description * Handles any kind of thrown `Throwable`s in the `LogStream` actor, so that the appender future gets completed exceptionally. * Handles the situation when opening the appender, the `LogStream` actor is supposed to be closed. In this situation, the appender future gets completed exceptionally as well. ## Related issues closes #7992 8615: deps(maven): bump value from 2.8.9-ea-1 to 2.9.0 r=npepinpe a=dependabot[bot] Bumps [value](https://github.com/immutables/immutables) from 2.8.9-ea-1 to 2.9.0. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/immutables/immutables/commits">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.immutables:value&package-manager=maven&previous-version=2.8.9-ea-1&new-version=2.9.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Co-authored-by: Roman <roman.smirnov@camunda.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
This was referenced Jan 19, 2022
ghost
pushed a commit
that referenced
this issue
Jan 20, 2022
ghost
pushed a commit
that referenced
this issue
Jan 20, 2022
This issue was closed.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/performance
Marks an issue as performance related
kind/bug
Categorizes an issue or PR as a bug
scope/broker
Marks an issue or PR to appear in the broker section of the changelog
support
Marks an issue as related to a customer support request
version:1.3.2
Marks an issue as being completely or in parts released in 1.3.2
Describe the bug
The
LogStorageAppender
subscribes to the write buffer (i.e.,Dispatcher
) to read from it and to append the available fragments to theLogStorage
:https://github.com/camunda-cloud/zeebe/blob/73e5c7be9f453e30b5aebc07aae322ee5f82b11e/logstreams/src/main/java/io/camunda/zeebe/logstreams/impl/log/LogStorageAppender.java#L151-L154
However, before appending to the
LogStorage
, theLogStorageAppender
tries to acquire a "token" from alimiter
(i.e., backpressure queue), and only if a token got acquired it appends to theLogStorage
(and the read fragments are marked as read in the write buffer). On the other side, if no token got acquired, the fragments are not appended to theLogStorage
(and the fragments are still marked as not read in the write buffer):https://github.com/camunda-cloud/zeebe/blob/73e5c7be9f453e30b5aebc07aae322ee5f82b11e/logstreams/src/main/java/io/camunda/zeebe/logstreams/impl/log/LogStorageAppender.java#L119-L136
So whenever some fragments are available on the write buffer, the Actor
LogStorageAppender
is submitted to the broker thread group's task queue so that a broker thread can execute this Actor eventually. When executing this Actor, it checks whether there are some fragments are available on the write buffer:https://github.com/camunda-cloud/zeebe/blob/73e5c7be9f453e30b5aebc07aae322ee5f82b11e/util/src/main/java/io/camunda/zeebe/util/sched/channel/ChannelConsumerCondition.java#L43-L48
and if there are some fragments available, it will execute the corresponding Actor Job (i.e., read from the write buffer and try to append to the
LogStorage
).The actor job is executed as long as
ChannelConsumerCondition#poll()
returns true.In a scenario, where the appender's backpressure queue is full (i.e., no token can be acquired from the limiter), the Actor Thread keeps executing the actor job because
ChannelConsumerCondition#poll()
still returns true. And as long as the backpressure is not emptied the Actor Thread will continue executing that job forever.For example, this can happen in the following case:
LogStorageAppender
s actor job, i.e., it reads from the write buffer and tries to append theLogStorage
(andChannelConsumerCondition#poll()
returns always true, but no token can be acquired)LogStorageAppender
is already in.As a consequence, the submitted Actor Task to transition to follower on the Zeebe application layer won't be executed, because the Actor Thread is already occupied by the
LogStorageAppender
Actor which never releases the Actor Thread.Note: This may also happen in other scenarios in which no leader change happens.
What is the impact of that issue?
In the worst case, it can happen that all Actor Threads are occupied by such Actor's (but for different partitions). This results in:
Expected behavior
The
LogStorageAppender
Actor releases the Actor Thread so that other Actor Jobs can be executed.Possible Solutions
ChannelConsumerCondition#poll()
take the backpressure queue into account.Environment:
related to https://jira.camunda.com/browse/SUPPORT-11966
The text was updated successfully, but these errors were encountered: