Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a flow control to fix exponential backoff behaviour for CP subsystem [HZ-2702] #25055

Merged
merged 3 commits into from Jul 25, 2023

Conversation

arodionov
Copy link
Contributor

@arodionov arodionov commented Jul 21, 2023

Added flowControlSequenceNumber to Append/InstallSnapshot requests and responses to perform matching between them.
This allows reset the backoff only for the corresponding request.

Fixes #24958

Breaking changes (list specific methods/types/messages):

  • AppendRequest, InstallSnapshot, AppendSuccessResponse, AppendFailureResponse

Checklist:

  • Labels (Team:, Type:, Source:, Module:) and Milestone set
  • Label Add to Release Notes or Not Release Notes content set
  • Request reviewers if possible
  • Send backports/forwardports if fix needs to be applied to past/future releases
  • New public APIs have @Nonnull/@Nullable annotations
  • New public APIs have @since tags in Javadoc

@hz-devops-test
Copy link

The job Hazelcast-pr-builder of your PR failed. (Hazelcast internal details: build log, artifacts).
Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log file

@arodionov
Copy link
Contributor Author

run-lab-run

@arodionov
Copy link
Contributor Author

run-ee-tests

…subsystem [HZ-2702]

Introduce a flowControlSequenceNumber into Append/InstallSnapshot requests and responses, to perform matching between them.
 Added flowControlSequenceNumber to Append/InstallSnapshot requests and responses to perform matching between them.
This allows reset the backoff only for the corresponding request.

Fix hazelcast#24958
@hz-devops-test
Copy link

The job Hazelcast-pr-builder-ee-tests of your PR failed. (Hazelcast internal details: build log, artifacts).
Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log file

@arodionov
Copy link
Contributor Author

run-ee-tests

@arodionov arodionov marked this pull request as ready for review July 21, 2023 17:12
@hz-devops-test
Copy link

The job Hazelcast-pr-builder-ee-tests of your PR failed. (Hazelcast internal details: build log, artifacts).
Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log file

@arodionov
Copy link
Contributor Author

run-ee-tests

@hz-devops-test
Copy link

The job Hazelcast-pr-builder-ee-tests of your PR failed. (Hazelcast internal details: build log, artifacts).
Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log file

@arodionov
Copy link
Contributor Author

run-ee-tests

@hz-devops-test
Copy link

The job Hazelcast-pr-builder-ee-tests of your PR failed. (Hazelcast internal details: build log, artifacts).
Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log file
--------------------------
-------TEST FAILURE-------
--------------------------
[INFO] Results:
[INFO] 
[ERROR] Failures: 
[ERROR]   FailoverTest.testFailover_clientDoesNotTryMemberListAfterSwitch:330->HazelcastTestSupport.assertTrueAllTheTime:1143->lambda$testFailover_clientDoesNotTryMemberListAfterSwitch$6:331 expected:<3> but was:<4>
[ERROR] Errors: 
[ERROR]   PartitionCompactorTest.test_partition_compactor_runs_on_owner_and_backup:82 ? TestTimedOut test timed out after 300000 milliseconds
[INFO] 
[ERROR] Tests run: 11175, Failures: 1, Errors: 1, Skipped: 108
[INFO] 

[ERROR] There are test failures.

@arodionov
Copy link
Contributor Author

run-ee-tests

@hz-devops-test
Copy link

The job Hazelcast-pr-builder-ee-tests of your PR failed. (Hazelcast internal details: build log, artifacts).
Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log file

@arodionov
Copy link
Contributor Author

run-ee-tests

1 similar comment
@arodionov
Copy link
Contributor Author

run-ee-tests

@hz-devops-test
Copy link

The job Hazelcast-pr-builder-ee-tests of your PR failed. (Hazelcast internal details: build log, artifacts).
Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log file

Copy link
Collaborator

@vbekiaris vbekiaris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice job!

NOTICE Outdated Show resolved Hide resolved
@hz-devops-test
Copy link

The job Hazelcast-pr-builder of your PR failed. (Hazelcast internal details: build log, artifacts).
Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log file
--------------------------
-------TEST FAILURE-------
--------------------------
[INFO] Results:
[INFO] 
[ERROR] Failures: 
[ERROR]   MemberJmxMetricsTest.testNoMBeanLeak:64 expected:<0> but was:<1>
[INFO] 
[ERROR] Tests run: 4659, Failures: 1, Errors: 0, Skipped: 11
[INFO] 
[WARNING] Corrupted channel by directly writing to native stream in forked JVM 8. See FAQ web page and the dump file /home/jenkins/jenkins_slave/workspace/Hazelcast-pr-builder_3/hazelcast/target/surefire-reports/2023-07-25T11-10-53_278-jvmRun8.dumpstream

[ERROR] There are test failures.

@arodionov
Copy link
Contributor Author

run-lab-run

@arodionov arodionov merged commit 8e3ffce into hazelcast:master Jul 25, 2023
8 checks passed
arodionov added a commit to arodionov/hazelcast that referenced this pull request Jul 25, 2023
…subsystem [HZ-2702] [5.3.z] (hazelcast#25055)

Added `flowControlSequenceNumber` to Append/InstallSnapshot requests and
responses to perform matching between them.
This allows reset the backoff only for the corresponding request.

Fixes hazelcast#24958

Breaking changes (list specific methods/types/messages):
* `AppendRequest`, `InstallSnapshot`, `AppendSuccessResponse`,`AppendFailureResponse`

(cherry picked from commit 8e3ffce)
arodionov added a commit that referenced this pull request Jul 26, 2023
…subsystem [HZ-2702] [5.3.z] (#25055) (#25074)

Added `flowControlSequenceNumber` to Append/InstallSnapshot requests and
responses to perform matching between them.
This allows reset the backoff only for the corresponding request.

Fixes #24958

Breaking changes (list specific methods/types/messages):
* `AppendRequest`, `InstallSnapshot`,
`AppendSuccessResponse`,`AppendFailureResponse`

(cherry picked from commit 8e3ffce)

Backport of: #25055
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Exponential backoff resets by the AppendResponse from the previous round AppendRequest
4 participants