Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Demote local data member #24617

Merged
merged 30 commits into from Oct 19, 2023

Conversation

tommyk-gears
Copy link
Contributor

Introduces com.hazelcast.cluster.Cluster#demoteLocalDataMember method which transitions the local member from a data/normal member to a lite member. The most dramatic effect might be that a cluster can now transition from a state where it holds data to a state where it does not hold any data, but the cluster is still "alive" (i.e. when demoting the last data member).

The implementation is mostly a blend of member promotion and member shutdown.

Checklist:

  • Labels (Team:, Type:, Source:, Module:) and Milestone set
  • Label Add to Release Notes or Not Release Notes content set
  • Request reviewers if possible
  • Send backports/forwardports if fix needs to be applied to past/future releases
  • New public APIs have @Nonnull/@Nullable annotations
  • New public APIs have @since tags in Javadoc

@tommyk-gears tommyk-gears requested a review from a team as a code owner May 22, 2023 14:04
@hz-devops-test hz-devops-test added the Source: Community PR or issue was opened by a community user label May 22, 2023
@devOpsHazelcast
Copy link
Collaborator

devOpsHazelcast commented May 22, 2023

CLA assistant check
All committers have signed the CLA.

@devOpsHazelcast
Copy link
Collaborator

Can one of the admins verify this patch?

1 similar comment
@devOpsHazelcast
Copy link
Collaborator

Can one of the admins verify this patch?

@vbekiaris
Copy link
Collaborator

run-lab-run

Copy link
Collaborator

@vbekiaris vbekiaris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tommyk-gears thanks a lot for the contribution. I am not sure we should allow removing all data members from the cluster after initial partition arrangement is done, but I didn't want to delay posting my comments further, so here they are. Let me know your thoughts around that.

thanks once more for the contribution!

@srknzl
Copy link
Member

srknzl commented Sep 7, 2023

run-lab-run

@hz-devops-test
Copy link

The job Hazelcast-pr-builder of your PR failed. (Hazelcast internal details: build log, artifacts).
Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log file
---------ERRORS-----------
--------------------------
[ERROR] /home/jenkins/jenkins_slave/workspace/Hazelcast-pr-builder/hazelcast/src/main/java/com/hazelcast/internal/cluster/impl/MembersViewResponse.java:1: Line does not match expected header line of '/*'. [Header]
--------------------------
[ERROR] /home/jenkins/jenkins_slave/workspace/Hazelcast-pr-builder/hazelcast/src/main/java/com/hazelcast/internal/partition/impl/InternalPartitionServiceImpl.java:614: Line is longer than 130 characters (found 137). [LineLength]
--------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-checkstyle-plugin:3.3.0:checkstyle (default) on project hazelcast: An error has occurred in Checkstyle report generation. Failed during checkstyle execution: There are 2 errors reported by Checkstyle 9.3 with /home/jenkins/jenkins_slave/workspace/Hazelcast-pr-builder/checkstyle/checkstyle.xml ruleset. -> [Help 1]
--------------------------
[ERROR] 
--------------------------
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
--------------------------
[ERROR] 
--------------------------
[ERROR] For more information about the errors and possible solutions, please read the following articles:
--------------------------
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
--------------------------
[ERROR] 
--------------------------
[ERROR] After correcting the problems, you can resume the build with the command
--------------------------
[ERROR]   mvn  -rf :hazelcast
--------------------------

@vbekiaris
Copy link
Collaborator

run-lab-run

@hz-devops-test
Copy link

The job Hazelcast-pr-builder of your PR failed. (Hazelcast internal details: build log, artifacts).
Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log file
---------ERRORS-----------
--------------------------
[ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.899 s <<< FAILURE! -- in com.hazelcast.HazelcastCompletableFutureAsyncUsageTest
--------------------------
[ERROR] com.hazelcast.HazelcastCompletableFutureAsyncUsageTest.noClassUsesCompletableFuture -- Time elapsed: 5.898 s <<< FAILURE!
--------------------------
[ERROR] Failures: 
--------------------------
[ERROR]   HazelcastCompletableFutureAsyncUsageTest.noClassUsesCompletableFuture:40 Architecture Violation [Priority: MEDIUM] - Rule 'classes should use only CompletableFuture async methods with explicit executor service' was violated (1 times):
--------------------------
[ERROR] Tests run: 4574, Failures: 1, Errors: 0, Skipped: 17
--------------------------
[ERROR] There are test failures.
--------------------------
[ERROR] Tests run: 17, Failures: 1, Errors: 0, Skipped: 1, Time elapsed: 192.6 s <<< FAILURE! -- in com.hazelcast.jet.kinesis.KinesisIntegrationTest
--------------------------
[ERROR] com.hazelcast.jet.kinesis.KinesisIntegrationTest.dynamicStream_1Shard_splitsDuringData -- Time elapsed: 126.5 s <<< FAILURE!
--------------------------
[ERROR] Failures: 
--------------------------
[ERROR]   KinesisIntegrationTest.dynamicStream_1Shard_splitsDuringData:324->dynamicStream_splitsDuringData:354->assertMessages:637->AbstractKinesisTest.assertMessages:163->HazelcastTestSupport.assertTrueEventually:1269->HazelcastTestSupport.assertTrueEventually:1165->AbstractKinesisTest.lambda$assertMessages$3:180 Messages for key 243 differ!
--------------------------
[ERROR] Tests run: 26, Failures: 1, Errors: 0, Skipped: 3
--------------------------
[ERROR] There are test failures.
--------------------------
--------------------------
-------TEST FAILURE-------
--------------------------
[INFO] Results:
[INFO] 
[ERROR] Failures: 
[ERROR]   HazelcastCompletableFutureAsyncUsageTest.noClassUsesCompletableFuture:40 Architecture Violation [Priority: MEDIUM] - Rule 'classes should use only CompletableFuture async methods with explicit executor service' was violated (1 times):
com.hazelcast.internal.partition.impl.InternalPartitionServiceImpl.checkClusterPartitionRuntimeStates():624 calls CompletableFuture.thenCompose
[INFO] 
[ERROR] Tests run: 4574, Failures: 1, Errors: 0, Skipped: 17
[INFO] 

[ERROR] There are test failures.

Copy link
Contributor

@JamesHazelcast JamesHazelcast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR @tommyk-gears, it's very well presented! Just a few minor comments from me, nothing major, although I do share Vassilis' concerns surrounding the demotion of the last member of a cluster to Lite (as discussed here).

Once we've reached some clarity on that (pending Vassilis' decision), I think we'll basically be ready for approval 👍

@JamesHazelcast
Copy link
Contributor

JamesHazelcast commented Sep 28, 2023

Hi @tommyk-gears, I've discussed with Vassilis and we both agree that a Hazelcast cluster without any data members does not make much sense, so we think it's best if we throw an IllegalStateException when an attempt is made to demote the last data member of a cluster. Let us know what you think 👍

This PR is currently failing one of our archunit tests that enforces an executor to be defined explicitly for non-async method calls in CompletableFuture - so switching thenCompose(x ->x) to thenComposeAsync(x -> x, internalAsyncExecutor) should make it happy 😄

- Remove Versioned interface from Demote(Request|Response)Operation
- Reorder properties in ClusterProperty to have newly added property last
@tommyk-gears
Copy link
Contributor Author

Hi @tommyk-gears, I've discussed with Vassilis and we both agree that a Hazelcast cluster without any data members does not make much sense, so we think it's best if we throw an IllegalStateException when an attempt is made to demote the last data member of a cluster. Let us know what you think 👍

This PR is currently failing one of our archunit tests that enforces an executor to be defined explicitly for non-async method calls in CompletableFuture - so switching thenCompose(x ->x) to thenComposeAsync(x -> x, internalAsyncExecutor) should make it happy 😄

Awesome, thanks for good feedback! I changed the implementation to throw IllegalStateException when trying to demote the last data member, and also reverted the code changes that were made to support demotion of last member.
Also fixed that thenCompose-issue.

@JamesHazelcast
Copy link
Contributor

run-lab-run

@hz-devops-test
Copy link

The job Hazelcast-pr-builder of your PR failed. (Hazelcast internal details: build log, artifacts).
Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log file
---------ERRORS-----------
--------------------------
[ERROR] /home/jenkins/jenkins_slave/workspace/Hazelcast-pr-builder_2/hazelcast/src/main/java/com/hazelcast/internal/partition/operation/DemoteRequestOperation.java:31:8: Unused import - com.hazelcast.nio.serialization.impl.Versioned. [UnusedImports]
--------------------------
[ERROR] /home/jenkins/jenkins_slave/workspace/Hazelcast-pr-builder_2/hazelcast/src/main/java/com/hazelcast/internal/partition/operation/DemoteResponseOperation.java:29:8: Unused import - com.hazelcast.nio.serialization.impl.Versioned. [UnusedImports]
--------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-checkstyle-plugin:3.3.0:checkstyle (default) on project hazelcast: An error has occurred in Checkstyle report generation. Failed during checkstyle execution: There are 2 errors reported by Checkstyle 9.3 with /home/jenkins/jenkins_slave/workspace/Hazelcast-pr-builder_2/checkstyle/checkstyle.xml ruleset. -> [Help 1]
--------------------------
[ERROR] 
--------------------------
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
--------------------------
[ERROR] 
--------------------------
[ERROR] For more information about the errors and possible solutions, please read the following articles:
--------------------------
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
--------------------------
[ERROR] 
--------------------------
[ERROR] After correcting the problems, you can resume the build with the command
--------------------------
[ERROR]   mvn  -rf :hazelcast
--------------------------

@JamesHazelcast
Copy link
Contributor

run-lab-run

Copy link
Contributor

@JamesHazelcast JamesHazelcast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the contribution @tommyk-gears!

@JamesHazelcast
Copy link
Contributor

Hi @hazelcast/apis, could someone please review? TIA!

Copy link
Collaborator

@vbekiaris vbekiaris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for contributing this enhancement @tommyk-gears!

@vbekiaris vbekiaris merged commit e5e13fd into hazelcast:master Oct 19, 2023
3 checks passed
@vbekiaris
Copy link
Collaborator

thank you @tommyk-gears for your contribution 🏅

Fly-Style pushed a commit to Fly-Style/hazelcast that referenced this pull request Oct 31, 2023
Introduces Cluster#demoteLocalDataMember method
which transitions the local member from a data/normal member to a lite
member.
The implementation is mostly a blend of member promotion and member
shutdown.

Co-authored-by: Vassilis Bekiaris <vbekiaris@gmail.com>
Co-authored-by: James Holgate <130981049+JamesHazelcast@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants