Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#956] refactor: Changes the Boolean flag that determines whether a Node is healthy to a state #959

Merged
merged 1 commit into from
Jun 22, 2023

Conversation

yl09099
Copy link
Contributor

@yl09099 yl09099 commented Jun 20, 2023

What changes were proposed in this pull request?

Change the ServerNode health status from the original Boolean judgment to the unhealthy state

Why are the changes needed?

Unhealthy states should not be isolated

Fix: #956

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing unit tests

@@ -251,7 +251,6 @@ message ShuffleServerHeartBeatRequest {
int64 availableMemory = 4;
int32 eventNumInFlush = 5;
repeated string tags = 6;
google.protobuf.BoolValue isHealthy = 7;
Copy link
Contributor

@jerqi jerqi Jun 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't modify the field. You can add a comment // deprecated. If you modify this, this will be incompatible feature. Although we don't release 1.0 version, we don't need to guarantee the compatibility, I still hope we reduce the similar breaking change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll recover

@jerqi jerqi requested a review from zuston June 20, 2023 12:26
@jerqi jerqi changed the title [#956][Improvement] Changes the Boolean flag that determines whether a Node is healthy to a state [#956] refactor: Changes the Boolean flag that determines whether a Node is healthy to a state Jun 20, 2023
ShuffleServerMetrics.gaugeIsHealthy.set(1);
return;
}
}
ShuffleServerMetrics.gaugeIsHealthy.set(0);
isHealthy.set(true);
if (shuffleServer.getServerStatus() == ServerStatus.DECOMMISSIONING) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In ShuffleServer. After decommission is called, there are a large number of application is not complete case, ServerStatus state is DECOMMISSIONING, but we can't change the state of ServerStatus health thread detection

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we assign DECOMMISSIONING to it if its origin value is DECOMMISSIONING? It seems that we don't change value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I made a mistake

@@ -226,7 +226,7 @@ private void initialization() throws Exception {
if (healthCheckEnable) {
List<Checker> builtInCheckers = Lists.newArrayList();
builtInCheckers.add(storageManager.getStorageChecker());
healthCheck = new HealthCheck(isHealthy, shuffleServerConf, builtInCheckers);
healthCheck = new HealthCheck(this, shuffleServerConf, builtInCheckers);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can pass ServerStatus as parameter. We can use AtomicReference as the type of ServerStatus.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me revise it.

private final long checkIntervalMs;
private final Thread thread;
private volatile boolean isStop = false;
private List<Checker> checkers = Lists.newArrayList();

public HealthCheck(AtomicBoolean isHealthy, ShuffleServerConf conf, List<Checker> buildInCheckers) {
this.isHealthy = isHealthy;
public HealthCheck(AtomicReference<ServerStatus> serverStatus,
Copy link
Contributor

@jerqi jerqi Jun 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use style consistent with other places?

public HealthCheck(
    AtomicReference<ServerStatus> serverStatus,
    ShuffleServerConf conf,
    List<Checker> buildInCheckers) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yl09099 Could you address this comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This indent should be 4.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, Have been changed

boolean isHealthy = true;
if (request.hasIsHealthy()) {
isHealthy = request.getIsHealthy().getValue();
/**
* Compatible with isHealthy version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Compatible with isHealthy version
* Compatible with older version

@@ -1,38 +1,26 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we remove the license header?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this file, forgot to add it, let me add it

Copy link
Contributor

@jerqi jerqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @yl09099

@jerqi jerqi merged commit 0e24225 into apache:master Jun 22, 2023
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Improvement] Changes the Boolean flag that determines whether a Node is healthy to a state
2 participants