Skip to content

Fix empty host list for instance stoppable response#1237

Merged
huizhilu merged 2 commits intoapache:masterfrom
huizhilu:stoppable
Aug 28, 2020
Merged

Fix empty host list for instance stoppable response#1237
huizhilu merged 2 commits intoapache:masterfrom
huizhilu:stoppable

Conversation

@huizhilu
Copy link
Contributor

@huizhilu huizhilu commented Aug 8, 2020

Issues

  • My PR addresses the following Helix issues and references them in the PR description:

Fixed #1232

Description

  • Here are some details about my PR, including screenshots of any UI changes:

Health check stoppable responds with empty hostname.

This PR adds error message of instance check to API response so users have a clear idea of the reason if PERSIST_INTERMEDIATE_ASSIGNMENT is not turned on.

curl -XPOST http://localhost:8100/admin/v2/namespaces/namespacce/clusters/cluster/instances\?command\=stoppable -d ' {"instances": ["instance"], "selection_base": "zone_based", "max_instance": "2", "customized_values": "{}"}' -H "Content-Type: application/json"
{
  "instance_stoppable_parallel" : [ ],
  "instance_not_stoppable_with_reasons" : {
    "instance" : [ "HELIX:INVALID_CONFIG" ]
  }
}

Server log will have the reason logged: Cluster config PERSIST_INTERMEDIATE_ASSIGNMENT is not turned on, which is required for instance stability check.

Tests

  • The following tests are written for this issue:

  • testBatchGetInstancesStoppableChecksWhenException()

  • The following is the result of the "mvn test" command on the appropriate module:

[INFO] Tests run: 166, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 163.995 s - in TestSuite
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 166, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  02:49 min
[INFO] Finished at: 2020-08-07T21:27:36-07:00
[INFO] ------------------------------------------------------------------------

Documentation (Optional)

  • In case of new functionality, my PR adds documentation in the following wiki page:

(Link the GitHub wiki you added)

Commits

  • My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Code Quality

  • My diff has been formatted using helix-style.xml
    (helix-style-intellij.xml if IntelliJ IDE is used)

@huizhilu huizhilu requested a review from junkaixue August 8, 2020 04:20
@huizhilu huizhilu self-assigned this Aug 9, 2020
@huizhilu huizhilu changed the title Add error message of instance check to API response Add error message of instance check to stoppable API response Aug 10, 2020
Copy link
Contributor

@junkaixue junkaixue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error message for server side is clear to find out the fail reason. The more important is let user to understand what is the real fail reason. Do we have that if persist assignment turned off, at least, we gave user info EV and IS does not match.

@huizhilu
Copy link
Contributor Author

Error message for server side is clear to find out the fail reason. The more important is let user to understand what is the real fail reason. Do we have that if persist assignment turned off, at least, we gave user info EV and IS does not match.

@dasahcc This PR is to provide the info to let users know the reason. The root cause is exception/error is not returned as response. This PR adds the exception/error message to not stoppable reason in the return response.
https://github.com/apache/helix/pull/1237/files#diff-c98c0908866d29b64f625e799c198d15R224-R227

The result will be what is the description

curl -XPOST http://localhost:8100/admin/v2/namespaces/namespacce/clusters/cluster/instances\?command\=stoppable -d ' {"instances": ["instance"], "selection_base": "zone_based", "max_instance": "2", "customized_values": "{}"}' -H "Content-Type: application/json"
{
  "instance_stoppable_parallel" : [ ],
  "instance_not_stoppable_with_reasons" : {
    "instance" : [ "HELIX:Instance stability check needs persist assignment cluster config turned on: PERSIST_INTERMEDIATE_ASSIGNMENT = true" ]
  }
}

@junkaixue junkaixue closed this Aug 10, 2020
@junkaixue junkaixue reopened this Aug 10, 2020
@huizhilu huizhilu force-pushed the stoppable branch 3 times, most recently from 77e2956 to 9381e50 Compare August 22, 2020 08:28
@huizhilu
Copy link
Contributor Author

huizhilu commented Aug 22, 2020

Error message for server side is clear to find out the fail reason. The more important is let user to understand what is the real fail reason. Do we have that if persist assignment turned off, at least, we gave user info EV and IS does not match.

@dasahcc I think it is more appropriate to put this config check in INVALID_CONFIG health check, which is the first check in health status check. If configs are invalid, we don't need to do remaining checks like INSTANCE_NOT_STABLE.

The response should be like this

curl -XPOST http://localhost:8100/admin/v2/namespaces/namespacce/clusters/cluster/instances\?command\=stoppable -d ' {"instances": ["instance"], "selection_base": "zone_based", "max_instance": "2", "customized_values": "{}"}' -H "Content-Type: application/json"
{
  "instance_stoppable_parallel" : [ ],
  "instance_not_stoppable_with_reasons" : {
    "instance" : [ "HELIX:INVALID_CONFIG" ]
  }
}

I actually would like to add more info for users to know why. But with current implementation, it is not easy/straightforward to add the message. Maybe ` "HELIX:INVALID_CONFIG:Cluster config PERSIST_INTERMEDIATE_ASSIGNMENT is not turned on"

I think the better one would be a map response for each instance:

{
  "instance_stoppable_parallel" : [ ],
  "instance_not_stoppable_with_reasons" : [{
    "instance" : "hostname",
    "health-check": [ "HELIX:INVALID_CONFIG" ],
    "message": "Cluster config PERSIST_INTERMEDIATE_ASSIGNMENT is not turned on"
  }]
}

@huizhilu huizhilu changed the title Add error message of instance check to stoppable API response Add config check to instance stoppable API response Aug 22, 2020
@huizhilu huizhilu changed the title Add config check to instance stoppable API response Fix empty host list for instance stoppable response Aug 24, 2020
@huizhilu
Copy link
Contributor Author

Thanks, @dasahcc @jiajunwang, for the review.

This PR is ready to be merged, approved by @dasahcc

@huizhilu huizhilu merged commit 54a96c0 into apache:master Aug 28, 2020
@huizhilu huizhilu deleted the stoppable branch August 28, 2020 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Stoppable rest response is empty because of instance check exception

3 participants