-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-1937. Acceptance tests fail if scm webui shows invalid json #1256
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's simpler to add || true
after jq
:
fi \
| jq -r '.beans[0].NodeCount[] | select(.key=="HEALTHY") | .value' \
|| true
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Yes, it's also possible. But my feeling is that
|
True, but I have two concerns about the new approach, not sure how real in practice they are:
Agreed. Maybe it should be set in the caller scripts ( |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Ok. Let's fix this problem with the suggested |
/retest |
1 similar comment
/retest |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @elek for making this change.
💔 -1 overall
This message was automatically generated. |
/retest |
+1, Thanks for the review @adoroszlai . |
Committed to the trunk. Thank for the contribution @elek |
Acceptance test of a nightly build is failed with the following error:
{code}
Creating ozonesecure_datanode_3 ...
�[7A�[2K
Creating ozonesecure_kdc_1 ... �[32mdone�[0m
�[7B�[6A�[2K
Creating ozonesecure_om_1 ... �[32mdone�[0m
�[6B�[8A�[2K
Creating ozonesecure_scm_1 ... �[32mdone�[0m
�[8B�[1A�[2K
Creating ozonesecure_datanode_3 ... �[32mdone�[0m
�[1B�[5A�[2K
Creating ozonesecure_kms_1 ... �[32mdone�[0m
�[5B�[4A�[2K
Creating ozonesecure_s3g_1 ... �[32mdone�[0m
�[4B�[2A�[2K
Creating ozonesecure_datanode_2 ... �[32mdone�[0m
�[2B�[3A�[2K
Creating ozonesecure_datanode_1 ... �[32mdone�[0m
�[3Bparse error: Invalid numeric literal at line 2, column 0
{code}
https://raw.githubusercontent.com/elek/ozone-ci/master/byscane/byscane-nightly-5b87q/acceptance/output.log
The problem is in the script which checks the number of available datanodes.
If the HTTP endpoint of the SCM is already started BUT not ready yet it may return with a simple HTML error message instead of json. Which can not be parsed by jq:
In testlib.sh:
{code}
37 │ if [[ "${SECURITY_ENABLED}" == 'true' ]]; then
38 │ docker-compose -f "${compose_file}" exec -T scm bash -c "kinit -k HTTP/scm@EXAMPL
│ E.COM -t /etc/security/keytabs/HTTP.keytab && curl --negotiate -u : -s '${jmx_url}'"
39 │ else
40 │ docker-compose -f "${compose_file}" exec -T scm curl -s "${jmx_url}"
41 │ fi
42 │ | jq -r '.beans[0].NodeCount[] | select(.key=="HEALTHY") | .value'
{code}
One possible fix is to adjust the error handling (set +x / set -x) per method instead of using a generic set -x at the beginning. It would provide a more predictable behavior. In our case count_datanode should not fail evert (as the caller method: wait_for_datanodes can retry anyway).
See: https://issues.apache.org/jira/browse/HDDS-1937