ci(test): exit 77 when tools missing to distinguish skip from pass#3055
ci(test): exit 77 when tools missing to distinguish skip from pass#3055bitflicker64 wants to merge 2 commits into
Conversation
All three startup test scripts previously exited 0 when required tools (lsof, curl, java) were not found. This is indistinguishable from a passing test run — CI shows green even though no tests ran. Change skip exits to 77 (conventional skip code) and update the CI workflow steps to treat exit 77 as a visible skip notice rather than a failure. Flagged as non-blocking follow-up in reviews of apache#3044 and apache#3047. Related to: apache#3043
|
@dosu review? |
|
I don't perform code reviews (assessing PRs for bugs, design, style, performance, or security). However, I can help in other ways — for example, I can research how the existing CI workflows or test scripts are structured, answer questions about the conventions used in the repository, or help with documentation updates related to this change. Would any of those be useful? To reply, just mention @dosu. Share context across your team and agents. Try Dosu. |
imbajin
left a comment
There was a problem hiding this comment.
Blocking: yes. Summary: The startup test wrapper still reports skipped suites as successful GitHub Actions steps. Evidence: static review of the workflow wrapper and local dummy-dist runs where the scripts returned 77.
Latest-head CI has visible failures; please also check dependency-check / build-commons (11): 🔗 https://github.com/apache/hugegraph/actions/runs/27120670959/job/80036825383
| set -e | ||
| if [ $EXIT -eq 77 ]; then | ||
| echo "::notice::Startup tests skipped — required tools not available" | ||
| exit 0 |
There was a problem hiding this comment.
Evidence: this wrapper turns the test script 77 into exit 0 after emitting a notice. I reproduced the script side with a dummy dist and restricted PATH: all three startup scripts return 77 when lsof is unavailable, but the workflow wrapper converts that case to a successful step. In GitHub Actions, ::notice:: only creates an annotation; the final exit 0 still marks the startup-test step as success.
Impact: the PR makes the logs clearer, but the check remains green exactly like a real passing startup suite. That keeps the original ambiguity for maintainers who inspect only the check result.
Requested fix: split this into a preflight step that writes can_run=false and a specific skip_reason to $GITHUB_OUTPUT, then guard the real startup-test step with if: steps.<preflight>.outputs.can_run == 'true'. Add a separate notice/summary step for the skip path. While doing that, keep the reason specific so the store ulimit -n < 1024 case is not reported as missing tools.
There was a problem hiding this comment.
Switched to the preflight/guard pattern as requested. The preflight step writes can_run and skip_reason to $GITHUB_OUTPUT; the test step is guarded by if: steps..outputs.can_run == 'true' so it shows as grey Skipped (not green Success) when prerequisites are missing. The Store preflight distinguishes missing tool: from ulimit -n is (store requires >= 1024) with separate skip_reason values. Pushing now.
Replace the exit-77 + set+e wrapper with a preflight step that writes can_run and skip_reason to GITHUB_OUTPUT. The test step is guarded by if: steps.<id>.outputs.can_run == 'true', so it shows as grey Skipped (not green Success) when prerequisites are missing. The Store preflight distinguishes 'missing tool: <name>' from 'ulimit -n is <N> (store requires >= 1024)' with separate skip_reason values as requested in review. Addresses blocking review feedback from imbajin on apache#3055. Related to: apache#3043
Purpose of the PR
The three startup test scripts (
test-start-hugegraph.sh,test-start-hugegraph-pd.sh,test-start-hugegraph-store.sh) exit0when required tools (lsof,curl,java) are not found. Exit code0is indistinguishable from a passing test run — CI shows green even though no tests ran.Main Changes
Test scripts:
exit 0→exit 77in all tool-missing skip pathsexit 0→exit 77in the Storeulimit -nskip pathCI workflows (
server-ci.yml,pd-store-ci.yml):set +e/ exit code capture77as a visible::notice::annotation (not a failure)Verifying these changes
lsoffrom the runner path → CI step shows::notice::... skippedinstead of silently passing0, step passes as before1, step fails as beforeDoes this PR potentially affect the following parts?
Documentation Status
Doc - No Need