-
Notifications
You must be signed in to change notification settings - Fork 86
PR to analyze failures based on the Tigers PR. #2038
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: oadp-dev
Are you sure you want to change the base?
Conversation
Original PR: openshift#2034 Author: Tiger Kaovilai <tkaovila@redhat.com> Date: Fri Nov 28 09:17:23 2025 -0500 refactor: update runtime permissions in failure analysis documentation and scripts - Replace `--allowedTools` with `--add-dir` for granting directory access in the analysis script. - Enhance documentation to clarify the use of `--add-dir` and `--allowedTools` for bypassing sandbox CWD restrictions. - Ensure consistent usage of CLI flags across the `analyze_failures.sh` script and design documentation. These changes improve clarity and functionality in the failure analysis process, ensuring proper access to necessary directories during runtime. Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
WalkthroughThese changes integrate Claude AI into the E2E testing pipeline for automated failure analysis. A new permissions configuration is added, documentation describes Claude integration requirements and workflows, the Dockerfile installs Claude CLI and Node.js dependencies, the Makefile's test-e2e target orchestrates failure analysis post-run with CI-specific credential setup, and a new analysis script preprocesses artifacts and invokes Claude via Vertex AI. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Areas requiring extra attention:
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mpryc The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
tests/e2e/scripts/analyze_failures.sh (2)
56-72: Tighten large‑log preprocessing to avoid duplication and follow shell best practicesTwo points worth adjusting here:
- Avoid masking
get_file_sizeexit status (SC2155)
Declaring and assigningfile_sizein one line can hide a failure fromget_file_size. Prefer:- local file_size=$(get_file_size "$log_file") + local file_size + file_size=$(get_file_size "$log_file")
- Prevent double‑processing must‑gather logs
The firstfind "${ARTIFACT_DIR}" ...walk will already include${ARTIFACT_DIR}/must-gather/**, and then you loop over${ARTIFACT_DIR}/must-gatheragain. That can duplicate work and inflate token usage for big runs. One simple fix is to excludemust-gatherfrom the first search:- done < <(find "${ARTIFACT_DIR}" -maxdepth 4 -name "*.log" -type f 2>/dev/null | while read f; do + done < <(find "${ARTIFACT_DIR}" -maxdepth 4 -name "*.log" -type f ! -path "${ARTIFACT_DIR}/must-gather/*" 2>/dev/null | while read f; doThis keeps per‑test logs and must‑gather logs logically separated in the two loops and avoids redundant subagent calls.
Also applies to: 119-167
21-22: Make exit‑code handling and temp‑file cleanup more robustThe failure‑analysis behavior is correct, but a couple of small tweaks will harden the script:
- Initialize and quote
EXIT_CODEdefensivelyRight now
EXIT_CODE=$1and laterif [ $EXIT_CODE -ne 0 ]; thenandexit $EXIT_CODE. This is fine when the caller always passes a numeric code, but it is easy to make the script more bullet‑proof:-EXIT_CODE=$1 +EXIT_CODE=${1:-0} ... -if [ $EXIT_CODE -ne 0 ]; then +if [ "$EXIT_CODE" -ne 0 ]; then ... -exit $EXIT_CODE +exit "$EXIT_CODE"
- Use a quoted single‑quoted trap to avoid premature expansion (SC2064)
- TEMP_OUTPUT=$(mktemp) - trap "rm -f $TEMP_OUTPUT" EXIT + TEMP_OUTPUT=$(mktemp) + trap 'rm -f "$TEMP_OUTPUT"' EXITThis ensures the trap always refers to the actual temp file path and behaves correctly even if the variable were changed later.
Also applies to: 169-181, 188-193, 365-367, 372-373, 432-432
docs/design/claude-prow-failure-analysis_design.md (1)
629-648: Updatetest-e2eMakefile snippet to match the current implementationThe Makefile example here has diverged from the real
test-e2etarget (flag names, provider/credentials wiring, and CI‑only Claude setup). For someone copying this snippet, that mismatch will be confusing and could break local runs.Recommend regenerating this section from the live
Makefile(including theEXIT_CODEhandling andanalyze_failures.shcall) so the design doc remains an accurate reference.build/ci-Dockerfile (1)
8-22: CI image wiring is correct; consider pinning Node/Claude versionsThe changes correctly:
- Ensure
tests/e2e/scripts/analyze_failures.shis executable for CI.- Switch kubectl installation to a multi‑arch flow using
TARGETARCH.- Add Node.js and the global
@anthropic-ai/claude-codeCLI needed by the analysis script.To improve reproducibility of CI images, consider pinning:
- A specific Node.js major/minor (via a fixed NodeSource channel or explicit package version).
- A specific
@anthropic-ai/claude-codeversion (e.g.npm install -g @anthropic-ai/claude-code@<version>).That will make it easier to audit and debug CI behavior over time when dependencies evolve.
Makefile (1)
860-886: EXIT_CODE capture and Claude invocation look good; optionally initialize EXIT_CODE explicitlyThe updated
test-e2etarget correctly:
- Runs ginkgo with the extended args (including
$(HCP_EXTERNAL_ARGS)).- Captures the test exit status into
EXIT_CODEvia... || EXIT_CODE=$$?.- In OpenShift CI, wires up Vertex AI env vars from
/var/run/oadp-credentials/*and invokesanalyze_failures.shwith the captured code.- Exits with the original test result so Prow status is preserved.
For extra robustness against odd environments where
EXIT_CODEmight already be set, you could initialize it in this recipe shell before running ginkgo:test-e2e: test-e2e-setup install-ginkgo EXIT_CODE=0; \ ginkgo run ... || EXIT_CODE=$$?; \ ... exit $$EXIT_CODEThe current behavior is still correct in normal CI usage; this is just a defensive hardening.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (6)
.claude/config.json(1 hunks)CLAUDE.md(1 hunks)Makefile(1 hunks)build/ci-Dockerfile(1 hunks)docs/design/claude-prow-failure-analysis_design.md(1 hunks)tests/e2e/scripts/analyze_failures.sh(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.
Files:
build/ci-DockerfileCLAUDE.mdtests/e2e/scripts/analyze_failures.shMakefiledocs/design/claude-prow-failure-analysis_design.md
🪛 LanguageTool
docs/design/claude-prow-failure-analysis_design.md
[style] ~12-~12: Consider a different adjective to strengthen your wording.
Context: ...analysis is time-consuming and requires deep domain knowledge of Velero, CSI snapsho...
(DEEP_PROFOUND)
[grammar] ~26-~26: Use a hyphen to join words.
Context: ...n't block test result reporting) ## Non Goals - Live cluster diagnostics during...
(QB_NEW_EN_HYPHEN)
[grammar] ~956-~956: Ensure spelling is correct
Context: ...ly more complex plumbing Decision: Chose Option B for better error isolation and...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~1075-~1075: Ensure spelling is correct
Context: ...original test code - Vertex AI timeout (>10min): Script logs timeout, preserves test r...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[style] ~1256-~1256: As an alternative to the over-used intensifier ‘very’, consider replacing this phrase.
Context: ...ifact Sets Question: How to handle very large must-gather archives or many per-test l...
(EN_WEAK_ADJECTIVE)
[style] ~1259-~1259: As an alternative to the over-used intensifier ‘very’, consider replacing this phrase.
Context: ...Claude Code CLI may truncate or fail on very large inputs - Some test runs generate extens...
(EN_WEAK_ADJECTIVE)
🪛 markdownlint-cli2 (0.18.1)
docs/design/claude-prow-failure-analysis_design.md
434-434: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
517-517: Headings must start at the beginning of the line
(MD023, heading-start-left)
527-527: Headings must start at the beginning of the line
(MD023, heading-start-left)
531-531: Headings must start at the beginning of the line
(MD023, heading-start-left)
532-532: Headings must start at the beginning of the line
(MD023, heading-start-left)
533-533: Headings must start at the beginning of the line
(MD023, heading-start-left)
534-534: Headings must start at the beginning of the line
(MD023, heading-start-left)
566-566: Code block style
Expected: fenced; Actual: indented
(MD046, code-block-style)
596-596: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
623-623: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3
(MD001, heading-increment)
632-632: Hard tabs
Column: 1
(MD010, no-hard-tabs)
633-633: Hard tabs
Column: 1
(MD010, no-hard-tabs)
634-634: Hard tabs
Column: 1
(MD010, no-hard-tabs)
635-635: Hard tabs
Column: 1
(MD010, no-hard-tabs)
636-636: Hard tabs
Column: 1
(MD010, no-hard-tabs)
637-637: Hard tabs
Column: 1
(MD010, no-hard-tabs)
638-638: Hard tabs
Column: 1
(MD010, no-hard-tabs)
639-639: Hard tabs
Column: 1
(MD010, no-hard-tabs)
640-640: Hard tabs
Column: 1
(MD010, no-hard-tabs)
641-641: Hard tabs
Column: 1
(MD010, no-hard-tabs)
642-642: Hard tabs
Column: 1
(MD010, no-hard-tabs)
643-643: Hard tabs
Column: 1
(MD010, no-hard-tabs)
644-644: Hard tabs
Column: 1
(MD010, no-hard-tabs)
645-645: Hard tabs
Column: 1
(MD010, no-hard-tabs)
646-646: Hard tabs
Column: 1
(MD010, no-hard-tabs)
647-647: Hard tabs
Column: 1
(MD010, no-hard-tabs)
757-757: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
788-788: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
821-821: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
850-850: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
873-873: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🪛 Shellcheck (0.11.0)
tests/e2e/scripts/analyze_failures.sh
[warning] 62-62: Declare and assign separately to avoid masking return values.
(SC2155)
[warning] 366-366: Use single quotes, otherwise this expands now rather than when signalled.
(SC2064)
🔇 Additional comments (3)
.claude/config.json (1)
1-45: Permissions policy matches intended read‑only, sandboxed usageAllow/deny lists are coherent with the log‑analysis use case and correctly block destructive and networked actions (rm, curl/wget, kubectl apply/delete, docker, git push, WebSearch/WebFetch). No issues from a security/CI safety standpoint.
tests/e2e/scripts/analyze_failures.sh (1)
27-44: Overall flow, safety checks, and redaction look solidThe script’s structure is sound:
- It gates analysis on a non‑zero test exit code and explicit CLI/Vertex configuration checks.
- It keeps analysis read‑only and bounded via
timeout(60s per subagent, 600s overall) and early exits that preserve the original test result.- All user‑visible Claude output is passed through
redact_secrets, which substantially reduces credential‑leakage risk in the generated markdown.No blocking issues here from a correctness or security perspective.
Also applies to: 169-181, 368-424
CLAUDE.md (1)
1-239: Claude integration overview is consistent with the implementationThe description of how Claude hooks into Prow (artifacts used, where
claude-failure-analysis.mdappears, and the opt‑out/credential behavior) matches the bash script and Makefile wiring. This should be a useful high‑level entry point for maintainers.
|
/test 4.20-e2e-test-aws |
|
/test all |
|
@mpryc: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Original PR: #2034
Why the changes were made
How to test the changes made