claude: add skill for running local kurtosis testnets#20876
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new Claude skill document to guide running and iterating on local Kurtosis ethereum-package testnets against a locally built test/erigon:current image, mirroring the repo’s test-kurtosis-assertoor CI flow.
Changes:
- Introduces a new
kurtosis-testskill with end-to-end local workflow steps (build image → run enclave → monitor → triage → rebuild/rerun loop). - Documents CI-aligned
ethereum-packagebranch pinning guidance and troubleshooting tips.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
yperbasis
left a comment
There was a problem hiding this comment.
Blocking issues
-
The image-freshness check is GNU-only (won't run on macOS). SKILL.md:84
img_ts=$(... | xargs -I{} date -d {} +%s ...)
On BSD date (macOS), -d is "set DST flag", not "parse this string" — verified locally just now: date -d '2025-01-01' +%s → date: illegal option -- d. CLAUDE.md is explicit: "all
scripts and shell code… must be cross platform, for macOS, Windows and Linux." Fix is one Go-template hop:
img_ts=$(docker image inspect test/erigon:current --format '{{.Created.Unix}}' 2>/dev/null || echo 0) -
The freshness check ignores uncommitted changes. SKILL.md:85
src_ts=$(git log -1 --format=%ct)
This compares to the last commit's timestamp. The skill's whole premise is the fix → rebuild → rerun loop where the user (or the skill itself, via Edit) is making local source edits —
those won't be in git log yet, so the check will skip the rebuild and silently run the stale image. Either always rebuild and lean on BuildKit's layer cache, or compare against find
. -name '*.go' -newer -print -quit. Always-rebuild is simpler and matches CI.
Real bugs (medium)
-
Block-progress monitor never honors duration. SKILL.md:120-138 — the while true loop only exits on stall (deadline exceeded). The Inputs table at SKILL.md:24 promises duration=Nm
is "the wall-clock window the monitor watches before declaring stable if no failures trip," but no duration variable appears anywhere in the body. Add either:
end=$(( $(date +%s) + duration_secs ))
while [ "$(date +%s)" -lt "$end" ]; do ...
or drop duration from the inputs. -
The service-name regex won't match kurtosis enclave inspect output. SKILL.md:101, 144
grep -oE '^el-[0-9]+-erigon-[a-z]+'
That output is columnar — UUID, then service name, indented. The ^ anchor will fail; the script will silently set EL_SERVICE="", and every subsequent kurtosis port print "$ENCLAVE" ""
returns nothing, then EL_RPC_PORT is empty, and curl quietly hits port 0. Either drop the anchor (grep -oE 'el-[0-9]+-erigon-[a-z]+') or pipe through awk '$2 ~ /^el-/ {print $2}'.
Same fix for the cross-client log loop at SKILL.md:144.
Worth verifying with the actual CLI
-
kurtosis service exec -- syntax. SKILL.md:200, 202 — newer kurtosis CLI versions take the command as a single quoted string (kurtosis service exec
"ls -la /jwt/"), not -- . Worth confirming against the version pinned for this skill. -
kurtosis service logs --follow=false. SKILL.md:140, 148 — kurtosis service logs is one-shot by default; --follow is what enables tailing. --follow=false may parse fine but is
redundant. Drop the flag.
Smaller stuff
- Inputs duration/auto/max-attempts are listed but never parsed in the bash. SKILL.md:18-30 + body. Fine if the model is the parser, but say so explicitly.
- Triage suggestion eth_call for payload replay. SKILL.md:172 — eth_call runs one tx against committed state; for full payload replay you want debug_traceBlockByNumber /
debug_traceBlockByHash. Trim the eth_call mention. - Style: the skill is noticeably wordier than hive-test/SKILL.md and launch-bal-devnet-2/SKILL.md. Lines like "The user's rule: challenge other implementations against erigon — figure
out who is in the wrong" read like AI commentary. The same point is made by the triage table; the editorializing prose can come out without losing information.
- Inline the config -> ethereum-package branch mapping as a real `case` block (previously referenced an undefined `get_package_ref_from_config_path`). - Add caplin-assertoor.io to the mapping and explicitly note caplin suites must use the erigontech fork. - Reconcile GpuConfig guidance: the 1.18.1 CLI requirement only applies to ethereum-package branches that include the GpuConfig built-in; the pinned branches in the mapping (5.0.1 / 6.1.0 / erigontech fork) all predate it. - Update troubleshooting row to point at the pinned versions instead of the contradictory "<= 6.0.0". - Reorder kurtosis run args to put the package ref immediately after `run`, matching the convention in kurtosis docs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
@yperbasis addressed 5 accepted (substantive issues fixed):
4 challenged (Copilot hallucinations):
1 partially challenged:
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| ```bash | ||
| img_ts=$(docker image inspect test/erigon:current --format '{{.Created}}' 2>/dev/null \ | ||
| | xargs -I{} date -d {} +%s 2>/dev/null || echo 0) |
There was a problem hiding this comment.
The “skip rebuild if image is fresh” snippet uses date -d to parse the Docker image Created timestamp. date -d is GNU-date specific and will fail on macOS/BSD environments, which makes the snippet non-runnable for many local setups. Consider either (a) documenting the GNU date/coreutils requirement for this optimization, or (b) replacing the parsing with a more portable approach (e.g., python3 -c ...), or (c) dropping the optimization and always rebuilding.
| | xargs -I{} date -d {} +%s 2>/dev/null || echo 0) | |
| | python3 -c 'import sys; from datetime import datetime, timezone; s = sys.stdin.read().strip(); print(int(datetime.fromisoformat(s.replace("Z", "+00:00")).astimezone(timezone.utc).timestamp()) if s else 0)' \ | |
| 2>/dev/null || echo 0) |
| ## Monitor | ||
|
|
||
| Three checks run on a polling loop until either (a) `duration` elapses, or (b) any | ||
| failure trips. All three are captured in the run history. | ||
|
|
There was a problem hiding this comment.
This section says the three checks run on a polling loop until either duration elapses or a failure trips, but the only provided loop is the block-height loop and it has no overall duration cutoff (it runs indefinitely unless stalled). Either update the example to include a duration end-time check, or clarify that duration handling is implemented outside these snippets so readers don’t assume the shown loop matches the described behavior.
| -H 'Content-Type: application/json' \ | ||
| -d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' \ | ||
| | jq -r '.result') | ||
| height=$(printf '%d\n' "$height_hex") |
There was a problem hiding this comment.
In the block-height snippet, if the RPC isn’t reachable yet (or jq returns null), height_hex may be empty/null, and printf '%d' will emit an “invalid number” warning and treat it as 0, potentially misclassifying the situation as a stall. Consider explicitly validating that .result matches ^0x[0-9a-fA-F]+$ (and treating anything else as an immediate failure with a helpful message) before converting to an integer.
| height=$(printf '%d\n' "$height_hex") | |
| case "$height_hex" in | |
| 0x[0-9a-fA-F]*) | |
| height=$((16#${height_hex#0x})) | |
| ;; | |
| *) | |
| echo "FAIL: invalid eth_blockNumber result from EL RPC on port ${EL_RPC_PORT}: ${height_hex:-<empty>}" | |
| break | |
| ;; | |
| esac |
|
|
||
| | Argument | Default | Notes | | ||
| |---|---|---| | ||
| | `$1` config path | required | Path to an `ethereum-package` YAML (same schema as `.github/workflows/kurtosis/*.io`). If omitted, list those files and ask the user to pick. | |
There was a problem hiding this comment.
The “If $1 is omitted, list those files” guidance points at .github/workflows/kurtosis/*.io, but that directory contains both ethereum-package args files (participants/participants_matrix) and assertoor playbooks (id/tasks). Listing all *.io makes it easy to pick a non-ethereum-package file that kurtosis run --args-file can’t consume. Consider filtering the suggested list (e.g., only files that contain participants:/participants_matrix: or match *-assertoor.io plus known suites), or explicitly warning that assertoor playbooks in that folder are not valid --args-file inputs.
| | `$1` config path | required | Path to an `ethereum-package` YAML (same schema as `.github/workflows/kurtosis/*.io`). If omitted, list those files and ask the user to pick. | | |
| | `$1` config path | required | Path to an `ethereum-package` YAML. Valid examples live among `.github/workflows/kurtosis/*.io`, but that directory also contains assertoor playbooks that are **not** valid `kurtosis run --args-file` inputs. If omitted, list only the `.io` files that look like ethereum-package args files (for example, files containing `participants:` or `participants_matrix:`) and ask the user to pick from that filtered set. | |
| ## Prerequisites | ||
|
|
||
| 1. **Docker** running: `docker info >/dev/null` should succeed. | ||
| 2. **Kurtosis CLI** installed: `kurtosis version`. Install from | ||
| https://docs.kurtosis.com/install if missing. CLI **≥ 1.18.1** is only needed when | ||
| the `--package@branch` you run includes the `GpuConfig` Starlark built-in (i.e. |
There was a problem hiding this comment.
The prerequisites list doesn’t mention jq (and curl), but multiple later snippets rely on them (jq -r '.result', parsing assertoor API, etc.). Add them as explicit prerequisites (or add a short “requires curl+jq” note) so users don’t hit confusing runtime failures when copy/pasting.
| 3. **Erigon source tree** at the cwd: `Makefile` exists and `go.mod` contains | ||
| `module github.com/erigontech/erigon`. | ||
| 4. **Fork detection** — skim the YAML for fork-epoch keys | ||
| (`{cancun,prague,osaka,fulu,gloas,amsterdam}_fork_epoch`) to know which fork is |
There was a problem hiding this comment.
The fork-epoch key list omits keys that appear in the repo’s kurtosis configs (e.g. electra_fork_epoch in pectra.io, deneb_fork_epoch in caplin configs). This can cause incorrect fork identification when following these steps. Consider expanding the list to cover the fork keys used in .github/workflows/kurtosis/*.io (at least deneb/electra in addition to the newer ones).
| (`{cancun,prague,osaka,fulu,gloas,amsterdam}_fork_epoch`) to know which fork is | |
| (`{deneb,cancun,electra,prague,osaka,fulu,gloas,amsterdam}_fork_epoch`) to know which fork is |
Blocking issues: - Drop the GNU-only `date -d` freshness check; always rebuild instead. BuildKit's layer cache makes the no-op fast and this picks up uncommitted source edits which a `git log` check would miss. Real bugs: - Block-progress monitor now honors `duration_secs`: the loop exits on STABLE after the full window or STALL on no progress. - Switch service-name extraction from `grep -oE '^...'` (which never matches columnar `enclave inspect` output) to awk on field 2. CLI usage: - `kurtosis service exec` takes the command as a single positional arg (multi-word quoted), not `-- <cmd>`. Verified via `--help`. - Drop redundant `--follow=false` from `kurtosis service logs` (one-shot by default). Smaller: - Note that the model parses Inputs args into shell vars. - Replace `eth_call` with `debug_traceBlockByNumber/Hash` for payload replay in the triage table (eth_call runs against committed state). - Trim the "user's rule" editorial prose; the triage table carries the same information. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- RPC validation in monitor: guard `printf '%d'` against non-hex responses (RPC unreachable, jq null), so the loop logs "RPC unreachable" and stays in the stall window without spamming printf "invalid number" warnings or falsely incrementing height. - Add `curl` and `jq` to prerequisites — both are used by the monitor and triage snippets. - Fix fork-epoch key list: configs use CL-side names (`deneb`/`electra`/`fulu`/`gloas`_fork_epoch), not the EL names the previous list had (`cancun`/`prague`/`osaka`/`amsterdam`). Map back to EL names so users still recognise the fork. - Sharpen `$1` config-path guidance: `.github/workflows/kurtosis/` contains both ethereum-package args files and assertoor playbooks. Direct users to `grep -lE '^participants(_matrix)?:'` to filter to valid args files only. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
yperbasis
left a comment
There was a problem hiding this comment.
Real bugs (medium):
- Stall detection can trip before the first block. SKILL.md monitor section. stall_deadline is initialized to start + stall_window (e.g. T+36s for a 12s slot), and is only refreshed
once height > prev. If the genesis-to-first-block gap exceeds 3 × seconds_per_slot — likely on cold-start configs with genesis_delay, e.g. glamsterdam.io has genesis_delay: 20 on a 6s
slot, so first block can land near T+26..32s, dangerously close to the T+18s stall window — a false STALL fires. Fix: gate the stall check on observed progress:
if [ "$prev" -gt 0 ] && [ "$(date +%s)" -gt "$stall_deadline" ]; then
...
fi - Both STALL and STABLE can print. Same block. After break from the STALL branch, the post-loop if [ "$(date +%s)" -ge "$end" ] can also be true if the stall happened to land in the
last poll window. Use a flag (stable=true; … stable=false; break) so you print exactly one outcome.
Doc inconsistencies (low):
- Cleanup section advertises trap but doesn't set one. The header reads Run on success, failure, and signal interruption (trap): but the snippet has no trap '... cleanup ...' EXIT INT
TERM. Either drop the (trap) mention or actually show the trap installation. - Check A pass criterion description ("advances ≥1 within every 2 × seconds_per_slot window") doesn't quite match the code (stall_window = 3 × slot). The 2× is the poll cadence, the
3× is the failure threshold — easy to clarify.
Nits:
- slot_secs=$(grep -E '^\s*seconds_per_slot:' …) works (verified \s parses on macOS BSD grep against the actual YAMLs in .github/workflows/kurtosis/), so no portability fix needed
there. But it would silently break if a YAML quotes the value (seconds_per_slot: "12"); none of the current configs do, so this is a future-proofing note, not a bug. - caplin-assertoor.io is in the mapping table but not in the CI matrix — the author mentioned this is intentional (skill is broader than CI). Fine to keep, just worth knowing.
Real bugs: - Don't trip stall before first block. Configs with `genesis_delay` > `stall_window` (e.g. glamsterdam.io: 20s delay vs 18s window at 6s slots) would fire a false STALL during the pre-genesis gap. Gate the stall check on `prev > 0` so only real post-genesis stalls fire. - Single outcome guarantee. Track outcome in a variable so STALL and STABLE can't both print on the same iteration when the stall lands in the last poll window. Adds NO_PROGRESS as a third explicit outcome for "chain never started" (validators down, etc.) which the previous code would have mislabelled as STABLE. Doc: - Drop the misleading `(trap)` mention from Cleanup — the snippet does not install a trap; the model executes cleanup as a final step. - Fix Check A pass criterion: `2 ×` is poll cadence, `3 ×` is the stall window / failure threshold. Previous text said advance must happen within `2 × seconds_per_slot`, conflating the two. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
addressed |
No description provided.