claude: add skill for running local kurtosis testnets by taratorio · Pull Request #20876 · erigontech/erigon

taratorio · 2026-04-28T12:51:36Z

No description provided.

Copilot

Pull request overview

Adds a new Claude skill document to guide running and iterating on local Kurtosis ethereum-package testnets against a locally built test/erigon:current image, mirroring the repo’s test-kurtosis-assertoor CI flow.

Changes:

Introduces a new kurtosis-test skill with end-to-end local workflow steps (build image → run enclave → monitor → triage → rebuild/rerun loop).
Documents CI-aligned ethereum-package branch pinning guidance and troubleshooting tips.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

yperbasis

Blocking issues

The image-freshness check is GNU-only (won't run on macOS). SKILL.md:84
img_ts=$(... | xargs -I{} date -d {} +%s ...)
On BSD date (macOS), -d is "set DST flag", not "parse this string" — verified locally just now: date -d '2025-01-01' +%s → date: illegal option -- d. CLAUDE.md is explicit: "all
scripts and shell code… must be cross platform, for macOS, Windows and Linux." Fix is one Go-template hop:
img_ts=$(docker image inspect test/erigon:current --format '{{.Created.Unix}}' 2>/dev/null || echo 0)
The freshness check ignores uncommitted changes. SKILL.md:85
src_ts=$(git log -1 --format=%ct)
This compares to the last commit's timestamp. The skill's whole premise is the fix → rebuild → rerun loop where the user (or the skill itself, via Edit) is making local source edits —
those won't be in git log yet, so the check will skip the rebuild and silently run the stale image. Either always rebuild and lean on BuildKit's layer cache, or compare against find
. -name '*.go' -newer -print -quit. Always-rebuild is simpler and matches CI.

Real bugs (medium)

Block-progress monitor never honors duration. SKILL.md:120-138 — the while true loop only exits on stall (deadline exceeded). The Inputs table at SKILL.md:24 promises duration=Nm
is "the wall-clock window the monitor watches before declaring stable if no failures trip," but no duration variable appears anywhere in the body. Add either:
end=$(( $(date +%s) + duration_secs ))
while [ "$(date +%s)" -lt "$end" ]; do ...
or drop duration from the inputs.
The service-name regex won't match kurtosis enclave inspect output. SKILL.md:101, 144
grep -oE '^el-[0-9]+-erigon-[a-z]+'
That output is columnar — UUID, then service name, indented. The ^ anchor will fail; the script will silently set EL_SERVICE="", and every subsequent kurtosis port print "$ENCLAVE" ""
returns nothing, then EL_RPC_PORT is empty, and curl quietly hits port 0. Either drop the anchor (grep -oE 'el-[0-9]+-erigon-[a-z]+') or pipe through awk '$2 ~ /^el-/ {print $2}'.
Same fix for the cross-client log loop at SKILL.md:144.

Worth verifying with the actual CLI

kurtosis service exec -- syntax. SKILL.md:200, 202 — newer kurtosis CLI versions take the command as a single quoted string (kurtosis service exec
"ls -la /jwt/"), not -- . Worth confirming against the version pinned for this skill.
kurtosis service logs --follow=false. SKILL.md:140, 148 — kurtosis service logs is one-shot by default; --follow is what enables tailing. --follow=false may parse fine but is
redundant. Drop the flag.

Smaller stuff

Inputs duration/auto/max-attempts are listed but never parsed in the bash. SKILL.md:18-30 + body. Fine if the model is the parser, but say so explicitly.
Triage suggestion eth_call for payload replay. SKILL.md:172 — eth_call runs one tx against committed state; for full payload replay you want debug_traceBlockByNumber /
debug_traceBlockByHash. Trim the eth_call mention.
Style: the skill is noticeably wordier than hive-test/SKILL.md and launch-bal-devnet-2/SKILL.md. Lines like "The user's rule: challenge other implementations against erigon — figure
out who is in the wrong" read like AI commentary. The same point is made by the triage table; the editorializing prose can come out without losing information.

- Inline the config -> ethereum-package branch mapping as a real `case` block (previously referenced an undefined `get_package_ref_from_config_path`). - Add caplin-assertoor.io to the mapping and explicitly note caplin suites must use the erigontech fork. - Reconcile GpuConfig guidance: the 1.18.1 CLI requirement only applies to ethereum-package branches that include the GpuConfig built-in; the pinned branches in the mapping (5.0.1 / 6.1.0 / erigontech fork) all predate it. - Update troubleshooting row to point at the pinned versions instead of the contradictory "<= 6.0.0". - Reorder kurtosis run args to put the package ref immediately after `run`, matching the convention in kurtosis docs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

taratorio · 2026-04-28T16:24:10Z

@yperbasis addressed

5 accepted (substantive issues fixed):

Inlined the case "$(basename "$CONFIG")" block — was previously calling an
undefined get_package_ref_from_config_path.
Added caplin-assertoor.io to both the mapping table and the case block (was
missing — would have hit the 5.0.1 default fallback).
Reconciled the GpuConfig contradiction in Prerequisites (1.18.1 only needed for
branches that include the built-in; the pinned branches all predate it).
Reconciled the GpuConfig contradiction in Troubleshooting (was saying "≤ 6.0.0"
while the rest of the doc pins to 6.1.0).
Reordered kurtosis run args to put the package ref immediately after run, per
kurtosis-docs convention.

4 challenged (Copilot hallucinations):

Four separate comments claimed tables used ||-prefixed rows. They don't — every
table in the file uses standard single-pipe markdown. Verified with grep -n '^||'
SKILL.md returning nothing. Pushed back on each with the verification.

1 partially challenged:

The kurtosis run ordering comment claimed wrong order would cause "CLI parse
errors". I accepted the reorder for stylistic consistency but flagged that the
reasoning is incorrect — kurtosis is Cobra-based and accepts flags in any order.

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-29T03:28:18Z

+
+```bash
+img_ts=$(docker image inspect test/erigon:current --format '{{.Created}}' 2>/dev/null \
+         | xargs -I{} date -d {} +%s 2>/dev/null || echo 0)


The “skip rebuild if image is fresh” snippet uses date -d to parse the Docker image Created timestamp. date -d is GNU-date specific and will fail on macOS/BSD environments, which makes the snippet non-runnable for many local setups. Consider either (a) documenting the GNU date/coreutils requirement for this optimization, or (b) replacing the parsing with a more portable approach (e.g., python3 -c ...), or (c) dropping the optimization and always rebuilding.

Suggested change

| xargs -I{} date -d {} +%s 2>/dev/null || echo 0)

| python3 -c 'import sys; from datetime import datetime, timezone; s = sys.stdin.read().strip(); print(int(datetime.fromisoformat(s.replace("Z", "+00:00")).astimezone(timezone.utc).timestamp()) if s else 0)' \

2>/dev/null || echo 0)

Copilot · 2026-04-29T03:28:18Z

+## Monitor
+
+Three checks run on a polling loop until either (a) `duration` elapses, or (b) any
+failure trips. All three are captured in the run history.
+


This section says the three checks run on a polling loop until either duration elapses or a failure trips, but the only provided loop is the block-height loop and it has no overall duration cutoff (it runs indefinitely unless stalled). Either update the example to include a duration end-time check, or clarify that duration handling is implemented outside these snippets so readers don’t assume the shown loop matches the described behavior.

Copilot · 2026-04-29T03:28:19Z

+    -H 'Content-Type: application/json' \
+    -d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' \
+    | jq -r '.result')
+  height=$(printf '%d\n' "$height_hex")


In the block-height snippet, if the RPC isn’t reachable yet (or jq returns null), height_hex may be empty/null, and printf '%d' will emit an “invalid number” warning and treat it as 0, potentially misclassifying the situation as a stall. Consider explicitly validating that .result matches ^0x[0-9a-fA-F]+$ (and treating anything else as an immediate failure with a helpful message) before converting to an integer.

Suggested change

height=$(printf '%d\n' "$height_hex")

case "$height_hex" in

0x[0-9a-fA-F]*)

height=$((16#${height_hex#0x}))

;;

*)

echo "FAIL: invalid eth_blockNumber result from EL RPC on port ${EL_RPC_PORT}: ${height_hex:-<empty>}"

break

;;

esac

Copilot · 2026-04-29T03:28:19Z

+
+| Argument | Default | Notes |
+|---|---|---|
+| `$1` config path | required | Path to an `ethereum-package` YAML (same schema as `.github/workflows/kurtosis/*.io`). If omitted, list those files and ask the user to pick. |


The “If $1 is omitted, list those files” guidance points at .github/workflows/kurtosis/*.io, but that directory contains both ethereum-package args files (participants/participants_matrix) and assertoor playbooks (id/tasks). Listing all *.io makes it easy to pick a non-ethereum-package file that kurtosis run --args-file can’t consume. Consider filtering the suggested list (e.g., only files that contain participants:/participants_matrix: or match *-assertoor.io plus known suites), or explicitly warning that assertoor playbooks in that folder are not valid --args-file inputs.

Suggested change

| `$1` config path | required | Path to an `ethereum-package` YAML (same schema as `.github/workflows/kurtosis/*.io`). If omitted, list those files and ask the user to pick. |

| `$1` config path | required | Path to an `ethereum-package` YAML. Valid examples live among `.github/workflows/kurtosis/*.io`, but that directory also contains assertoor playbooks that are **not** valid `kurtosis run --args-file` inputs. If omitted, list only the `.io` files that look like ethereum-package args files (for example, files containing `participants:` or `participants_matrix:`) and ask the user to pick from that filtered set. |

Copilot · 2026-04-29T03:28:19Z

+## Prerequisites
+
+1. **Docker** running: `docker info >/dev/null` should succeed.
+2. **Kurtosis CLI** installed: `kurtosis version`. Install from
+   https://docs.kurtosis.com/install if missing. CLI **≥ 1.18.1** is only needed when
+   the `--package@branch` you run includes the `GpuConfig` Starlark built-in (i.e.


The prerequisites list doesn’t mention jq (and curl), but multiple later snippets rely on them (jq -r '.result', parsing assertoor API, etc.). Add them as explicit prerequisites (or add a short “requires curl+jq” note) so users don’t hit confusing runtime failures when copy/pasting.

Copilot · 2026-04-29T03:28:19Z

+3. **Erigon source tree** at the cwd: `Makefile` exists and `go.mod` contains
+   `module github.com/erigontech/erigon`.
+4. **Fork detection** — skim the YAML for fork-epoch keys
+   (`{cancun,prague,osaka,fulu,gloas,amsterdam}_fork_epoch`) to know which fork is


The fork-epoch key list omits keys that appear in the repo’s kurtosis configs (e.g. electra_fork_epoch in pectra.io, deneb_fork_epoch in caplin configs). This can cause incorrect fork identification when following these steps. Consider expanding the list to cover the fork keys used in .github/workflows/kurtosis/*.io (at least deneb/electra in addition to the newer ones).

Suggested change

(`{cancun,prague,osaka,fulu,gloas,amsterdam}_fork_epoch`) to know which fork is

(`{deneb,cancun,electra,prague,osaka,fulu,gloas,amsterdam}_fork_epoch`) to know which fork is

Blocking issues: - Drop the GNU-only `date -d` freshness check; always rebuild instead. BuildKit's layer cache makes the no-op fast and this picks up uncommitted source edits which a `git log` check would miss. Real bugs: - Block-progress monitor now honors `duration_secs`: the loop exits on STABLE after the full window or STALL on no progress. - Switch service-name extraction from `grep -oE '^...'` (which never matches columnar `enclave inspect` output) to awk on field 2. CLI usage: - `kurtosis service exec` takes the command as a single positional arg (multi-word quoted), not `-- <cmd>`. Verified via `--help`. - Drop redundant `--follow=false` from `kurtosis service logs` (one-shot by default). Smaller: - Note that the model parses Inputs args into shell vars. - Replace `eth_call` with `debug_traceBlockByNumber/Hash` for payload replay in the triage table (eth_call runs against committed state). - Trim the "user's rule" editorial prose; the triage table carries the same information. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- RPC validation in monitor: guard `printf '%d'` against non-hex responses (RPC unreachable, jq null), so the loop logs "RPC unreachable" and stays in the stall window without spamming printf "invalid number" warnings or falsely incrementing height. - Add `curl` and `jq` to prerequisites — both are used by the monitor and triage snippets. - Fix fork-epoch key list: configs use CL-side names (`deneb`/`electra`/`fulu`/`gloas`_fork_epoch), not the EL names the previous list had (`cancun`/`prague`/`osaka`/`amsterdam`). Map back to EL names so users still recognise the fork. - Sharpen `$1` config-path guidance: `.github/workflows/kurtosis/` contains both ethereum-package args files and assertoor playbooks. Direct users to `grep -lE '^participants(_matrix)?:'` to filter to valid args files only. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

yperbasis

Real bugs (medium):

Stall detection can trip before the first block. SKILL.md monitor section. stall_deadline is initialized to start + stall_window (e.g. T+36s for a 12s slot), and is only refreshed
once height > prev. If the genesis-to-first-block gap exceeds 3 × seconds_per_slot — likely on cold-start configs with genesis_delay, e.g. glamsterdam.io has genesis_delay: 20 on a 6s
slot, so first block can land near T+26..32s, dangerously close to the T+18s stall window — a false STALL fires. Fix: gate the stall check on observed progress:
if [ "$prev" -gt 0 ] && [ "$(date +%s)" -gt "$stall_deadline" ]; then
...
fi
Both STALL and STABLE can print. Same block. After break from the STALL branch, the post-loop if [ "$(date +%s)" -ge "$end" ] can also be true if the stall happened to land in the
last poll window. Use a flag (stable=true; … stable=false; break) so you print exactly one outcome.

Doc inconsistencies (low):

Cleanup section advertises trap but doesn't set one. The header reads Run on success, failure, and signal interruption (trap): but the snippet has no trap '... cleanup ...' EXIT INT
TERM. Either drop the (trap) mention or actually show the trap installation.
Check A pass criterion description ("advances ≥1 within every 2 × seconds_per_slot window") doesn't quite match the code (stall_window = 3 × slot). The 2× is the poll cadence, the
3× is the failure threshold — easy to clarify.

Nits:

slot_secs=$(grep -E '^\s*seconds_per_slot:' …) works (verified \s parses on macOS BSD grep against the actual YAMLs in .github/workflows/kurtosis/), so no portability fix needed
there. But it would silently break if a YAML quotes the value (seconds_per_slot: "12"); none of the current configs do, so this is a future-proofing note, not a bug.
caplin-assertoor.io is in the mapping table but not in the CI matrix — the author mentioned this is intentional (skill is broader than CI). Fine to keep, just worth knowing.

Real bugs: - Don't trip stall before first block. Configs with `genesis_delay` > `stall_window` (e.g. glamsterdam.io: 20s delay vs 18s window at 6s slots) would fire a false STALL during the pre-genesis gap. Gate the stall check on `prev > 0` so only real post-genesis stalls fire. - Single outcome guarantee. Track outcome in a variable so STALL and STABLE can't both print on the same iteration when the stall lands in the last poll window. Adds NO_PROGRESS as a third explicit outcome for "chain never started" (validators down, etc.) which the previous code would have mislabelled as STABLE. Doc: - Drop the misleading `(trap)` mention from Cleanup — the snippet does not install a trap; the model executes cleanup as a final step. - Fix Check A pass criterion: `2 ×` is poll cadence, `3 ×` is the stall window / failure threshold. Previous text said advance must happen within `2 × seconds_per_slot`, conflating the two. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

taratorio · 2026-04-29T14:48:10Z

Real bugs (medium):

Stall detection can trip before the first block. SKILL.md monitor section. stall_deadline is initialized to start + stall_window (e.g. T+36s for a 12s slot), and is only refreshed
once height > prev. If the genesis-to-first-block gap exceeds 3 × seconds_per_slot — likely on cold-start configs with genesis_delay, e.g. glamsterdam.io has genesis_delay: 20 on a 6s
slot, so first block can land near T+26..32s, dangerously close to the T+18s stall window — a false STALL fires. Fix: gate the stall check on observed progress:
if [ "$prev" -gt 0 ] && [ "$(date +%s)" -gt "$stall_deadline" ]; then
...
fi

Both STALL and STABLE can print. Same block. After break from the STALL branch, the post-loop if [ "$(date +%s)" -ge "$end" ] can also be true if the stall happened to land in the
last poll window. Use a flag (stable=true; … stable=false; break) so you print exactly one outcome.

Doc inconsistencies (low):

Cleanup section advertises trap but doesn't set one. The header reads Run on success, failure, and signal interruption (trap): but the snippet has no trap '... cleanup ...' EXIT INT
TERM. Either drop the (trap) mention or actually show the trap installation.

Check A pass criterion description ("advances ≥1 within every 2 × seconds_per_slot window") doesn't quite match the code (stall_window = 3 × slot). The 2× is the poll cadence, the
3× is the failure threshold — easy to clarify.

Nits:

slot_secs=$(grep -E '^\s*seconds_per_slot:' …) works (verified \s parses on macOS BSD grep against the actual YAMLs in .github/workflows/kurtosis/), so no portability fix needed
there. But it would silently break if a YAML quotes the value (seconds_per_slot: "12"); none of the current configs do, so this is a future-proofing note, not a bug.

caplin-assertoor.io is in the mapping table but not in the CI matrix — the author mentioned this is intentional (skill is broader than CI). Fine to keep, just worth knowing.

addressed

claude: add skill for running local kurtosis testnets

375be35

taratorio requested review from mh0lt and yperbasis April 28, 2026 12:51

taratorio requested review from AskAlexSharov and Giulio2002 as code owners April 28, 2026 12:51

yperbasis requested a review from Copilot April 28, 2026 13:12

Copilot started reviewing on behalf of yperbasis April 28, 2026 13:13 View session

Copilot AI reviewed Apr 28, 2026

View reviewed changes

yperbasis requested changes Apr 28, 2026

View reviewed changes

AskAlexSharov approved these changes Apr 29, 2026

View reviewed changes

AskAlexSharov requested a review from Copilot April 29, 2026 03:25

Copilot started reviewing on behalf of AskAlexSharov April 29, 2026 03:25 View session

Copilot AI reviewed Apr 29, 2026

View reviewed changes

taratorio and others added 3 commits April 29, 2026 12:58

Merge branch 'main' of github.com:erigontech/erigon into kurtosis_skill

0888077

yperbasis approved these changes Apr 29, 2026

View reviewed changes

taratorio enabled auto-merge April 29, 2026 14:48

taratorio added this pull request to the merge queue Apr 29, 2026

Merged via the queue into main with commit cc38945 Apr 29, 2026
37 checks passed

taratorio deleted the kurtosis_skill branch April 29, 2026 15:58

	\| xargs -I{} date -d {} +%s 2>/dev/null \|\| echo 0)
	\| python3 -c 'import sys; from datetime import datetime, timezone; s = sys.stdin.read().strip(); print(int(datetime.fromisoformat(s.replace("Z", "+00:00")).astimezone(timezone.utc).timestamp()) if s else 0)' \
	2>/dev/null \|\| echo 0)

-  height=$(printf '%d\n' "$height_hex")
+  case "$height_hex" in
+x[0-9a-fA-F]*)
+      height=$((16#${height_hex#0x}))
+      ;;
+    *)
+      echo "FAIL: invalid eth_blockNumber result from EL RPC on port ${EL_RPC_PORT}: ${height_hex:-<empty>}"
+      break
+      ;;
+  esac

	(`{cancun,prague,osaka,fulu,gloas,amsterdam}_fork_epoch`) to know which fork is
	(`{deneb,cancun,electra,prague,osaka,fulu,gloas,amsterdam}_fork_epoch`) to know which fork is

Conversation

taratorio commented Apr 28, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yperbasis left a comment

Choose a reason for hiding this comment

Uh oh!

taratorio commented Apr 28, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

yperbasis left a comment

Choose a reason for hiding this comment

Uh oh!

taratorio commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants