Skip to content

feat(simulator): v0.5.0 — BTC scenarios + firmware matrix + conventional-commits auto-tag#5

Merged
TaprootFreak merged 6 commits into
DFXswiss:developfrom
joshuakrueger-dfx:joshua/v0.5.0-bitcoin-and-matrix
May 18, 2026
Merged

feat(simulator): v0.5.0 — BTC scenarios + firmware matrix + conventional-commits auto-tag#5
TaprootFreak merged 6 commits into
DFXswiss:developfrom
joshuakrueger-dfx:joshua/v0.5.0-bitcoin-and-matrix

Conversation

@joshuakrueger-dfx
Copy link
Copy Markdown
Contributor

@joshuakrueger-dfx joshuakrueger-dfx commented May 17, 2026

Summary

Grows the simulator baseline from 9 to 14 scenarios, adds full Bitcoin coverage, pins the simulator's deterministic identity, exercises the actual EIP-155 multi-byte v path, introduces a firmware-matrix runner, and replaces the patch-only auto-tag bumper with a tested Conventional-Commits engine.

New scenarios

  1. root_fingerprint_deterministic — pins simulator BIP-32 fingerprint to 0x4c00739d. Acts as the canonical "seed drifted" signal so downstream pinned-output failures aren't N derived symptoms of one root cause.
  2. eth_sign_legacy_polygon_multibyte_v — actually drives ETHSign(chainId=137) to exercise EIP-155 multi-byte v (quirk CC-5). The pre-existing eth_address_polygon_multibyte_v only queried an address, which is chain-id-independent and never tested the v-byte path.
  3. btc_xpub_zpub_mainnet — BIP-84 native-segwit ZPUB shape (zpub prefix + base58 length envelope).
  4. btc_address_p2wpkh_mainnet — bech32 bc1q… at m/84'/0'/0'/0/0.
  5. btc_address_p2tr_taproot — bech32m bc1p… at m/86'/0'/0'/0/0 (distinct firmware codepath: BIP-341 key tweak).
  6. btc_sign_message_mainnet — 64-byte R||S, recId 0..3, 65-byte Electrum envelope with header 31..34.

New testkit API

  • simulator.LaunchVersion(cacheDir, name) + ErrSimulatorNotFound to pin a specific embedded build.
  • simulator.Connect(inst, ConnectOptions{HandshakeTimeout, Logger}) — shared bring-up helper used by both the CLI and the integration test, replacing the per-callsite Noise+channel-hash dance.
  • bitbox-simulator-check --firmware <name|all> flag and MatrixReport wire format. Single-firmware runs remain shape-compatible (MatrixReport.Reports[0]).

New CI

  • go-simulator-matrix job drives the 14-scenario baseline against all 8 embedded firmwares (v9.19.0 → v9.26.1) in parallel on every push. Catches regressions that only surface on older firmwares still in the production tail — the BitBox02 only auto-updates when the user opens the BitBoxApp.
  • TestSimulatorBaselineScenarios integration test runs the full baseline (was: Launch-only smoke test).

Conventional-Commits auto-tag

  • New go/cmd/release-version tool (31 table-driven tests) parses every commit subject + body between the last tag and HEAD as Conventional Commits and picks the highest bump: feat! / BREAKING CHANGE: → MAJOR, feat: → MINOR, everything else → PATCH.
  • auto-tag.yaml replaces the hardcoded PATCH+1 with the tool. Per-commit breakdown is surfaced as a CI log group so reviewers see exactly which commit drove the bump.
  • Practical effect: when this PR merges through developmain, the auto-tagger sees the feat(simulator): and feat(release): commits and emits v0.5.0 automatically — no manual tag intervention needed.
  • CONTRIBUTING.md "Releases" rewritten with the new policy: commit-prefix → bump table, local preview command, manual-release escape hatch for hotfixes.

Action surface

  • bitbox-simulator composite gains firmware: input.
  • Slash-trigger template parses firmware=X and ref=Y modifiers + fail shorthand.
  • Composite-action defaults bumped to v0.5.0; workflow-templates bumped to match.

Hygiene

  • TS FakePairedBitBox: proxy ignores symbol-keyed lookups and returns undefined for then/catch/finally, so awaiting the proxy doesn't infect the chain as thenable. New clearCalls() resets the recorded call log mid-flight.

Docs

  • CHANGELOG entries for v0.5.0 plus backfill for v0.3.2 → v0.4.6 (previously missing).
  • ONBOARDING gains §6 covering the 14 baseline scenarios, matrix mode, slash trigger, and what the simulator validates vs. doesn't (transport still needs a real device).

Test plan

  • CI: 14 jobs green (6 baseline + 8 firmware-matrix)
  • Self-audit on testkit source: 0 findings
  • go test ./cmd/release-version/... — 31/31 green
  • Local preview: go run ./cmd/release-version --base v0.4.6 --report correctly picks v0.5.0 from the actual commit range
  • Verified live against the personal-account testkit's CI — all 14 scenarios pass against every embedded firmware version

…narios

- BtcXpubZpubMainnet: BIP-84 native segwit zpub shape
- BtcAddressP2WPKHMainnet: bech32 bc1q P2WPKH derivation
- BtcAddressP2TRTaproot: bech32m bc1p P2TR derivation
- BtcSignMessageMainnet: 64-byte sig + 65-byte electrum envelope
- RootFingerprintDeterministic: pins 4c00739d (upstream fixture seed)
- EthSignLegacyPolygonMultiByteV: actually exercises CC-5 v-byte path
  (the existing chainId=137 address probe never did — addresses don't
  depend on chainId)

Plus simulator.Connect helper (extracted from cmd/bitbox-simulator-check)
so the integration test, CLI, and any future consumer share the exact
Noise XX + channel-hash-verify bring-up. Integration test now runs the
full BaselineScenarios set on every push, surfacing any firmware drift
or scenario regression at testkit CI time instead of consumer time.

Fake TS proxy: add clearCalls, ignore symbol-keyed lookups, return
undefined for then/catch/finally so awaiting the proxy does not infect
chains as thenable. quirks.test.ts now reads quirks.json directly to
stay self-consistent across releases instead of needing a hardcoded
count bump every time.
The umlaut-rejection scenario's payload had three literal non-ASCII bytes
(ü, ß, ü) in a raw-string Go const, which the audit's quirk-E1 regex
flagged as a critical finding when self-auditing the testkit. Encoding
them as JSON ü / ß keeps the SOURCE pure ASCII while the JSON
parser inside the BitBox SDK still resolves them to the exact same UTF-8
bytes a literal "ü" would produce — the scenario still exercises the
firmware reject path.

Removes 3 false-positive critical findings from the testkit's own
action-selftest job, and from any consumer who ever decides to point
their bitbox-audit at the testkit source tree.
…RDING

- bitbox-simulator-check gains --firmware <name|all>; LaunchVersion +
  ErrSimulatorNotFound let any caller pin a specific embedded build.
- New CI job go-simulator-matrix drives the 14-scenario baseline against
  all 8 embedded firmwares (v9.19.0 → v9.26.1) in parallel on every push.
  Catches regressions that only surface on older firmwares still in the
  production tail — BitBox02 only auto-updates when the user opens the
  BitBoxApp.
- bitbox-simulator composite action exposes firmware: input; slash
  template parses firmware=X and ref=Y modifiers + 'fail' shorthand.
- Composite action defaults: bitbox-audit testkit-ref v0.2.0 → v0.5.0,
  bitbox-simulator v0.4.2 → v0.5.0. Workflow-templates bumped to match.
- CHANGELOG backfilled for every version between v0.3.1 and v0.4.4
  (previously only the v0.1.0/v0.2.0/v0.3.0/v0.3.1 entries existed) and
  the new v0.5.0 entry.
- ONBOARDING gains a §6 simulator section covering the 14 baseline
  scenarios, matrix mode, slash trigger, and what the simulator
  validates vs. doesn't (transport still needs a real device).
- Drop the JSON \u-escape workaround in scenarios.go; the audit-skip-file
  marker TaprootFreak added in PR DFXswiss#2 is the right per-file opt-out for
  intentional non-ASCII test fixtures, and matches the pattern already
  used in core/guards/*.go.
- Backfill CHANGELOG entries for v0.4.5 (Go module rename) and v0.4.6
  (auto-tag + auto-release-pr + audit-skip-file). v0.5.0 entry now points
  at the DFXswiss release URL and references test.yaml (not test.yml).
- ONBOARDING simulator example and ts/src/index.ts JSDoc now reference
  DFXswiss/bitbox-testkit consistently (ts/package.json was already at
  @DFXswiss).
Replaces the hardcoded PATCH+1 logic in .github/workflows/auto-tag.yaml
with a small testable Go tool at go/cmd/release-version. The tool reads
every commit subject + body between the last release tag and HEAD,
parses them as Conventional Commits 1.0, and picks the highest bump:

  feat! / <type>! / BREAKING CHANGE: footer  -> MAJOR
  feat:                                       -> MINOR
  fix:, perf:, refactor:, revert:             -> PATCH
  chore:, ci:, docs:, test:, style:, build:   -> PATCH
  non-conventional subjects                   -> PATCH + warning

A single feat! anywhere in the range promotes the whole release to a
major bump; a single feat: promotes to minor. The aggregator is
pure: 31 table-driven tests in main_test.go lock every classification
arm + the SemVer math + the report shape consumers parse.

The auto-tag workflow now surfaces the per-commit breakdown as a CI
group so reviewers can see exactly which commit voted which way, and
short-circuits cleanly (exit code 4) when the range is empty.

CONTRIBUTING.md "Releases" rewritten with the new policy: a
commit-message -> bump table, the local preview command, and the
manual-release escape hatch for hotfixes.

Practical effect for v0.5.0: the feat(simulator): commit in this PR
will cause the auto-tagger to emit v0.5.0 (not v0.4.7) when the
develop -> main release PR merges, with no manual tag intervention.
@joshuakrueger-dfx joshuakrueger-dfx changed the title v0.5.0: BTC scenarios + firmware matrix + RootFingerprint pin feat(simulator): v0.5.0 — BTC scenarios + firmware matrix + conventional-commits auto-tag May 17, 2026
@joshuakrueger-dfx joshuakrueger-dfx changed the base branch from develop to main May 17, 2026 14:55
@TaprootFreak TaprootFreak changed the base branch from main to develop May 18, 2026 15:20
@TaprootFreak
Copy link
Copy Markdown
Contributor

Maintainer-Review

Joshua nicht erreichbar — Review + Maintainer-Edit selbst durchgeführt.

Maintainer-Edit angewendet

PR-Base von maindevelop umgestellt. Verletzte unsere Konvention (develop ist default branch seit PR #3, ruleset enforced). Joshuas eigener Commit 90abf93 align v0.5.0 with develop conventions zeigt dass er den Stand kannte; vermutlich hat der --base main Default beim PR-create-Zeitpunkt gegriffen. CI bleibt grün (mergeStateStatus: CLEAN).

Lokale Validierung (vor Base-Switch)

Check Result
./scripts/sync-quirks.sh --check ok
go vet ./... clean
go test -race -timeout 60s ./... 10/10 packages pass (inkl. 31/31 release-version Tests)
npm test (ts) 40/40 pass
python3 yaml.safe_load × 9 workflows 9/9 ok
Audit self-test gegen Testkit-Source 0 findings
GitHub CI (post Base-Switch) 14/14 jobs pass (6 base + 8 firmware matrix)

Code-Review-Ergebnis

Stark — gehört in den Stack:

  • release-version Tool: Conventional-Commits-Parser mit 31 Table-driven Tests, klares Exit-Code-Schema (0/2/3/4), Range-aware mit Empty-Range-Short-Circuit. Strikte Verbesserung über unseren PATCH+1 Workflow von PR chore(ci): align with bitbox_flutter — develop default, auto-tag, auto-release-pr #3. Erste Anwendung beim Merge dieses PRs wird v0.5.0 korrekt aus den feat: Commits ableiten.
  • go-simulator-matrix Job: 8 Firmware-Versionen (v9.19.0 → v9.26.1) parallel auf jedem Push. Fängt Regressionen die nur auf älterer Firmware in der Production-Tail auftreten — wichtig weil BitBox02 nur beim BitBoxApp-Start auto-updated.
  • 90abf93 align v0.5.0 with develop conventions: Proaktive Adoption unserer Konventionen:
    • JSON-\u-escape Workaround aus scenarios.go durch unseren audit-skip-file Marker ersetzt (matched core/guards/*.go Pattern aus PR fix(ci): unbreak self-test on quirk-def file + auto-track registry count #2)
    • CHANGELOG backfill v0.4.5 (Go module rename) + v0.4.6 (auto-tag + auto-release-pr + audit-skip-file) — saubere Geschichte
    • test.yaml (nicht .yml) und DFXswiss/bitbox-testkit Pfade konsequent durchgezogen
  • Dual-Tag bleibt erhalten (vX.Y.Z + go/vX.Y.Z) im neuen auto-tag.yaml — Go-Submodul-Resolver-Konvention aus CONTRIBUTING respektiert.
  • Idempotenz: Tag-Existenz-Check refused to overwrite (mit klarer Fehlermeldung). Empty-range exit code 4 → kein leeres Re-Tag.
  • TS Proxy-Hygiene: FakePairedBitBox Proxy ignoriert symbol-keyed Lookups und gibt undefined für then/catch/finallyawait proxy infiziert die Chain nicht mehr als thenable. Subtiler aber wichtiger Fix.

Minor observations (kein Blocker):

  • test.yaml triggered jetzt 14 Jobs auf jedem PR. Matrix-Jobs sind parallel je ~15s, akzeptabel — aber bei wachsender Firmware-Liste irgendwann reviewen.
  • Nach Merge: Consumer-Repos mit testkit-ref: v0.4.x pin (realunit-app, dfx-wallet — siehe Memory) sollten als Follow-up auf v0.5.0 gebumped werden, da action defaults jetzt v0.5.0 empfehlen.

Empfehlung

Merge-Ready. PR ist auf dem Niveau, das wir für den BitBox-Stack haben wollen — Conventional-Commits-Auto-Tag wird ab v0.5.0 die Release-Mechanik korrekt steuern, Firmware-Matrix gibt uns Regression-Coverage über die echte Production-Verteilung.

CHANGELOG had 13 release links pointing at
github.com/joshuakrueger-dfx/bitbox-testkit/releases/tag/vX.Y.Z, but
that account no longer hosts releases — every linked page 404s. The
v0.4.5 entry also pointed at DFXswiss for a release that doesn't
exist yet. All historical release links now point at
DFXswiss/bitbox-testkit consistently; the actual GitHub-Release
backfill for v0.3.2 → v0.4.5 is a separate maintenance task and
doesn't gate the v0.5.0 cut.

auto-tag.yaml now uses `git push --atomic` for the vX.Y.Z + go/vX.Y.Z
pair. Without it, a partial push (server-side ref protection trip,
network blip on the second ref) could leave the repo with one tag
present and the other missing — and the next auto-tag run would fail
the "tag exists" check while consumers' `go install ...@vX.Y.Z` would
still 404 on the missing submodule tag. The --atomic flag tells the
server to apply both updates as a single transaction or neither.
@TaprootFreak
Copy link
Copy Markdown
Contributor

Tieferer Critical-Review — zwei zusätzliche Inkonsistenzen gefixt

Nach noch kritischerem Durchgang sind zwei reale Konsistenz-Probleme aufgefallen, die ich gleich mit-gefixt habe (Commit 791f3db).

1. CHANGELOG: 13 broken Release-Links → DFXswiss-Mirror konsolidiert

CHANGELOG enthielt:

  • 12× Links auf github.com/joshuakrueger-dfx/bitbox-testkit/releases/tag/vX.Y.Z (v0.3.2..v0.4.4) — alle 404 weil dieser Account keine Releases mehr hostet
  • DFXswiss/bitbox-testkit/releases/tag/v0.4.5 — auch 404 weil dort kein Release existiert (DFXswiss-Mirror hat alle Tags, aber bisher nur das v0.4.6 Release)

Sed-Replace auf DFXswiss/bitbox-testkit/releases/tag/ für alle Einträge. Eine Joshua-Referenz bleibt (Zeile 41 der v0.4.5-Beschreibung: "renamed from github.com/joshuakrueger-dfx/bitbox-testkit to …" — historisch korrekt, soll bleiben).

Follow-up: Tags v0.3.2..v0.4.5 brauchen rückwirkende GitHub-Releases im DFXswiss-Mirror damit die Links auch wirklich funktionieren. Nicht Blocker für v0.5.0, separater Maintenance-Task.

2. auto-tag.yaml: Doppel-Tag-Push war nicht atomar

# Vorher:
git push origin "$NEW_TAG" "$GO_TAG"
# Nachher:
git push --atomic origin "$NEW_TAG" "$GO_TAG"

Ohne --atomic macht der Server zwei separate Ref-Updates. Wenn das zweite scheitert (Ref-Protection, Network-Glitch), bleibt das Repo mit vX.Y.Z ohne go/vX.Y.Z zurück:

  • Nächster auto-tag-Lauf scheitert am git rev-parse "$NEW_TAG" "tag exists" Check (exit 1) → Release-Pipeline blockiert
  • Consumer-go install github.com/DFXswiss/bitbox-testkit/go/cmd/...@vX.Y.Z schlägt fehl weil der Submodul-Pfad-Tag fehlt
  • Maintainer muss manuell aufräumen

--atomic lässt den Server beide Refs als Transaktion anwenden oder keine. Hardening für den Pathological Case.

Andere kritisch geprüfte Stellen — alle ok oder bewusst belassen

  • release-version Tool: 8 Test-Funktionen mit 44 table-driven Sub-Cases, klare Exit-Code-Semantik (0/2/3/4), --no-merges filtert Merge-Commits korrekt, \w Type-Pattern ist permissiver als Spec aber harmlos, Idempotenz via git rev-parse Existenz-Check ✓
  • SemVer 0.x → 1.0.0 Transition: feat!: auf v0.4.6 würde direkt zu v1.0.0 springen (statt v0.5.0 wie SemVer für 0.x empfehlen würde). Bewusste Designentscheidung im Tool — Maintainer kontrolliert ob !: benutzt wird, bei aktueller Codebase nicht akut.
  • simulator matrix concurrency: fail-fast: false + identischer Cache-Key für alle 8 Matrix-Jobs → 1 Cache, 8 Reader, kein Race.
  • go run statt go install in Matrix: jede Job-Iteration kompiliert. Build-Cache via actions/setup-go@v5 cache:true macht das vernachlässigbar.
  • Slash-Template fail Keyword: setzt failOnFindings = 'true' aber Default ist eh 'true' → effektiv no-op. Bei Joshua belassen (sein Slash-UI-Design, keine Funktionsregression).
  • TS Proxy Hygiene: then/catch/finally/Symbol-Schutz mit drei dedizierten Tests in ts/test/fake.test.ts
  • connect.go Helper: extrahiert Noise XX + ChannelHash-Verify aus CLI in shared package — saubere Refactor ✓
  • MatrixReport: Shape-kompatibel für N=1 und N=many, ExitCode ist max() der Per-Firmware-Codes ✓

Lokale Re-Validierung post-Edit

Check Result
YAML lint (9 files) 9/9 ok
Go race tests alle Pakete pass
TS tests 40/40
audit self-test 0 findings

CI auf 791f3db läuft.

@TaprootFreak TaprootFreak merged commit 5eeaa55 into DFXswiss:develop May 18, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants