Skip to content

tools/twamp/signed: agent-side challenge-response inbound probing#3737

Merged
ben-dz merged 7 commits into
mainfrom
bdz/challenge-response-inbound-agent
May 21, 2026
Merged

tools/twamp/signed: agent-side challenge-response inbound probing#3737
ben-dz merged 7 commits into
mainfrom
bdz/challenge-response-inbound-agent

Conversation

@ben-dz
Copy link
Copy Markdown
Contributor

@ben-dz ben-dz commented May 20, 2026

Summary of Changes

  • Adds an opt-in nonce handshake to signed-TWAMP inbound probing. The reflector now issues a fresh 8-byte random nonce on every Probe 0 and places it in Reply0.SinceLastRxNs (previously always 0). On Probe 1, if the sender echoed that nonce in Sec || Frac and re-signed the packet, the reflector sets bit 7 of Reply1.NumOffsets (Challenged flag) — cryptographic proof the sender actually received Reply 0 before sending Probe 1, defeating the pre-emit-Probe-1 attack on SinceLastRxNs.
  • Extends ReplyPacket with a Challenged field encoded as the top bit of byte 212 (lower 7 bits remain the offset count 0–5). The flag is covered by the Ed25519 signature, so a MITM cannot flip it.
  • Backwards compatible in both directions: legacy senders never echo the nonce, so bit 7 stays 0 and the existing NumOffsets ≤ 5 unmarshal check still passes; legacy senders also ignore Reply0.SinceLastRxNs, so populating it with a nonce is invisible to them.
  • Updates RFC16 to document the new wire-format semantics (a new "Challenge-Response Inbound Probing" subsection under Detailed Design, plus byte-table entries for Sec/Frac/SinceLastRxNs/NumOffsets). Folds in the wallet-bound-probing use case that was previously in Future Work.
  • Agent-side only. The follow-up PR adds a --challenged flag to geoprobe-target-sender to actually exercise the new flow; until then the reflector still answers ordinary unchallenged probes exactly as before.

Diff Breakdown

Category Files Lines (+/-) Net
Core logic 2 +61 / -11 +50
Tests 3 +328 / -27 +301
Docs 1 +24 / -15 +9
Total 6 +413 / -53 +360

Test-heavy: 73% of added lines are reflector / packet-codec tests covering the new behavior and the cohabiting bit-packed wire-format byte.

Key files (click to expand)
  • tools/twamp/pkg/signed/reflector_linux.go — adds nonce to per-sender state, generates a fresh nonce on each Probe 0, verifies the echoed nonce on Probe 1 (with state.nonce != 0 guard so a crypto/rand failure doesn't spuriously authenticate), clears the nonce on pair completion / rate-limit reset / stale-pair reset so the documented "0 outside a pair" invariant holds unconditionally
  • tools/twamp/pkg/signed/reflector_test.go — adds tests for non-zero nonce in Reply 0, distinct nonces across pairs (gated by `testing.Short`), Reply 1 challenged-on-match, Reply 1 unchallenged-on-mismatch; updates pre-existing assertions that expected Reply 0 SinceLastRxNs to be 0
  • tools/twamp/pkg/signed/packet_test.go — adds round-trip tests for `Challenged=true`, `Challenged=false`, and the combined case (challenged + non-zero offset count) so the bit-cohabits-with-count path is exercised
  • tools/twamp/pkg/signed/packet.go — adds `Challenged bool` to `ReplyPacket`, named `numOffsetsChallengedBit`/`numOffsetsCountMask` constants, encodes/decodes the flag as the top bit of byte 212, threads `challenged bool` through `NewReplyPacket`
  • rfcs/rfc16-geolocation-verification.md — new Challenge-Response Inbound Probing subsection (flow, why-unchallenged-is-permitted, backwards-compat, wallet-bound use case); byte-table entries updated for `Sec`/`Frac`/`SinceLastRxNs`/`NumOffsets`; resolves a stale "Reply 0 SinceLastRxNs is 0" note in the flow narrative; removes the obsolete Future Work entry that this PR implements

Testing Verification

  • `go test -count=1 ./tools/twamp/pkg/signed/...` green (20s); `./controlplane/telemetry/cmd/geoprobe-agent/...` green
  • `golangci-lint run` on both packages: 0 issues
  • New unit tests directly assert the wire-format byte (`buf[212]&0x80`, `buf[212]&0x7F`) so the bit-packing contract is tested at the byte level
  • End-to-end reflector tests (in-process UDP via the existing Linux harness) exercise both the challenged and unchallenged Probe 1 paths against a real reflector goroutine

Related: RFC16 — see the new "Challenge-Response Inbound Probing" subsection under Detailed Design.
Related Stacked PR: #3738

@ben-dz ben-dz marked this pull request as ready for review May 20, 2026 19:53
@ben-dz ben-dz requested a review from nikw9944 May 20, 2026 19:53
@ben-dz ben-dz force-pushed the bdz/challenge-response-inbound-agent branch from d825351 to 9fdc6c4 Compare May 21, 2026 12:51
@nikw9944
Copy link
Copy Markdown
Contributor

Claude:

The new tests mix t.Fatalf and require.True/require.NotZero for failures that other tests in the same file express viaassert.* + require.NoError. Examples:

  • reflector_test.go:587 uses t.Fatalf for nonce equality, while reflector_test.go:637–642 uses t.Fatalf for Challenged checks — but reflector_test.go:691 uses assert.False for the same kind of check.
  • packet_test.go:578–579 uses require.NotZero / require.Zero on raw bytes; nearby packet_test.go tests use assert.Equal(byte(0), …).

Functionally identical; stylistically inconsistent. Worth a pass to align on require.* for "must hold or test is meaningless" and assert.* for "we want to keep checking after a failure."

Comment thread tools/twamp/pkg/signed/packet.go Outdated
Comment thread tools/twamp/pkg/signed/packet.go
Comment thread tools/twamp/pkg/signed/reflector_linux.go Outdated
- ProbePacket Sec/Frac doc comments mention the challenged Probe 1
  nonce-echo semantic.
- state.nonce=0 comment clarifies that lastTxTime is intentionally
  preserved (overwritten on the next Reply Tx).
- New tests standardize on require.* for setup-critical checks
  (NoError, Len for buffers indexed below) and assert.* for behavior
  assertions, replacing four t.Fatalf bail-outs and several
  require.NotZero/Zero/True/False/Equal on round-trip outcomes.
@ben-dz ben-dz merged commit 9210830 into main May 21, 2026
33 checks passed
@ben-dz ben-dz deleted the bdz/challenge-response-inbound-agent branch May 21, 2026 17:13
ben-dz added a commit that referenced this pull request May 21, 2026
…#3738)

> Stacked on #3737 (RFC16 + reflector). Merge the base PR first.

## Summary of Changes
- Adds an opt-in `--challenged` flag to `geoprobe-target-sender`
(default `false`). When set, the sender extracts the nonce from
`Reply0.SinceLastRxNs`, writes it into `Probe1.Sec || Frac`, signs Probe
1, and only then transmits — proving to the reflector (PR #3737) that it
actually received Reply 0 before issuing Probe 1.
- Default behavior (`--challenged=false`) is **byte-identical on the
wire** to the pre-PR2 sender. The existing `ProbePair` body is moved
verbatim into a private `probePairUnchallenged`; a new
`probePairChallenged` adds the deferred-sign path; the exported
`ProbePair` dispatches on a constructor flag.
- Threads `Challenged` through the existing telemetry: startup
`slog.Info`, per-pair `slog.Debug`, JSON output (with a `"challenged":
bool` field that is intentionally **not** `omitempty` so downstream
consumers always see the mode), and the human-readable text output.
- Trade-off baked into the new path: signing Probe 1 after Reply 0
receipt inflates `Reply1.SinceLastRxNs` by the sender's compute time, in
exchange for cryptographic proof of Reply 0 receipt. This is the
documented RFC16 trade-off — TEE senders should stay unchallenged;
bare-metal senders use `--challenged`.

## Stacking & rollout
- During the rollout window (new sender against an old reflector that
doesn't issue nonces), `Reply0.SinceLastRxNs == 0`, the sender echoes 0
in `Sec/Frac`, and the agent's `state.nonce != 0` guard keeps
`Reply1.Challenged == false`. Sender doesn't crash and the operator sees
`challenged=false` in the per-pair log, indistinguishable from "agent
hasn't been upgraded yet" — which is exactly the desired signal.

## Diff Breakdown (vs PR1 base branch)
| Category     | Files | Lines (+/-) | Net  |
|--------------|-------|-------------|------|
| Core logic   |     5 | +97 / -8    |  +89 |
| Tests        |     2 | +122 / -6   | +116 |
| **Total**    |     7 | +218 / -13  | +205 |

Test-heavy. The core-logic diff is dominated by the new
`probePairChallenged` method (deferred-sign flow) and the
`newChallengedProbePacket` helper; everything else is plumbing
(constructor signature, dispatcher, CLI flag, telemetry surfacing).

<details>
<summary>Key files (click to expand)</summary>

- `tools/twamp/pkg/signed/sender_linux.go` — renames existing
`ProbePair` body to `probePairUnchallenged` (verbatim, no logic change),
adds `probePairChallenged` using `sendAndRecv` for both probes, adds a
4-line `ProbePair` dispatcher that branches on the new `challenged`
field of `LinuxSender`
- `tools/twamp/pkg/signed/sender_test.go` — two new subtests inside
`TestSender_Linux`: `ProbePair challenged mode` (asserts
`Reply1.Challenged == true` and non-zero timing fields against a real
reflector goroutine) and `ProbePair unchallenged mode is still
unchallenged` (asserts the byte-identical legacy path against the same
reflector); 5 existing call sites updated with `false`
- `tools/twamp/pkg/signed/packet.go` — new unexported
`newChallengedProbePacket(seq, signer, nonce)` helper that takes the
nonce as a parameter (does NOT generate one — that's the reflector's job
in PR #3737), sets `Sec = uint32(nonce >> 32)`, `Frac = uint32(nonce &
0xFFFFFFFF)`, and signs the 44-byte payload
- `controlplane/telemetry/cmd/geoprobe-target-sender/main_test.go` —
`TestProbeOutput_ChallengedFieldRoundTrip` (true case: round-trip plus
raw `"challenged":true` byte assertion) and
`TestProbeOutput_ChallengedFalseStillSerialized` (false case: asserts
`"challenged":false` appears so an accidental `omitempty` regression
fails fast)
- `controlplane/telemetry/cmd/geoprobe-target-sender/main.go` —
`--challenged` flag declaration, threaded through `signed.NewSender`;
new `Challenged bool` field on the `probeOutput` JSON struct (no
`omitempty`); challenged status added to startup log, per-pair
`slog.Debug`, JSON output, and text output ("Challenged Inbound:" line)
- `tools/twamp/pkg/signed/sender.go` — `NewSender` constructor gains
trailing `challenged bool` parameter
- `tools/twamp/pkg/signed/stub_fallback.go` — non-Linux `NewLinuxSender`
signature updated to match

</details>

## Testing Verification
- `go test -count=1 ./tools/twamp/pkg/signed/...` green (20s);
`./controlplane/telemetry/cmd/geoprobe-target-sender/...` green
- `golangci-lint run` on both packages: 0 issues
- Sender-side challenged-mode test exercises the full Probe 0 → Reply 0
→ extract-nonce → sign-Probe-1 → Reply 1 round-trip against a real
in-process Linux reflector goroutine, asserting `Reply1.Challenged ==
true`. The unchallenged-mode test exercises the same harness with
`challenged=false` and asserts `Reply1.Challenged == false`, confirming
default behavior is preserved.
- JSON serialization is tested for both `Challenged=true` and
`Challenged=false` so the no-`omitempty` choice is regression-protected.

Related: [RFC16 "Challenge-Response Inbound
Probing"](rfcs/rfc16-geolocation-verification.md) (added in PR #3737) —
see the "Why unchallenged inbound is still permitted" paragraph for the
latency trade-off rationale.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants