feat: parametrise .ad replay scripts by kacper-mikolajczak · Pull Request #433 · callstackincubator/agent-device

kacper-mikolajczak · 2026-04-23T08:26:33Z

Resolves: #432

Summary

Add ${VAR} substitution to .ad replay scripts so flows can be reused across app variants, tuned per environment, deduplicated within a file, and driven from CI. Values in .ad were previously literals only - a long label=A || label=B || ... selector repeated across several steps, or wait 1000 / wait 500 scattered across a tree, had no way to be named once and reused.

Fixes #.

Resolution runs at dispatch time in invokeReplayAction. SessionAction keeps raw tokens, so writeReplayScript round-trips unchanged and the recorder stays literal-only. Precedence is high to low: CLI -e > AD_* shell env > file env > built-ins.

Key behaviours:

env KEY=VALUE directives in the script header, quoted values supported for content with spaces or OR-chained selectors.
${VAR} interpolation in positionals and string flag values, ${VAR:-default} fallback, \${ escape.
CLI -e KEY=VALUE (repeatable) on replay and test.
Shell env auto-import: variables prefixed AD_* are imported with the prefix stripped.
Built-ins: AD_PLATFORM, AD_SESSION, AD_FILENAME, AD_DEVICE, AD_ARTIFACTS.
Unresolved ${X} throws with file:line.
replay -u is rejected when the script has env entries - rewrite would drop them. Noted as a v1 limitation.

Before, the same selector and timing repeated inline:

tap label=Settings || label=Preferences || label=More || label=Account
wait 1000
tap label=Settings || label=Preferences || label=More || label=Account
wait 1000
tap label=Settings || label=Preferences || label=More || label=Account
wait 500

After, named once and reused:

env SETTINGS="label=Settings || label=Preferences || label=More || label=Account"
env WAIT_LONG=1000
env WAIT_SHORT=500

tap ${SETTINGS}
wait ${WAIT_LONG}
tap ${SETTINGS}
wait ${WAIT_LONG}
tap ${SETTINGS}
wait ${WAIT_SHORT}

CI or per-variant override without editing the file:

ad replay flow.ad -e SETTINGS="label=Settings" -e WAIT_LONG=500
AD_SETTINGS="label=Settings" ad replay flow.ad

Scope intentionally excluded from v1: includes / sub-scripts, conditionals, loops, computed expressions, cross-file shared vars.

Touched-file count: 10. Scope stayed within the replay script / dispatch path plus CLI plumbing and docs.

Files:

new: src/daemon/handlers/session-replay-vars.ts, src/daemon/handlers/__tests__/session-replay-vars.test.ts
edited: src/cli/commands/generic.ts, src/client-normalizers.ts, src/client-types.ts, src/client.ts, src/daemon/handlers/session-replay-runtime.ts, src/daemon/handlers/session-replay-script.ts, src/utils/command-schema.ts, website/docs/docs/replay-e2e.md

Validation

pnpm format
pnpm check:quick
pnpm vitest run src/daemon/handlers/__tests__/session-replay-vars.test.ts
pnpm test:smoke
pnpm build
git diff --check
Manual ad replay on a sample .ad using env directives, ${VAR:-default} fallback, \${ escape, and an unresolved ${X} to verify the file:line error.
Manual ad replay -e K=V ... and AD_K=V ad replay ... to verify CLI and shell precedence over file env.
Manual ad replay -u on a script with env entries to verify the rejection.

Support ${VAR} substitution with env header directives and CLI -e overrides so flows can be reused across app variants, environments, and devices without duplicating the script. Precedence (high->low): CLI -e > AD_* shell env > file-local env > built-ins (AD_PLATFORM, AD_SESSION, AD_FILENAME, AD_DEVICE, AD_ARTIFACTS). Supports ${VAR:-default} fallback, \${ escape, and fails with file:line on unresolved vars.

thymikee · 2026-04-23T12:18:59Z

+        resolvedPath: resolved,
+      }),
+      fileEnv: metadata.env,
+      shellEnv: collectReplayShellEnv(process.env),


[P2] This reads process.env in the daemon process, so the documented AD_K=V agent-device replay flow only works if that env was present when the daemon started. It also cannot work for a remote daemon. Collect the caller environment in the CLI/client and serialize it into the replay request instead.

Resolved ✅

thymikee · 2026-04-23T12:18:59Z

+  if (typeof device === 'string' && device.length > 0) builtins.AD_DEVICE = device;
+  const artifactsDir = flags.artifactsDir;
+  if (typeof artifactsDir === 'string' && artifactsDir.length > 0) {
+    builtins.AD_ARTIFACTS = artifactsDir;


[P2] AD_ARTIFACTS is derived only from the raw artifactsDir flag. Under agent-device test with the default artifacts directory it is unset, so ${AD_ARTIFACTS} fails despite being documented as available under test. With a custom --artifacts-dir it points at the raw root, not the resolved per-suite artifacts directory that test actually uses.

Resolved ✅

Tighten the parametrisation surface introduced in the feat commit after an internal review pass: - Reserve the AD_* namespace for built-ins. User env (file env, CLI -e, shell AD_VAR_*) can no longer define AD_* keys, which closes a built-in shadowing vector (e.g. AD_VAR_AD_SESSION). - Change the shell-env prefix from AD_* to AD_VAR_* so unrelated CI secrets that happen to start with AD_ (AD_TOKEN, AD_SECRET_KEY) are not auto-imported into replay scripts. - Extend the replay -u guard to also reject scripts with \${VAR} substitutions in any action, not just those with env directives, so the writer never silently drops substitutions on heal-rewrite. - Reword DX-unfriendly regex errors ("must match /^[A-Z_]...$/" -> "must be uppercase letters, digits, and underscores, e.g. APP_ID"). - Docs rewrite with precedence table, three recipes, fallback/escape examples, and a Notes block covering replay -u limitation, remote daemon caveat, no nested fallback, and loud typo behaviour. - Additional unit tests: namespace reservation on every path, shell prefix migration, \${VAR} round-trip preservation through writeReplayScript, green-path integration test with a fake invoke.

Review feedback (callstackincubator#433 (comment)): the daemon was reading AD_VAR_* from its own process.env, which meant "AD_VAR_K=V agent-device replay" only worked if V was set when the daemon started, and never worked for remote daemons. The client now filters process.env for AD_VAR_* at request time and ships the result as replayShellEnv on the DaemonRequest flags. The daemon prefers the request value when present and falls back to its own process.env for direct-daemon callers (internal tests). Adds two integration tests pinning both paths and updates the docs Notes block to reflect the new behaviour.

Review feedback (callstackincubator#433 (comment)): AD_ARTIFACTS is documented as available under "agent-device test", but buildReplayBuiltinVars was only reading the raw flags.artifactsDir. Under the default artifacts layout the flag is unset and \${AD_ARTIFACTS} failed; when set with --artifacts-dir it pointed at the suite root, not the resolved per-attempt directory the test runner actually writes to. Plumb the attempt-level artifacts dir from session-test.ts through the runReplay callback (via runReplayTestAttempt) down into the nested replay request's flags.artifactsDir. The daemon side is unchanged - buildReplayBuiltinVars just now sees the right value. Extracts the nested-request flag merge into a testable helper (buildNestedReplayFlags) to close the coverage gap between the test harness and the replay runtime.

- Share VAR_KEY_RE between session-replay-vars and session-replay-script instead of re-declaring the same /^[A-Z_][A-Z0-9_]*$/ in each. - Fold resolveReplayFlags + resolveReplayRuntime into a single generic resolveStringProps<T>; the two were near-identical object-walkers. - Un-export parseReplayEnvLine (was only used inside its own module). - Table-drive the four "reject AD_* namespace" tests with test.each instead of four near-duplicate test blocks. - Extract runReplayFixture helper for runReplayScriptFile integration tests; each test is now ~10 lines instead of ~30. No behaviour change. 812 tests still green.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 95def33f52

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-24T21:59:53Z

 }): Promise<DaemonResponse> {
-  const { req, sessionName, action, invoke } = params;
+  const { req, sessionName, action, scope, filePath, line, invoke } = params;
+  const resolved = resolveReplayAction(action, scope, { file: filePath, line });


Interpolate replay vars before argument classification

Variable substitution now runs in invokeReplayAction, after parseReplayScriptLine has already classified tokens, which breaks parametrized inputs that depend on token shape. For example, click ${X} ${Y} is parsed as a selector and collapsed to one positional (later becoming "100 200"), and numeric options like snapshot -d ${DEPTH} are parsed as NaN and dropped before substitution. This means replay scripts can silently execute different behavior than intended when using ${...} in positionals/flag values that influence parsing.

Useful? React with 👍 / 👎.

thymikee reviewed Apr 23, 2026

View reviewed changes

kacper-mikolajczak added 3 commits April 24, 2026 11:05

kacper-mikolajczak requested a review from thymikee April 24, 2026 09:23

thymikee marked this pull request as ready for review April 24, 2026 21:52

docs: clarify replay built-in variables

790e7c6

chatgpt-codex-connector Bot reviewed Apr 24, 2026

View reviewed changes

test: cover replay env serialization

f5a163b

thymikee merged commit 2062bff into callstackincubator:main Apr 24, 2026
9 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: parametrise .ad replay scripts#433

feat: parametrise .ad replay scripts#433
thymikee merged 7 commits intocallstackincubator:mainfrom
kacper-mikolajczak:feat/parametrise-ad-scripts

kacper-mikolajczak commented Apr 23, 2026

Uh oh!

thymikee Apr 23, 2026

Uh oh!

kacper-mikolajczak Apr 24, 2026

Uh oh!

thymikee Apr 23, 2026

Uh oh!

kacper-mikolajczak Apr 24, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kacper-mikolajczak commented Apr 23, 2026

Summary

Validation

Uh oh!

thymikee Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

kacper-mikolajczak Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

thymikee Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

kacper-mikolajczak Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants