Skip to content

perf(persona): drop per-turn format!()+lowercase — use canonical RoleId::as_str() (#195 slice 3)#1534

Merged
joelteply merged 1 commit into
canaryfrom
7f238140/cache-specialty
Jun 6, 2026
Merged

perf(persona): drop per-turn format!()+lowercase — use canonical RoleId::as_str() (#195 slice 3)#1534
joelteply merged 1 commit into
canaryfrom
7f238140/cache-specialty

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

@joelteply joelteply commented Jun 6, 2026

Summary

Slice 3 of task #195 (inference latency perfection campaign). Drop the per-turn format!("{:?}", ctx.role).to_lowercase() from serve_persona_loop_inner and read the canonical RoleId::as_str() directly. No new field on PersonaContext, no cache plumbing — as_str()'s &'static str IS the cache.

What ships (one-line behavior change + one test)

// service_loop.rs, in RespondInput construction:
// was: specialty: format!("{:?}", ctx.role).to_lowercase()
specialty: ctx.role.as_str().to_string(),

Per-turn cost drops from format!() (variadic macro + Debug formatter + heap alloc) + .to_lowercase() (second alloc + Unicode lowercase walk) to a single String::from of a static &'static str.

Why this shape (not a cache)

Initial draft of slice 3 added a PersonaContext.specialty: Arc<str> cache populated by a build_persona_specialty(role) helper. Three adversarial reviewers correctly flagged:

  1. Caching the wrong source. RoleId::as_str() is the documented, intentionally-stable specialty derivation (kebab-case identifier, pinned by an explicit match in role_template.rs). Caching the Debug+lowercase chain instead couples prompt assembly to the derived-Debug format — a future variant rename or wrapper struct could silently break the specialty with no test failure.
  2. The cache test was a tautology. The helper was Arc::from(format!("{:?}", role).to_lowercase()) and the test compared its output to format!("{:?}", role).to_lowercase() — comparing the helper to its own implementation.

The revised slice picks the right source and the right contract: use role.as_str() directly. No cache because the static-str pointer IS already pre-computed at compile time. The new test is non-circular: it compares two independently-derived strings (role.as_str() from role_template.rs::match, vs the pre-slice-3 derived-Debug chain).

Test

role_as_str_preserves_pre_slice3_specialty_format_for_each_role:

  • Pins that for EVERY RoleId variant (Helper, Coder, Sentinel, Custom), role.as_str() produces byte-identical output to the pre-slice-3 format!("{:?}", role).to_lowercase(). Non-circular.
  • Compile-time exhaustiveness: a closure-typed |role: RoleId| match role { ... } with NO wildcard arm forces a new RoleId variant to be added here AND to the variants array. (Initial draft's "cosmetic exhaustiveness" — matching on a literal RoleId::Helper — was fixed.)

Out of scope

  • No display_name change. String::clone of a fixed-per-session name has the same per-turn cost cache-or-not; the smell is RespondInput's String-typed field, not the source. Lift when RespondInput's type changes (Build(deps-dev): Bump @typescript-eslint/eslint-plugin from 8.29.1 to 8.46.2 #149-adjacent).
  • No prompt-template change. Substrate keeps the existing specialty-string contract verbatim; the migration is provably behavior-preserving.

Doctrine

  • [[init-once-handle-then-lease-zero-copy-refs]] — canonical "init-once" here is the compile-time match in RoleId::as_str(). Per-turn cost is a &'static str deref + one String::from (memcpy of 6-9 static bytes). Doesn't get cheaper without changing RespondInput's type signature.
  • [[observability-is-half-the-architecture]] — slice 1's respond_latency aggregate will tell us if this + cumulative slice-2/3+ wins are moving the needle on real persona turns.
  • [[no-fallbacks-ever]] — no graceful-degradation; the behavior-preservation test breaks loudly if either side drifts.

card: 0d780926
parent task: #195
slice 1: #1532 (merged)
slice 2: #1533 (merged)

@joelteply joelteply force-pushed the 7f238140/cache-specialty branch from becdb7c to c8bd170 Compare June 6, 2026 04:52
@joelteply joelteply changed the title perf(persona): cache role-specialty at PersonaContext (#195 slice 3) perf(persona): drop per-turn format!()+lowercase — use canonical RoleId::as_str() (#195 slice 3) Jun 6, 2026
…Id::as_str() (#195 slice 3)

Slice 3 of task #195 (inference latency perfection campaign).

Pre-slice-3, `serve_persona_loop_inner` did, every turn:

```rust
specialty: format!("{:?}", ctx.role).to_lowercase(),
```

Two allocations, a variadic-macro dispatch through derived-Debug,
and a Unicode lowercase walk — all to produce a value `RoleId`
already exposes via its canonical `as_str() -> &'static str`
method (in `role_template.rs`, pinned at compile time).

## What ships

Switch the per-turn read from the Debug+lowercase chain to
`ctx.role.as_str()`:

```rust
let respond_input = RespondInput {
    persona: PersonaSlot {
        // was: format!("{:?}", ctx.role).to_lowercase()
        specialty: ctx.role.as_str().to_string(),
        /* ... */
    },
    /* ... */
};
```

Per-turn cost drops from `format!()` (variadic macro + Debug
formatter + heap alloc) + `.to_lowercase()` (second alloc +
Unicode lowercase walk) to a single `String::from` of a static
`&'static str`. The source IS the cache — no field on
`PersonaContext`, no helper, no Arc plumbing.

## Why this shape (not a cache)

Initial draft of slice 3 added `PersonaContext.specialty: Arc<str>`
populated by a `build_persona_specialty(role)` helper that pre-
baked the Debug+lowercase output. Adversarial review (3 agents)
correctly identified two problems:

1. **Caching the wrong source.** `RoleId::as_str()` already exists
   as the documented, intentionally-stable specialty derivation
   (kebab-case identifier, pinned by an explicit `match` so
   renames are deliberate). Caching the Debug+lowercase output
   couples the prompt-assembly contract to the derived-Debug
   format — a future contributor adding a custom Debug or
   wrapping the enum silently breaks the specialty string with
   no test failure.
2. **Caching test was a tautology.** The helper was
   `Arc::from(format!("{:?}", role).to_lowercase())` and the test
   compared its output to `format!("{:?}", role).to_lowercase()`
   — comparing the helper to its own implementation. A buggy
   helper using `.to_uppercase()` would have passed.

The revised slice picks the right source and the right
contract: use `role.as_str()` directly. No cache needed because
the static str pointer IS already pre-computed at compile time.
The behavior-preservation test (below) is non-circular: it
compares the new direct path (`role.as_str()`, a hand-curated
match in role_template.rs) to the pre-slice-3 derived-Debug
chain.

## Test (+1 new)

`persona::service_loop::tests::role_as_str_preserves_pre_slice3_specialty_format_for_each_role`:

  - Pins that for every `RoleId` variant (Helper, Coder,
    Sentinel, Custom), `role.as_str()` produces byte-identical
    output to the pre-slice-3 `format!("{:?}", role).to_lowercase()`.
    Non-circular: two independently-derived strings. A future PR
    adjusting either `as_str()` or adding a variant where the
    two paths disagree must update this test to record the
    intentional divergence — silent drift breaks loudly.
  - Compile-time exhaustiveness: a closure-typed
    `|role: RoleId| match role { ... }` with NO wildcard arm
    forces a new RoleId variant to be added here AND to the
    `variants` array. Cosmetic-exhaustiveness defeats from the
    initial draft (matching on a literal `RoleId::Helper`) are
    fixed.

## What this slice does NOT do (deliberate)

- No `PersonaContext.specialty` field (initial draft removed
  per review). The original goal was reducing per-turn cost;
  `role.as_str()` already gives that with zero field bloat.
- No `display_name` change — `String::clone` of a fixed-per-
  session name is the same per-turn cost whether we cache or
  not, so the smell is RespondInput's `String`-typed field, not
  the source. Lift in a follow-up when `RespondInput`'s type
  changes (#149-adjacent).
- No prompt-template change — substrate keeps the existing
  specialty-string contract verbatim; the migration is
  provably behavior-preserving.

## Doctrine

- `[[init-once-handle-then-lease-zero-copy-refs]]` — the
  canonical "init-once" here is the compile-time `match` in
  `RoleId::as_str()`; per-turn cost is a `&'static str` deref +
  one `String::from` (memcpy of 6-9 static bytes). Doesn't get
  cheaper without changing `RespondInput`'s type signature.
- `[[observability-is-half-the-architecture]]` — slice 1's
  `respond_latency` aggregate will tell us if this and the
  cumulative slice-2/3+ wins are moving the needle on real
  persona turns.
- `[[no-fallbacks-ever]]` — no graceful-degradation path; the
  behavior-preservation test breaks loudly if either side
  drifts.

card: 0d780926
parent task: #195
slice 1: #1532 (merged) — phase decomposition
slice 2: #1533 (merged) — cache system_prompt at construction

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@joelteply joelteply changed the base branch from main to canary June 6, 2026 04:54
@joelteply joelteply force-pushed the 7f238140/cache-specialty branch from c8bd170 to f634870 Compare June 6, 2026 04:54
@github-actions github-actions Bot added size: M and removed size: XL labels Jun 6, 2026
@joelteply joelteply merged commit b916ca6 into canary Jun 6, 2026
10 of 12 checks passed
@joelteply joelteply deleted the 7f238140/cache-specialty branch June 6, 2026 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant