perf(persona): drop per-turn format!()+lowercase — use canonical RoleId::as_str() (#195 slice 3)#1534
Merged
Merged
Conversation
becdb7c to
c8bd170
Compare
…Id::as_str() (#195 slice 3) Slice 3 of task #195 (inference latency perfection campaign). Pre-slice-3, `serve_persona_loop_inner` did, every turn: ```rust specialty: format!("{:?}", ctx.role).to_lowercase(), ``` Two allocations, a variadic-macro dispatch through derived-Debug, and a Unicode lowercase walk — all to produce a value `RoleId` already exposes via its canonical `as_str() -> &'static str` method (in `role_template.rs`, pinned at compile time). ## What ships Switch the per-turn read from the Debug+lowercase chain to `ctx.role.as_str()`: ```rust let respond_input = RespondInput { persona: PersonaSlot { // was: format!("{:?}", ctx.role).to_lowercase() specialty: ctx.role.as_str().to_string(), /* ... */ }, /* ... */ }; ``` Per-turn cost drops from `format!()` (variadic macro + Debug formatter + heap alloc) + `.to_lowercase()` (second alloc + Unicode lowercase walk) to a single `String::from` of a static `&'static str`. The source IS the cache — no field on `PersonaContext`, no helper, no Arc plumbing. ## Why this shape (not a cache) Initial draft of slice 3 added `PersonaContext.specialty: Arc<str>` populated by a `build_persona_specialty(role)` helper that pre- baked the Debug+lowercase output. Adversarial review (3 agents) correctly identified two problems: 1. **Caching the wrong source.** `RoleId::as_str()` already exists as the documented, intentionally-stable specialty derivation (kebab-case identifier, pinned by an explicit `match` so renames are deliberate). Caching the Debug+lowercase output couples the prompt-assembly contract to the derived-Debug format — a future contributor adding a custom Debug or wrapping the enum silently breaks the specialty string with no test failure. 2. **Caching test was a tautology.** The helper was `Arc::from(format!("{:?}", role).to_lowercase())` and the test compared its output to `format!("{:?}", role).to_lowercase()` — comparing the helper to its own implementation. A buggy helper using `.to_uppercase()` would have passed. The revised slice picks the right source and the right contract: use `role.as_str()` directly. No cache needed because the static str pointer IS already pre-computed at compile time. The behavior-preservation test (below) is non-circular: it compares the new direct path (`role.as_str()`, a hand-curated match in role_template.rs) to the pre-slice-3 derived-Debug chain. ## Test (+1 new) `persona::service_loop::tests::role_as_str_preserves_pre_slice3_specialty_format_for_each_role`: - Pins that for every `RoleId` variant (Helper, Coder, Sentinel, Custom), `role.as_str()` produces byte-identical output to the pre-slice-3 `format!("{:?}", role).to_lowercase()`. Non-circular: two independently-derived strings. A future PR adjusting either `as_str()` or adding a variant where the two paths disagree must update this test to record the intentional divergence — silent drift breaks loudly. - Compile-time exhaustiveness: a closure-typed `|role: RoleId| match role { ... }` with NO wildcard arm forces a new RoleId variant to be added here AND to the `variants` array. Cosmetic-exhaustiveness defeats from the initial draft (matching on a literal `RoleId::Helper`) are fixed. ## What this slice does NOT do (deliberate) - No `PersonaContext.specialty` field (initial draft removed per review). The original goal was reducing per-turn cost; `role.as_str()` already gives that with zero field bloat. - No `display_name` change — `String::clone` of a fixed-per- session name is the same per-turn cost whether we cache or not, so the smell is RespondInput's `String`-typed field, not the source. Lift in a follow-up when `RespondInput`'s type changes (#149-adjacent). - No prompt-template change — substrate keeps the existing specialty-string contract verbatim; the migration is provably behavior-preserving. ## Doctrine - `[[init-once-handle-then-lease-zero-copy-refs]]` — the canonical "init-once" here is the compile-time `match` in `RoleId::as_str()`; per-turn cost is a `&'static str` deref + one `String::from` (memcpy of 6-9 static bytes). Doesn't get cheaper without changing `RespondInput`'s type signature. - `[[observability-is-half-the-architecture]]` — slice 1's `respond_latency` aggregate will tell us if this and the cumulative slice-2/3+ wins are moving the needle on real persona turns. - `[[no-fallbacks-ever]]` — no graceful-degradation path; the behavior-preservation test breaks loudly if either side drifts. card: 0d780926 parent task: #195 slice 1: #1532 (merged) — phase decomposition slice 2: #1533 (merged) — cache system_prompt at construction Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
c8bd170 to
f634870
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Slice 3 of task #195 (inference latency perfection campaign). Drop the per-turn
format!("{:?}", ctx.role).to_lowercase()fromserve_persona_loop_innerand read the canonicalRoleId::as_str()directly. No new field onPersonaContext, no cache plumbing —as_str()'s&'static strIS the cache.What ships (one-line behavior change + one test)
Per-turn cost drops from
format!()(variadic macro + Debug formatter + heap alloc) +.to_lowercase()(second alloc + Unicode lowercase walk) to a singleString::fromof a static&'static str.Why this shape (not a cache)
Initial draft of slice 3 added a
PersonaContext.specialty: Arc<str>cache populated by abuild_persona_specialty(role)helper. Three adversarial reviewers correctly flagged:RoleId::as_str()is the documented, intentionally-stable specialty derivation (kebab-case identifier, pinned by an explicitmatchinrole_template.rs). Caching the Debug+lowercase chain instead couples prompt assembly to the derived-Debug format — a future variant rename or wrapper struct could silently break the specialty with no test failure.Arc::from(format!("{:?}", role).to_lowercase())and the test compared its output toformat!("{:?}", role).to_lowercase()— comparing the helper to its own implementation.The revised slice picks the right source and the right contract: use
role.as_str()directly. No cache because the static-str pointer IS already pre-computed at compile time. The new test is non-circular: it compares two independently-derived strings (role.as_str()fromrole_template.rs::match, vs the pre-slice-3 derived-Debug chain).Test
role_as_str_preserves_pre_slice3_specialty_format_for_each_role:RoleIdvariant (Helper, Coder, Sentinel, Custom),role.as_str()produces byte-identical output to the pre-slice-3format!("{:?}", role).to_lowercase(). Non-circular.|role: RoleId| match role { ... }with NO wildcard arm forces a new RoleId variant to be added here AND to thevariantsarray. (Initial draft's "cosmetic exhaustiveness" — matching on a literalRoleId::Helper— was fixed.)Out of scope
display_namechange.String::cloneof a fixed-per-session name has the same per-turn cost cache-or-not; the smell isRespondInput'sString-typed field, not the source. Lift whenRespondInput's type changes (Build(deps-dev): Bump @typescript-eslint/eslint-plugin from 8.29.1 to 8.46.2 #149-adjacent).Doctrine
[[init-once-handle-then-lease-zero-copy-refs]]— canonical "init-once" here is the compile-timematchinRoleId::as_str(). Per-turn cost is a&'static strderef + oneString::from(memcpy of 6-9 static bytes). Doesn't get cheaper without changingRespondInput's type signature.[[observability-is-half-the-architecture]]— slice 1'srespond_latencyaggregate will tell us if this + cumulative slice-2/3+ wins are moving the needle on real persona turns.[[no-fallbacks-ever]]— no graceful-degradation; the behavior-preservation test breaks loudly if either side drifts.card:
0d780926parent task: #195
slice 1: #1532 (merged)
slice 2: #1533 (merged)