Skip to content

Optimize Somatic Guardrail RegExp Performance#5

Merged
DHCross merged 1 commit intomainfrom
raven-somatic-regex-optimization-8534477663627930470
Dec 24, 2025
Merged

Optimize Somatic Guardrail RegExp Performance#5
DHCross merged 1 commit intomainfrom
raven-somatic-regex-optimization-8534477663627930470

Conversation

@google-labs-jules
Copy link
Copy Markdown
Contributor

Optimized the applySomaticGuard function in vessel/src/app/api/oracle/route.ts by replacing the loop-based RegExp creation with a single pre-compiled RegExp. This change improves performance and ensures correct single-pass replacement behavior, resolving the issue where terms could be doubly replaced (e.g., "heartbeat" becoming "signal rhythm rhythm"). The SOMATIC_BLOCKLIST is sorted by length to correctly handle overlapping terms.


PR created automatically by Jules for task 8534477663627930470 started by @DHCross

- Moved RegExp compilation outside the `applySomaticGuard` loop to module scope.
- Implemented single-pass replacement using a joined regex pattern.
- Sorted regex terms by length descending to handle overlapping phrases.
- Moved `alternatives` map to module scope for performance.
- Prevents cascading replacements (e.g., 'heartbeat' -> 'pulse rhythm' -> 'signal rhythm rhythm').
@vercel
Copy link
Copy Markdown

vercel Bot commented Dec 24, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
shipyard Ready Ready Preview, Comment Dec 24, 2025 3:59pm

@google-labs-jules
Copy link
Copy Markdown
Contributor Author

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!


For security, I will only act on instructions from the user who triggered this task.

New to Jules? Learn more at jules.google/docs.

@DHCross DHCross merged commit fe0ffab into main Dec 24, 2025
1 of 2 checks passed
@DHCross DHCross deleted the raven-somatic-regex-optimization-8534477663627930470 branch January 9, 2026 23:27
DHCross added a commit that referenced this pull request Mar 2, 2026
…ge clarity

HIGH priority:
- #6 symbolicWeather: soft-fail instead of re-throwing AuthorityViolationError
- #7 TTS: add 10s AbortController timeout to ElevenLabs fetch (504 on timeout)
- #9 injectProtocols: wrap in try/catch, continue without corpus on failure

MEDIUM priority:
- #2 LLM auth: normalize 401/403 to user-friendly message
- #3 LLM timeout: add one retry in generateReplyWithRetry
- #4 Empty LLM response: distinguish content_filter from true empty
- #5 AuthorityViolation outer catch: hide internal module names

LOW priority:
- #1 Missing LLM key 503: clearer operator message
- #8 useOracleChat: map 413 to user-friendly 'message too long' text
- #10 Remove void-cast needsConcreteRetry/needsProtocolRepair calls
DHCross pushed a commit that referenced this pull request Apr 25, 2026
…re-event type

Task #5: split the doctrinal `TELEMETRY_SIGNAL_VOID` and the broad
`TELEMETRY_INFRASTRUCTURE_EVENT` runtime events into distinct buckets
across every in-repo admin/dashboard surface and internal-doc reference,
so operators (and downstream log aggregators) can count doctrinal
refusals separately from upstream/infrastructure perturbations.

Changes:

- vessel/src/components/chat/SessionFlightRecorder.tsx
  `describeRuntimeEvent` now branches inside the telemetry-event case so
  TELEMETRY_SIGNAL_VOID renders as "Signal void (doctrinal)" (warn),
  TELEMETRY_INFRASTRUCTURE_EVENT renders as "Infrastructure event" (info,
  reflecting that a reply is usually still emitted), and the legacy
  OSR_WEATHER_SIGNAL_VOID renders as "Signal void (legacy)". The pre-
  existing `relational_mapping_unavailable` special case still triggers
  first.

- vessel/src/components/chat/DownloadSessionButton.tsx
  `ReplyLifecycleSummary` gains a new exported field
  `latestInfrastructureEventReasons: string[]` alongside the existing
  `latestSignalVoidReasons`. Both are documented with TSDoc explaining
  which bucket each represents. `buildReplyLifecycle` introduces an
  internal pooled `telemetryReasons` array for pattern matching
  (preserving prior behaviour for protocol_repair / scaffolded_full_read
  detection, which can be carried by either event type) and two narrowed
  arrays for the exported fields.

- vessel/src/lib/server/systemEventsMirror.ts
  Module-level docstring documenting the runtime event vocabulary
  persisted to Postgres, calling out the two telemetry buckets and how
  the legacy OSR_WEATHER_SIGNAL_VOID is normalised.

- docs/stable-central-llm-guardrails.md, docs/PLANNER_IMPLEMENTATION_BRIEF.md
  Updated the "signal-void" references to mention both event types and
  what each one means.

Surfaces intentionally NOT touched (already correct):
- vessel/src/components/chat/RavenThinkingFeed.tsx — both events return
  null (admin-only; not surfaced to the user-facing thinking feed).
- vessel/src/lib/plannerSignals.ts — already has an explanatory comment
  saying the planner intentionally pools both for perturbation density.
- vessel/src/lib/raven/affirmativeRuntime.ts — TSDoc was already explicit
  about the doctrinal vs infrastructure split.

Verification:
- typecheck (vessel/tsconfig.json): clean
- DownloadSessionButton.test.ts: 8/8 pass
- SessionFlightRecorder.test.tsx: 1/1 pass
- Workflow restarted, app responds 200 on /
- Architect review: no high/medium issues, safe to merge

Follow-up proposed:
- Task #10 — Update the external (out-of-repo) log dashboards to count
  the two buckets separately.
DHCross pushed a commit that referenced this pull request Apr 25, 2026
…re-event type

Task #5: split the doctrinal `TELEMETRY_SIGNAL_VOID` and the broad
`TELEMETRY_INFRASTRUCTURE_EVENT` runtime events into distinct buckets
across every in-repo admin/dashboard surface and internal-doc reference,
so operators (and downstream log aggregators) can count doctrinal
refusals separately from upstream/infrastructure perturbations.

Changes:

- vessel/src/components/chat/SessionFlightRecorder.tsx
  `describeRuntimeEvent` now branches inside the telemetry-event case so
  TELEMETRY_SIGNAL_VOID renders as "Signal void (doctrinal)" (warn),
  TELEMETRY_INFRASTRUCTURE_EVENT renders as "Infrastructure event" (info,
  reflecting that a reply is usually still emitted), and the legacy
  OSR_WEATHER_SIGNAL_VOID renders as "Signal void (legacy)". The pre-
  existing `relational_mapping_unavailable` special case still triggers
  first; its detail line now carries a `bucket {doctrinal|infrastructure
  |legacy}` prefix so operators always see which runtime bucket emitted
  the row (today only the infrastructure helper emits this category, but
  the prefix future-proofs the surface).

- vessel/src/components/chat/DownloadSessionButton.tsx
  `ReplyLifecycleSummary` gains a new exported field
  `latestInfrastructureEventReasons: string[]` alongside the existing
  `latestSignalVoidReasons`. Both are documented with TSDoc explaining
  which bucket each represents. `buildReplyLifecycle` introduces an
  internal pooled `telemetryReasons` array for pattern matching
  (preserving prior behaviour for protocol_repair / scaffolded_full_read
  detection, which can be carried by either event type) and two narrowed
  arrays for the exported fields.

- vessel/src/lib/server/systemEventsMirror.ts
  Module-level docstring documenting the runtime event vocabulary
  persisted to Postgres, calling out the two telemetry buckets and how
  the legacy OSR_WEATHER_SIGNAL_VOID is normalised.

- docs/stable-central-llm-guardrails.md, docs/PLANNER_IMPLEMENTATION_BRIEF.md
  Updated the "signal-void" references to mention both event types and
  what each one means.

Surfaces intentionally NOT touched (already correct):
- vessel/src/components/chat/RavenThinkingFeed.tsx — both events return
  null (admin-only; not surfaced to the user-facing thinking feed).
- vessel/src/lib/plannerSignals.ts — already has an explanatory comment
  saying the planner intentionally pools both for perturbation density.
- vessel/src/lib/raven/affirmativeRuntime.ts — TSDoc was already explicit
  about the doctrinal vs infrastructure split.

Verification:
- typecheck (vessel/tsconfig.json): clean
- DownloadSessionButton.test.ts: 8/8 pass
- SessionFlightRecorder.test.tsx: 1/1 pass
- Workflow restarted, app responds 200 on /
- Architect review: no high/medium issues, safe to merge

Follow-up proposed:
- Task #10 — Update the external (out-of-repo) log dashboards to count
  the two buckets separately.
DHCross pushed a commit that referenced this pull request Apr 25, 2026
…re-event type

Task #5: split the doctrinal `TELEMETRY_SIGNAL_VOID` and the broad
`TELEMETRY_INFRASTRUCTURE_EVENT` runtime events into distinct buckets
across every in-repo admin/dashboard surface and internal-doc reference,
so operators (and downstream log aggregators) can count doctrinal
refusals separately from upstream/infrastructure perturbations.

Changes:

- vessel/src/components/chat/SessionFlightRecorder.tsx
  `describeRuntimeEvent` now branches inside the telemetry-event case so
  TELEMETRY_SIGNAL_VOID renders as "Signal void (doctrinal)" (warn),
  TELEMETRY_INFRASTRUCTURE_EVENT renders as "Infrastructure event" (info,
  reflecting that a reply is usually still emitted), and the legacy
  OSR_WEATHER_SIGNAL_VOID renders as "Signal void (legacy)". The pre-
  existing `relational_mapping_unavailable` special case still triggers
  first; its detail line now carries a `bucket {doctrinal|infrastructure
  |legacy}` prefix so operators always see which runtime bucket emitted
  the row (today only the infrastructure helper emits this category, but
  the prefix future-proofs the surface).

- vessel/src/components/chat/DownloadSessionButton.tsx
  `ReplyLifecycleSummary` gains a new exported field
  `latestInfrastructureEventReasons: string[]` alongside the existing
  `latestSignalVoidReasons`. Both are documented with TSDoc explaining
  which bucket each represents. `buildReplyLifecycle` introduces an
  internal pooled `telemetryReasons` array for pattern matching
  (preserving prior behaviour for protocol_repair / scaffolded_full_read
  detection, which can be carried by either event type) and two narrowed
  arrays for the exported fields.

- vessel/src/lib/server/systemEventsMirror.ts
  Module-level docstring documenting the runtime event vocabulary
  persisted to Postgres, calling out the two telemetry buckets and how
  the legacy OSR_WEATHER_SIGNAL_VOID is normalised.

- docs/stable-central-llm-guardrails.md, docs/PLANNER_IMPLEMENTATION_BRIEF.md
  Updated the "signal-void" references to mention both event types and
  what each one means.

Surfaces intentionally NOT touched (already correct):
- vessel/src/components/chat/RavenThinkingFeed.tsx — both events return
  null (admin-only; not surfaced to the user-facing thinking feed).
- vessel/src/lib/plannerSignals.ts — already has an explanatory comment
  saying the planner intentionally pools both for perturbation density.
- vessel/src/lib/raven/affirmativeRuntime.ts — TSDoc was already explicit
  about the doctrinal vs infrastructure split.

- vessel/src/components/chat/__tests__/DownloadSessionButton.test.ts
  Two new bucket-split assertions on the lifecycle export:
  (a) extends the existing fixture to assert that the new
  `latestInfrastructureEventReasons` field captures all
  TELEMETRY_INFRASTRUCTURE_EVENT reasons and that
  `latestSignalVoidReasons` stays empty when nothing doctrinal was
  emitted; (b) a new dedicated test that mixes both event types and
  asserts each bucket lands in its own array while protocol_repair
  pattern matching still works across the pooled set. This locks in the
  bucket-split contract so future regressions cannot silently re-conflate
  the two.

Verification:
- typecheck (vessel/tsconfig.json): clean
- DownloadSessionButton.test.ts: 9/9 pass (8 prior + 1 new bucket-split test)
- SessionFlightRecorder.test.tsx: 1/1 pass
- Workflow restarted, app responds 200 on /
- Architect review: no high/medium issues; non-blocking comment about
  bucket-tagging the relational-mapping-degraded label addressed inline.

Follow-up proposed:
- Task #10 — Update the external (out-of-repo) log dashboards to count
  the two buckets separately.
DHCross pushed a commit that referenced this pull request Apr 25, 2026
…re-event type

Task #5: split the doctrinal `TELEMETRY_SIGNAL_VOID` and the broad
`TELEMETRY_INFRASTRUCTURE_EVENT` runtime events into distinct buckets
across every in-repo admin/dashboard surface and internal-doc reference,
so operators (and downstream log aggregators) can count doctrinal
refusals separately from upstream/infrastructure perturbations.

Changes:

- vessel/src/components/chat/SessionFlightRecorder.tsx
  `describeRuntimeEvent` now branches inside the telemetry-event case so
  TELEMETRY_SIGNAL_VOID renders as "Signal void (doctrinal)" (warn),
  TELEMETRY_INFRASTRUCTURE_EVENT renders as "Infrastructure event" (info,
  reflecting that a reply is usually still emitted), and the legacy
  OSR_WEATHER_SIGNAL_VOID renders as "Signal void (legacy)". The pre-
  existing `relational_mapping_unavailable` special case still triggers
  first; its detail line now carries a `bucket {doctrinal|infrastructure
  |legacy}` prefix so operators always see which runtime bucket emitted
  the row (today only the infrastructure helper emits this category, but
  the prefix future-proofs the surface).

- vessel/src/components/chat/DownloadSessionButton.tsx
  `ReplyLifecycleSummary` gains a new exported field
  `latestInfrastructureEventReasons: string[]` alongside the existing
  `latestSignalVoidReasons`. Both are documented with TSDoc explaining
  which bucket each represents. `buildReplyLifecycle` introduces an
  internal pooled `telemetryReasons` array for pattern matching
  (preserving prior behaviour for protocol_repair / scaffolded_full_read
  detection, which can be carried by either event type) and two narrowed
  arrays for the exported fields.

- vessel/src/lib/server/systemEventsMirror.ts
  Module-level docstring documenting the runtime event vocabulary
  persisted to Postgres, calling out the two telemetry buckets and how
  the legacy OSR_WEATHER_SIGNAL_VOID is normalised.

- docs/stable-central-llm-guardrails.md, docs/PLANNER_IMPLEMENTATION_BRIEF.md
  Updated the "signal-void" references to mention both event types and
  what each one means.

Surfaces intentionally NOT touched (already correct):
- vessel/src/components/chat/RavenThinkingFeed.tsx — both events return
  null (admin-only; not surfaced to the user-facing thinking feed).
- vessel/src/lib/plannerSignals.ts — already has an explanatory comment
  saying the planner intentionally pools both for perturbation density.
- vessel/src/lib/raven/affirmativeRuntime.ts — TSDoc was already explicit
  about the doctrinal vs infrastructure split.

- vessel/src/components/chat/__tests__/DownloadSessionButton.test.ts
  Two new bucket-split assertions on the lifecycle export:
  (a) extends the existing fixture to assert that the new
  `latestInfrastructureEventReasons` field captures all
  TELEMETRY_INFRASTRUCTURE_EVENT reasons and that
  `latestSignalVoidReasons` stays empty when nothing doctrinal was
  emitted; (b) a new dedicated test that mixes both event types and
  asserts each bucket lands in its own array while protocol_repair
  pattern matching still works across the pooled set. This locks in the
  bucket-split contract so future regressions cannot silently re-conflate
  the two.

- vessel/src/components/chat/SessionFlightRecorder.tsx
  `describeRuntimeEvent` is now exported (with comment explaining the
  reason) so unit tests can assert on the operator-facing labels and
  tones for the three telemetry buckets without rendering the full
  React component.

- vessel/src/components/chat/__tests__/SessionFlightRecorder.test.tsx
  Two new operator-facing label tests: (a) asserts doctrinal/
  infrastructure/legacy each render under distinct titles
  ("Signal void (doctrinal)" / "Infrastructure event" /
  "Signal void (legacy)") and tones (warn/info/warn); (b) asserts the
  Relationship Mapping degraded detail line carries the correct
  `bucket {doctrinal|infrastructure}` prefix when the same payload
  category is emitted via either helper.

Verification:
- typecheck (vessel/tsconfig.json): clean
- DownloadSessionButton.test.ts: 9/9 pass (8 prior + 1 new bucket-split test)
- SessionFlightRecorder.test.tsx: 3/3 pass (1 prior + 2 new label tests)
- Workflow restarted, app responds 200 on /
- Architect review: no high/medium issues; non-blocking comment about
  bucket-tagging the relational-mapping-degraded label addressed inline.

Follow-up proposed:
- Task #10 — Update the external (out-of-repo) log dashboards to count
  the two buckets separately.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant