Skip to content

feat(prd): stress-test web UI — trigger + streaming progress (#561)#595

Merged
frankbria merged 5 commits into
mainfrom
feat/prd-stress-test-web-ui-561
May 27, 2026
Merged

feat(prd): stress-test web UI — trigger + streaming progress (#561)#595
frankbria merged 5 commits into
mainfrom
feat/prd-stress-test-web-ui-561

Conversation

@frankbria
Copy link
Copy Markdown
Owner

@frankbria frankbria commented May 27, 2026

Summary

Adds a Stress Test button to the PRD page that runs the existing recursive-decomposition stress test over SSE and streams progress into a modal. Closes #561.

This is the web surface for cf prd stress-test — web-only users can now run the most valuable part of the THINK phase (surfacing PRD ambiguities) without the CLI.

What changed

Backend (codeframe/)

  • core/prd_stress_test.py: new stress_test_prd_stream() async generator wrapping the headless stress_test_prd pipeline, yielding goals_extracted / goal_analyzed / complete / error events. Each blocking provider.complete() is offloaded via asyncio.to_thread so the event loop stays responsive. Stays headless (no FastAPI imports).
  • ui/routers/prd_v2.py: new GET /api/v2/prd/stress-test SSE endpoint that streams those events from the latest PRD. Missing PRD / missing ANTHROPIC_API_KEY surface as in-stream error events.

Frontend (web-ui/)

  • hooks/useStressTestStream.ts: hook built on the existing useEventSource, parsing events into an idle → streaming → complete | error state machine with human-readable progress lines.
  • components/prd/StressTestModal.tsx: streaming log, completion summary (ambiguity count), and a graceful error state with Retry.
  • Stress Test button wired through PRDHeader → PRDView → /prd page, enabled only when a PRD exists.

Deviations from the issue's plan (and why)

  1. Endpoint is GET, not POST. Browser EventSource only issues GET requests; this matches the existing GET /api/v2/tasks/{id}/stream SSE endpoint. (The plan's own sequence diagram shows a GET.)
  2. useStressTestStream hook instead of a raw EventSource factory in api.ts. Matches the established useTaskStream convention and connects directly to NEXT_PUBLIC_SSE_URL — the Next.js rewrite proxy buffers chunked responses and would break SSE streaming.
  3. Event types live in the hook (like useTaskStream), re-exported from hooks/index.ts.
  4. Recoverable problems (no PRD / no key) are in-stream error events, not HTTP errors, so the browser EventSource renders them via the modal's error state.

Testing

  • Backend: uv run pytest tests/core/test_prd_stress_test.py tests/ui/test_prd_stress_test_router.py — green; ruff + mypy clean.
  • Frontend: full Jest suite 905 passed; npm run build succeeds. New code ~98% covered.
  • Demo (live, real server + real frontend): verified the button, streaming lines in the modal, completion summary, and the error+retry state end to end. Screenshots in /tmp/demo-561-*.png. The LLM provider was the only mock (no API key in the demo env); the endpoint, core generator, SSE framing, hook, and modal all ran for real.

Acceptance criteria

Criterion Evidence
Stress Test button visible (enabled only with a PRD) Live snapshot of /prd shows the button; PRDHeader test asserts disabled when no PRD
Clicking starts the stress-test + shows streaming output Live modal rendered ✓ Extracted 3 goals → per-goal lines → ✓ Analysis complete — 1 ambiguity found
Backend endpoint calls core stress-test logic Live SSE emitted the real ordered events + rendered tech spec / ambiguity report from the core generator
Error state handled gracefully Live modal showed Stress test failed + backend message + Retry button
npm test and uv run pytest pass Verified

Known limitations

Summary by CodeRabbit

  • New Features

    • PRD stress-test with real-time streamed progress (goals, per-goal analysis, final spec & ambiguity report)
    • Server SSE endpoint to stream progress to clients
  • UI

    • "Stress Test" button, interactive modal, auto-scrolling progress log, retry/close controls
  • Client

    • Hook to subscribe to the stress-test stream exposing status, human-readable lines, results, and controls
  • Tests & Docs

    • End-to-end and unit tests for streaming, errors, disconnects; updated docs and roadmap notes

Review Change Stack

Add a Stress Test button to the PRD page that runs the existing
recursive-decomposition stress test over SSE and streams progress into
a modal.

Backend:
- core: stress_test_prd_stream() async generator wraps the headless
  stress_test_prd pipeline, yielding goals_extracted / goal_analyzed /
  complete / error events (provider.complete offloaded via asyncio.to_thread)
- GET /api/v2/prd/stress-test SSE endpoint streams those events; missing
  PRD / missing ANTHROPIC_API_KEY surface as in-stream error events.
  GET (not POST) for browser EventSource compatibility, matching the
  existing task stream endpoint.

Frontend:
- useStressTestStream hook (built on useEventSource, direct SSE connect)
  parses events into an idle->streaming->complete|error state machine
- StressTestModal renders streaming lines, completion summary, and a
  graceful error state with Retry
- Stress Test button wired through PRDHeader -> PRDView -> /prd page,
  enabled only when a PRD exists

Results rendering / answer flow is out of scope (tracked in #562); the
hook retains tech_spec_markdown + ambiguity_report for that work.

Closes #561
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 27, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fd899ef9-22a8-489c-b5ee-a03f08415216

📥 Commits

Reviewing files that changed from the base of the PR and between d4d9267 and 2015b25.

📒 Files selected for processing (3)
  • CLAUDE.md
  • docs/PHASE_2_CLI_API_MAPPING.md
  • docs/PRODUCT_ROADMAP.md
✅ Files skipped from review due to trivial changes (3)
  • docs/PHASE_2_CLI_API_MAPPING.md
  • docs/PRODUCT_ROADMAP.md
  • CLAUDE.md

Walkthrough

Adds an async backend generator that streams PRD stress-test progress via an SSE router endpoint; a React hook subscribes, accumulates human-readable lines, and a modal UI renders streaming, completion, and error states with button wiring on the PRD page and tests across layers.

Changes

PRD Stress-Test Streaming Feature

Layer / File(s) Summary
Core async streaming generator and tests
codeframe/core/prd_stress_test.py, tests/core/test_prd_stress_test.py
stress_test_prd_stream async generator uses asyncio.to_thread to offload synchronous goal extraction and decomposition, emitting typed progress events (goals_extracted, goal_analyzed, complete, error). Tests verify event order, payloads, and error handling.
Backend SSE endpoint and integration tests
codeframe/ui/routers/prd_v2.py, tests/ui/test_prd_stress_test_router.py
GET /api/v2/prd/stress-test frames core events as SSE data: frames, resolves latest workspace PRD and LLM provider, emits in-stream error for missing PRD/API key, and aborts iteration on client disconnect. Router tests cover normal flow, missing PRD/key, provider selection, and disconnect semantics.
Client-side streaming hook and tests
web-ui/src/hooks/useStressTestStream.ts, web-ui/src/__tests__/hooks/useStressTestStream.test.ts
useStressTestStream connects to SSE endpoint, maintains idle→streaming→complete
Modal component, tests, and barrel exports
web-ui/src/components/prd/StressTestModal.tsx, web-ui/src/__tests__/components/prd/StressTestModal.test.tsx, web-ui/src/components/prd/index.ts, web-ui/src/hooks/index.ts, web-ui/__mocks__/@hugeicons/react.js
StressTestModal wires the hook lifecycle, auto-scrolls streaming lines, renders streaming/complete/error UI with ambiguity messaging and retry/close/cancel actions. Adds barrel exports and an icon mock for tests.
PRD page and header stress-test button
web-ui/src/components/prd/PRDHeader.tsx, web-ui/src/components/prd/PRDView.tsx, web-ui/src/app/prd/page.tsx, web-ui/src/__tests__/components/prd/PRDHeader.test.tsx, web-ui/src/__tests__/components/prd/PrdPage.test.tsx
PRDHeader gains optional onStressTest prop and renders a "Stress Test" button (disabled when no PRD). Prop is threaded through PRDView; PrdPage manages modal open state and mounts StressTestModal. Component tests validate visibility, enabled/disabled state, and modal open workflow.

Sequence Diagram

sequenceDiagram
  participant User
  participant PRDPage
  participant StressTestModal
  participant useStressTestStream
  participant EventSource
  participant Backend
  User->>PRDPage: Click "Stress Test" button
  PRDPage->>PRDPage: setStressTestOpen(true)
  PRDPage->>StressTestModal: mount open=true
  StressTestModal->>useStressTestStream: start()
  useStressTestStream->>EventSource: new EventSource(url)
  EventSource->>Backend: GET /api/v2/prd/stress-test
  Backend->>Backend: stress_test_prd_stream(prd, provider)
  loop emit events
    Backend-->>EventSource: data: {type:goals_extracted, goals:[...]}
    EventSource-->>useStressTestStream: goals_extracted
    useStressTestStream->>useStressTestStream: append lines, status=streaming
    useStressTestStream-->>StressTestModal: render progress
    Backend-->>EventSource: data: {type:goal_analyzed, classification, ambiguity_count}
    EventSource-->>useStressTestStream: goal_analyzed
  end
  Backend-->>EventSource: data: {type:complete, tech_spec_markdown, ambiguity_report}
  EventSource-->>useStressTestStream: complete
  useStressTestStream->>useStressTestStream: status=complete, close EventSource
  useStressTestStream-->>StressTestModal: status/result updated
  User->>StressTestModal: Click "Close"
  StressTestModal->>useStressTestStream: reset()
  StressTestModal->>PRDPage: onOpenChange(false)
  PRDPage->>PRDPage: setStressTestOpen(false)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • frankbria/codeframe#446: Implements the core synchronous stress_test_prd logic that this PR wraps with an async streaming interface and exposes via the web UI.
  • frankbria/codeframe#525: Touches the same PRD UI wiring files (PrdPage, PRDView, PRDHeader) for adding header actions/modal buttons.

Poem

🧪 I twitch my whiskers at streaming lines,
Threads hum softly under moonlit pines,
Goals split, reports arrive, events descend,
Ambiguities counted until the end,
A rabbit cheers: “Stress test — onward, friend!” 🐰

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 34.62% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes the main feature: a stress-test web UI with trigger button and streaming progress, directly matching the primary changeset objective and issue #561.
Linked Issues check ✅ Passed The PR fulfills all coding requirements from #561: stress-test button added to PRD page (enabled only with PRD present), modal opens with streaming progress, backend SSE endpoint implemented (GET /api/v2/prd/stress-test), loading state with spinner/output, error handling with retry, tests pass.
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing the stress-test web UI trigger and streaming (backend generator, SSE endpoint, frontend hook, modal, button wiring) and supporting tests; no unrelated modifications found.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/prd-stress-test-web-ui-561

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link
Copy Markdown

claude Bot commented May 27, 2026

test

@claude
Copy link
Copy Markdown

claude Bot commented May 27, 2026

Review: feat(prd): stress-test web UI - see follow-up comments for full review

@claude
Copy link
Copy Markdown

claude Bot commented May 27, 2026

High: Multi-provider support bypassed

ui/routers/prd_v2.py hardcodes the Anthropic provider and gates on ANTHROPIC_API_KEY. Per CLAUDE.md, provider selection should follow the chain: CLI flag -> env var -> .codeframe/config.yaml -> default (anthropic). A workspace configured with CODEFRAME_LLM_PROVIDER=openai (or a local Ollama model) will hit the hard-coded error even though the system supports multi-provider. The other routers route through the existing adapter selection logic — this endpoint should do the same rather than importing AnthropicProvider directly.

@claude
Copy link
Copy Markdown

claude Bot commented May 27, 2026

High: Missing rate limiting on an expensive endpoint

Every other router in prd_v2.py (and across the codebase) carries @rate_limit_standard(). The new /api/v2/prd/stress-test endpoint runs recursive LLM decomposition (potentially many calls) with no rate limiting at all. This is a straightforward DoS vector: a single client can hammer it without any throttling. Should get @rate_limit_standard() (or a tighter custom limit given the cost) like the surrounding endpoints.

@claude
Copy link
Copy Markdown

claude Bot commented May 27, 2026

Medium: Redundant scroll containers in StressTestModal

In StressTestModal.tsx, the ScrollArea wraps a div that already has its own overflow-y-auto max-h-[40vh]. Shadcn ScrollArea provides its own styled scrollable viewport internally. The inner div's overflow-y-auto creates a second nested scroll container, so the Shadcn-styled scrollbar is bypassed in favour of the browser native one, making ScrollArea effectively a no-op here. Either: (a) use ScrollArea and attach the auto-scroll ref to its viewport, or (b) drop ScrollArea and keep just the inner div. The latter is simpler given that the auto-scroll ref is already wired to the inner div.

@claude
Copy link
Copy Markdown

claude Bot commented May 27, 2026

Low: No server-side disconnect detection

When the user clicks Cancel mid-stream, the EventSource closes client-side. However stress_test_prd_stream has no way to detect that disconnect — asyncio.to_thread will let the current recursive_decompose call run to completion before the generator can observe the closed connection. A request.is_disconnected() check between yields in _generate() would cover this for large PRDs with high max_depth. Not blocking for this PR but worth tracking.

@claude
Copy link
Copy Markdown

claude Bot commented May 27, 2026

What's well done

  • asyncio.to_thread offloading for blocking provider.complete() calls is exactly right
  • Errors surface as in-stream error events rather than HTTP 4xx/5xx — correct SSE pattern since EventSource cannot read HTTP status codes
  • runId bump on retry is a clean way to force useEventSource to reconnect (avoids URL equality no-op)
  • maxRetries: 0 correctly prevents auto-reconnect after a one-shot stream closes
  • X-Accel-Buffering: no header present — nginx proxy buffering will not silently swallow SSE chunks
  • Test coverage is thorough: mock EventSource, all status transitions, retry lifecycle, error + retry round-trip, missing PRD / missing key error events
  • onStressTest as an optional prop on PRDHeader is a clean way to keep the button conditional at the call-site level

Address cross-family (codex) review:
- Frontend: useStressTestStream now observes useEventSource onError, so a
  transport-level failure (server down, 404/CORS, dropped connection) with
  no data frame transitions the modal to the error state instead of hanging
  on 'Analyzing PRD...'. Guarded to ignore transient (non-CLOSED) reconnect
  errors and to not clobber a backend-sent error event.
- Backend: extract the SSE generator to _stress_test_event_stream and check
  request.is_disconnected() between events so an abandoned stream stops
  issuing further (billable) LLM calls — mirroring event_stream_generator.
@frankbria
Copy link
Copy Markdown
Owner Author

Cross-family review (codex) — addressed in 18ef219

codex flagged two P2 streaming edge cases; both fixed with tests:

  1. Transport-level EventSource failures never reached the UI. useStressTestStream now wires useEventSource's onError so a connection failure with no data: frame (server down, 404/CORS, dropped connection) transitions the modal to the error state instead of hanging on "Analyzing PRD…". Guarded to ignore transient (non-CLOSED) reconnect errors and to not overwrite a backend-sent error event.
  2. Abandoned streams kept burning LLM calls. The SSE generator (extracted to _stress_test_event_stream) now checks request.is_disconnected() between events and stops iterating the core generator, so closing the modal / navigating away halts further decomposition — mirroring event_stream_generator in streaming_v2.

Tests added: transport-failure / transient-error / no-clobber (hook) and disconnect-abort / stays-connected (router). Full suite: 908 frontend + 33 backend stress-test tests green.

@claude
Copy link
Copy Markdown

claude Bot commented May 27, 2026

Follow-up review — after 18ef219

Addressed since last review

The two P2 streaming edge cases flagged by codex are properly resolved:

  • Transport-level errors now propagate to the modal via onError wiring (with correct guards for transient reconnects and no-clobber of backend error events).
  • Disconnect detection is now in _stress_test_event_stream via request.is_disconnected() between yields — mirrors the existing event_stream_generator pattern correctly.

Still unresolved — both block merge

1. Multi-provider support bypassed (High)
_stress_test_event_stream still hardcodes AnthropicProvider with a direct ANTHROPIC_API_KEY check. A workspace configured with CODEFRAME_LLM_PROVIDER=openai or a local Ollama model will hit the hard-coded error path even though the system already supports those providers. Every other endpoint routes through the shared adapter selection chain (CLI flag → env var → .codeframe/config.yaml → default). This one should too rather than being an Anthropic-only carveout.

2. Missing rate limiting (High)
@rate_limit_standard() is on every other endpoint in prd_v2.py. The stress-test endpoint has no decorator at all. Given that each call fans out to multiple LLM completions (one per goal × depth), the cost per request is higher than average, making the omission more consequential. At minimum @rate_limit_standard() should be added; a tighter custom limit would be better.


Still present — medium

Redundant scroll containers in StressTestModal
ScrollArea from Shadcn wraps a div that also carries overflow-y-auto max-h-[40vh]. The inner div creates a native scrollbar; Shadcn's styled scrollbar is never reached. Since scrollRef is already wired to the inner div, the simplest fix is to drop ScrollArea and keep only the inner div — behaviour is identical, markup is simpler.


No new issues

The rest of the implementation is clean. The runId bump for retry, maxRetries: 0 to prevent reconnect loops, the async test suite with a proper mock EventSource, and the optional onStressTest prop pattern are all well done.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@codeframe/ui/routers/prd_v2.py`:
- Around line 244-249: The stress-test SSE endpoint function
stress_test_prd_stream_endpoint is missing the standard rate limiting decorator;
add the `@rate_limit_standard`() decorator immediately above the function
definition (matching how other endpoints in this router are protected) so the
LLM-backed streaming path uses the same rate-limiting policy and prevents burst
abuse.

In `@tests/ui/test_prd_stress_test_router.py`:
- Around line 197-200: The assertion on mock_provider.complete.call_count is too
loose (uses "< 6") and can miss regressions; tighten it to guard against a full
non-aborted run (which makes 4 provider calls) by changing the assertion to
assert mock_provider.complete.call_count < 4 so the test fails if abort behavior
regresses to a full run.

In `@web-ui/src/hooks/useStressTestStream.ts`:
- Around line 101-104: The hook sets the streaming state and calls start() even
when workspacePath is null, so ensure you guard against missing workspacePath
before initiating the EventSource: in useStressTestStream.ts check workspacePath
(and that url !== null) before calling start() or flipping to the "streaming"
state where the const url is computed, and return/abort early (or set state to
idle/error) if workspacePath is absent; apply the same guard to the other
start/streaming block around the code referenced at lines 182-190 to prevent
entering an analyzing/streaming state without a valid EventSource URL.
- Around line 8-65: The exported Stress Test UI types (StressTestEventType,
StressTestGoalsExtractedEvent, StressTestGoalAnalyzedEvent,
StressTestCompleteEvent, StressTestErrorEvent, StressTestEvent,
StressTestStatus, StressTestResultData, UseStressTestStreamReturn) should be
relocated into the shared types module (types/index) and this hook file should
import them instead of redeclaring them; remove the local exports, add imports
for each moved type, and ensure the hook implementation and any consumers
reference the imported types to keep the public type surface consistent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 977f6039-e5ef-466e-ab73-c8bce7fb2caa

📥 Commits

Reviewing files that changed from the base of the PR and between 38007da and 18ef219.

📒 Files selected for processing (16)
  • codeframe/core/prd_stress_test.py
  • codeframe/ui/routers/prd_v2.py
  • tests/core/test_prd_stress_test.py
  • tests/ui/test_prd_stress_test_router.py
  • web-ui/__mocks__/@hugeicons/react.js
  • web-ui/src/__tests__/components/prd/PRDHeader.test.tsx
  • web-ui/src/__tests__/components/prd/PrdPage.test.tsx
  • web-ui/src/__tests__/components/prd/StressTestModal.test.tsx
  • web-ui/src/__tests__/hooks/useStressTestStream.test.ts
  • web-ui/src/app/prd/page.tsx
  • web-ui/src/components/prd/PRDHeader.tsx
  • web-ui/src/components/prd/PRDView.tsx
  • web-ui/src/components/prd/StressTestModal.tsx
  • web-ui/src/components/prd/index.ts
  • web-ui/src/hooks/index.ts
  • web-ui/src/hooks/useStressTestStream.ts

Comment thread codeframe/ui/routers/prd_v2.py
Comment thread tests/ui/test_prd_stress_test_router.py Outdated
Comment thread web-ui/src/hooks/useStressTestStream.ts Outdated
Comment thread web-ui/src/hooks/useStressTestStream.ts
Test User added 2 commits May 26, 2026 17:58
- Add @rate_limit_standard() to the stress-test SSE endpoint (it runs
  recursive LLM decomposition — throttle burst abuse), matching the rest
  of prd_v2. (coderabbit Major / claude High)
- Move SSE event payload types to web-ui/src/types/prd.ts (re-exported
  from types/index.ts) per the repo's type-location guideline; the hook
  imports them. (coderabbit Major)
- Guard useStressTestStream.start() when workspacePath is null — fail
  fast to the error state instead of hanging in 'streaming'. (coderabbit)
- Drop the redundant ScrollArea wrapper in StressTestModal so the
  auto-scroll ref and the visible scrollbar are the same element.
  (claude Medium)
- Tighten the disconnect-abort test assertion to <= 2 calls (this
  fixture's full run is 4). (coderabbit)
…561)

Address coderabbit Major: the SSE endpoint hardcoded AnthropicProvider.
Now resolve provider via the documented chain (CODEFRAME_LLM_PROVIDER env
→ .codeframe/config.yaml → default anthropic) using the shared
get_provider factory, mirroring runtime.py. The ANTHROPIC_API_KEY gate
only applies when the resolved provider is anthropic, so OpenAI-compatible
and local (ollama/vllm) providers work. Unknown provider types surface as
in-stream error events.
@frankbria
Copy link
Copy Markdown
Owner Author

Review feedback addressed (802ecdd, d4d9267)

CodeRabbit

  • 🟠 Rate limiting on the SSE endpoint — added @rate_limit_standard(). ✅
  • 🟠 Provider hardcoded to Anthropic — now resolves via the documented chain (CODEFRAME_LLM_PROVIDER.codeframe/config.yaml → default anthropic) through the shared get_provider factory, mirroring runtime.py. The ANTHROPIC_API_KEY gate only applies when the resolved provider is anthropic, so OpenAI-compatible / local providers work. Added a test. ✅
  • 🟠 Move exported types to types/ — SSE event payload types moved to web-ui/src/types/prd.ts (re-exported from types/index.ts); the hook imports them. Hook-return types stay with the hook (as useTaskStream does for its options type). ✅
  • 🟡 Guard start() when workspacePath is null — now fails fast to the error state. Added a test. ✅
  • 🟡 Tighten disconnect assertion — changed to <= 2 (this fixture's full run is 4 calls). ✅

claude-review

  • High: rate limiting — same as above. ✅
  • Medium: redundant scroll containers — dropped the no-op ScrollArea wrapper; the auto-scroll ref and visible scrollbar are now the same element. ✅
  • Low: server-side disconnect detection — already addressed in 18ef219 via request.is_disconnected() in _stress_test_event_stream. ✅

Suites green: 909 frontend + 6 stress-test router + core tests; ruff + mypy clean.

@claude
Copy link
Copy Markdown

claude Bot commented May 27, 2026

FULL REVIEW BODY

@claude
Copy link
Copy Markdown

claude Bot commented May 27, 2026

Follow-up review -- current HEAD

Resolved since last review

Rate limiting (High -- now fixed)
@rate_limit_standard() is now present on the /api/v2/prd/stress-test endpoint in the current diff. Closes the DoS concern raised in both prior reviews.

Redundant scroll containers (Medium -- now fixed)
StressTestModal.tsx no longer uses ScrollArea. The log area is a plain div with overflow-y-auto max-h-[40vh] and the auto-scroll ref wired directly. Clean.


Still unresolved -- blocks merge

Multi-provider support bypassed (High)
_stress_test_event_stream still hardcodes AnthropicProvider with a direct ANTHROPIC_API_KEY check. Per CLAUDE.md, provider selection should follow the chain: CLI flag -> env var -> .codeframe/config.yaml -> default (anthropic). A workspace configured with CODEFRAME_LLM_PROVIDER=openai or a local Ollama model will hit the hard-coded error path even though the system already supports those providers. All other endpoints route through the shared adapter selection logic -- this one should do the same rather than being an Anthropic-only carveout.


No new issues

Everything else in the current diff is clean. The transport-level error handling, disconnect guard, runId retry pattern, and test coverage are all solid. Once the provider-factory issue is addressed, this is ready to merge.

Mark Phase 5.4 trigger+streaming shipped in CLAUDE.md, PRODUCT_ROADMAP,
and the CLI↔API mapping; note results rendering (#562) still pending.
@claude
Copy link
Copy Markdown

claude Bot commented May 27, 2026

test comment - please ignore

@claude
Copy link
Copy Markdown

claude Bot commented May 27, 2026

deleting test

@claude
Copy link
Copy Markdown

claude Bot commented May 27, 2026

Code Review: feat(prd): stress-test web UI

This is a clean, well-structured PR. The layering is sound: headless core generator → thin SSE router → React hook → modal, mirroring existing conventions throughout.

What is working well

  • Headless core: stress_test_prd_stream has no FastAPI imports; asyncio.to_thread is the right bridge for the blocking sync LLM calls.
  • SSE pattern: GET endpoint with text/event-stream + X-Accel-Buffering: no mirrors /tasks/{id}/stream. In-stream error events for missing PRD / API key is the right call for EventSource.
  • Disconnect guard: checking request.is_disconnected() before each yield prevents billing for abandoned streams, matching the existing event_stream_generator pattern.
  • Hook design: terminalRef preventing transport-error overwriting of a server error event is a subtle but important correctness fix. The runId trick for forcing a fresh EventSource URL on retry is well-chosen.
  • Test coverage: backend + router + hook + component tests are all present and meaningful, including the disconnect-abort path.

Issues

1. base_url_override precedence is inverted (medium)

In prd_v2.py _stress_test_event_stream, model_override correctly follows env var over config file. base_url_override has it backwards — config file overrides OPENAI_BASE_URL:

# current (wrong order):
base_url_override = (llm_cfg.base_url if llm_cfg else None) or os.getenv("OPENAI_BASE_URL")

# should be (env var wins):
base_url_override = os.getenv("OPENAI_BASE_URL") or (llm_cfg.base_url if llm_cfg else None)

This matches the documented priority: env var > .codeframe/config.yaml > default.

2. Async tests may be missing @pytest.mark.asyncio (low)

TestStressTestPrdStream and TestStressTestDisconnect use async def methods without @pytest.mark.asyncio. These only run if pytest-asyncio is in auto mode. Worth confirming pyproject.toml has asyncio_mode = "auto" — otherwise these tests silently pass as no-ops.

3. No SSE heartbeat for long-running decompositions (low)

A deep decomposition (max_depth=10, large PRD) could take several minutes. Most reverse proxies have a default idle-timeout of 60s and will drop a connection with no bytes. Emitting a periodic SSE comment heartbeat would prevent premature disconnects in hosted deployments. Not a blocker, but worth tracking.

4. provider parameter is untyped in core (cosmetic)

async def stress_test_prd_stream(prd_content: str, provider, ...) — worth aligning with BaseLLMProvider if the rest of the file already uses it.

Minor observations (no action needed)

  • StressTestModal test: the mock for "calls start() when opened" sets status: 'streaming' as the initial value — the test is correct but status: 'idle' would be more semantically accurate.
  • The run={runId} query param is a cache-buster that the backend ignores. A brief inline comment explaining this intent would save future readers a second look.

Summary

The feature is production-ready as written. The one meaningful correctness issue is the base_url_override precedence inversion — worth a one-line fix before merge. The async test marker question is a fast confirm-or-add. Everything else is cosmetic.

@frankbria frankbria merged commit 79fa686 into main May 27, 2026
11 checks passed
@frankbria frankbria deleted the feat/prd-stress-test-web-ui-561 branch May 27, 2026 01:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Phase 5.4] PRD stress-test web UI: trigger and streaming progress

1 participant