Feature/remove deploy integration by venikman · Pull Request #13 · venikman/PromptAgent

venikman · 2026-01-14T06:52:40Z

Summary by CodeRabbit

New Features
- Fresh-based Studio UI (playground, generation, evaluation, patching, tournaments, streaming optimization)
- Full telemetry (HTTP & AI) with configurable reporting and previews
- Playwright end-to-end tests and dev tooling
Refactor
- Unified server/API entrypoint and simplified UI build/dev workflow
- Per-generation token limits and LLM timeouts threaded through flows
Documentation
- README, architecture, and evaluation methodology expanded
Chores
- New example env vars for timeouts, telemetry, and CORS added

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Move UI to Fresh, add LLM telemetry previews, improve generation parsing, and update tests/config.

coderabbitai · 2026-01-14T06:52:50Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Consolidates server into an API-first entrypoint, adds a telemetry subsystem, threads LLM timeouts and telemetry through AI calls, introduces a Fresh/Preact UI under src/ui, removes the legacy Vite/React UI, and updates CI/deploy, scripts, tests, and configs.

Changes

Cohort / File(s)	Summary
Env & Config `\.env.example`, `src/config.ts`, `deno.json`	Add LLM timeout, telemetry and CORS env vars; extend EnvSchema/defaults; update deno tasks, import mappings and deploy include/exclude paths.
CI / Deploy / Gitignore `.github/.../deploy-*.yml`, `.gitignore`, `playwright.config.ts`	Replace Vite UI build with Fresh UI build step; adjust deploy excludes to `src/ui/...`; update .gitignore; add Playwright config for E2E.
Server / API `src/server/main.ts`, `src/server/handler.ts`, `deploy/deploy-config.test.ts`	Add centralized API handler (`createApiHandler`, `ApiConfig`), unify API/UI routing, orchestrator-backed endpoints, task stores and progress APIs; remove legacy deploy tests.
Telemetry `src/telemetry.ts`	New telemetry subsystem: histograms, HTTP/AI metrics, AI call lifecycle, withAiTelemetry wrapper, snapshot/flush, SSE stream and background reporter; public record/subscribe APIs.
Generator / Models / Patch `src/generator.ts`, `src/models.ts`, `src/patchEngineer.ts`	Add structured-output extraction, token usage normalization, env fallbacks for models/URLs/keys, AbortSignal timeouts, and telemetry wrapping for AI calls.
Judge / FPF / Scoring `src/judge/`, `src/fpf/`, `src/scorer.ts`	Wrap judge/PoLL calls with telemetry and timeouts, reorganize FPF exports/types, propagate minor API additions and formatting refactors.
Orchestrator & Meta‑evolution `src/orchestrator/`, `src/meta-evolution/`	Thread `maxTokens` through evaluator/optimization APIs, add telemetry/timeouts, adjust some function signatures and exported types.
New Fresh UI `src/ui/*` (app, routes, islands, components, styles, twind...)	Add Fresh app, Studio island, UI primitives (messages, prompt input), Twind client/middleware/shared, polyfills, styles, dev/build scripts and build snapshot tooling.
Removed Legacy UI `ui/`, `ui/src/`, `ui/components.json`, `ui/postcss.config.cjs`, `ui/deno-deps.ts`, `ui/index.html`	Delete Vite/React UI, components, styling and build configs; remove many UI-related files and exports.
Scripts & Tasks `scripts/dev.ts`, `scripts/ui-build.ts`, `scripts/pre-commit-checks.ts`	Simplify dev watcher to single Deno process, add Fresh build script, replace Vite checks with Fresh structural checks.
Tests `tests/*`, `tests/e2e/`	Update tests to new `src/` paths and imports, remove deploy-focused tests, add Playwright E2E test(s).
Docs & Prompts `README.md`, `AGENTS.md`, `evaluation-methodology.md`, `prompts/*`, `data/epics.eval.json`	Reflow and expand docs, add evaluation methodology and prompts, small JSON formatting edits.

Sequence Diagram(s)

sequenceDiagram
    participant Browser as Client (Browser)
    participant Main as Server (src/server/main.ts)
    participant Router as Route classifier (isApiRoute)
    participant API as createApiHandler
    participant Telemetry as src/telemetry.ts
    participant LLM as LLM Service

    Browser->>Main: HTTP request
    Main->>Router: classify route (API vs UI)
    alt API route
        Router->>API: forward request
        API->>Telemetry: recordHttpRequest(start)
        API->>Telemetry: recordAiStart (if AI call)
        API->>LLM: call LLM with AbortSignal.timeout
        LLM-->>API: response / error
        API->>Telemetry: recordAiResponse
        API-->>Main: Response
    else UI route
        Router->>Main: serve Fresh UI snapshot
        Main-->>Browser: HTML/JS
    end
    Main->>Telemetry: recordHttpRequest(end with status,duration)

sequenceDiagram
    participant Caller as Module (generator/judge/meta-evo)
    participant Wrapper as withAiTelemetry
    participant Telemetry as src/telemetry.ts
    participant LLM as LLM API

    Caller->>Wrapper: withAiTelemetry({name,model}, fn)
    Wrapper->>Telemetry: recordAiStart(name)
    Wrapper->>LLM: invoke fn() with AbortSignal.timeout(LLM_TIMEOUT_MS)
    alt success
        LLM-->>Wrapper: payload + tokenUsage
        Wrapper->>Telemetry: recordAiResponse(success=true, text, tokenUsage)
        Wrapper-->>Caller: result
    else error/timeout
        LLM-->>Wrapper: error
        Wrapper->>Telemetry: recordAiResponse(success=false, error)
        Wrapper-->>Caller: throw/error result
    end

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Add FPF trace-aware judge metric #1 — overlapping FPF judge integration, generator tracing, and scorer wiring.
feat: add Interactive Docs UI with 3-flow architecture #3 — related Deno/Fresh migration, UI rework, and deploy/workflow changes.
feat: redesign Playground with better UX and error handling #6 — overlaps server-side generation/evaluation endpoints and tournament flows.

Suggested labels

codex

"🐇
I hopped through logs and stitched the streams,
Telemetry hums where LLMs dream,
Fresh UI blooms where old one slept,
I nibble changes—patched and prepped! 🥕"

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 73.40% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Feature/remove deploy integration' directly describes the main objective: removing deploy-related functionality and integration points from the codebase.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR removes the Vite-based UI deployment integration and migrates to a Fresh framework-based UI architecture. The changes eliminate the React/Vite frontend components and replace them with a Deno-native Fresh setup using Preact and Twind for styling.

Changes:

Removed entire Vite/React UI implementation including all playground, evaluation, evolution, and methodology components
Added Fresh framework-based UI with server-side rendering capabilities
Introduced telemetry tracking system for HTTP and AI operations
Updated test imports to reflect new source file locations
Added new UI components using Preact and Twind CSS

Reviewed changes

Copilot reviewed 137 out of 152 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
ui/* (deleted)	Removed entire Vite-based React UI including all components, configuration, and build files
src/ui/* (added)	Added Fresh framework-based UI with Preact components, Twind styling, and SSR support
src/telemetry.ts (added)	Added comprehensive telemetry system for tracking HTTP requests and AI operations
tests/*	Updated import paths to reflect reorganized source structure
src/schema.ts	Updated zod import to use standard import instead of npm: specifier

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-14T06:53:21Z

src/similarity.ts

+export function textSimilarity(
+  textA: string,
+  textB: string,
+  dim = 512,
+): number {


The function signature formatting change is inconsistent with the rest of the file. The parameters were split across multiple lines but other similar functions in the file (like hashVector and cosine) keep their parameters on a single line or use different formatting. Consider maintaining consistent formatting throughout the file.

Suggested change

export function textSimilarity(

textA: string,

textB: string,

dim = 512,

): number {

export function textSimilarity(textA: string, textB: string, dim = 512): number {

Deno fmt enforces the multiline signature here due to line length, so keeping formatter output for consistency with repo formatting.

Copilot · 2026-01-14T06:53:22Z

src/schema.ts

@@ -1,4 +1,4 @@
-import { z } from "npm:zod@4.3.5";
+import { z } from "zod";


Changed from npm: specifier to bare import. Ensure deno.json has the correct import map entry for 'zod' to resolve to 'npm:zod@4.3.5' or the intended version to maintain dependency version consistency.

Suggested change

import { z } from "zod";

import { z } from "npm:zod@4.3.5";

deno.json maps zod to npm:zod@4.3.5, so the bare import stays pinned to the intended version.

gemini-code-assist · 2026-01-14T06:55:30Z

Summary of Changes

Hello @venikman, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request modernizes the project's architecture by transitioning to a Deno Fresh framework for both UI and API serving. This change aims to simplify the development and deployment process, enhance performance through server-side rendering, and provide better observability into system operations with a new telemetry module. The refactoring consolidates server-side logic and updates build tooling to align with the new unified Deno environment.

Highlights

UI Framework Migration: The project has undergone a significant migration from a React/Vite-based UI to a Deno Fresh server-side rendered (SSR) UI, centralizing frontend and backend logic.
API Endpoint Refactoring: All API endpoint implementations have been refactored into a dedicated handler file (src/server/handler.ts), streamlining the server's responsibilities.
Comprehensive Telemetry System: A new telemetry system has been introduced to monitor HTTP requests and AI calls, providing detailed metrics, in-flight counts, and AI response previews.
Enhanced Environment Configuration: Environment variable handling has been improved with fallback logic for LLM-related configurations and new variables for LLM timeouts and telemetry settings.
Streamlined Development Workflow: Development and build scripts have been updated to support the new Fresh architecture, including new tasks for UI building, development, and E2E testing with Playwright.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Ignored Files

Ignored by pattern: .github/workflows/** (2)
- .github/workflows/deploy-deno.yml
- .github/workflows/deploy-staging.yml

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This is an excellent and substantial refactoring. The move from a separate Vite UI to an integrated Fresh application greatly simplifies the architecture, development workflow, and deployment process. The introduction of a dedicated telemetry system and Playwright for E2E tests are fantastic additions that will significantly improve observability and reliability. The code organization is also much cleaner with the new src/server and top-level tests directories.

I've left a few comments on specific areas for improvement, including a critical one about re-introducing production safety checks that were present in the previous version. Once those are addressed, this will be a massive step forward for the project.

gemini-code-assist · 2026-01-14T06:58:22Z

deploy/main.ts

+import { fromFileUrl } from "@std/path";
+import { app } from "../src/ui/app.ts";
 import { env } from "../src/config.ts";
-
-// Create scorer instance (reused across requests)
-const scorer = createStoryDecompositionScorer();
-
-// ─────────────────────────────────────────────────
-// Environment Detection & Validation
-// ─────────────────────────────────────────────────
-const isDeployed = !!Deno.env.get("DENO_DEPLOYMENT_ID");
-
-// OpenAI-compatible API (OpenRouter, LM Studio, etc.)
-// Support both LLM_BASE_URL and legacy LLM_API_BASE_URL
-const LLM_BASE_URL =
-  Deno.env.get("LLM_BASE_URL") ??
-  Deno.env.get("LLM_API_BASE_URL") ??
-  "http://localhost:1234/v1";
-const LLM_API_KEY = Deno.env.get("LLM_API_KEY") ?? "";
-const LLM_MODEL = Deno.env.get("LLM_MODEL") ?? "openai/gpt-oss-120b";
-
-// Production validation - fail fast if config is missing
-if (isDeployed) {
-  if (!Deno.env.get("LLM_BASE_URL") && !Deno.env.get("LLM_API_BASE_URL")) {
-    throw new Error(
-      "LLM_BASE_URL (or LLM_API_BASE_URL) is required in production. " +
-        "Set it in Deno Deploy → Settings → Environment Variables. " +
-        "Example: https://openrouter.ai/api/v1",
-    );
-  }
-  if (LLM_BASE_URL.includes("localhost")) {
-    throw new Error(
-      "LLM_BASE_URL cannot be localhost in production. " +
-        "Configure a real API endpoint like OpenRouter or OpenAI.",
-    );
-  }
-  if (!LLM_API_KEY) {
-    throw new Error(
-      "LLM_API_KEY is required in production. " +
-        "Set it in Deno Deploy → Settings → Environment Variables.",
-    );
-  }
-}
+import { createApiHandler, type ApiConfig } from "../src/server/handler.ts";
+import {
+  normalizePath,
+  recordHttpRequest,
+  startTelemetryReporter,
+} from "../src/telemetry.ts";


This refactoring significantly simplifies the deployment entrypoint, which is great. However, it seems the production safety checks from the previous version of this file have been removed. The old deploy/main.ts included crucial fail-fast validation to ensure that LLM_BASE_URL was set to a non-localhost value and LLM_API_KEY was present in a production environment (detected via DENO_DEPLOYMENT_ID).

Without these checks, a misconfigured deployment could start successfully but then fail at runtime when trying to contact a localhost LLM. It's critical to reintroduce these startup validations to prevent production failures.

import { fromFileUrl } from "@std/path"; import { app } from "../src/ui/app.ts"; import { env } from "../src/config.ts"; import { createApiHandler, type ApiConfig } from "../src/server/handler.ts"; import { normalizePath, recordHttpRequest, startTelemetryReporter, } from "../src/telemetry.ts"; // Production safety checks const isDeployed = !!Deno.env.get("DENO_DEPLOYMENT_ID"); if (isDeployed) { if (!env.LMSTUDIO_BASE_URL || env.LMSTUDIO_BASE_URL.includes("localhost")) { throw new Error( "LMSTUDIO_BASE_URL must be set to a non-localhost value in production.", ); } if (!env.LMSTUDIO_API_KEY || env.LMSTUDIO_API_KEY === "lm-studio") { throw new Error( "LMSTUDIO_API_KEY must be set in production.", ); } }

Added production checks for LMSTUDIO_BASE_URL (non-localhost) and LMSTUDIO_API_KEY, and removed key prefix logging.

gemini-code-assist · 2026-01-14T06:58:23Z

deno.json

+    "test:e2e:install": "deno run -A --node-modules-dir npm:@playwright/test@1.57.0 install",
    "test-server": "deno run -A scripts/test-server.ts",
-    "typecheck": "deno check deploy/main.ts src/**/*.ts",
+    "typecheck": "deno check deploy/main.ts src/server/main.ts src/cli/optimize.ts src/cli/generate.ts src/cli/eval-dist.ts",


The typecheck task has been updated, but its scope seems to have been significantly reduced. The previous version checked src/**/*.ts, ensuring all TypeScript files in the source directory were type-checked. The new version only checks a few specific entrypoint files. This could allow type errors in other modules to go unnoticed. I recommend restoring the broader scope to maintain type safety across the entire project.

Suggested change

"typecheck": "deno check deploy/main.ts src/server/main.ts src/cli/optimize.ts src/cli/generate.ts src/cli/eval-dist.ts",

"typecheck": "deno check deploy/main.ts src/**/*.ts",

Updated typecheck task to check deploy/main.ts and src/**/*.ts.

gemini-code-assist · 2026-01-14T06:58:23Z

src/orchestrator/tools/pair-miner-tool.ts

+export function executePairMiner(
  input: PairMinerInput,
  ctx: ToolContext,
 ): Promise<ToolResult<PairMinerOutput>> {


This function was changed from async to a synchronous function that returns a resolved promise. While functionally correct, this is inconsistent with other tool functions like executePatcher and executeEvaluator which are async. For better code consistency and readability across the tools, I'd suggest making this function async as well and removing the Promise.resolve() wrappers on the return statements.

Suggested change

export function executePairMiner(

input: PairMinerInput,

ctx: ToolContext,

): Promise<ToolResult<PairMinerOutput>> {

export async function executePairMiner(

input: PairMinerInput,

ctx: ToolContext,

): Promise<ToolResult<PairMinerOutput>> {

Keeping executePairMiner synchronous since it has no awaits and Deno lint require-await would fail; Promise.resolve keeps the ToolResult promise shape.

venikman · 2026-01-14T06:59:17Z

@claude
@jules
@codex
review

coderabbitai

Actionable comments posted: 20

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)

evaluation-methodology.md (3)
361-474: Remove internal conversation history from documentation.

This section contains raw conversation logs, internal AI context tracking, and verbatim user messages with numerous typos. This content is inappropriate for formal methodology documentation and should be removed entirely.

The section includes:

Internal file paths and pending tasks

Verbatim user messages with spelling errors

AI context tracking ("Current Work", "Pending Tasks", "Optional Next Step")

Implementation details that belong in internal notes, not user-facing docs
📝 Proposed fix

Remove this entire section (lines 361-474). If any information from this section is valuable for the documentation, extract and rewrite it formally in the appropriate section above.
-I want you to usi FPF to come up with evaluator for you proeject. So we would
-have custon uniqui, basedon SOTA mete-mate evaluateors. before we wil mae a
-choce with which new method to adotp
-
-1. Primary Request and Intent:
-   ...
-   [entire section through line 474]
552-552: Complete the truncated sentence.

The sentence ends abruptly with "for every as" - this appears to be cut off mid-word or mid-phrase.
✍️ Suggested completion

Based on context, this likely should read:
-- **Auditable evidence** (SCR) for every as
+- **Auditable evidence** (SCR) for every assurance claim
554-567: Restructure or remove misplaced implementation tasks.

This section contains two structural issues:

Lines 554-565: The "Medium Priority" section lists implementation tasks (endpoints, buttons, storage) that don't belong in a methodology document. These appear to be development TODOs that should be tracked elsewhere (e.g., GitHub issues, project management tools).

Line 567: The orphaned text fragment "rance claim" is clearly broken and appears to be a continuation error from line 552's incomplete sentence.
🔧 Proposed fix

Option 1 (Recommended): Remove the entire section and track implementation tasks in a separate location:
-### Medium Priority
-
-- Wire up scorer to evaluate generated stories
-...
-- Download buttons in Stories tab
-
-rance claim
Option 2: If these tasks must remain, move them to a separate "Implementation Roadmap" appendix with proper context.

For line 567, simply remove the orphaned fragment as it's addressed by fixing line 552.
src/cli/optimize.ts (1)

186-201: Fire-and-forget async calls may silently lose important errors.

Both saveChampion (line 190) and writeFile (line 196) are called without await inside the callback, using only .catch() for error handling. While errors are logged, failure to save a promoted champion could lead to data loss without the optimization loop being aware.

Consider tracking these promises and checking their status, or at minimum ensuring critical operations like saving promoted champions are more robustly handled.
tests/deployment.test.ts (1)
43-52: Test fails in CI: .env file doesn't exist in pipeline environments.

The pipeline failure shows this test fails because CI environments typically don't have a .env file. Consider either:

Skipping this test in CI environments

Using Deno.test with ignore based on environment detection

Changing the assertion to be informational rather than failing
🛠️ Suggested fix: Skip test in CI
 Deno.test(
-  "Environment - should have .env file for local development",
+  {
+    name: "Environment - should have .env file for local development",
+    ignore: Deno.env.get("CI") === "true",
+  },
   async () => {
     const envPath = join(ROOT_DIR, ".env");
     assert(
       await fileExists(envPath),
       ".env should exist for local development",
     );
   },
 );
src/orchestrator/optimization-loop.ts (1)
356-382: Add maxTokens to the evaluateChampion call for consistency.

The evaluateChampion method (line 365-372) does not pass maxTokens to executeEvaluator, while runIteration (line 174) does pass it. Since maxTokens is documented as "Override max tokens per generation" and is part of the config, the initial champion evaluation should also respect this setting to ensure consistent token constraints across all evaluations.
const result = await executeEvaluator(
  {
    promptText: composedPrompt,
    epics: this.epics,
    replicates: this.config.replicates,
    seedBase: env.EVAL_SEED_BASE,
+   maxTokens: this.config.maxTokens,
    concurrency: this.config.concurrency,
    onProgress: this.config.onProgress,
  },
  ctx,
);

🤖 Fix all issues with AI agents

In `@deno.json`:
- Line 43: The "test:unit" npm script's glob is being expanded by the shell so
Deno never gets to apply --ignore; update the "test:unit" script (the
"test:unit" key in deno.json) to pass the glob and ignore patterns as single
arguments to Deno by quoting or escaping them so the shell won't expand (e.g.,
wrap tests/**/*.test.ts and tests/api in quotes/escaped form), letting Deno
handle file discovery and the --ignore filter.

In `@playwright.config.ts`:
- Line 18: The Playwright config uses the Node-only global process in the
reuseExistingServer setting; replace process.env.CI with Deno.env.get("CI") and
coerce it to a boolean so reuseExistingServer becomes
!Boolean(Deno.env.get("CI")) (update the reuseExistingServer assignment in the
config object).

In `@src/config.ts`:
- Around line 35-36: z.coerce.boolean() will coerce string "false" to true;
replace both TELEMETRY_LOG_REQUESTS and TELEMETRY_INCLUDE_LLM_OUTPUT uses with a
deterministic string-to-boolean helper (e.g., envBoolean) that parses "true"/"1"
as true and everything else as false and returns a boolean default; add a helper
function (envBoolean) and use TELEMETRY_LOG_REQUESTS: envBoolean(true) and
TELEMETRY_INCLUDE_LLM_OUTPUT: envBoolean(false) so env vars like "false" behave
as expected.

In `@src/judge/promptagent-fpf-judge.ts`:
- Around line 368-379: The structuredOutput config passed to this.agent.generate
(invoked inside withAiTelemetry for "fpf-judge") currently sets only schema:
LLMSubscoresSchema; update that structuredOutput object to include
jsonPromptInjection: true and errorStrategy: 'strict' alongside the existing
schema, keeping the abortSignal and the rest of the call intact so the call
signature for this.agent.generate and the surrounding withAiTelemetry block are
unchanged.

In `@src/server/handler.ts`:
- Around line 84-85: The in-memory Maps evalTasks, tournamentTasks, and
optimizationTasks grow without bounds; add a TTL-based cleanup: introduce a
TASK_TTL_MS constant and a cleanupOldTasks generic function that iterates a
given Map (e.g., evalTasks, tournamentTasks, optimizationTasks) and deletes
entries whose completedAt timestamp is older than TTL, then schedule periodic
invocation (setInterval) of cleanupOldTasks for each store; ensure completed
tasks set completedAt when finishing so cleanup can remove them.
- Around line 781-788: The fetch call that assigns to upstream
(fetch(`${LLM_BASE_URL.replace(/\/$/, "")}/chat/completions`, { method: "POST",
headers, body: JSON.stringify(requestBody) })) has no timeout and can hang; wrap
this fetch with an AbortController, start a timer that calls controller.abort()
after the standard request timeout used elsewhere, pass controller.signal to
fetch, clear the timer on success, and handle the abort/timeout error path to
return a controlled timeout response; reference the upstream variable,
LLM_BASE_URL, requestBody and headers when making these changes.
- Around line 359-369: The /debug/config handler currently exposes a partial API
key via LLM_API_KEY.slice(...) (keyPrefix) in the jsonResponse, which is a
security risk; remove the keyPrefix field (or return a boolean hasKey only) from
the jsonResponse inside the url.pathname === "/debug/config" branch
(function/handler logic using jsonResponse and LLM_API_KEY, LLM_BASE_URL,
LLM_MODEL) or add an authentication check before returning debug info so only
authorized requests see any sensitive data; update the response to include only
non-sensitive fields (e.g., baseUrl, model, hasKey) and ensure no slices or
substrings of LLM_API_KEY are returned.

In `@src/ui/components/ai-elements/message.tsx`:
- Line 1: The import uses an inline "npm:" specifier which violates lint rules;
change the import in src/ui/components/ai-elements/message.tsx to import from
the configured package name (e.g., import type { ComponentChildren,
ComponentProps } from "preact";) and ensure "preact" is mapped in your import
map or deno.json so the runtime/resolver resolves it properly; update any
build/config (import_map.json or deno.json) if necessary to point "preact" to
the npm package or local alias.

In `@src/ui/components/ai-elements/prompt-input.tsx`:
- Line 1: Replace the inline npm specifier in the import for ComponentProps:
instead of importing from "npm:preact@10.28.2" update the import to use the
project import-map alias (for example "react" which maps to "preact/compat", or
add a "preact" mapping in deno.json and import from "preact"); change the import
target used in the file (the ComponentProps import) to the mapped alias so the
linter no longer sees an inline "npm:" specifier.

In `@src/ui/islands/Studio.tsx`:
- Line 472: The code uses window.setInterval to start polling (assignment to
intervalId = window.setInterval(poll, 2500)), which breaks compatibility with
Deno and Web Workers; change the call to use globalThis.setInterval or bare
setInterval instead (e.g., replace window.setInterval with
globalThis.setInterval or setInterval) so intervalId is set in a
platform-agnostic way while keeping the same poll callback and delay.
- Line 365: The code uses window.setInterval when creating intervalId which
breaks Deno/Web Worker compatibility because window may be undefined; change the
call in the Studio component (where intervalId is assigned from
window.setInterval(poll, 5000)) to use globalThis.setInterval or the bare
setInterval instead so it works in all JS environments and keep the same poll
callback and interval variable (intervalId).
- Around line 1-2: The imports in Studio.tsx use inline npm: specifiers
(useEffect/useMemo/useState from "npm:preact@10.28.2/hooks" and
ComponentChildren from "npm:preact@10.28.2"), which static analysis disallows;
add import map entries mapping "preact" to "npm:preact@10.28.2" and
"preact/hooks" to "npm:preact@10.28.2/hooks" in your import_map.json or
deno.json, then update the imports in Studio.tsx to import from "preact" and
"preact/hooks" (e.g., import { useEffect, useMemo, useState } from
"preact/hooks"; import type { ComponentChildren } from "preact";) so the runtime
resolves via the import map.

In `@src/ui/polyfills.ts`:
- Line 1: The import uses an inline npm: specifier which violates policy; add an
import map entry mapping "linkedom" to "npm:linkedom@0.18.5" in your
deno.json/import_map.json, then update the import statement in
src/ui/polyfills.ts to import { DOMParser as LinkedomDOMParser, parseHTML } from
"linkedom" so the module is resolved via the import map rather than an inline
specifier.

In `@src/ui/routes/_app.tsx`:
- Around line 1-2: Imports for Fresh are using the wrong module specifiers;
update the two import lines so the Head symbol is imported from
"$fresh/runtime.ts" and the PageProps type is imported from "$fresh/server.ts"
instead of "@fresh/core/runtime" and "@fresh/core/render" respectively, ensuring
the named symbols (Head and PageProps) are preserved.

In `@src/ui/twind_middleware.ts`:
- Line 8: The module-level virtualSheet() stored in the const sheet causes race
conditions across concurrent requests; move sheet creation into the
twindMiddleware factory so each middleware invocation gets its own isolated
sheet: inside the twindMiddleware (the factory function that returns the
middleware), call virtualSheet() to create a local sheet and pass that instance
into setup(options, sheet) and the middleware logic instead of the module-scoped
sheet, ensuring sheet.reset()/collection operate on a per-request/middleware
instance rather than a shared singleton.

In `@src/ui/twind_shared.ts`:
- Around line 15-22: The module augmentation currently declares module
"npm:preact@10.28.2"; update that module string to match the import-map alias
you now use for preact (replace "npm:preact@10.28.2" with the project's preact
alias, e.g. "preact" or whatever alias is used in imports) so the JSX
DOMAttributes augmentation (interface DOMAttributes in this file) applies to the
real imported module; keep the existing interface members (class, className)
intact.
- Around line 1-3: Replace inline `npm:preact@10.28.2` specifiers with the
import map alias (e.g., "preact") in the import statements that bring in JSX,
VNode and options (the lines importing JSX, VNode and preactOptions); update
imports to use "preact" instead of "npm:preact@..." and ensure your import map
(deno.json or import_map.json) contains the mapping "preact":
"npm:preact@10.28.2" so static analysis no longer flags the inline specifiers.

🧹 Nitpick comments (34)

src/ui/dev.ts (1)
9-10: Consider using "development" mode for the dev script.

This file is named dev.ts and uses --watch flags for hot reloading, suggesting it's intended for development. However, mode is set to "production". If this is intentional (e.g., to test production builds locally), consider adding a comment to clarify. Otherwise, switching to "development" mode would enable typical dev features like better error messages and faster rebuilds.
♻️ Suggested change if development mode is intended
-const mode = "production";
+const mode = "development";
 app.config.mode = mode;
src/ui/static/styles.css (1)

1-1: Consider self-hosting fonts or adding font-display: swap.

External Google Fonts request can cause FOIT (Flash of Invisible Text) and may raise GDPR concerns for EU users due to the implicit IP transfer. The display=swap parameter mitigates FOIT, but consider self-hosting the fonts if privacy compliance is a concern.
scripts/dev.ts (2)
8-8: Consider: src/ui is a subdirectory of src.

Watching src should already include src/ui. Unless Deno's --watch requires explicit subdirectory listing, this may be redundant.
Suggested simplification
-const watchTargets = ["src", "src/ui"].join(",");
+const watchTargets = "src";
15-17: Simplify: Environment spreading is redundant.

Spreading Deno.env.toObject() into a new object has no effect here. You can either pass it directly or omit the env option entirely since Deno inherits the current environment by default.
Suggested simplification
 const process = new Deno.Command("deno", {
   args: ["run", "-A", `--watch=${watchTargets}`, "src/server/main.ts"],
   cwd,
-  env: {
-    ...Deno.env.toObject(),
-  },
   stdout: "inherit",
   stderr: "inherit",
 }).spawn();
src/orchestrator/state/kv-store.ts (1)
302-316: Consider batching atomic deletes for efficiency.

The cleanupOldTasks function performs individual atomic commits in a loop. For large cleanup operations, this could be slow. Consider batching deletes (Deno KV supports up to 10 operations per atomic transaction).
♻️ Optional: Batch deletes for better performance
 export async function cleanupOldTasks(olderThanMs: number): Promise<number> {
   const kv = await getKv();
   const cutoff = Date.now() - olderThanMs;
   let deleted = 0;
+  const toDelete: TaskRecord[] = [];

   for await (const entry of kv.list<TaskRecord>({ prefix: ["tasks"] })) {
     const task = entry.value;
     if (
       task.completedAt &&
       new Date(task.completedAt).getTime() < cutoff
     ) {
-      await kv.atomic()
-        .delete(["tasks", task.id])
-        .delete(["tasks:by-status", task.status, task.id])
-        .commit();
-      deleted++;
+      toDelete.push(task);
     }
   }

+  // Batch deletes (max 5 tasks per atomic = 10 operations)
+  for (let i = 0; i < toDelete.length; i += 5) {
+    const batch = toDelete.slice(i, i + 5);
+    const atomic = kv.atomic();
+    for (const task of batch) {
+      atomic.delete(["tasks", task.id]);
+      atomic.delete(["tasks:by-status", task.status, task.id]);
+    }
+    await atomic.commit();
+    deleted += batch.length;
+  }

   return deleted;
 }
src/ui/twind.config.ts (1)
4-12: Consider adding type annotation for the config export.

The previous version used a typed Config constant. Removing the type annotation loses compile-time validation of the configuration shape. If Fresh requires the untyped export, this is acceptable; otherwise, consider restoring type safety.
♻️ Optional: Restore type safety
 import presetAutoprefix from "@twind/preset-autoprefix";
 import presetTailwind from "@twind/preset-tailwind";
+import type { Config } from "@twind/core";

-export default {
+export default {
   selfURL: import.meta.url,
   presets: [presetTailwind(), presetAutoprefix()],
   // ...
-};
+} satisfies Config;
deno.json (1)
73-81: Redundant include entry: src/ui is already covered by src.

Line 78 includes src/ui, but src is already included at line 75. This is redundant since src encompasses all subdirectories including src/ui.
♻️ Suggested fix
     "include": [
       "deploy",
       "src",
       "data",
       "prompts",
-      "src/ui",
       "deno.json",
       "deno.lock"
     ],
src/ui/components/ai-elements/message.tsx (2)
39-48: mode prop is defined but never used.

MessageResponseProps extends StreamdownProps which includes a mode property, but it's not utilized in the component. Either implement the intended behavior or remove it to avoid confusion.

41-47: Inconsistent prop naming: className vs class.

Message and MessageContent use class (Preact convention), but MessageResponse uses className. This creates an inconsistent API for consumers.
♻️ Proposed fix for consistency
-export const MessageResponse = ({ className, ...props }: MessageResponseProps) => {
+export const MessageResponse = ({ class: className, ...props }: MessageResponseProps) => {
   const mergedClassName = cn(
     "size-full whitespace-pre-wrap [&>*:first-child]:mt-0 [&>*:last-child]:mb-0",
     className,
   );

-  return <div class={mergedClassName}>{props.children}</div>;
+  return <div class={mergedClassName} {...props} />;
 };
src/ui/twind_middleware.ts (2)
35-35: Fragile type assertions on Twind snapshot internals.

Casting snapshot[1] and snapshot[3] without validation couples this code to Twind's internal snapshot structure. A library update could silently break this.

Consider adding runtime validation:
const precedences = snapshot[1];
if (!Array.isArray(precedences)) {
  throw new Error("Unexpected Twind snapshot format");
}
Also applies to: 43-43

48-49: Consider escaping JSON in script tag to prevent injection.

While mappings contains Twind-generated values, it's good practice to escape </script> sequences in inline JSON to prevent accidental breakout.
🛡️ Defensive escaping
   const stateTag =
-    `<script id="${STATE_ELEMENT_ID}" type="application/json">${JSON.stringify(mappings)}</script>`;
+    `<script id="${STATE_ELEMENT_ID}" type="application/json">${JSON.stringify(mappings).replace(/</g, "\\u003c")}</script>`;
tests/config.test.ts (2)
166-177: Boolean test doesn't cover string coercion edge case.

This test passes actual boolean values (true/false), but environment variables are strings. Consider testing with string values "true", "false", "0", "1" to verify expected behavior.
🧪 Additional test case
Deno.test("Schema - should handle boolean string values correctly", () => {
  const testEnv = {
    ...Deno.env.toObject(),
    POLL_ENABLED: "true",
    NQD_ENABLED: "false",
  };

  const parsed = EnvSchema.parse(testEnv);
  assertEquals(parsed.POLL_ENABLED, true);
  assertEquals(parsed.NQD_ENABLED, false, "String 'false' should parse as false");
});
201-215: Missing test coverage for new configuration fields.

The EnvSchema was extended with LLM_TIMEOUT_MS and telemetry settings (TELEMETRY_REPORT_INTERVAL_MS, TELEMETRY_LOG_REQUESTS, TELEMETRY_INCLUDE_LLM_OUTPUT, TELEMETRY_LLM_PREVIEW_CHARS), but no tests cover these additions.
🧪 Suggested test additions
Deno.test("Defaults - should have sensible telemetry defaults", () => {
  const parsed = EnvSchema.parse({});
  assertEquals(parsed.LLM_TIMEOUT_MS, 120_000);
  assertEquals(parsed.TELEMETRY_REPORT_INTERVAL_MS, 60_000);
  assertEquals(parsed.TELEMETRY_LOG_REQUESTS, true);
  assertEquals(parsed.TELEMETRY_INCLUDE_LLM_OUTPUT, false);
  assertEquals(parsed.TELEMETRY_LLM_PREVIEW_CHARS, 800);
});

Deno.test("Schema - should validate LLM_TIMEOUT_MS bounds", () => {
  assertThrows(() => EnvSchema.parse({ LLM_TIMEOUT_MS: "500" })); // below min
  assertThrows(() => EnvSchema.parse({ LLM_TIMEOUT_MS: "700000" })); // above max
});
src/orchestrator/tools/meta-patcher-tool.ts (3)
147-149: Consider logging or recording failed patch generations.

The catch block silently returns null without any logging. For debugging production issues, consider at least a debug-level log or telemetry event for failed patch generations.
🔧 Optional: Add minimal error logging
-  } catch {
+  } catch (err) {
+    // Optional: Log for debugging without breaking flow
+    console.debug("[meta-patcher] Patch generation failed:", err);
     return null;
   }
248-251: Potential floating-point precision concern in success rate calculation.

The exponential moving average calculation looks correct. However, consider clamping the result to [0, 1] to prevent any floating-point drift over many iterations.
🔧 Optional: Clamp success rate
     mutation.successRate = alpha * (improved ? 1 : 0) +
       (1 - alpha) * mutation.successRate;
+    mutation.successRate = Math.max(0, Math.min(1, mutation.successRate));
     mutation.fitness = mutation.successRate;
319-323: Mutation array is mutated in-place.

The function modifies the mutations array in-place at line 322 (mutations[idx] = evolvedMutation). This is intentional based on the return type, but consider documenting this side effect in the JSDoc or returning a new array for immutability if the caller doesn't expect mutation.
src/ui/types.ts (1)
82-95: Consider extracting aiResponses entry type for reusability.

The inline object type in aiResponses array is fairly complex. Consider extracting it as a named type for reuse and clearer documentation.
♻️ Optional: Extract inline type
+export type AiResponseEntry = {
+  key: string;
+  at: string;
+  durationMs?: number;
+  success: boolean;
+  error?: string;
+  preview?: string;
+  previewTruncated?: boolean;
+  tokenUsage?: {
+    prompt?: number;
+    completion?: number;
+    total?: number;
+  };
+};
+
 export type TelemetrySnapshot = {
   type: "telemetry";
   at: string;
   periodMs: number;
   uptimeMs: number;
   http: TelemetryEntry[];
   ai: TelemetryEntry[];
-  aiResponses: {
-    key: string;
-    at: string;
-    durationMs?: number;
-    success: boolean;
-    error?: string;
-    preview?: string;
-    previewTruncated?: boolean;
-    tokenUsage?: {
-      prompt?: number;
-      completion?: number;
-      total?: number;
-    };
-  }[];
+  aiResponses: AiResponseEntry[];
 };
deploy/main.ts (1)

56-76: Consider consolidating the duplicated API_PREFIXES array across files.

The API_PREFIXES array is currently synchronized with all routes in handler.ts, but it's duplicated in both deploy/main.ts and src/server/main.ts. To prevent future sync issues, consolidate this list into a single shared location (e.g., a constants file) that both entry points can import from.
src/ui/main.ts (1)
13-13: Consider making the port configurable.

Hardcoding port 8000 may be limiting for deployment scenarios. Consider using an environment variable with a fallback.
♻️ Suggested improvement
-await app.listen({ port: 8000 });
+const port = Number(Deno.env.get("PORT")) || 8000;
+await app.listen({ port });
src/ui/twind_client.ts (1)
43-51: Non-null assertion on sheet may throw if style element has no sheet.

The styleEl.sheet! assertion assumes the CSSStyleSheet is always available. However, sheet can be null if the style element isn't connected to the document or in certain edge cases. Additionally, parseInt on malformed rule comments could produce NaN, which would silently propagate.

Consider adding defensive checks:
Proposed defensive handling
- const target = styleEl.sheet!;
+ const target = styleEl.sheet;
+ if (!target) {
+   console.warn("Twind: Unable to access stylesheet");
+   return;
+ }
  const ruleText = Array.from(target.cssRules).map((rule) => rule.cssText);
  for (const rule of ruleText) {
    const marker = rule.lastIndexOf("/*");
+   if (marker === -1) continue; // Skip rules without precedence marker
    const precedence = parseInt(rule.slice(marker + 2, -2), 36);
+   if (Number.isNaN(precedence)) continue;
    const rawRule = rule.slice(0, marker);
    rules.add(rawRule);
    precedences.push(precedence);
  }
playwright.config.ts (1)

20-25: Consider documenting the LMStudio environment variables.

The hardcoded environment variables appear to be local development defaults for LMStudio. Consider adding a comment explaining these are test defaults, or document how to override them for different environments.
src/fpf/assurance-extractor.ts (1)
193-207: Consider removing redundant type assertions.

The type guards isPoLLMetricInfo and isPromptAgentJudgeInfo already narrow the type, making the explicit casts at lines 196 and 205 unnecessary.
♻️ Suggested simplification
   if (preprocessResult.pollResult?.info) {
     if (isPoLLMetricInfo(preprocessResult.pollResult.info)) {
-      return extractFromPoLL(
-        preprocessResult.pollResult.info as PoLLMetricInfo,
-      );
+      return extractFromPoLL(preprocessResult.pollResult.info);
     }
   }

   // Fall back to single-judge FPF
   if (preprocessResult.fpfJudgeResult?.info) {
     if (isPromptAgentJudgeInfo(preprocessResult.fpfJudgeResult.info)) {
       return extractFromSingleJudge(
-        preprocessResult.fpfJudgeResult.info as PromptAgentJudgeInfo,
+        preprocessResult.fpfJudgeResult.info,
       );
     }
   }
src/server/main.ts (1)
72-83: Consider extracting the duplicated telemetry logging.

The JSON logging block is duplicated between success and error paths. A small helper could reduce repetition.
♻️ Suggested extraction
const logHttpRequest = (method: string, route: string, status: number, durationMs: number) => {
  if (env.TELEMETRY_LOG_REQUESTS) {
    console.log(
      JSON.stringify({
        type: "http",
        at: new Date().toISOString(),
        method,
        route,
        status,
        durationMs: Math.round(durationMs),
      }),
    );
  }
};
Also applies to: 92-103
src/patchEngineer.ts (1)
162-165: Short patch filtering prevents trivial outputs.

Skipping patches shorter than 20 characters is reasonable to filter out empty or meaningless responses. However, this threshold is a magic number.

Consider extracting this to a named constant or configuration value for clarity and easier adjustment.
💡 Optional: Extract magic number to constant
+const MIN_PATCH_LENGTH = 20;
+
 export async function generatePatchCandidates(
   params: Omit<GeneratePatchParams, "temperature">,
   count: number = env.OPT_PATCH_CANDIDATES,
 ): Promise<string[]> {
   // ...
     // Skip empty or very short patches
-    if (patch.length >= 20) {
+    if (patch.length >= MIN_PATCH_LENGTH) {
       candidates.push(patch);
     }
AGENTS.md (1)
39-67: Add language specifier to the directory tree code block.

The static analysis tool flags this code block as missing a language identifier. While directory trees don't have a specific language, using text or plaintext satisfies the linter and is a common convention.
💡 Suggested fix
-```
+```text
 src/
   cli/
src/orchestrator/tools/pair-miner-tool.ts (1)
39-42: Consider retaining async keyword for clarity.

The function was changed from async function to a regular function returning Promise.resolve(). While functionally equivalent since the body has no await, using async provides clearer intent and better stack traces on errors.

The current approach works correctly but is unconventional. If the synchronous nature is intentional for a specific reason (e.g., avoiding microtask queue overhead), this is fine. Otherwise, consider reverting to async for consistency with typical TypeScript patterns.
♻️ Optional: Revert to async pattern
-export function executePairMiner(
+export async function executePairMiner(
   input: PairMinerInput,
   ctx: ToolContext,
-): Promise<ToolResult<PairMinerOutput>> {
+): Promise<ToolResult<PairMinerOutput>> {
   const startTime = Date.now();

   try {
     // ... unchanged ...

-    return Promise.resolve(
-      successResult(
-        { pairs, formattedContext, tiered: !!input.tieredMining },
-        ctx,
-        startTime,
-      ),
-    );
+    return successResult(
+      { pairs, formattedContext, tiered: !!input.tieredMining },
+      ctx,
+      startTime,
+    );
   } catch (error) {
-    return Promise.resolve(
-      failureResult<PairMinerOutput>(error, ctx, startTime),
-    );
+    return failureResult<PairMinerOutput>(error, ctx, startTime);
   }
 }
Also applies to: 79-90
src/meta-evolution/evolution-engine.ts (1)

581-591: Consider retaining async keyword for consistency.

Similar to pair-miner-tool.ts, this function changed from async to regular function returning a Promise. While it works correctly (since it just delegates to engine.evolve()), the async keyword provides clearer intent.

src/orchestrator/root.ts (1)

160-187: runOptimization and resumeOptimization no longer use async keyword.

Both methods changed from async function to regular functions returning Promises. They immediately return the result of loopAgent.execute(), which is itself a Promise.

This is functionally correct but inconsistent with the async runPlayground and async runEvaluation methods in the same class. Consider either:

Making all methods consistently async, or

Making all non-awaiting methods return Promises directly

The current mixed approach works but may confuse readers.

Also applies to: 192-222
src/orchestrator/nqd-tournament.ts (1)
59-70: Unused configuration option includeIllumination.

The includeIllumination field in NQDTournamentConfig is defined but never read or used in runNQDTournament or anywhere else in this file. Since IlluminationTelemetry was removed from imports, this field appears to be dead code.

Consider removing the unused field to avoid confusion:
♻️ Suggested cleanup
 export interface NQDTournamentConfig {
   /** Champion's objective score (baseline for use-value) */
   championObjective: number;
   /** Champion's prompt text (reference for novelty) */
   championPrompt?: string;
   /** All prompt texts in current portfolio */
   portfolioPrompts?: string[];
   /** Minimum constraint-fit threshold (default 1.0) */
   constraintFitThreshold?: number;
-  /** Whether to include illumination telemetry */
-  includeIllumination?: boolean;
 }
src/server/handler.ts (2)
842-863: Fire-and-forget async task lacks top-level error boundary.

The background evaluation task uses (async () => { ... })() without a .catch() handler at the IIFE level. While errors are caught internally, any unhandled rejection from evalPromptDistribution failure paths could result in unhandled promise rejections.

The pattern is consistent but could be made more robust:
♻️ Defensive improvement
     // Start evaluation in background (don't await)
     (async () => {
       task.status = "running";
       try {
         // ... evaluation logic
       } catch (err) {
         task.status = "failed";
         task.error = err instanceof Error ? err.message : String(err);
         task.completedAt = new Date().toISOString();
       }
-    })();
+    })().catch((err) => {
+      console.error("Unhandled evaluation task error:", err);
+    });
1118-1127: Non-null assertion on potentially empty array.

Line 1122 uses sorted[0]! after sorting, but if task.candidates was somehow empty (edge case), this would throw. The check at line 1019 ensures epics exist, but not that candidates were added.

While unlikely given the flow, defensive coding would add a guard:
♻️ Defensive fix
         const sorted = [...task.candidates].sort(
           (a, b) => b.objective - a.objective,
         );
-        const best = sorted[0]!;
+        const best = sorted[0];
+        if (!best) {
+          throw new Error("No candidates evaluated");
+        }
         task.winner = {
           id: best.id,
src/telemetry.ts (3)
322-325: No mechanism to stop the telemetry reporter.

startTelemetryReporter stores reporterId but there's no exported stopTelemetryReporter function. Once started, the interval runs indefinitely even if no longer needed, preventing clean shutdown.
♻️ Suggested addition
 export const startTelemetryReporter = (intervalMs = 60_000) => {
   if (reporterId !== null) return;
   reporterId = setInterval(() => flushTelemetry(), intervalMs);
 };
+
+export const stopTelemetryReporter = () => {
+  if (reporterId !== null) {
+    clearInterval(reporterId);
+    reporterId = null;
+  }
+};
109-132: Percentile calculation returns bucket upper bounds — approximation only.

The percentile function returns BUCKETS_MS[index] (bucket upper bound) rather than interpolating within the bucket. This is acceptable for monitoring but produces step-function percentiles.

This is a known limitation of histogram-based percentile estimation. Consider documenting this approximation for clarity.

12-48: Type definitions duplicate src/ui/types.ts.

TelemetryEntry and TelemetrySnapshot are defined both here and in src/ui/types.ts. This duplication risks drift if one is updated without the other.

Consider either:

Importing the types from a shared location, or

Re-exporting these types from one canonical source

deno.json

playwright.config.ts

src/config.ts

src/judge/promptagent-fpf-judge.ts

coderabbitai · 2026-01-14T07:02:28Z

src/server/handler.ts

+let LLM_BASE_URL = "";
+let LLM_API_KEY = "";
+let LLM_MODEL = "";
+let DATA_ROOT = "";
+let PROMPTS_ROOT = "";


⚠️ Potential issue | 🟠 Major

Module-level mutable state creates shared state across handler instances.

The module-level variables LLM_BASE_URL, LLM_API_KEY, etc. are reassigned in createApiHandler. If multiple handlers are created with different configs, they will overwrite each other's configuration since these are module-scoped variables.

Consider using closure-scoped variables instead:

♻️ Suggested refactor

export function createApiHandler(config: ApiConfig) { - LLM_BASE_URL = config.llmBaseUrl; - LLM_API_KEY = config.llmApiKey; - LLM_MODEL = config.llmModel; - DATA_ROOT = config.dataRoot; - PROMPTS_ROOT = config.promptsRoot; + const LLM_BASE_URL = config.llmBaseUrl; + const LLM_API_KEY = config.llmApiKey; + const LLM_MODEL = config.llmModel; + const DATA_ROOT = config.dataRoot; + const PROMPTS_ROOT = config.promptsRoot; + + // Move cached* and task stores inside the closure as well + let cachedEpics: unknown[] | null = null; + let cachedChampion: { base: string; patch: string; composed: string } | null = null; + let cachedOrchestrator: Orchestrator | null = null; + const evalTasks = new Map<string, EvalTask>(); + const tournamentTasks = new Map<string, TournamentTask>(); + const optimizationTasks = new Map<string, OptimizationTask>(); // ... rest of handler

Leaving module-level caches as-is because createApiHandler is instantiated once per process in server entrypoints; no concurrent configs today. Happy to refactor if multi-handler usage becomes needed.

@venikman, understood! Your single-handler-per-process deployment model makes the module-level state safe in practice. The current design is simpler and perfectly reasonable given your architecture. Thanks for clarifying the deployment pattern.

✏️ Learnings added

Learnt from: venikman Repo: venikman/PromptAgent PR: 13 File: src/server/handler.ts:47-51 Timestamp: 2026-01-14T08:35:08.246Z Learning: In the PromptAgent repository, `createApiHandler` in `src/server/handler.ts` is designed to be instantiated once per process in server entrypoints, so module-level caches (cachedEpics, cachedChampion, cachedOrchestrator, task stores) are safe and intentional for the single-handler deployment model.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: CR Repo: venikman/PromptAgent PR: 0 File: AGENTS.md:0-0 Timestamp: 2026-01-06T19:14:53.230Z Learning: Applies to **/scorer.ts : Implement scorer pipeline following the pattern: preprocess → analyze → generateScore → generateReason

coderabbitai · 2026-01-14T07:02:29Z

src/ui/polyfills.ts

@@ -0,0 +1,11 @@
+import { DOMParser as LinkedomDOMParser, parseHTML } from "npm:linkedom@0.18.5";


⚠️ Potential issue | 🟠 Major

Inline npm: specifier violates project policy.

The static analysis reports that inline npm:, jsr:, or https: dependencies are not allowed. Move this dependency to the project's import map (e.g., deno.json or import_map.json).

Suggested fix

In your deno.json (or import map), add:

{ "imports": { "linkedom": "npm:linkedom@0.18.5" } }

Then update this file:

-import { DOMParser as LinkedomDOMParser, parseHTML } from "npm:linkedom@0.18.5"; +import { DOMParser as LinkedomDOMParser, parseHTML } from "linkedom";

🧰 Tools

🪛 GitHub Check: check

[failure] 1-1:
�[0m�[1mInline 'npm:', 'jsr:' or 'https:' dependency not allowed

🤖 Prompt for AI Agents

In `@src/ui/polyfills.ts` at line 1, The import uses an inline npm: specifier which violates policy; add an import map entry mapping "linkedom" to "npm:linkedom@0.18.5" in your deno.json/import_map.json, then update the import statement in src/ui/polyfills.ts to import { DOMParser as LinkedomDOMParser, parseHTML } from "linkedom" so the module is resolved via the import map rather than an inline specifier.

coderabbitai · 2026-01-14T07:02:29Z

src/ui/routes/_app.tsx

+import { Head } from "@fresh/core/runtime";
+import type { PageProps } from "@fresh/core/render";


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Fresh Deno framework version 2.x import paths Head component PageProps

💡 Result:

Use PageProps from "$fresh/server.ts" and Head from "$fresh/runtime.ts".

Examples:

PageProps: import { PageProps } from "$fresh/server.ts"; — used for route/layout/_app props. [1]

Head: import { Head, asset, Partial } from "$fresh/runtime.ts"; — used to modify the document head. [2]

References:
[1] Fresh docs — routes / PageProps usage.
[2] Fresh docs — runtime (Head, asset, Partial) and examples.

Fix Fresh framework import paths.

The imports are incorrect for Fresh 2.x. Update to:

Line 1: import { Head } from "$fresh/runtime.ts";

Line 2: import type { PageProps } from "$fresh/server.ts";

The @fresh/core/ prefix is non-standard; Fresh uses $fresh/ aliases for imports.

🤖 Prompt for AI Agents

In `@src/ui/routes/_app.tsx` around lines 1 - 2, Imports for Fresh are using the wrong module specifiers; update the two import lines so the Head symbol is imported from "$fresh/runtime.ts" and the PageProps type is imported from "$fresh/server.ts" instead of "@fresh/core/runtime" and "@fresh/core/render" respectively, ensuring the named symbols (Head and PageProps) are preserved.

src/ui/twind_middleware.ts

src/ui/twind_shared.ts

chatgpt-codex-connector

💡 Codex Review

PromptAgent/src/orchestrator/optimization-loop.ts

Lines 365 to 369 in f03da6e

    
           const result = await executeEvaluator( 
        
             { 
        
               promptText: composedPrompt, 
        
               epics: this.epics, 
        
               replicates: this.config.replicates,

Honor maxTokens in initial champion evaluation

When optimization is configured with a maxTokens override (e.g., via the v3 optimize payload or CLI), the initial champion evaluation here does not pass that value to executeEvaluator. That means the baseline objective is computed with the default token cap while subsequent iterations use the override, which can skew promotion decisions and unintentionally increase cost for the first evaluation.

PromptAgent/src/orchestrator/optimization-loop.ts

Lines 404 to 408 in f03da6e

    
           const result = await executeEvaluator( 
        
             { 
        
               promptText: composedPrompt, 
        
               epics: this.epics, 
        
               replicates: this.config.replicates,

Apply maxTokens to tournament candidate evaluation

The tournament evaluation uses executeEvaluator without maxTokens, so candidate scoring ignores the configured token cap. If a caller sets maxTokens to control runtime/cost, champion evaluation (step 1) honors it but candidate evaluations do not, making tournament results inconsistent and potentially more expensive than intended.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In `@deploy/main.ts`:
- Around line 16-18: The current computation of keyPrefix using
env.LMSTUDIO_API_KEY.slice(0, 10) risks leaking sensitive key material in
production; update the logging logic that builds keyPrefix so it only includes a
short prefix (e.g., 4 chars) or is omitted entirely when running in production
(detect production via an env var like DENO_DEPLOYMENT_ID or NODE_ENV ===
"production"), and replace the explicit key string in the logged object with a
boolean like hasKey: !!env.LMSTUDIO_API_KEY; adjust references to keyPrefix and
env.LMSTUDIO_API_KEY accordingly so production logs never contain the API key
prefix.

In `@src/ui/islands/Studio.tsx`:
- Around line 650-664: The Refresh button clears the champion state immediately
then fetches /champion without preserving the previous value, so on network
failure the UI stays stuck in "Loading champion prompt..."; modify the onClick
handler around setChampion/fetch("/champion")/readJson<ChampionPrompt> to first
capture the current champion in a local variable, then attempt the fetch, and on
failure restore that saved champion (or set an explicit error state) instead of
leaving it null; update any UI that reads champion to display an error message
when the fetch fails if you choose the error-state approach.
- Around line 633-641: The MessageResponse component's mode prop is unused;
remove all occurrences of mode="static" passed into MessageResponse in
Studio.tsx (the three MessageResponse usages), and also remove the unused mode
prop from the MessageResponse props/type definition and any related default/prop
handling so the component signature no longer declares or expects mode.

In `@tests/deployment.test.ts`:
- Around line 178-186: The section header comment above the test is incorrect:
replace the header string "// API Endpoint Consistency Tests" with a header that
reflects this UI test (e.g., "// UI Entry Tests" or "// UI - Entry Point Tests")
so it aligns with the Deno.test named "UI - Fresh app entry should exist" which
checks the app entry at join(ROOT_DIR, "src", "ui", "app.ts").

♻️ Duplicate comments (3)

deno.json (2)
46-47: Shell glob expansion undermines --ignore flag.

The glob tests/**/*.test.ts expands at the shell level before Deno processes the command. By the time --ignore=tests/api is evaluated, the file list is already resolved, making the exclusion ineffective.
🔧 Suggested fix
-    "test:unit": "deno test -A tests/**/*.test.ts --ignore=tests/api",
+    "test:unit": "deno test -A tests/ --ignore=tests/api",
This lets Deno handle file discovery and apply the ignore filter correctly.
52-52: Typecheck scope is significantly reduced.

The typecheck task only covers specific entrypoint files, which may allow type errors in other modules (e.g., src/telemetry.ts, src/ui/**/*.ts, utility modules) to go unnoticed. Consider broadening the scope.
🔧 Suggested fix
-    "typecheck": "deno check deploy/main.ts src/server/main.ts src/cli/optimize.ts src/cli/generate.ts src/cli/eval-dist.ts",
+    "typecheck": "deno check deploy/main.ts src/**/*.ts",
deploy/main.ts (1)
1-19: Missing production safety checks (fail-fast validation).

The previous version included critical startup validations to ensure LMSTUDIO_BASE_URL was set to a non-localhost value and LMSTUDIO_API_KEY was present when deployed (detected via DENO_DEPLOYMENT_ID). Without these, a misconfigured deployment could start successfully but fail at runtime.
🔧 Suggested fix

Add after the imports:
// Production safety checks
const isDeployed = !!Deno.env.get("DENO_DEPLOYMENT_ID");
if (isDeployed) {
  if (!env.LMSTUDIO_BASE_URL || env.LMSTUDIO_BASE_URL.includes("localhost")) {
    throw new Error(
      "LMSTUDIO_BASE_URL must be set to a non-localhost value in production.",
    );
  }
  if (!env.LMSTUDIO_API_KEY || env.LMSTUDIO_API_KEY === "lm-studio") {
    throw new Error(
      "LMSTUDIO_API_KEY must be set in production.",
    );
  }
}

🧹 Nitpick comments (6)

tests/deployment.test.ts (2)
44-54: Consider using Deno.test's ignore option for clearer test reporting.

The early return works but makes the test pass silently in CI. Using ignore provides explicit skip reporting in test output.
♻️ Suggested improvement
-Deno.test(
-  "Environment - should have .env file for local development",
-  async () => {
-    if (IS_CI) return;
+Deno.test({
+  name: "Environment - should have .env file for local development",
+  ignore: IS_CI,
+  fn: async () => {
     const envPath = join(ROOT_DIR, ".env");
     assert(
       await fileExists(envPath),
       ".env should exist for local development",
     );
   },
-);
+});
182-186: Consider adding content validation to the Fresh app entry test.

The test only checks file existence. For more robust validation, consider verifying that the file exports expected symbols (e.g., App component or Fresh configuration).
♻️ Optional enhancement
 Deno.test("UI - Fresh app entry should exist", async () => {
   const appPath = join(ROOT_DIR, "src", "ui", "app.ts");
   const content = await readFileIfExists(appPath);
   assertExists(content, "Fresh app entry should exist");
+  // Optionally validate expected exports
+  assert(
+    content.includes("export") || content.includes("default"),
+    "Fresh app entry should have exports",
+  );
 });
deno.json (1)
76-84: Redundant entry in deploy include list.

src/ui (line 81) is redundant since src (line 78) already includes the entire src/ directory. This doesn't cause harm but clutters the configuration.
🔧 Suggested fix
     "include": [
       "deploy",
       "src",
       "data",
       "prompts",
-      "src/ui",
       "deno.json",
       "deno.lock"
     ],
src/ui/twind_shared.ts (1)
24-55: Well-implemented Twind integration.

The setup function correctly:

Chains the Preact vnode hook while preserving the original

Processes both class and className through Twind's tw() utility

Joins multiple class sources into a single class prop

One minor observation: selfURL from Options is spread into the config but isn't used by twSetup(). If you want to be explicit about not passing it to twind, you could destructure it out:
Optional: Exclude selfURL from config spread
 export function setup(options: Options, sheet: Sheet) {
+  const { selfURL: _, ...twindOptions } = options;
   const config: Configuration = {
-    ...options,
+    ...twindOptions,
     mode: "silent",
     sheet,
   };
This is purely a cleanliness preference—twind ignores unknown properties, so the current code works fine.
src/ui/islands/Studio.tsx (2)
441-478: Consider using ReturnType<typeof setInterval> for cross-runtime compatibility.

The explicit number | null type annotation on line 445 assumes browser semantics where setInterval returns a number. In Deno, setInterval returns a Timeout object. Using ReturnType<typeof setInterval> ensures compatibility across runtimes.
Suggested fix
-    let intervalId: number | null = null;
+    let intervalId: ReturnType<typeof setInterval> | null = null;
709-757: Number inputs may produce NaN values.

Number(target.value) returns NaN for non-numeric input. While min={1} provides HTML validation, programmatic or paste inputs could still result in NaN being set in state, potentially causing issues in the optimization request.
Suggested fix with input validation
 onChange={(event) => {
   const target = event.currentTarget as HTMLInputElement;
+  const value = Number(target.value);
+  if (Number.isNaN(value) || value < 1) return;
   setOptConfig((current) => ({
     ...current,
-    maxIterations: Number(target.value),
+    maxIterations: value,
   }));
 }}
Apply the same pattern to the other number inputs (replicates, patchCandidates).

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f03da6e and 4c2dd74.

⛔ Files ignored due to path filters (1)

deno.lock is excluded by !**/*.lock

📒 Files selected for processing (9)

deno.json
deploy/main.ts
playwright.config.ts
src/ui/components/ai-elements/message.tsx
src/ui/components/ai-elements/prompt-input.tsx
src/ui/islands/Studio.tsx
src/ui/polyfills.ts
src/ui/twind_shared.ts
tests/deployment.test.ts

🚧 Files skipped from review as they are similar to previous changes (3)

src/ui/components/ai-elements/message.tsx
playwright.config.ts
src/ui/components/ai-elements/prompt-input.tsx

🧰 Additional context used

🧬 Code graph analysis (1)

src/ui/islands/Studio.tsx (3)

src/ui/types.ts (2)

ScorerResult (49-54)

TelemetrySnapshot (75-96)

src/ui/components/ai-elements/prompt-input.tsx (5)

PromptInput (6-14)

PromptInputBody (18-20)

PromptInputTextarea (24-35)

PromptInputFooter (39-50)

PromptInputSubmit (54-67)

src/ui/components/ai-elements/message.tsx (3)

Message (14-23)

MessageContent (27-37)

MessageResponse (41-48)

🔇 Additional comments (16)

src/ui/polyfills.ts (2)

3-11: LGTM!

The polyfill implementation is sound:

Defensive checks (typeof ... === "undefined") ensure existing browser globals are not overwritten.

Double type assertions (as unknown as X) are the accepted pattern for bridging linkedom's types to native DOM types.

The minimal document stub is appropriate for server-side rendering scenarios.

1-1: Import specifier issue has been addressed.

The import now uses the bare specifier "linkedom" instead of the inline npm: specifier flagged in the previous review. The deno.json correctly maps "linkedom" to "npm:linkedom@0.18.5", aligning with project policy.

tests/deployment.test.ts (2)

10-14: LGTM!

The import cleanup and CI detection pattern are appropriate. Using Boolean(Deno.env.get("CI")) correctly handles the undefined case when not in CI.

164-176: LGTM!

The refactored hasStdImports check is cleaner and the logic correctly validates the presence of @std/* imports in the Deno configuration.

deno.json (1)

1-87: LGTM on the overall configuration restructuring.

The migration to Fresh/Preact with proper JSX configuration, new test tasks (including e2e with Playwright), and updated import mappings looks well-structured. The compiler options correctly set jsx: "react-jsx" with jsxImportSource pointing to Preact.

deploy/main.ts (3)

43-56: Good fallback pattern for Fresh build cache.

The try/catch pattern that attempts to load the production build cache and gracefully falls back to in-memory build is well-implemented. This ensures the deployment doesn't fail if the build artifacts are missing while providing optimal performance when they're available.

61-81: API route detection is clean and maintainable.

The API_PREFIXES array with the isApiRoute helper provides a clear, centralized way to distinguish API routes from UI routes. The logic correctly handles both exact matches and prefix matches with trailing paths.

83-131: Request handler with telemetry is well-structured.

The unified request handler properly:

Records metrics for both success and error cases

Logs structured JSON when telemetry is enabled

Returns generic error messages to clients while logging full errors server-side

Delegates to appropriate handlers based on route type

src/ui/twind_shared.ts (1)

1-22: Previous feedback addressed.

The imports and module augmentation now correctly use the "preact" import map alias instead of inline npm: specifiers. The interface and type definitions are clean.

src/ui/islands/Studio.tsx (7)

1-22: LGTM!

Previous static analysis issues have been addressed: imports now use import map entries instead of inline npm: specifiers.

60-86: LGTM!

Utility functions properly handle edge cases (undefined, NaN) and provide safe fallbacks.

88-114: LGTM!

StoryCard component properly guards the optional story points field with a typeof check before rendering.

116-177: LGTM!

readJson helper provides clear error messages with endpoint context, and PlaygroundResultView handles all result states gracefully.

179-262: LGTM!

State management is well-organized with appropriate memoization for derived values like inFlightTotal and selectedEpic.

263-347: LGTM!

Effect properly handles cleanup with the cancelled flag and AbortController. The health check has a sensible 2.5s timeout.

349-371: LGTM!

Previous window.setInterval issue has been addressed by using globalThis.setInterval. Silent error handling for telemetry polling is acceptable for non-critical background data.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

deploy/main.ts

src/ui/islands/Studio.tsx

tests/deployment.test.ts

coderabbitai

Actionable comments posted: 6

🤖 Fix all issues with AI agents

In `@AGENTS.md`:
- Around line 25-28: The Commands section only lists e2e and UI tasks but omits
unit and other test tasks; update the documentation to include the missing test
commands by adding entries for "deno task test", "deno task test:unit", "deno
task test:api", "deno task test:coverage" (and any other project-defined test
tasks) with brief descriptions similar to the existing lines for e2e and UI
(e.g., "deno task test           # Run all tests", "deno task test:unit      #
Run unit tests", "deno task test:api       # Run API tests", "deno task
test:coverage  # Generate test coverage report") so the Commands section fully
documents available test tasks.

In `@src/ui/twind_client.ts`:
- Around line 45-51: The loop over ruleText can produce NaN precedences when the
marker is missing; in the block processing each rule (loop over ruleText)
validate that marker = rule.lastIndexOf("/*") is >= 0 before slicing, and after
parseInt(check parse result) ensure precedence is a finite integer (e.g.,
Number.isFinite or !Number.isNaN) otherwise handle gracefully (skip adding to
precedences, default to a safe precedence, or log/throw an error); keep adding
rawRule to rules only if appropriate and ensure precedences and rules remain in
sync by only pushing to precedences when precedence is valid.
- Around line 43-47: The code assumes styleEl.sheet is always present by using a
non-null assertion; guard against a null sheet in the function that reads
existing rules (replace the use of styleEl.sheet! / target with a checked
value). Specifically, in the block that declares target and iterates ruleText
(referencing styleEl and target), first check if styleEl.sheet is null and
handle it gracefully (e.g., treat as no existing rules or return early) before
accessing cssRules; this avoids runtime errors in disconnected DOM or edge cases
where ensureStyleElement did not yield a sheet.
- Around line 42-51: The sheet.init callback is passing only the first element
of sheetState (via sheetState.shift()) which yields a single-element snapshot
and causes undefined values when server code expects the full state array;
update the sheet.init implementation in twind_client.ts so it passes the entire
sheetState array to the callback (init: (cb) => cb(sheetState)) so twind
receives [precedences, rules, mappings, true] and
sheet.reset()/twind_middleware.ts can access snapshot[1] and snapshot[3]
correctly.

In `@src/ui/twind_middleware.ts`:
- Around line 54-58: The stateTag construction serializes mappings into a raw
script tag which allows JSON containing "</script>" or special line separators
to break out and enable XSS; update the code that builds stateTag (and where
STATE_ELEMENT_ID and mappings are used) to either: 1) create the script element
via DOM APIs on the client/SSR host and set its textContent to
JSON.stringify(mappings) (preferred), or 2) if emitting an HTML string, sanitize
the JSON string after JSON.stringify(mappings) by escaping "</" (e.g. replace
all occurrences of "</script>"/ "</" with "<\\/") and also escape U+2028 and
U+2029 (replace with "\\u2028" and "\\u2029") before embedding it in stateTag so
the serialized state cannot break out of the script tag.

♻️ Duplicate comments (8)

src/server/handler.ts (5)

47-51: Module-level mutable state creates shared state across handler instances.

This issue was previously flagged. The module-level variables LLM_BASE_URL, LLM_API_KEY, etc. are reassigned in createApiHandler. If multiple handlers are created with different configs, they will overwrite each other's configuration.

79-80: In-memory task stores grow unbounded — memory leak risk.

This issue was previously flagged. The evalTasks, tournamentTasks, and optimizationTasks Maps store task data indefinitely without cleanup.

Also applies to: 117-117, 123-123

354-364: Debug endpoint exposes partial API key.

This issue was previously flagged. Line 361 exposes keyPrefix: LLM_API_KEY.slice(0, 12) + "..." which could aid attackers.

559-566: Missing timeout on upstream LLM fetch in /generate-story.

This issue was previously flagged. The fetch() call lacks an AbortSignal timeout, risking indefinite hangs if the LLM service is unresponsive.

780-787: Missing timeout on upstream LLM fetch in /generate.

This issue was previously flagged. Same issue as /generate-story — this fetch lacks a timeout.
deploy/main.ts (2)
1-9: Production safety checks are missing.

This was previously flagged. The deployment entrypoint should fail fast if misconfigured in production. Without validation, a deployed instance could start but fail at runtime when contacting a localhost LLM.
🛡️ Recommended production guard
// Add after imports
const isDeployed = !!Deno.env.get("DENO_DEPLOYMENT_ID");
if (isDeployed) {
  if (!env.LMSTUDIO_BASE_URL || env.LMSTUDIO_BASE_URL.includes("localhost")) {
    throw new Error(
      "LMSTUDIO_BASE_URL must be set to a non-localhost value in production.",
    );
  }
  if (!env.LMSTUDIO_API_KEY || env.LMSTUDIO_API_KEY === "lm-studio") {
    throw new Error(
      "LMSTUDIO_API_KEY must be set in production.",
    );
  }
}
16-18: API key prefix logged in production.

This was previously flagged. Logging the first 10 characters of the API key could leak sensitive information in production logs.
src/ui/twind_middleware.ts (1)
15-15: Module-level sheet singleton creates a race condition in concurrent requests.

This issue was previously identified: the sheet is created once at module load. When concurrent requests invoke the middleware, they share this singleton and can cross-contaminate CSS state—sheet.reset() in one request can interfere with another request's rendering.

Move sheet creation inside the factory function so each middleware instance has its own isolated sheet:
Proposed fix
-const sheet = virtualSheet();
-
-const injectIntoHtml = (html: string, injection: string) => {
+const injectIntoHtml = (html: string, injection: string) => {
   if (html.includes("</head>")) {
     return html.replace("</head>", `${injection}</head>`);
   }
@@ -25,7 +23,8 @@ const injectIntoHtml = (html: string, injection: string) => {
 };

 export const twindMiddleware = (options: Options): TwindMiddleware => {
+  const sheet = virtualSheet();
   setup(options, sheet);

   return async (ctx) => {

🧹 Nitpick comments (8)

src/models.ts (1)
36-79: Consider extracting common URL/apiKey configuration.

Both makeGeneratorModel and makeJudgeModel repeat identical fallback chains for url and apiKey. This is a minor DRY opportunity.
♻️ Optional: Extract shared config
+const getBaseUrl = () =>
+  getEnvWithFallback(
+    "LMSTUDIO_BASE_URL",
+    ["LLM_BASE_URL", "LLM_API_BASE_URL"],
+    "http://127.0.0.1:1234/v1",
+  );
+
+const getApiKey = () =>
+  getEnvWithFallback("LMSTUDIO_API_KEY", ["LLM_API_KEY"], "lm-studio");
+
 export function makeGeneratorModel(): ModelConfig {
   const modelId = getEnvWithFallback(
     "LMSTUDIO_MODEL",
     ["LLM_MODEL"],
     "openai/gpt-oss-120b",
   );
   return {
-    url: getEnvWithFallback(
-      "LMSTUDIO_BASE_URL",
-      ["LLM_BASE_URL", "LLM_API_BASE_URL"],
-      "http://127.0.0.1:1234/v1",
-    ),
+    url: getBaseUrl(),
     id: `lmstudio/${modelId}` as const,
-    apiKey: getEnvWithFallback(
-      "LMSTUDIO_API_KEY",
-      ["LLM_API_KEY"],
-      "lm-studio",
-    ),
+    apiKey: getApiKey(),
   };
 }
src/telemetry.ts (2)
329-332: Consider exposing a stopTelemetryReporter function.

startTelemetryReporter stores the interval ID but there's no exported function to stop it. This could be useful for graceful shutdown or testing.
♻️ Proposed addition
 export const startTelemetryReporter = (intervalMs = 60_000) => {
   if (reporterId !== null) return;
   reporterId = setInterval(() => flushTelemetry(), intervalMs);
 };
+
+export const stopTelemetryReporter = () => {
+  if (reporterId === null) return;
+  clearInterval(reporterId);
+  reporterId = null;
+};
12-48: Consolidate type definitions: move TelemetryEntry and TelemetrySnapshot to a shared source.

TelemetryEntry and TelemetrySnapshot are currently defined identically in both src/telemetry.ts and src/ui/types.ts, with no imports between them. This creates risk of unintended drift if one definition is updated without the other. Consider importing these types from src/ui/types.ts in src/telemetry.ts and removing the duplicate definitions to establish a single source of truth.
src/server/handler.ts (4)
842-863: Unhandled promise rejection in background task could crash silently.

The async IIFE for evaluation runs in the background but exceptions thrown inside the catch block (e.g., if updating task.status or task.error fails) would result in an unhandled promise rejection. While the main try/catch handles evalPromptDistribution errors, any failure in the error-handling code itself is not caught.

Consider adding a top-level catch or ensuring the entire async block is wrapped:
♻️ Suggested defensive wrapper
       // Start evaluation in background (don't await)
       (async () => {
         task.status = "running";
         try {
           // ... existing code
         } catch (err) {
           task.status = "failed";
           task.error = err instanceof Error ? err.message : String(err);
           task.completedAt = new Date().toISOString();
         }
-      })();
+      })().catch((e) => console.error("Background eval task failed unexpectedly:", e));
1045-1136: Tournament background task has same unhandled rejection risk.

Similar to the evaluation task, the tournament async IIFE could have unhandled rejections if the error-handling code itself fails.
♻️ Suggested defensive wrapper
-      })();
+      })().catch((e) => console.error("Background tournament task failed unexpectedly:", e));
1275-1300: Background tasks in V2/V3 endpoints also lack top-level error handling.

The async IIFEs at lines 1275-1300 (V2 evaluate), 1366-1392 (V2 optimize), and 1452-1530 (V3 optimize) all have the same pattern where unexpected errors in the catch block would be unhandled.

Apply the same .catch() pattern to all background async IIFEs for consistency.

Also applies to: 1366-1392, 1452-1530

1492-1500: Potential division by zero in tournament progress calculation.

At line 1493, total / (config.patchCandidates + 1) could produce unexpected results if total is 0 (though unlikely given the flow). More importantly, the Math.floor(completed / ...) calculation may not accurately track candidate index if runs per candidate don't divide evenly.

Consider tracking candidateIdx explicitly via the callback chain rather than inferring it from completion counts.
AGENTS.md (1)
41-69: Add language specifier to fenced code block.

The architecture tree code block at line 41 is missing a language specifier. Per static analysis (markdownlint MD040), fenced code blocks should specify a language for proper rendering and accessibility.
📝 Suggested fix
-```
+```text
 src/
   cli/

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4c2dd74 and 2968421.

📒 Files selected for processing (17)

AGENTS.md
deploy/main.ts
scripts/pre-commit-checks.ts
src/generator.ts
src/models.ts
src/server/handler.ts
src/server/main.ts
src/similarity.ts
src/telemetry.ts
src/ui/components/ai-elements/message.tsx
src/ui/components/ai-elements/prompt-input.tsx
src/ui/islands/Studio.tsx
src/ui/static/styles.css
src/ui/twind_client.ts
src/ui/twind_middleware.ts
tests/meta-evolution/meta-evolution.test.ts
tests/orchestrator/tools/tools.test.ts

✅ Files skipped from review due to trivial changes (1)

src/similarity.ts

🚧 Files skipped from review as they are similar to previous changes (5)

src/ui/components/ai-elements/message.tsx
tests/meta-evolution/meta-evolution.test.ts
src/ui/islands/Studio.tsx
src/ui/static/styles.css
src/server/main.ts

🧰 Additional context used

📓 Path-based instructions (1)

**/{generator,*agent*}.{ts,js}

📄 CodeRabbit inference engine (AGENTS.md)

**/{generator,*agent*}.{ts,js}: Use jsonPromptInjection: true in structuredOutput configuration when LM Studio's response_format support is incomplete
Set errorStrategy: 'strict' in structuredOutput configuration for schema validation

Files:

src/generator.ts

🧠 Learnings (6)

📚 Learning: 2026-01-06T19:14:53.230Z

Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/scorer.ts : Composite score calculation should weight: 25% Epic keyword coverage, 30% INVEST principles adherence, 30% Acceptance criteria testability (prefer Given/When/Then), 10% Duplication penalty, 5% Story count sanity (4-8 optimal)

Applied to files:

AGENTS.md

📚 Learning: 2026-01-06T19:14:53.230Z

Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/{generator,*agent*}.{ts,js} : Use `jsonPromptInjection: true` in structuredOutput configuration when LM Studio's `response_format` support is incomplete

Applied to files:

AGENTS.md
src/server/handler.ts

📚 Learning: 2026-01-06T19:14:53.230Z

Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/{generator,*agent*}.{ts,js} : Set `errorStrategy: 'strict'` in structuredOutput configuration for schema validation

Applied to files:

AGENTS.md

📚 Learning: 2026-01-06T19:14:53.230Z

Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/scorer.ts : Use a separate judge model instance via `makeJudgeModel()` in scoring modules to reduce 'grading your own homework' bias

Applied to files:

AGENTS.md

📚 Learning: 2026-01-06T19:14:53.230Z

Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/scorer.ts : Implement scorer pipeline following the pattern: preprocess → analyze → generateScore → generateReason

Applied to files:

AGENTS.md

📚 Learning: 2026-01-06T19:14:53.230Z

Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/scorer.ts : Enforce schema validation as a hard gate: return score = 0 if validation fails

Applied to files:

AGENTS.md

🧬 Code graph analysis (6)

src/ui/twind_client.ts (1)

src/ui/twind_shared.ts (4)

STATE_ELEMENT_ID (8-8)

STYLE_ELEMENT_ID (7-7)

Options (10-13)

setup (24-55)

src/ui/twind_middleware.ts (1)

src/ui/twind_shared.ts (4)

Options (10-13)

setup (24-55)

STYLE_ELEMENT_ID (7-7)

STATE_ELEMENT_ID (8-8)

src/generator.ts (4)

src/telemetry.ts (3)

start (345-372)

withAiTelemetry (275-303)

recordAiResponse (246-273)

src/schema.ts (3)

StoryPack (71-71)

storyPackSchema (62-69)

Epic (20-20)

src/config.ts (1)

env (156-156)

src/judge/promptagent-fpf-judge.ts (1)

ExecutionTrace (57-57)

src/ui/components/ai-elements/prompt-input.tsx (1)

src/ui/lib/utils.ts (1)

cn (1-2)

src/telemetry.ts (2)

src/ui/types.ts (2)

TelemetryEntry (62-73)

TelemetrySnapshot (75-96)

src/config.ts (1)

env (156-156)

src/server/handler.ts (6)

src/scorer.ts (1)

createStoryDecompositionScorer (49-306)

src/eval.ts (2)

PromptDistReport (82-93)

evalPromptDistribution (130-239)

src/telemetry.ts (1)

getTelemetrySnapshot (321-321)

src/pairMining.ts (4)

ScoredOutput (26-35)

mineContrastivePairs (139-194)

ContrastPair (50-66)

formatPairsForPrompt (442-489)

src/patchEngineer.ts (1)

generatePatchCandidates (146-169)

src/config.ts (1)

env (156-156)

🪛 markdownlint-cli2 (0.18.1)

AGENTS.md

41-41: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (32)

src/ui/components/ai-elements/prompt-input.tsx (6)

1-2: LGTM!

The inline npm specifier issue from the previous review has been addressed. The import now correctly uses the "preact" alias from the import map.

4-16: LGTM!

Clean form wrapper with appropriate focus-within visual feedback for accessibility. The prop forwarding pattern is idiomatic.

18-22: LGTM!

Simple container component with consistent prop forwarding.

24-37: LGTM!

The focus:outline-none is appropriate here since the parent PromptInput form provides focus-within visual feedback, avoiding duplicate focus indicators.

39-52: LGTM!

Well-structured footer with appropriate flex layout for action items.

54-69: LGTM!

Good default of type="submit" for form context. Disabled state styling is properly handled with disabled:cursor-not-allowed disabled:opacity-60.

tests/orchestrator/tools/tools.test.ts (1)

7-20: LGTM! Import paths correctly updated to reflect new module structure.

The import reorganization aligns with the PR's move to centralized modules under src/. Type imports are properly segregated using type keyword for FlatRun and StoryPack.

src/models.ts (1)

18-34: LGTM! Clean environment variable fallback implementation.

The two-stage lookup pattern with getEnvWithFallback provides good flexibility for different deployment configurations while maintaining sensible defaults.

scripts/pre-commit-checks.ts (2)

228-248: LGTM! Fresh UI structure checks appropriately replace Vite checks.

The new checkFreshUi function validates the expected Fresh framework structure (routes/index.tsx and app.ts), aligning with the PR's migration away from Vite/React. Using "warning" severity is appropriate since the UI might be optional in some configurations.

253-274: Server entrypoint checks updated correctly.

The renamed check validates both local (src/server/main.ts) and production (deploy/main.ts) server entrypoints, with production being a warning-level check. This matches the PR's "remove deploy integration" theme while still validating the expected structure.

src/telemetry.ts (1)

341-378: Well-structured SSE stream implementation.

The createTelemetryStream properly handles cleanup on both abort signal and manual cancellation, includes keepalive messages, and pushes an initial snapshot. The implementation follows SSE best practices.

src/generator.ts (4)

293-306: Correct structuredOutput configuration per coding guidelines.

The structuredOutput configuration correctly uses jsonPromptInjection: true and errorStrategy: 'strict' as required by the coding guidelines for generator/agent files. The AbortSignal.timeout integration properly enforces LLM timeout.

97-111: Robust JSON extraction with multiple fallback strategies.

The extractJsonCandidate function handles both fenced code blocks (```json) and bare JSON objects, providing good resilience for varied LLM output formats.

375-384: Comprehensive telemetry recording for AI responses.

The recordAiResponse call captures response-specific telemetry (duration, success, text preview, error, token usage) complementing the aggregate metrics tracked by withAiTelemetry. The Number.isFinite guard on durationMs is a good defensive check.

340-350: Good error message extraction with fallbacks.

The validation error handling extracts the first issue for a concise error message while preserving the full validation failure structure. The fallback to responseError.message ensures errors are captured even when other extraction fails.

src/server/handler.ts (4)

208-212: CORS configuration is overly permissive for production.

access-control-allow-origin: "*" allows any origin to make requests. While acceptable for development/demo, this should be restricted in production deployments to prevent unauthorized cross-origin access to the API.

Is this API intended to be publicly accessible from any origin, or should CORS be configurable per environment?

242-323: LGTM! Robust acceptance criteria parser.

The parseAcceptanceCriteria function handles multiple input formats (GWT, numbered lists, bullets, HTML, newlines) with clear fallback logic and a sensible minimum length filter. The implementation is thorough and defensive.

398-458: Champion update logic looks correct with appropriate backup versioning.

The POST /champion endpoint properly:

Creates versioned backups with timestamps

Parses base/patch sections from composed prompts

Normalizes and persists all three files atomically

Clears caches after update

36-37: The scorer instance is designed for safe reuse across requests. The metrics (KeywordCoverageMetric, PromptAgentFPFMetric) are measurement tools, not state containers—each request flows through the pipeline with isolated input/output data. The measure() methods operate on fresh inputs per request without accumulating cross-request state.

AGENTS.md (1)

31-37: LGTM! Clear functional style guidelines.

The new Code Style section provides clear, actionable guidance for maintaining a functional programming style with immutability and pure functions.

deploy/main.ts (3)

46-75: Good fallback handling for missing Fresh build output.

The try/catch structure properly distinguishes between deployed and local environments:

In production (Deno Deploy): Returns 503 with clear message

In development: Falls back to in-memory build

This graceful degradation is well-designed.

78-98: LGTM! Clean API route detection.

The API_PREFIXES array and isApiRoute function provide a clear, maintainable way to distinguish API routes from UI routes. The prefix-matching logic correctly handles both exact matches and sub-paths.

100-147: Request handler has comprehensive telemetry and error handling.

The main request handler:

Records timing metrics for all requests

Logs structured telemetry when enabled

Catches and logs errors while returning generic 500 responses

Properly records failed requests in telemetry

src/ui/twind_middleware.ts (4)

1-7: Imports look correct.

The imports align with the module's responsibilities: virtual sheet from twind/sheets and the shared utilities/constants from twind_shared.ts.

9-13: Type definitions are well-structured.

MappingState efficiently represents identity mappings as strings and non-identity mappings as tuples. The TwindMiddleware type properly captures the Fresh middleware signature.

17-25: HTML injection helper is correct but consider edge cases.

The function handles </head>, </body>, and fallback cases appropriately. Note that replace() only replaces the first occurrence, which is the correct behavior here. However, if the HTML is malformed or uses uppercase tags (</HEAD>), injection may fall through to appending.

60-68: Response reconstruction is handled correctly.

Deleting content-length is appropriate since the HTML body size has changed. Preserving other headers, status, and statusText maintains response fidelity.

src/ui/twind_client.ts (5)

1-8: Imports are appropriate for client-side hydration.

The module correctly imports the Sheet type, shared setup utilities, and element IDs that correspond to what the middleware injects server-side.

12-20: State reading is robust with proper fallbacks.

The function gracefully handles missing elements and JSON parse errors by returning an empty array. This ensures hydration can proceed even if server-side state injection failed.

22-32: Style element creation is correct.

The function properly checks for existing elements and creates one in document.head if missing, using the shared constant for the ID.

53-57: Sheet interface implementation looks correct.

The insert method properly delegates to CSSStyleSheet.insertRule, and init uses shift() to provide state sequentially. This matches typical twind sheet initialization patterns.

62-64: Browser guard is appropriate.

Checking typeof document !== "undefined" correctly gates client-side hydration for SSR/isomorphic module loading scenarios.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

AGENTS.md

src/ui/twind_client.ts

src/ui/twind_middleware.ts

coderabbitai · 2026-01-14T07:58:56Z

src/ui/twind_middleware.ts

+    const styleTag = `<style id="${STYLE_ELEMENT_ID}">${cssText}</style>`;
+    const stateTag =
+      `<script id="${STATE_ELEMENT_ID}" type="application/json">${
+        JSON.stringify(mappings)
+      }</script>`;


⚠️ Potential issue | 🟠 Major

JSON injection vulnerability in state serialization.

If mappings contains strings with </script>, the JSON output could break out of the script tag and enable XSS. Consider escaping or using a safer serialization approach.

Proposed fix using JSON escaping

const styleTag = `<style id="${STYLE_ELEMENT_ID}">${cssText}</style>`; const stateTag = `<script id="${STATE_ELEMENT_ID}" type="application/json">${ - JSON.stringify(mappings) + JSON.stringify(mappings).replace(/</g, "\\u003c") }</script>`;

🤖 Prompt for AI Agents

In `@src/ui/twind_middleware.ts` around lines 54 - 58, The stateTag construction serializes mappings into a raw script tag which allows JSON containing "</script>" or special line separators to break out and enable XSS; update the code that builds stateTag (and where STATE_ELEMENT_ID and mappings are used) to either: 1) create the script element via DOM APIs on the client/SSR host and set its textContent to JSON.stringify(mappings) (preferred), or 2) if emitting an HTML string, sanitize the JSON string after JSON.stringify(mappings) by escaping "</" (e.g. replace all occurrences of "</script>"/ "</" with "<\\/") and also escape U+2028 and U+2029 (replace with "\\u2028" and "\\u2029") before embedding it in stateTag so the serialized state cannot break out of the script tag.

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@src/server/handler.ts`:
- Around line 1580-1588: The tournament progress calculation can divide by zero
when total === 0; change the logic around candidateIdx and totalRuns handling in
the block using candidateIdx, updateTaskTournamentProgress, and
config.patchCandidates so you first guard total (e.g., treat runsPerCandidate =
total === 0 ? 1 : total / (config.patchCandidates + 1)), compute candidateIdx
using that guarded runsPerCandidate, and clamp candidateIdx to the valid range
[0, config.patchCandidates] before calling updateTaskTournamentProgress; ensure
totalCandidates remains config.patchCandidates + 1 and runsCompleted/totalRuns
still reflect the actual values (or zero) but avoid any division by zero.

In `@src/ui/islands/Studio.tsx`:
- Around line 730-794: The number inputs for maxIterations, replicates and
patchCandidates set state with Number(target.value) without guarding against NaN
when the field is cleared or invalid; update the onChange handlers for the
inputs (the ones that call setOptConfig in Studio.tsx) to parse and validate the
value (e.g., use parseInt/Number, check isNaN, and clamp to a minimum of 1 or
fall back to the current state's value) before calling setOptConfig so
optConfig.maxIterations, optConfig.replicates and optConfig.patchCandidates
never become NaN.

🧹 Nitpick comments (4)

deno.json (1)

76-84: Redundant entry in deploy include list.

src/ui on line 81 is already covered by src on line 78. This redundancy is harmless but could be cleaned up.
🧹 Suggested cleanup
     "include": [
       "deploy",
       "src",
       "data",
       "prompts",
-      "src/ui",
       "deno.json",
       "deno.lock"
     ],

deploy/main.ts (1)

111-159: Consider extracting duplicate telemetry logging logic.

The success and error paths contain nearly identical telemetry logging code (lines 119-135 and 139-155). While functional, this could be simplified.

♻️ Suggested refactor

 Deno.serve(async (req) => {
   const started = performance.now();
   const url = new URL(req.url);
   const isApi = isApiRoute(url.pathname);
   const routeKey = isApi ? normalizePath(url.pathname) : "ui";
+
+  const logRequest = (status: number, durationMs: number) => {
+    recordHttpRequest({
+      key: `${req.method} ${routeKey}`,
+      status,
+      durationMs,
+    });
+    if (env.TELEMETRY_LOG_REQUESTS) {
+      console.log(
+        JSON.stringify({
+          type: "http",
+          at: new Date().toISOString(),
+          method: req.method,
+          route: routeKey,
+          status,
+          durationMs: Math.round(durationMs),
+        }),
+      );
+    }
+  };
+
   try {
     const res = isApi ? await apiHandler(req) : await freshHandler(req);
     const durationMs = performance.now() - started;
-    recordHttpRequest({
-      key: `${req.method} ${routeKey}`,
-      status: res.status,
-      durationMs,
-    });
-    if (env.TELEMETRY_LOG_REQUESTS) {
-      console.log(
-        JSON.stringify({
-          type: "http",
-          at: new Date().toISOString(),
-          method: req.method,
-          route: routeKey,
-          status: res.status,
-          durationMs: Math.round(durationMs),
-        }),
-      );
-    }
+    logRequest(res.status, durationMs);
     return res;
   } catch (err) {
     const durationMs = performance.now() - started;
-    recordHttpRequest({
-      key: `${req.method} ${routeKey}`,
-      status: 500,
-      durationMs,
-    });
-    if (env.TELEMETRY_LOG_REQUESTS) {
-      console.log(
-        JSON.stringify({
-          type: "http",
-          at: new Date().toISOString(),
-          method: req.method,
-          route: routeKey,
-          status: 500,
-          durationMs: Math.round(durationMs),
-        }),
-      );
-    }
+    logRequest(500, durationMs);
     console.error(err);
     return new Response("Internal server error", { status: 500 });
   }
 });

src/server/handler.ts (2)

924-945: Background task error handling is correct but consider adding a safety net.

The fire-and-forget IIFE pattern is appropriate for async tasks. The try-catch properly handles errors by setting task status and error message. However, if task.error = ... or task.completedAt = ... assignments somehow throw, the rejection would be unhandled.

Consider wrapping in an additional outer catch for defensive programming:
♻️ Suggested approach
// Wrap background tasks to prevent unhandled rejections
(async () => {
  try {
    // ... task logic
  } catch (err) {
    task.status = "failed";
    task.error = err instanceof Error ? err.message : String(err);
    task.completedAt = new Date().toISOString();
  }
})().catch((err) => console.error("Unhandled background task error:", err));
498-509: Non-atomic file writes could leave inconsistent state on failure.

If one of the three writeTextFile calls fails (e.g., disk full), the prompt files could be left in an inconsistent state where champion.md differs from the combination of base.md and patch.md. The cache is cleared regardless, so the next load would read the inconsistent state.
♻️ Suggested approach

Consider writing to temporary files first, then renaming atomically, or at minimum, writing in sequence so failures are more predictable:
// Write in sequence to fail fast before partial writes
await Deno.writeTextFile(`${PROMPTS_ROOT}/champion.md`, normalizedComposed);
await Deno.writeTextFile(`${PROMPTS_ROOT}/champion.base.md`, nextBase);
await Deno.writeTextFile(`${PROMPTS_ROOT}/champion.patch.md`, nextPatch);
Or use a write-then-rename pattern for true atomicity.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b7c1a73 and 0b0ebea.

📒 Files selected for processing (11)

AGENTS.md
deno.json
deploy/main.ts
playwright.config.ts
src/config.ts
src/judge/promptagent-fpf-judge.ts
src/server/handler.ts
src/ui/components/ai-elements/message.tsx
src/ui/islands/Studio.tsx
src/ui/twind_middleware.ts
tests/deployment.test.ts

🚧 Files skipped from review as they are similar to previous changes (3)

src/ui/components/ai-elements/message.tsx
src/ui/twind_middleware.ts
playwright.config.ts

🧰 Additional context used

📓 Path-based instructions (1)

**/{generator,*agent*}.{ts,js}

📄 CodeRabbit inference engine (AGENTS.md)

**/{generator,*agent*}.{ts,js}: Use jsonPromptInjection: true in structuredOutput configuration when LM Studio's response_format support is incomplete
Set errorStrategy: 'strict' in structuredOutput configuration for schema validation

Files:

src/judge/promptagent-fpf-judge.ts

🧠 Learnings (8)

📚 Learning: 2026-01-06T19:14:53.230Z

Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/scorer.ts : Use a separate judge model instance via `makeJudgeModel()` in scoring modules to reduce 'grading your own homework' bias

Applied to files:

src/judge/promptagent-fpf-judge.ts
AGENTS.md

📚 Learning: 2026-01-06T19:14:53.230Z

Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/{generator,*agent*}.{ts,js} : Use `jsonPromptInjection: true` in structuredOutput configuration when LM Studio's `response_format` support is incomplete

Applied to files:

src/judge/promptagent-fpf-judge.ts
AGENTS.md
src/server/handler.ts

📚 Learning: 2026-01-06T19:14:53.230Z

Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/{generator,*agent*}.{ts,js} : Set `errorStrategy: 'strict'` in structuredOutput configuration for schema validation

Applied to files:

src/judge/promptagent-fpf-judge.ts
AGENTS.md

📚 Learning: 2026-01-06T19:14:53.230Z

Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/scorer.ts : Composite score calculation should weight: 25% Epic keyword coverage, 30% INVEST principles adherence, 30% Acceptance criteria testability (prefer Given/When/Then), 10% Duplication penalty, 5% Story count sanity (4-8 optimal)

Applied to files:

src/judge/promptagent-fpf-judge.ts
AGENTS.md

📚 Learning: 2026-01-06T19:14:53.230Z

Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/scorer.ts : Enforce schema validation as a hard gate: return score = 0 if validation fails

Applied to files:

src/judge/promptagent-fpf-judge.ts
AGENTS.md

📚 Learning: 2026-01-06T19:14:53.230Z

Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/schema.ts : Use Zod 4.x for schema validation

Applied to files:

src/config.ts

📚 Learning: 2026-01-14T08:35:08.246Z

Learnt from: venikman
Repo: venikman/PromptAgent PR: 13
File: src/server/handler.ts:47-51
Timestamp: 2026-01-14T08:35:08.246Z
Learning: In the PromptAgent repository, `createApiHandler` in `src/server/handler.ts` is designed to be instantiated once per process in server entrypoints, so module-level caches (cachedEpics, cachedChampion, cachedOrchestrator, task stores) are safe and intentional for the single-handler deployment model.

Applied to files:

deploy/main.ts
src/server/handler.ts

📚 Learning: 2026-01-06T19:14:53.230Z

Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/scorer.ts : Implement scorer pipeline following the pattern: preprocess → analyze → generateScore → generateReason

Applied to files:

AGENTS.md

🧬 Code graph analysis (2)

src/judge/promptagent-fpf-judge.ts (2)

src/config.ts (1)

env (168-168)

src/telemetry.ts (1)

withAiTelemetry (275-303)

deploy/main.ts (4)

src/config.ts (1)

env (168-168)

src/server/handler.ts (2)

ApiConfig (39-45)

createApiHandler (399-1658)

src/ui/app.ts (1)

app (6-9)

src/telemetry.ts (3)

startTelemetryReporter (329-332)

normalizePath (179-182)

recordHttpRequest (184-195)

🔇 Additional comments (22)

deno.json (1)

1-87: Configuration looks well-structured.

The Fresh/Preact setup with proper import mappings, jsx configuration, and task definitions is clean. The addressed feedback from previous reviews (typecheck scope, test:unit glob handling) has been incorporated correctly.

src/judge/promptagent-fpf-judge.ts (2)

368-383: Structured output configuration now complies with coding guidelines.

The structuredOutput configuration correctly includes jsonPromptInjection: true and errorStrategy: "strict" as required by the coding guidelines for agent files. The telemetry wrapping and timeout handling are also well-implemented.

456-473: Clean overloaded method signature for flexibility.

The measure method overloads provide a good API surface, supporting both the full PromptAgentJudgeInput object and a simpler (input, output) string signature. The fallback defaults for the string signature are sensible.

src/ui/islands/Studio.tsx (2)

276-362: Initial data loading effect is well-structured.

The effect properly handles cancellation, abort controllers, and cleanup. The parallel loading of health, epics, and champion data with appropriate error handling is clean. The early return pattern for cancelled requests prevents state updates after unmount.

657-667: Past review feedback addressed: mode prop removed from MessageResponse.

The MessageResponse components are now used without the unused mode="static" prop, addressing the previous review concern about the unused prop.

src/config.ts (2)

15-20: Boolean environment variable handling is now correct.

The envBoolean helper properly addresses the previous review concern about z.coerce.boolean() treating string "false" as true. The implementation correctly normalizes and parses "true"/"1" as true and defaults appropriately.

26-34: LLM configuration with layered fallbacks is well-designed.

The fallback chain from LMSTUDIO_* to generic LLM_* variables provides flexibility for different deployment scenarios. The timeout constraints (1-600 seconds) are sensible bounds.

tests/deployment.test.ts (2)

44-54: CI-aware test skipping is appropriate.

Skipping the .env existence check on CI makes sense since CI environments typically inject secrets differently. The IS_CI detection pattern is clean.

178-186: Section header and test now align correctly.

The section header "UI Entry Point Tests" now accurately describes the test that validates the Fresh app entry point existence. This addresses the previous review feedback about the mismatched header.

AGENTS.md (3)

35-42: Good addition of functional code style guidelines.

The new Code Style section clearly documents the project's preference for pure functions, immutability, and expression-based transforms. This aligns well with the functional patterns used throughout the codebase.

77-86: Structured output configuration matches learnings.

The example correctly includes jsonPromptInjection: true and errorStrategy: "strict" as specified in the project learnings for LM Studio compatibility.

43-73: Architecture documentation reflects new modular structure.

The updated file tree accurately documents the reorganized src/ directory with the new modules (fpf, judge, meta-evolution, orchestrator, server, telemetry, ui) and prompt artifacts.

deploy/main.ts (2)

11-25: Production safety checks look good.

The checks correctly validate that LMSTUDIO_BASE_URL is set to a non-localhost value and that LMSTUDIO_API_KEY is not the default placeholder. The regex /localhost|127\.0\.0\.1/ covers common local patterns.

57-86: Fresh UI fallback handling is well-designed.

The try-catch properly distinguishes between deployed and local environments:

In production: returns 503 with clear message when build is missing

In development: falls back to in-memory build for convenience

This graceful degradation prevents cryptic errors while maintaining development ergonomics.

src/server/handler.ts (8)

47-51: Module-level state is acceptable for single-handler deployment.

Based on learnings, createApiHandler is designed to be instantiated once per process, so module-level caches for config, epics, champion, and orchestrator are safe and intentional.

128-153: Task cleanup implementation addresses memory leak concern.

The TTL-based cleanup with 1-hour retention and 10-minute intervals is well-implemented. The guard against multiple schedulers and proper handling of invalid timestamps make this robust.

309-345: Acceptance criteria parser handles multiple formats comprehensively.

The GWT (Given/When/Then) parsing correctly groups scenarios by "Given" keywords and handles the common BDD format. The fallback chain (GWT → numbered → bullets → HTML → newlines) ensures broad compatibility.

250-262: CORS configuration is secure for production.

The logic correctly prevents wildcard (*) origins in deployed environments and falls back to same-origin validation when no allowlist is configured. This is a safe default.

630-639: LLM fetch includes proper timeout.

The AbortSignal.timeout(timeoutMs) prevents indefinite hangs when the LLM service is unresponsive. This addresses the previous review concern about missing timeouts.

422-432: Debug endpoint no longer exposes API key material.

The endpoint now returns only hasKey and keyLength instead of a key prefix, addressing the previous security concern.

856-865: Generate endpoint also has proper timeout.

The /generate endpoint correctly includes AbortSignal.timeout(timeoutMs) for the LLM fetch, consistent with /generate-story.

1-45: Well-structured module with clear type definitions.

The imports are organized logically and the ApiConfig type provides a clear contract for handler initialization. The separation of PromptAgent modules, orchestrator imports, and config/telemetry is clean.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

src/server/handler.ts

src/ui/islands/Studio.tsx

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@src/ui/twind_client.ts`:
- Line 63: The TS2345 arises because init: (cb) => cb(sheetState) passes
sheetState to a generically-typed cb; fix by asserting types so the compiler
knows the callback accepts SheetInitState—either cast the cb parameter or the
entire init function to Sheet["init"] (e.g., implement init as a function typed
like (cb: (state?: unknown) => unknown) => cb(sheetState) and then assert it as
Sheet["init"], or cast the rhs to Sheet["init"]); update the init line to use
that type assertion referencing init, cb, sheetState, and Sheet["init"] to
resolve the error.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0b0ebea and 9930730.

📒 Files selected for processing (2)

src/config.ts
src/ui/twind_client.ts

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2026-01-06T19:14:53.230Z

Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/schema.ts : Use Zod 4.x for schema validation

Applied to files:

src/config.ts

🧬 Code graph analysis (1)

src/ui/twind_client.ts (1)

src/ui/twind_shared.ts (4)

STATE_ELEMENT_ID (8-8)

STYLE_ELEMENT_ID (7-7)

Options (10-13)

setup (24-55)

🪛 GitHub Actions: CI

src/ui/twind_client.ts

[error] 63-63: TS2345: Argument of type 'unknown' is not assignable to parameter of type 'T | undefined'. init: (cb) => cb(sheetState). Type checking failed. (Command: deno check deploy/main.ts src/**/*.ts)

🔇 Additional comments (9)

src/ui/twind_client.ts (2)

44-57: LGTM! Defensive checks properly address edge cases.

The null check for styleEl.sheet (line 44-45) and the robust precedence parsing with marker === -1 guards and Number.isFinite validation (lines 49-56) properly handle the edge cases previously identified. The implementation now gracefully handles malformed CSS rules and disconnected DOM scenarios.

69-71: LGTM!

Appropriate browser environment check before hydration.

src/config.ts (7)

1-2: LGTM!

Clean import paths using import map resolution. This centralizes version management and follows Deno best practices.

10-14: LGTM!

Good use of fallback variables to support alternative environment variable names (LLM_BASE_URL, LLM_API_BASE_URL, etc.), providing backward compatibility for existing deployments.

15-26: LGTM!

The envBoolean helper correctly addresses the boolean coercion issue. Key behaviors verified:

"false" / "0" → false ✓

"true" / "1" → true ✓

Missing env var → uses default ✓

Case-insensitive handling ✓

27-40: LGTM!

Well-structured LLM configuration with:

URL validation via .url() for LMSTUDIO_BASE_URL

Sensible timeout bounds (1s–10min) with 2-minute default

Fallback support for alternative env var names

41-51: LGTM!

Sensible telemetry defaults:

TELEMETRY_INCLUDE_LLM_OUTPUT defaults to false (privacy-conscious default for potentially sensitive LLM outputs)

Preview character limit (800) balances debuggability with log size

123-123: LGTM!

POLL_ENABLED and NQD_ENABLED now use the envBoolean helper, ensuring consistent boolean parsing across all env-based feature flags.

Also applies to: 166-166

172-176: LGTM!

Clean fail-fast validation at module load with both the parsed config and schema exported for testing/documentation purposes.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

src/ui/twind_client.ts

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@src/server/handler.ts`:
- Around line 856-877: The fetch to the LLM endpoint (the call that sets
timeoutMs and assigns upstream from fetch(`${LLM_BASE_URL.replace(/\/$/,
"")}/chat/completions`)) lacks a try-catch and will throw on timeouts or network
errors; wrap that fetch + reading of upstream.text() in a try-catch, catch
AbortError/other errors and return a proper jsonResponse (matching the existing
error shape used elsewhere, e.g., error: "llm_error", status: 502 or appropriate
upstream.status if available, and include safeJson(error.message) or the error
details) instead of letting exceptions propagate; ensure you still handle the
upstream.ok branch the same way inside the try block.

♻️ Duplicate comments (3)

src/ui/islands/Studio.tsx (2)

70-73: Number input validation has been addressed.

The parsePositiveInt helper properly validates parsed values using Number.isFinite and minimum bound check, returning the fallback value when input is invalid. This addresses the previously raised concern about NaN values in config state.

735-808: Number input validation is properly implemented.

All numeric configuration inputs (maxIterations, replicates, patchCandidates) now use the parsePositiveInt helper to guard against NaN and invalid values, falling back to the current state value when input is invalid.

src/server/handler.ts (1)

1578-1597: LGTM — Division by zero protection confirmed.

The guard total > 0 ? ... : 1 prevents division by zero, and the Math.min(Math.max(...)) clamping ensures candidateIdx stays within valid bounds [0, config.patchCandidates]. This properly addresses the concern raised in the previous review.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9930730 and 1635bf5.

📒 Files selected for processing (3)

src/server/handler.ts
src/ui/islands/Studio.tsx
src/ui/twind_client.ts

🚧 Files skipped from review as they are similar to previous changes (1)

src/ui/twind_client.ts

🧰 Additional context used

🧠 Learnings (2)

📚 Learning: 2026-01-14T08:35:08.246Z

Learnt from: venikman
Repo: venikman/PromptAgent PR: 13
File: src/server/handler.ts:47-51
Timestamp: 2026-01-14T08:35:08.246Z
Learning: In the PromptAgent repository, `createApiHandler` in `src/server/handler.ts` is designed to be instantiated once per process in server entrypoints, so module-level caches (cachedEpics, cachedChampion, cachedOrchestrator, task stores) are safe and intentional for the single-handler deployment model.

Applied to files:

src/server/handler.ts

📚 Learning: 2026-01-06T19:14:53.230Z

Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/{generator,*agent*}.{ts,js} : Use `jsonPromptInjection: true` in structuredOutput configuration when LM Studio's `response_format` support is incomplete

Applied to files:

src/server/handler.ts

🧬 Code graph analysis (1)

src/ui/islands/Studio.tsx (3)

src/ui/types.ts (2)

ScorerResult (49-54)

TelemetrySnapshot (75-96)

src/ui/components/ai-elements/prompt-input.tsx (5)

PromptInput (6-16)

PromptInputBody (20-22)

PromptInputTextarea (26-37)

PromptInputFooter (41-52)

PromptInputSubmit (56-69)

src/ui/components/ai-elements/message.tsx (3)

Message (13-22)

MessageContent (26-38)

MessageResponse (42-51)

🔇 Additional comments (23)

src/ui/islands/Studio.tsx (11)

1-58: LGTM!

Imports now use import map entries as previously requested. Type definitions are well-structured and provide good type safety for the component's state management.

93-122: LGTM!

Clean component with proper defensive rendering for optional StoryPoints. Key generation approach is acceptable given the data context.

124-171: LGTM!

Clean result rendering with proper fallback handling for missing storyPack. Error display is appropriately styled.

193-215: LGTM!

Well-structured reusable panel component with clear prop interface.

217-280: LGTM!

State management is well-organized. Memoized derived values have correct dependencies and sensible fallback behavior.

281-367: LGTM!

Proper use of AbortController with timeout for health check. Cancellation flag pattern correctly prevents state updates after unmount. Parallel loading of initial data is efficient.

369-391: LGTM!

Telemetry polling correctly uses globalThis.setInterval. Silent error handling is acceptable for non-critical telemetry data.

393-464: LGTM!

Both async handlers have proper validation, error handling, and loading state management. The pattern of clearing previous state before starting new operations prevents stale data display.

466-505: LGTM!

Optimization task polling correctly handles terminal states and cleanup. Uses globalThis.setInterval as addressed in previous review.

678-693: LGTM!

Champion refresh correctly preserves and restores the previous value on fetch failure, addressing the previously raised concern about the UI getting stuck in loading state.

513-886: LGTM!

JSX rendering is well-structured with proper semantic HTML, accessible form controls, and clear visual feedback for loading and error states. The MessageResponse components no longer use the unused mode prop.

src/server/handler.ts (12)

1-64: LGTM!

Imports are well-organized, and the normalizeMaxTokens utility correctly validates input boundaries. Module-level configuration is appropriate per the single-handler deployment model documented in learnings.

124-153: LGTM — TTL cleanup implementation is solid.

The cleanup logic correctly guards against missing completedAt and invalid date parsing. The 10-minute sweep interval for a 1-hour TTL is a reasonable balance between memory efficiency and CPU overhead.

188-236: LGTM!

Lazy initialization with caching is appropriate. Error paths return safe defaults (empty arrays/strings), and the orchestrator is properly invalidated when champion is updated (line 509).

238-280: LGTM!

CORS handling correctly differentiates between deployed and development environments. The Vary: Origin header is properly added when using explicit origin matching, which ensures correct caching behavior.

298-390: LGTM!

The acceptance criteria parser handles multiple formats with a sensible fallback chain. The minimum length filter prevents noise from being included, and the GWT scenario grouping using → separators provides readable output.

422-432: LGTM — Security improvement confirmed.

The keyPrefix field removal addresses the previous security concern. Exposing only hasKey and keyLength is acceptable for debugging purposes.

630-639: LGTM — Timeout protection confirmed.

The AbortSignal.timeout(timeoutMs) addresses the previous concern about indefinite hangs on LLM requests. The try-catch at line 622 properly handles timeout/network errors.

924-945: LGTM!

The background evaluation task properly catches exceptions and updates task status to "failed" with the error message. The completedAt timestamp is set in both success and failure paths, enabling proper TTL cleanup.

975-1077: LGTM!

Both endpoints validate required inputs and follow the consistent error handling pattern. The transformation from internal ContrastPair to frontend format in /mine-pairs is straightforward.

1129-1222: LGTM!

The tournament execution properly evaluates champion first (ensuring candidates is never empty), then iterates through patches. Error handling in the background task is comprehensive, and the winner selection correctly sorts by objective score.

1268-1483: LGTM!

V2 endpoints correctly leverage the orchestrator abstraction and Deno KV for persistence. The maxTokens validation is consistently applied across all endpoints that accept it, and background tasks properly use failTask() to record errors.

1661-1665: LGTM!

The fallback 404 response and consistent CORS header application complete the handler cleanly.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-14T08:54:00Z

src/server/handler.ts

+        const timeoutMs = env.LLM_TIMEOUT_MS;
+        const upstream = await fetch(
+          `${LLM_BASE_URL.replace(/\/$/, "")}/chat/completions`,
+          {
+            method: "POST",
+            headers,
+            body: JSON.stringify(requestBody),
+            signal: AbortSignal.timeout(timeoutMs),
+          },
+        );
+
+        const raw = await upstream.text();
+        if (!upstream.ok) {
+          return jsonResponse(
+            {
+              error: "llm_error",
+              status: upstream.status,
+              detail: safeJson(raw),
+            },
+            upstream.status,
+          );
+        }


⚠️ Potential issue | 🟠 Major

Missing try-catch around LLM fetch — timeout/network errors will propagate uncaught.

Unlike /generate-story (line 622), this fetch call isn't wrapped in a try-catch. If the LLM request times out (AbortError) or fails due to network issues, the error propagates as an unhandled exception rather than returning a proper JSON error response.

🛡️ Suggested fix

headers["authorization"] = `Bearer ${LLM_API_KEY}`; } + try { const timeoutMs = env.LLM_TIMEOUT_MS; const upstream = await fetch( `${LLM_BASE_URL.replace(/\/$/, "")}/chat/completions`, { method: "POST", headers, body: JSON.stringify(requestBody), signal: AbortSignal.timeout(timeoutMs), }, ); const raw = await upstream.text(); if (!upstream.ok) { return jsonResponse( { error: "llm_error", status: upstream.status, detail: safeJson(raw), }, upstream.status, ); } const data = safeJson(raw) as { choices?: Array<{ message?: { content?: string } }>; }; const content = data?.choices?.[0]?.message?.content ?? ""; return jsonResponse({ provider: "lm-studio", model, response: content, }); + } catch (err) { + return jsonResponse( + { error: err instanceof Error ? err.message : String(err) }, + 500, + ); + } }

🤖 Prompt for AI Agents

In `@src/server/handler.ts` around lines 856 - 877, The fetch to the LLM endpoint (the call that sets timeoutMs and assigns upstream from fetch(`${LLM_BASE_URL.replace(/\/$/, "")}/chat/completions`)) lacks a try-catch and will throw on timeouts or network errors; wrap that fetch + reading of upstream.text() in a try-catch, catch AbortError/other errors and return a proper jsonResponse (matching the existing error shape used elsewhere, e.g., error: "llm_error", status: 502 or appropriate upstream.status if available, and include safeJson(error.message) or the error details) instead of letting exceptions propagate; ensure you still handle the upstream.ok branch the same way inside the try block.

venikman · 2026-01-14T08:54:23Z

Addressed review items: guarded tournament progress division-by-zero edge case, validated optimization number inputs to avoid NaN, and fixed twind_client init typing (typecheck clean).

Stanislau Niadbailau added 5 commits January 14, 2026 01:05

feat: refresh studio UI and telemetry

b482c85

Move UI to Fresh, add LLM telemetry previews, improve generation parsing, and update tests/config.

fix: fallback to LLM_* in model config

475827b

fix: prebuild fresh UI for deploy

3f81dc2

fix: avoid dev builder in deploy graph

b041a74

refactor: split local server and drop deploy checks

f03da6e

Copilot AI review requested due to automatic review settings January 14, 2026 06:52

Copilot AI reviewed Jan 14, 2026

View reviewed changes

gemini-code-assist bot reviewed Jan 14, 2026

View reviewed changes

coderabbitai bot reviewed Jan 14, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Jan 14, 2026

View reviewed changes

fix: address lint and deploy failures

4c2dd74

coderabbitai bot reviewed Jan 14, 2026

View reviewed changes

deploy/main.ts Outdated Show resolved Hide resolved

src/ui/islands/Studio.tsx Show resolved Hide resolved

src/ui/islands/Studio.tsx Show resolved Hide resolved

tests/deployment.test.ts Show resolved Hide resolved

fix: satisfy fmt check and guard deploy build

2968421

deno-deploy bot deployed to Preview January 14, 2026 07:50 View deployment

coderabbitai bot reviewed Jan 14, 2026

View reviewed changes

Tighten CORS configuration

b7c1a73

deno-deploy bot deployed to Preview January 14, 2026 08:15 View deployment

Address PR review feedback

0b0ebea

deno-deploy bot deployed to Preview January 14, 2026 08:32 View deployment

coderabbitai bot reviewed Jan 14, 2026

View reviewed changes

src/server/handler.ts Outdated Show resolved Hide resolved

src/ui/islands/Studio.tsx Show resolved Hide resolved

Fix config bool parsing and twind init

9930730

coderabbitai bot reviewed Jan 14, 2026

View reviewed changes

src/ui/twind_client.ts Outdated Show resolved Hide resolved

Stanislau Niadbailau added 2 commits January 14, 2026 03:43

Fix twind client init typing

296e99a

Guard tournament progress and validate inputs

1635bf5

deno-deploy bot deployed to Preview January 14, 2026 08:49 View deployment

coderabbitai bot reviewed Jan 14, 2026

View reviewed changes

venikman merged commit dcd5d6f into main Jan 14, 2026
5 checks passed

This was referenced Jan 15, 2026

Migrate UI to Vite, split model config, and enforce release prep #14

Merged

Consolidate UI diagrams, deploy deps, and version policy #18

Merged

coderabbitai bot mentioned this pull request Feb 9, 2026

Optimization flow UI, schema relaxation, and in-memory KV #21

Open

4 tasks

		@@ -1,4 +1,4 @@
		import { z } from "npm:zod@4.3.5";
		import { z } from "zod";

	"typecheck": "deno check deploy/main.ts src/server/main.ts src/cli/optimize.ts src/cli/generate.ts src/cli/eval-dist.ts",
	"typecheck": "deno check deploy/main.ts src/*/.ts",

		@@ -0,0 +1,11 @@
		import { DOMParser as LinkedomDOMParser, parseHTML } from "npm:linkedom@0.18.5";

		import { Head } from "@fresh/core/runtime";
		import type { PageProps } from "@fresh/core/render";

	const result = await executeEvaluator(
	{
	promptText: composedPrompt,
	epics: this.epics,
	replicates: this.config.replicates,

Comments

Conversation

venikman commented Jan 14, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

venikman Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

venikman Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot commented Jan 14, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

venikman Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

venikman Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

venikman Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

venikman commented Jan 14, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

venikman Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

venikman commented Jan 14, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 14, 2026 •

edited

Loading

venikman Jan 14, 2026 •

edited

Loading

coderabbitai bot Jan 14, 2026 •

edited

Loading