Conversation
Move UI to Fresh, add LLM telemetry previews, improve generation parsing, and update tests/config.
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughConsolidates server into an API-first entrypoint, adds a telemetry subsystem, threads LLM timeouts and telemetry through AI calls, introduces a Fresh/Preact UI under Changes
Sequence Diagram(s)sequenceDiagram
participant Browser as Client (Browser)
participant Main as Server (src/server/main.ts)
participant Router as Route classifier (isApiRoute)
participant API as createApiHandler
participant Telemetry as src/telemetry.ts
participant LLM as LLM Service
Browser->>Main: HTTP request
Main->>Router: classify route (API vs UI)
alt API route
Router->>API: forward request
API->>Telemetry: recordHttpRequest(start)
API->>Telemetry: recordAiStart (if AI call)
API->>LLM: call LLM with AbortSignal.timeout
LLM-->>API: response / error
API->>Telemetry: recordAiResponse
API-->>Main: Response
else UI route
Router->>Main: serve Fresh UI snapshot
Main-->>Browser: HTML/JS
end
Main->>Telemetry: recordHttpRequest(end with status,duration)
sequenceDiagram
participant Caller as Module (generator/judge/meta-evo)
participant Wrapper as withAiTelemetry
participant Telemetry as src/telemetry.ts
participant LLM as LLM API
Caller->>Wrapper: withAiTelemetry({name,model}, fn)
Wrapper->>Telemetry: recordAiStart(name)
Wrapper->>LLM: invoke fn() with AbortSignal.timeout(LLM_TIMEOUT_MS)
alt success
LLM-->>Wrapper: payload + tokenUsage
Wrapper->>Telemetry: recordAiResponse(success=true, text, tokenUsage)
Wrapper-->>Caller: result
else error/timeout
LLM-->>Wrapper: error
Wrapper->>Telemetry: recordAiResponse(success=false, error)
Wrapper-->>Caller: throw/error result
end
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Pull request overview
This PR removes the Vite-based UI deployment integration and migrates to a Fresh framework-based UI architecture. The changes eliminate the React/Vite frontend components and replace them with a Deno-native Fresh setup using Preact and Twind for styling.
Changes:
- Removed entire Vite/React UI implementation including all playground, evaluation, evolution, and methodology components
- Added Fresh framework-based UI with server-side rendering capabilities
- Introduced telemetry tracking system for HTTP and AI operations
- Updated test imports to reflect new source file locations
- Added new UI components using Preact and Twind CSS
Reviewed changes
Copilot reviewed 137 out of 152 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| ui/* (deleted) | Removed entire Vite-based React UI including all components, configuration, and build files |
| src/ui/* (added) | Added Fresh framework-based UI with Preact components, Twind styling, and SSR support |
| src/telemetry.ts (added) | Added comprehensive telemetry system for tracking HTTP requests and AI operations |
| tests/* | Updated import paths to reflect reorganized source structure |
| src/schema.ts | Updated zod import to use standard import instead of npm: specifier |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| export function textSimilarity( | ||
| textA: string, | ||
| textB: string, | ||
| dim = 512, | ||
| ): number { |
There was a problem hiding this comment.
The function signature formatting change is inconsistent with the rest of the file. The parameters were split across multiple lines but other similar functions in the file (like hashVector and cosine) keep their parameters on a single line or use different formatting. Consider maintaining consistent formatting throughout the file.
| export function textSimilarity( | |
| textA: string, | |
| textB: string, | |
| dim = 512, | |
| ): number { | |
| export function textSimilarity(textA: string, textB: string, dim = 512): number { |
There was a problem hiding this comment.
Deno fmt enforces the multiline signature here due to line length, so keeping formatter output for consistency with repo formatting.
| @@ -1,4 +1,4 @@ | |||
| import { z } from "npm:zod@4.3.5"; | |||
| import { z } from "zod"; | |||
There was a problem hiding this comment.
Changed from npm: specifier to bare import. Ensure deno.json has the correct import map entry for 'zod' to resolve to 'npm:zod@4.3.5' or the intended version to maintain dependency version consistency.
| import { z } from "zod"; | |
| import { z } from "npm:zod@4.3.5"; |
There was a problem hiding this comment.
deno.json maps zod to npm:zod@4.3.5, so the bare import stays pinned to the intended version.
Summary of ChangesHello @venikman, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request modernizes the project's architecture by transitioning to a Deno Fresh framework for both UI and API serving. This change aims to simplify the development and deployment process, enhance performance through server-side rendering, and provide better observability into system operations with a new telemetry module. The refactoring consolidates server-side logic and updates build tooling to align with the new unified Deno environment. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This is an excellent and substantial refactoring. The move from a separate Vite UI to an integrated Fresh application greatly simplifies the architecture, development workflow, and deployment process. The introduction of a dedicated telemetry system and Playwright for E2E tests are fantastic additions that will significantly improve observability and reliability. The code organization is also much cleaner with the new src/server and top-level tests directories.
I've left a few comments on specific areas for improvement, including a critical one about re-introducing production safety checks that were present in the previous version. Once those are addressed, this will be a massive step forward for the project.
| import { fromFileUrl } from "@std/path"; | ||
| import { app } from "../src/ui/app.ts"; | ||
| import { env } from "../src/config.ts"; | ||
|
|
||
| // Create scorer instance (reused across requests) | ||
| const scorer = createStoryDecompositionScorer(); | ||
|
|
||
| // ───────────────────────────────────────────────── | ||
| // Environment Detection & Validation | ||
| // ───────────────────────────────────────────────── | ||
| const isDeployed = !!Deno.env.get("DENO_DEPLOYMENT_ID"); | ||
|
|
||
| // OpenAI-compatible API (OpenRouter, LM Studio, etc.) | ||
| // Support both LLM_BASE_URL and legacy LLM_API_BASE_URL | ||
| const LLM_BASE_URL = | ||
| Deno.env.get("LLM_BASE_URL") ?? | ||
| Deno.env.get("LLM_API_BASE_URL") ?? | ||
| "http://localhost:1234/v1"; | ||
| const LLM_API_KEY = Deno.env.get("LLM_API_KEY") ?? ""; | ||
| const LLM_MODEL = Deno.env.get("LLM_MODEL") ?? "openai/gpt-oss-120b"; | ||
|
|
||
| // Production validation - fail fast if config is missing | ||
| if (isDeployed) { | ||
| if (!Deno.env.get("LLM_BASE_URL") && !Deno.env.get("LLM_API_BASE_URL")) { | ||
| throw new Error( | ||
| "LLM_BASE_URL (or LLM_API_BASE_URL) is required in production. " + | ||
| "Set it in Deno Deploy → Settings → Environment Variables. " + | ||
| "Example: https://openrouter.ai/api/v1", | ||
| ); | ||
| } | ||
| if (LLM_BASE_URL.includes("localhost")) { | ||
| throw new Error( | ||
| "LLM_BASE_URL cannot be localhost in production. " + | ||
| "Configure a real API endpoint like OpenRouter or OpenAI.", | ||
| ); | ||
| } | ||
| if (!LLM_API_KEY) { | ||
| throw new Error( | ||
| "LLM_API_KEY is required in production. " + | ||
| "Set it in Deno Deploy → Settings → Environment Variables.", | ||
| ); | ||
| } | ||
| } | ||
| import { createApiHandler, type ApiConfig } from "../src/server/handler.ts"; | ||
| import { | ||
| normalizePath, | ||
| recordHttpRequest, | ||
| startTelemetryReporter, | ||
| } from "../src/telemetry.ts"; |
There was a problem hiding this comment.
This refactoring significantly simplifies the deployment entrypoint, which is great. However, it seems the production safety checks from the previous version of this file have been removed. The old deploy/main.ts included crucial fail-fast validation to ensure that LLM_BASE_URL was set to a non-localhost value and LLM_API_KEY was present in a production environment (detected via DENO_DEPLOYMENT_ID).
Without these checks, a misconfigured deployment could start successfully but then fail at runtime when trying to contact a localhost LLM. It's critical to reintroduce these startup validations to prevent production failures.
import { fromFileUrl } from "@std/path";
import { app } from "../src/ui/app.ts";
import { env } from "../src/config.ts";
import { createApiHandler, type ApiConfig } from "../src/server/handler.ts";
import {
normalizePath,
recordHttpRequest,
startTelemetryReporter,
} from "../src/telemetry.ts";
// Production safety checks
const isDeployed = !!Deno.env.get("DENO_DEPLOYMENT_ID");
if (isDeployed) {
if (!env.LMSTUDIO_BASE_URL || env.LMSTUDIO_BASE_URL.includes("localhost")) {
throw new Error(
"LMSTUDIO_BASE_URL must be set to a non-localhost value in production.",
);
}
if (!env.LMSTUDIO_API_KEY || env.LMSTUDIO_API_KEY === "lm-studio") {
throw new Error(
"LMSTUDIO_API_KEY must be set in production.",
);
}
}There was a problem hiding this comment.
Added production checks for LMSTUDIO_BASE_URL (non-localhost) and LMSTUDIO_API_KEY, and removed key prefix logging.
deno.json
Outdated
| "test:e2e:install": "deno run -A --node-modules-dir npm:@playwright/test@1.57.0 install", | ||
| "test-server": "deno run -A scripts/test-server.ts", | ||
| "typecheck": "deno check deploy/main.ts src/**/*.ts", | ||
| "typecheck": "deno check deploy/main.ts src/server/main.ts src/cli/optimize.ts src/cli/generate.ts src/cli/eval-dist.ts", |
There was a problem hiding this comment.
The typecheck task has been updated, but its scope seems to have been significantly reduced. The previous version checked src/**/*.ts, ensuring all TypeScript files in the source directory were type-checked. The new version only checks a few specific entrypoint files. This could allow type errors in other modules to go unnoticed. I recommend restoring the broader scope to maintain type safety across the entire project.
| "typecheck": "deno check deploy/main.ts src/server/main.ts src/cli/optimize.ts src/cli/generate.ts src/cli/eval-dist.ts", | |
| "typecheck": "deno check deploy/main.ts src/**/*.ts", |
There was a problem hiding this comment.
Updated typecheck task to check deploy/main.ts and src/**/*.ts.
| export function executePairMiner( | ||
| input: PairMinerInput, | ||
| ctx: ToolContext, | ||
| ): Promise<ToolResult<PairMinerOutput>> { |
There was a problem hiding this comment.
This function was changed from async to a synchronous function that returns a resolved promise. While functionally correct, this is inconsistent with other tool functions like executePatcher and executeEvaluator which are async. For better code consistency and readability across the tools, I'd suggest making this function async as well and removing the Promise.resolve() wrappers on the return statements.
| export function executePairMiner( | |
| input: PairMinerInput, | |
| ctx: ToolContext, | |
| ): Promise<ToolResult<PairMinerOutput>> { | |
| export async function executePairMiner( | |
| input: PairMinerInput, | |
| ctx: ToolContext, | |
| ): Promise<ToolResult<PairMinerOutput>> { |
There was a problem hiding this comment.
Keeping executePairMiner synchronous since it has no awaits and Deno lint require-await would fail; Promise.resolve keeps the ToolResult promise shape.
There was a problem hiding this comment.
Actionable comments posted: 20
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
evaluation-methodology.md (3)
361-474: Remove internal conversation history from documentation.This section contains raw conversation logs, internal AI context tracking, and verbatim user messages with numerous typos. This content is inappropriate for formal methodology documentation and should be removed entirely.
The section includes:
- Internal file paths and pending tasks
- Verbatim user messages with spelling errors
- AI context tracking ("Current Work", "Pending Tasks", "Optional Next Step")
- Implementation details that belong in internal notes, not user-facing docs
📝 Proposed fix
Remove this entire section (lines 361-474). If any information from this section is valuable for the documentation, extract and rewrite it formally in the appropriate section above.
-I want you to usi FPF to come up with evaluator for you proeject. So we would -have custon uniqui, basedon SOTA mete-mate evaluateors. before we wil mae a -choce with which new method to adotp - -1. Primary Request and Intent: - ... - [entire section through line 474]
552-552: Complete the truncated sentence.The sentence ends abruptly with "for every as" - this appears to be cut off mid-word or mid-phrase.
✍️ Suggested completion
Based on context, this likely should read:
-- **Auditable evidence** (SCR) for every as +- **Auditable evidence** (SCR) for every assurance claim
554-567: Restructure or remove misplaced implementation tasks.This section contains two structural issues:
Lines 554-565: The "Medium Priority" section lists implementation tasks (endpoints, buttons, storage) that don't belong in a methodology document. These appear to be development TODOs that should be tracked elsewhere (e.g., GitHub issues, project management tools).
Line 567: The orphaned text fragment "rance claim" is clearly broken and appears to be a continuation error from line 552's incomplete sentence.
🔧 Proposed fix
Option 1 (Recommended): Remove the entire section and track implementation tasks in a separate location:
-### Medium Priority - -- Wire up scorer to evaluate generated stories -... -- Download buttons in Stories tab - -rance claimOption 2: If these tasks must remain, move them to a separate "Implementation Roadmap" appendix with proper context.
For line 567, simply remove the orphaned fragment as it's addressed by fixing line 552.
src/cli/optimize.ts (1)
186-201: Fire-and-forget async calls may silently lose important errors.Both
saveChampion(line 190) andwriteFile(line 196) are called withoutawaitinside the callback, using only.catch()for error handling. While errors are logged, failure to save a promoted champion could lead to data loss without the optimization loop being aware.Consider tracking these promises and checking their status, or at minimum ensuring critical operations like saving promoted champions are more robustly handled.
tests/deployment.test.ts (1)
43-52: Test fails in CI:.envfile doesn't exist in pipeline environments.The pipeline failure shows this test fails because CI environments typically don't have a
.envfile. Consider either:
- Skipping this test in CI environments
- Using
Deno.testwithignorebased on environment detection- Changing the assertion to be informational rather than failing
🛠️ Suggested fix: Skip test in CI
Deno.test( - "Environment - should have .env file for local development", + { + name: "Environment - should have .env file for local development", + ignore: Deno.env.get("CI") === "true", + }, async () => { const envPath = join(ROOT_DIR, ".env"); assert( await fileExists(envPath), ".env should exist for local development", ); }, );src/orchestrator/optimization-loop.ts (1)
356-382: AddmaxTokensto theevaluateChampioncall for consistency.The
evaluateChampionmethod (line 365-372) does not passmaxTokenstoexecuteEvaluator, whilerunIteration(line 174) does pass it. SincemaxTokensis documented as "Override max tokens per generation" and is part of the config, the initial champion evaluation should also respect this setting to ensure consistent token constraints across all evaluations.const result = await executeEvaluator( { promptText: composedPrompt, epics: this.epics, replicates: this.config.replicates, seedBase: env.EVAL_SEED_BASE, + maxTokens: this.config.maxTokens, concurrency: this.config.concurrency, onProgress: this.config.onProgress, }, ctx, );
🤖 Fix all issues with AI agents
In `@deno.json`:
- Line 43: The "test:unit" npm script's glob is being expanded by the shell so
Deno never gets to apply --ignore; update the "test:unit" script (the
"test:unit" key in deno.json) to pass the glob and ignore patterns as single
arguments to Deno by quoting or escaping them so the shell won't expand (e.g.,
wrap tests/**/*.test.ts and tests/api in quotes/escaped form), letting Deno
handle file discovery and the --ignore filter.
In `@playwright.config.ts`:
- Line 18: The Playwright config uses the Node-only global process in the
reuseExistingServer setting; replace process.env.CI with Deno.env.get("CI") and
coerce it to a boolean so reuseExistingServer becomes
!Boolean(Deno.env.get("CI")) (update the reuseExistingServer assignment in the
config object).
In `@src/config.ts`:
- Around line 35-36: z.coerce.boolean() will coerce string "false" to true;
replace both TELEMETRY_LOG_REQUESTS and TELEMETRY_INCLUDE_LLM_OUTPUT uses with a
deterministic string-to-boolean helper (e.g., envBoolean) that parses "true"/"1"
as true and everything else as false and returns a boolean default; add a helper
function (envBoolean) and use TELEMETRY_LOG_REQUESTS: envBoolean(true) and
TELEMETRY_INCLUDE_LLM_OUTPUT: envBoolean(false) so env vars like "false" behave
as expected.
In `@src/judge/promptagent-fpf-judge.ts`:
- Around line 368-379: The structuredOutput config passed to this.agent.generate
(invoked inside withAiTelemetry for "fpf-judge") currently sets only schema:
LLMSubscoresSchema; update that structuredOutput object to include
jsonPromptInjection: true and errorStrategy: 'strict' alongside the existing
schema, keeping the abortSignal and the rest of the call intact so the call
signature for this.agent.generate and the surrounding withAiTelemetry block are
unchanged.
In `@src/server/handler.ts`:
- Around line 84-85: The in-memory Maps evalTasks, tournamentTasks, and
optimizationTasks grow without bounds; add a TTL-based cleanup: introduce a
TASK_TTL_MS constant and a cleanupOldTasks generic function that iterates a
given Map (e.g., evalTasks, tournamentTasks, optimizationTasks) and deletes
entries whose completedAt timestamp is older than TTL, then schedule periodic
invocation (setInterval) of cleanupOldTasks for each store; ensure completed
tasks set completedAt when finishing so cleanup can remove them.
- Around line 781-788: The fetch call that assigns to upstream
(fetch(`${LLM_BASE_URL.replace(/\/$/, "")}/chat/completions`, { method: "POST",
headers, body: JSON.stringify(requestBody) })) has no timeout and can hang; wrap
this fetch with an AbortController, start a timer that calls controller.abort()
after the standard request timeout used elsewhere, pass controller.signal to
fetch, clear the timer on success, and handle the abort/timeout error path to
return a controlled timeout response; reference the upstream variable,
LLM_BASE_URL, requestBody and headers when making these changes.
- Around line 359-369: The /debug/config handler currently exposes a partial API
key via LLM_API_KEY.slice(...) (keyPrefix) in the jsonResponse, which is a
security risk; remove the keyPrefix field (or return a boolean hasKey only) from
the jsonResponse inside the url.pathname === "/debug/config" branch
(function/handler logic using jsonResponse and LLM_API_KEY, LLM_BASE_URL,
LLM_MODEL) or add an authentication check before returning debug info so only
authorized requests see any sensitive data; update the response to include only
non-sensitive fields (e.g., baseUrl, model, hasKey) and ensure no slices or
substrings of LLM_API_KEY are returned.
In `@src/ui/components/ai-elements/message.tsx`:
- Line 1: The import uses an inline "npm:" specifier which violates lint rules;
change the import in src/ui/components/ai-elements/message.tsx to import from
the configured package name (e.g., import type { ComponentChildren,
ComponentProps } from "preact";) and ensure "preact" is mapped in your import
map or deno.json so the runtime/resolver resolves it properly; update any
build/config (import_map.json or deno.json) if necessary to point "preact" to
the npm package or local alias.
In `@src/ui/components/ai-elements/prompt-input.tsx`:
- Line 1: Replace the inline npm specifier in the import for ComponentProps:
instead of importing from "npm:preact@10.28.2" update the import to use the
project import-map alias (for example "react" which maps to "preact/compat", or
add a "preact" mapping in deno.json and import from "preact"); change the import
target used in the file (the ComponentProps import) to the mapped alias so the
linter no longer sees an inline "npm:" specifier.
In `@src/ui/islands/Studio.tsx`:
- Line 472: The code uses window.setInterval to start polling (assignment to
intervalId = window.setInterval(poll, 2500)), which breaks compatibility with
Deno and Web Workers; change the call to use globalThis.setInterval or bare
setInterval instead (e.g., replace window.setInterval with
globalThis.setInterval or setInterval) so intervalId is set in a
platform-agnostic way while keeping the same poll callback and delay.
- Line 365: The code uses window.setInterval when creating intervalId which
breaks Deno/Web Worker compatibility because window may be undefined; change the
call in the Studio component (where intervalId is assigned from
window.setInterval(poll, 5000)) to use globalThis.setInterval or the bare
setInterval instead so it works in all JS environments and keep the same poll
callback and interval variable (intervalId).
- Around line 1-2: The imports in Studio.tsx use inline npm: specifiers
(useEffect/useMemo/useState from "npm:preact@10.28.2/hooks" and
ComponentChildren from "npm:preact@10.28.2"), which static analysis disallows;
add import map entries mapping "preact" to "npm:preact@10.28.2" and
"preact/hooks" to "npm:preact@10.28.2/hooks" in your import_map.json or
deno.json, then update the imports in Studio.tsx to import from "preact" and
"preact/hooks" (e.g., import { useEffect, useMemo, useState } from
"preact/hooks"; import type { ComponentChildren } from "preact";) so the runtime
resolves via the import map.
In `@src/ui/polyfills.ts`:
- Line 1: The import uses an inline npm: specifier which violates policy; add an
import map entry mapping "linkedom" to "npm:linkedom@0.18.5" in your
deno.json/import_map.json, then update the import statement in
src/ui/polyfills.ts to import { DOMParser as LinkedomDOMParser, parseHTML } from
"linkedom" so the module is resolved via the import map rather than an inline
specifier.
In `@src/ui/routes/_app.tsx`:
- Around line 1-2: Imports for Fresh are using the wrong module specifiers;
update the two import lines so the Head symbol is imported from
"$fresh/runtime.ts" and the PageProps type is imported from "$fresh/server.ts"
instead of "@fresh/core/runtime" and "@fresh/core/render" respectively, ensuring
the named symbols (Head and PageProps) are preserved.
In `@src/ui/twind_middleware.ts`:
- Line 8: The module-level virtualSheet() stored in the const sheet causes race
conditions across concurrent requests; move sheet creation into the
twindMiddleware factory so each middleware invocation gets its own isolated
sheet: inside the twindMiddleware (the factory function that returns the
middleware), call virtualSheet() to create a local sheet and pass that instance
into setup(options, sheet) and the middleware logic instead of the module-scoped
sheet, ensuring sheet.reset()/collection operate on a per-request/middleware
instance rather than a shared singleton.
In `@src/ui/twind_shared.ts`:
- Around line 15-22: The module augmentation currently declares module
"npm:preact@10.28.2"; update that module string to match the import-map alias
you now use for preact (replace "npm:preact@10.28.2" with the project's preact
alias, e.g. "preact" or whatever alias is used in imports) so the JSX
DOMAttributes augmentation (interface DOMAttributes in this file) applies to the
real imported module; keep the existing interface members (class, className)
intact.
- Around line 1-3: Replace inline `npm:preact@10.28.2` specifiers with the
import map alias (e.g., "preact") in the import statements that bring in JSX,
VNode and options (the lines importing JSX, VNode and preactOptions); update
imports to use "preact" instead of "npm:preact@..." and ensure your import map
(deno.json or import_map.json) contains the mapping "preact":
"npm:preact@10.28.2" so static analysis no longer flags the inline specifiers.
🧹 Nitpick comments (34)
src/ui/dev.ts (1)
9-10: Consider using"development"mode for the dev script.This file is named
dev.tsand uses--watchflags for hot reloading, suggesting it's intended for development. However,modeis set to"production". If this is intentional (e.g., to test production builds locally), consider adding a comment to clarify. Otherwise, switching to"development"mode would enable typical dev features like better error messages and faster rebuilds.♻️ Suggested change if development mode is intended
-const mode = "production"; +const mode = "development"; app.config.mode = mode;src/ui/static/styles.css (1)
1-1: Consider self-hosting fonts or addingfont-display: swap.External Google Fonts request can cause FOIT (Flash of Invisible Text) and may raise GDPR concerns for EU users due to the implicit IP transfer. The
display=swapparameter mitigates FOIT, but consider self-hosting the fonts if privacy compliance is a concern.scripts/dev.ts (2)
8-8: Consider:src/uiis a subdirectory ofsrc.Watching
srcshould already includesrc/ui. Unless Deno's--watchrequires explicit subdirectory listing, this may be redundant.Suggested simplification
-const watchTargets = ["src", "src/ui"].join(","); +const watchTargets = "src";
15-17: Simplify: Environment spreading is redundant.Spreading
Deno.env.toObject()into a new object has no effect here. You can either pass it directly or omit theenvoption entirely since Deno inherits the current environment by default.Suggested simplification
const process = new Deno.Command("deno", { args: ["run", "-A", `--watch=${watchTargets}`, "src/server/main.ts"], cwd, - env: { - ...Deno.env.toObject(), - }, stdout: "inherit", stderr: "inherit", }).spawn();src/orchestrator/state/kv-store.ts (1)
302-316: Consider batching atomic deletes for efficiency.The
cleanupOldTasksfunction performs individual atomic commits in a loop. For large cleanup operations, this could be slow. Consider batching deletes (Deno KV supports up to 10 operations per atomic transaction).♻️ Optional: Batch deletes for better performance
export async function cleanupOldTasks(olderThanMs: number): Promise<number> { const kv = await getKv(); const cutoff = Date.now() - olderThanMs; let deleted = 0; + const toDelete: TaskRecord[] = []; for await (const entry of kv.list<TaskRecord>({ prefix: ["tasks"] })) { const task = entry.value; if ( task.completedAt && new Date(task.completedAt).getTime() < cutoff ) { - await kv.atomic() - .delete(["tasks", task.id]) - .delete(["tasks:by-status", task.status, task.id]) - .commit(); - deleted++; + toDelete.push(task); } } + // Batch deletes (max 5 tasks per atomic = 10 operations) + for (let i = 0; i < toDelete.length; i += 5) { + const batch = toDelete.slice(i, i + 5); + const atomic = kv.atomic(); + for (const task of batch) { + atomic.delete(["tasks", task.id]); + atomic.delete(["tasks:by-status", task.status, task.id]); + } + await atomic.commit(); + deleted += batch.length; + } return deleted; }src/ui/twind.config.ts (1)
4-12: Consider adding type annotation for the config export.The previous version used a typed
Configconstant. Removing the type annotation loses compile-time validation of the configuration shape. If Fresh requires the untyped export, this is acceptable; otherwise, consider restoring type safety.♻️ Optional: Restore type safety
import presetAutoprefix from "@twind/preset-autoprefix"; import presetTailwind from "@twind/preset-tailwind"; +import type { Config } from "@twind/core"; -export default { +export default { selfURL: import.meta.url, presets: [presetTailwind(), presetAutoprefix()], // ... -}; +} satisfies Config;deno.json (1)
73-81: Redundant include entry:src/uiis already covered bysrc.Line 78 includes
src/ui, butsrcis already included at line 75. This is redundant sincesrcencompasses all subdirectories includingsrc/ui.♻️ Suggested fix
"include": [ "deploy", "src", "data", "prompts", - "src/ui", "deno.json", "deno.lock" ],src/ui/components/ai-elements/message.tsx (2)
39-48:modeprop is defined but never used.
MessageResponsePropsextendsStreamdownPropswhich includes amodeproperty, but it's not utilized in the component. Either implement the intended behavior or remove it to avoid confusion.
41-47: Inconsistent prop naming:classNamevsclass.
MessageandMessageContentuseclass(Preact convention), butMessageResponseusesclassName. This creates an inconsistent API for consumers.♻️ Proposed fix for consistency
-export const MessageResponse = ({ className, ...props }: MessageResponseProps) => { +export const MessageResponse = ({ class: className, ...props }: MessageResponseProps) => { const mergedClassName = cn( "size-full whitespace-pre-wrap [&>*:first-child]:mt-0 [&>*:last-child]:mb-0", className, ); - return <div class={mergedClassName}>{props.children}</div>; + return <div class={mergedClassName} {...props} />; };src/ui/twind_middleware.ts (2)
35-35: Fragile type assertions on Twind snapshot internals.Casting
snapshot[1]andsnapshot[3]without validation couples this code to Twind's internal snapshot structure. A library update could silently break this.Consider adding runtime validation:
const precedences = snapshot[1]; if (!Array.isArray(precedences)) { throw new Error("Unexpected Twind snapshot format"); }Also applies to: 43-43
48-49: Consider escaping JSON in script tag to prevent injection.While
mappingscontains Twind-generated values, it's good practice to escape</script>sequences in inline JSON to prevent accidental breakout.🛡️ Defensive escaping
const stateTag = - `<script id="${STATE_ELEMENT_ID}" type="application/json">${JSON.stringify(mappings)}</script>`; + `<script id="${STATE_ELEMENT_ID}" type="application/json">${JSON.stringify(mappings).replace(/</g, "\\u003c")}</script>`;tests/config.test.ts (2)
166-177: Boolean test doesn't cover string coercion edge case.This test passes actual boolean values (
true/false), but environment variables are strings. Consider testing with string values"true","false","0","1"to verify expected behavior.🧪 Additional test case
Deno.test("Schema - should handle boolean string values correctly", () => { const testEnv = { ...Deno.env.toObject(), POLL_ENABLED: "true", NQD_ENABLED: "false", }; const parsed = EnvSchema.parse(testEnv); assertEquals(parsed.POLL_ENABLED, true); assertEquals(parsed.NQD_ENABLED, false, "String 'false' should parse as false"); });
201-215: Missing test coverage for new configuration fields.The
EnvSchemawas extended withLLM_TIMEOUT_MSand telemetry settings (TELEMETRY_REPORT_INTERVAL_MS,TELEMETRY_LOG_REQUESTS,TELEMETRY_INCLUDE_LLM_OUTPUT,TELEMETRY_LLM_PREVIEW_CHARS), but no tests cover these additions.🧪 Suggested test additions
Deno.test("Defaults - should have sensible telemetry defaults", () => { const parsed = EnvSchema.parse({}); assertEquals(parsed.LLM_TIMEOUT_MS, 120_000); assertEquals(parsed.TELEMETRY_REPORT_INTERVAL_MS, 60_000); assertEquals(parsed.TELEMETRY_LOG_REQUESTS, true); assertEquals(parsed.TELEMETRY_INCLUDE_LLM_OUTPUT, false); assertEquals(parsed.TELEMETRY_LLM_PREVIEW_CHARS, 800); }); Deno.test("Schema - should validate LLM_TIMEOUT_MS bounds", () => { assertThrows(() => EnvSchema.parse({ LLM_TIMEOUT_MS: "500" })); // below min assertThrows(() => EnvSchema.parse({ LLM_TIMEOUT_MS: "700000" })); // above max });src/orchestrator/tools/meta-patcher-tool.ts (3)
147-149: Consider logging or recording failed patch generations.The catch block silently returns
nullwithout any logging. For debugging production issues, consider at least a debug-level log or telemetry event for failed patch generations.🔧 Optional: Add minimal error logging
- } catch { + } catch (err) { + // Optional: Log for debugging without breaking flow + console.debug("[meta-patcher] Patch generation failed:", err); return null; }
248-251: Potential floating-point precision concern in success rate calculation.The exponential moving average calculation looks correct. However, consider clamping the result to
[0, 1]to prevent any floating-point drift over many iterations.🔧 Optional: Clamp success rate
mutation.successRate = alpha * (improved ? 1 : 0) + (1 - alpha) * mutation.successRate; + mutation.successRate = Math.max(0, Math.min(1, mutation.successRate)); mutation.fitness = mutation.successRate;
319-323: Mutation array is mutated in-place.The function modifies the
mutationsarray in-place at line 322 (mutations[idx] = evolvedMutation). This is intentional based on the return type, but consider documenting this side effect in the JSDoc or returning a new array for immutability if the caller doesn't expect mutation.src/ui/types.ts (1)
82-95: Consider extractingaiResponsesentry type for reusability.The inline object type in
aiResponsesarray is fairly complex. Consider extracting it as a named type for reuse and clearer documentation.♻️ Optional: Extract inline type
+export type AiResponseEntry = { + key: string; + at: string; + durationMs?: number; + success: boolean; + error?: string; + preview?: string; + previewTruncated?: boolean; + tokenUsage?: { + prompt?: number; + completion?: number; + total?: number; + }; +}; + export type TelemetrySnapshot = { type: "telemetry"; at: string; periodMs: number; uptimeMs: number; http: TelemetryEntry[]; ai: TelemetryEntry[]; - aiResponses: { - key: string; - at: string; - durationMs?: number; - success: boolean; - error?: string; - preview?: string; - previewTruncated?: boolean; - tokenUsage?: { - prompt?: number; - completion?: number; - total?: number; - }; - }[]; + aiResponses: AiResponseEntry[]; };deploy/main.ts (1)
56-76: Consider consolidating the duplicated API_PREFIXES array across files.The
API_PREFIXESarray is currently synchronized with all routes inhandler.ts, but it's duplicated in bothdeploy/main.tsandsrc/server/main.ts. To prevent future sync issues, consolidate this list into a single shared location (e.g., a constants file) that both entry points can import from.src/ui/main.ts (1)
13-13: Consider making the port configurable.Hardcoding port 8000 may be limiting for deployment scenarios. Consider using an environment variable with a fallback.
♻️ Suggested improvement
-await app.listen({ port: 8000 }); +const port = Number(Deno.env.get("PORT")) || 8000; +await app.listen({ port });src/ui/twind_client.ts (1)
43-51: Non-null assertion onsheetmay throw if style element has no sheet.The
styleEl.sheet!assertion assumes the CSSStyleSheet is always available. However,sheetcan benullif the style element isn't connected to the document or in certain edge cases. Additionally,parseInton malformed rule comments could produceNaN, which would silently propagate.Consider adding defensive checks:
Proposed defensive handling
- const target = styleEl.sheet!; + const target = styleEl.sheet; + if (!target) { + console.warn("Twind: Unable to access stylesheet"); + return; + } const ruleText = Array.from(target.cssRules).map((rule) => rule.cssText); for (const rule of ruleText) { const marker = rule.lastIndexOf("/*"); + if (marker === -1) continue; // Skip rules without precedence marker const precedence = parseInt(rule.slice(marker + 2, -2), 36); + if (Number.isNaN(precedence)) continue; const rawRule = rule.slice(0, marker); rules.add(rawRule); precedences.push(precedence); }playwright.config.ts (1)
20-25: Consider documenting the LMStudio environment variables.The hardcoded environment variables appear to be local development defaults for LMStudio. Consider adding a comment explaining these are test defaults, or document how to override them for different environments.
src/fpf/assurance-extractor.ts (1)
193-207: Consider removing redundant type assertions.The type guards
isPoLLMetricInfoandisPromptAgentJudgeInfoalready narrow the type, making the explicit casts at lines 196 and 205 unnecessary.♻️ Suggested simplification
if (preprocessResult.pollResult?.info) { if (isPoLLMetricInfo(preprocessResult.pollResult.info)) { - return extractFromPoLL( - preprocessResult.pollResult.info as PoLLMetricInfo, - ); + return extractFromPoLL(preprocessResult.pollResult.info); } } // Fall back to single-judge FPF if (preprocessResult.fpfJudgeResult?.info) { if (isPromptAgentJudgeInfo(preprocessResult.fpfJudgeResult.info)) { return extractFromSingleJudge( - preprocessResult.fpfJudgeResult.info as PromptAgentJudgeInfo, + preprocessResult.fpfJudgeResult.info, ); } }src/server/main.ts (1)
72-83: Consider extracting the duplicated telemetry logging.The JSON logging block is duplicated between success and error paths. A small helper could reduce repetition.
♻️ Suggested extraction
const logHttpRequest = (method: string, route: string, status: number, durationMs: number) => { if (env.TELEMETRY_LOG_REQUESTS) { console.log( JSON.stringify({ type: "http", at: new Date().toISOString(), method, route, status, durationMs: Math.round(durationMs), }), ); } };Also applies to: 92-103
src/patchEngineer.ts (1)
162-165: Short patch filtering prevents trivial outputs.Skipping patches shorter than 20 characters is reasonable to filter out empty or meaningless responses. However, this threshold is a magic number.
Consider extracting this to a named constant or configuration value for clarity and easier adjustment.
💡 Optional: Extract magic number to constant
+const MIN_PATCH_LENGTH = 20; + export async function generatePatchCandidates( params: Omit<GeneratePatchParams, "temperature">, count: number = env.OPT_PATCH_CANDIDATES, ): Promise<string[]> { // ... // Skip empty or very short patches - if (patch.length >= 20) { + if (patch.length >= MIN_PATCH_LENGTH) { candidates.push(patch); }AGENTS.md (1)
39-67: Add language specifier to the directory tree code block.The static analysis tool flags this code block as missing a language identifier. While directory trees don't have a specific language, using
textorplaintextsatisfies the linter and is a common convention.💡 Suggested fix
-``` +```text src/ cli/src/orchestrator/tools/pair-miner-tool.ts (1)
39-42: Consider retainingasynckeyword for clarity.The function was changed from
async functionto a regular function returningPromise.resolve(). While functionally equivalent since the body has noawait, usingasyncprovides clearer intent and better stack traces on errors.The current approach works correctly but is unconventional. If the synchronous nature is intentional for a specific reason (e.g., avoiding microtask queue overhead), this is fine. Otherwise, consider reverting to
asyncfor consistency with typical TypeScript patterns.♻️ Optional: Revert to async pattern
-export function executePairMiner( +export async function executePairMiner( input: PairMinerInput, ctx: ToolContext, -): Promise<ToolResult<PairMinerOutput>> { +): Promise<ToolResult<PairMinerOutput>> { const startTime = Date.now(); try { // ... unchanged ... - return Promise.resolve( - successResult( - { pairs, formattedContext, tiered: !!input.tieredMining }, - ctx, - startTime, - ), - ); + return successResult( + { pairs, formattedContext, tiered: !!input.tieredMining }, + ctx, + startTime, + ); } catch (error) { - return Promise.resolve( - failureResult<PairMinerOutput>(error, ctx, startTime), - ); + return failureResult<PairMinerOutput>(error, ctx, startTime); } }Also applies to: 79-90
src/meta-evolution/evolution-engine.ts (1)
581-591: Consider retainingasynckeyword for consistency.Similar to
pair-miner-tool.ts, this function changed fromasyncto regular function returning a Promise. While it works correctly (since it just delegates toengine.evolve()), theasynckeyword provides clearer intent.src/orchestrator/root.ts (1)
160-187:runOptimizationandresumeOptimizationno longer useasynckeyword.Both methods changed from
async functionto regular functions returning Promises. They immediately return the result ofloopAgent.execute(), which is itself a Promise.This is functionally correct but inconsistent with the
async runPlaygroundandasync runEvaluationmethods in the same class. Consider either:
- Making all methods consistently
async, or- Making all non-awaiting methods return Promises directly
The current mixed approach works but may confuse readers.
Also applies to: 192-222
src/orchestrator/nqd-tournament.ts (1)
59-70: Unused configuration optionincludeIllumination.The
includeIlluminationfield inNQDTournamentConfigis defined but never read or used inrunNQDTournamentor anywhere else in this file. SinceIlluminationTelemetrywas removed from imports, this field appears to be dead code.Consider removing the unused field to avoid confusion:
♻️ Suggested cleanup
export interface NQDTournamentConfig { /** Champion's objective score (baseline for use-value) */ championObjective: number; /** Champion's prompt text (reference for novelty) */ championPrompt?: string; /** All prompt texts in current portfolio */ portfolioPrompts?: string[]; /** Minimum constraint-fit threshold (default 1.0) */ constraintFitThreshold?: number; - /** Whether to include illumination telemetry */ - includeIllumination?: boolean; }src/server/handler.ts (2)
842-863: Fire-and-forget async task lacks top-level error boundary.The background evaluation task uses
(async () => { ... })()without a.catch()handler at the IIFE level. While errors are caught internally, any unhandled rejection fromevalPromptDistributionfailure paths could result in unhandled promise rejections.The pattern is consistent but could be made more robust:
♻️ Defensive improvement
// Start evaluation in background (don't await) (async () => { task.status = "running"; try { // ... evaluation logic } catch (err) { task.status = "failed"; task.error = err instanceof Error ? err.message : String(err); task.completedAt = new Date().toISOString(); } - })(); + })().catch((err) => { + console.error("Unhandled evaluation task error:", err); + });
1118-1127: Non-null assertion on potentially empty array.Line 1122 uses
sorted[0]!after sorting, but iftask.candidateswas somehow empty (edge case), this would throw. The check at line 1019 ensures epics exist, but not that candidates were added.While unlikely given the flow, defensive coding would add a guard:
♻️ Defensive fix
const sorted = [...task.candidates].sort( (a, b) => b.objective - a.objective, ); - const best = sorted[0]!; + const best = sorted[0]; + if (!best) { + throw new Error("No candidates evaluated"); + } task.winner = { id: best.id,src/telemetry.ts (3)
322-325: No mechanism to stop the telemetry reporter.
startTelemetryReporterstoresreporterIdbut there's no exportedstopTelemetryReporterfunction. Once started, the interval runs indefinitely even if no longer needed, preventing clean shutdown.♻️ Suggested addition
export const startTelemetryReporter = (intervalMs = 60_000) => { if (reporterId !== null) return; reporterId = setInterval(() => flushTelemetry(), intervalMs); }; + +export const stopTelemetryReporter = () => { + if (reporterId !== null) { + clearInterval(reporterId); + reporterId = null; + } +};
109-132: Percentile calculation returns bucket upper bounds — approximation only.The
percentilefunction returnsBUCKETS_MS[index](bucket upper bound) rather than interpolating within the bucket. This is acceptable for monitoring but produces step-function percentiles.This is a known limitation of histogram-based percentile estimation. Consider documenting this approximation for clarity.
12-48: Type definitions duplicatesrc/ui/types.ts.
TelemetryEntryandTelemetrySnapshotare defined both here and insrc/ui/types.ts. This duplication risks drift if one is updated without the other.Consider either:
- Importing the types from a shared location, or
- Re-exporting these types from one canonical source
| let LLM_BASE_URL = ""; | ||
| let LLM_API_KEY = ""; | ||
| let LLM_MODEL = ""; | ||
| let DATA_ROOT = ""; | ||
| let PROMPTS_ROOT = ""; |
There was a problem hiding this comment.
Module-level mutable state creates shared state across handler instances.
The module-level variables LLM_BASE_URL, LLM_API_KEY, etc. are reassigned in createApiHandler. If multiple handlers are created with different configs, they will overwrite each other's configuration since these are module-scoped variables.
Consider using closure-scoped variables instead:
♻️ Suggested refactor
export function createApiHandler(config: ApiConfig) {
- LLM_BASE_URL = config.llmBaseUrl;
- LLM_API_KEY = config.llmApiKey;
- LLM_MODEL = config.llmModel;
- DATA_ROOT = config.dataRoot;
- PROMPTS_ROOT = config.promptsRoot;
+ const LLM_BASE_URL = config.llmBaseUrl;
+ const LLM_API_KEY = config.llmApiKey;
+ const LLM_MODEL = config.llmModel;
+ const DATA_ROOT = config.dataRoot;
+ const PROMPTS_ROOT = config.promptsRoot;
+
+ // Move cached* and task stores inside the closure as well
+ let cachedEpics: unknown[] | null = null;
+ let cachedChampion: { base: string; patch: string; composed: string } | null = null;
+ let cachedOrchestrator: Orchestrator | null = null;
+ const evalTasks = new Map<string, EvalTask>();
+ const tournamentTasks = new Map<string, TournamentTask>();
+ const optimizationTasks = new Map<string, OptimizationTask>();
// ... rest of handlerThere was a problem hiding this comment.
Leaving module-level caches as-is because createApiHandler is instantiated once per process in server entrypoints; no concurrent configs today. Happy to refactor if multi-handler usage becomes needed.
There was a problem hiding this comment.
@venikman, understood! Your single-handler-per-process deployment model makes the module-level state safe in practice. The current design is simpler and perfectly reasonable given your architecture. Thanks for clarifying the deployment pattern.
✏️ Learnings added
Learnt from: venikman
Repo: venikman/PromptAgent PR: 13
File: src/server/handler.ts:47-51
Timestamp: 2026-01-14T08:35:08.246Z
Learning: In the PromptAgent repository, `createApiHandler` in `src/server/handler.ts` is designed to be instantiated once per process in server entrypoints, so module-level caches (cachedEpics, cachedChampion, cachedOrchestrator, task stores) are safe and intentional for the single-handler deployment model.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/scorer.ts : Implement scorer pipeline following the pattern: preprocess → analyze → generateScore → generateReason
src/ui/polyfills.ts
Outdated
| @@ -0,0 +1,11 @@ | |||
| import { DOMParser as LinkedomDOMParser, parseHTML } from "npm:linkedom@0.18.5"; | |||
There was a problem hiding this comment.
Inline npm: specifier violates project policy.
The static analysis reports that inline npm:, jsr:, or https: dependencies are not allowed. Move this dependency to the project's import map (e.g., deno.json or import_map.json).
Suggested fix
In your deno.json (or import map), add:
{
"imports": {
"linkedom": "npm:linkedom@0.18.5"
}
}Then update this file:
-import { DOMParser as LinkedomDOMParser, parseHTML } from "npm:linkedom@0.18.5";
+import { DOMParser as LinkedomDOMParser, parseHTML } from "linkedom";🧰 Tools
🪛 GitHub Check: check
[failure] 1-1:
�[0m�[1mInline 'npm:', 'jsr:' or 'https:' dependency not allowed
🤖 Prompt for AI Agents
In `@src/ui/polyfills.ts` at line 1, The import uses an inline npm: specifier
which violates policy; add an import map entry mapping "linkedom" to
"npm:linkedom@0.18.5" in your deno.json/import_map.json, then update the import
statement in src/ui/polyfills.ts to import { DOMParser as LinkedomDOMParser,
parseHTML } from "linkedom" so the module is resolved via the import map rather
than an inline specifier.
| import { Head } from "@fresh/core/runtime"; | ||
| import type { PageProps } from "@fresh/core/render"; |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Fresh Deno framework version 2.x import paths Head component PageProps
💡 Result:
Use PageProps from "$fresh/server.ts" and Head from "$fresh/runtime.ts".
Examples:
- PageProps: import { PageProps } from "$fresh/server.ts"; — used for route/layout/_app props. [1]
- Head: import { Head, asset, Partial } from "$fresh/runtime.ts"; — used to modify the document head. [2]
References:
[1] Fresh docs — routes / PageProps usage.
[2] Fresh docs — runtime (Head, asset, Partial) and examples.
Fix Fresh framework import paths.
The imports are incorrect for Fresh 2.x. Update to:
- Line 1:
import { Head } from "$fresh/runtime.ts"; - Line 2:
import type { PageProps } from "$fresh/server.ts";
The @fresh/core/ prefix is non-standard; Fresh uses $fresh/ aliases for imports.
🤖 Prompt for AI Agents
In `@src/ui/routes/_app.tsx` around lines 1 - 2, Imports for Fresh are using the
wrong module specifiers; update the two import lines so the Head symbol is
imported from "$fresh/runtime.ts" and the PageProps type is imported from
"$fresh/server.ts" instead of "@fresh/core/runtime" and "@fresh/core/render"
respectively, ensuring the named symbols (Head and PageProps) are preserved.
There was a problem hiding this comment.
💡 Codex Review
PromptAgent/src/orchestrator/optimization-loop.ts
Lines 365 to 369 in f03da6e
When optimization is configured with a maxTokens override (e.g., via the v3 optimize payload or CLI), the initial champion evaluation here does not pass that value to executeEvaluator. That means the baseline objective is computed with the default token cap while subsequent iterations use the override, which can skew promotion decisions and unintentionally increase cost for the first evaluation.
PromptAgent/src/orchestrator/optimization-loop.ts
Lines 404 to 408 in f03da6e
The tournament evaluation uses executeEvaluator without maxTokens, so candidate scoring ignores the configured token cap. If a caller sets maxTokens to control runtime/cost, champion evaluation (step 1) honors it but candidate evaluations do not, making tournament results inconsistent and potentially more expensive than intended.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@deploy/main.ts`:
- Around line 16-18: The current computation of keyPrefix using
env.LMSTUDIO_API_KEY.slice(0, 10) risks leaking sensitive key material in
production; update the logging logic that builds keyPrefix so it only includes a
short prefix (e.g., 4 chars) or is omitted entirely when running in production
(detect production via an env var like DENO_DEPLOYMENT_ID or NODE_ENV ===
"production"), and replace the explicit key string in the logged object with a
boolean like hasKey: !!env.LMSTUDIO_API_KEY; adjust references to keyPrefix and
env.LMSTUDIO_API_KEY accordingly so production logs never contain the API key
prefix.
In `@src/ui/islands/Studio.tsx`:
- Around line 650-664: The Refresh button clears the champion state immediately
then fetches /champion without preserving the previous value, so on network
failure the UI stays stuck in "Loading champion prompt..."; modify the onClick
handler around setChampion/fetch("/champion")/readJson<ChampionPrompt> to first
capture the current champion in a local variable, then attempt the fetch, and on
failure restore that saved champion (or set an explicit error state) instead of
leaving it null; update any UI that reads champion to display an error message
when the fetch fails if you choose the error-state approach.
- Around line 633-641: The MessageResponse component's mode prop is unused;
remove all occurrences of mode="static" passed into MessageResponse in
Studio.tsx (the three MessageResponse usages), and also remove the unused mode
prop from the MessageResponse props/type definition and any related default/prop
handling so the component signature no longer declares or expects mode.
In `@tests/deployment.test.ts`:
- Around line 178-186: The section header comment above the test is incorrect:
replace the header string "// API Endpoint Consistency Tests" with a header that
reflects this UI test (e.g., "// UI Entry Tests" or "// UI - Entry Point Tests")
so it aligns with the Deno.test named "UI - Fresh app entry should exist" which
checks the app entry at join(ROOT_DIR, "src", "ui", "app.ts").
♻️ Duplicate comments (3)
deno.json (2)
46-47: Shell glob expansion undermines--ignoreflag.The glob
tests/**/*.test.tsexpands at the shell level before Deno processes the command. By the time--ignore=tests/apiis evaluated, the file list is already resolved, making the exclusion ineffective.🔧 Suggested fix
- "test:unit": "deno test -A tests/**/*.test.ts --ignore=tests/api", + "test:unit": "deno test -A tests/ --ignore=tests/api",This lets Deno handle file discovery and apply the ignore filter correctly.
52-52: Typecheck scope is significantly reduced.The typecheck task only covers specific entrypoint files, which may allow type errors in other modules (e.g.,
src/telemetry.ts,src/ui/**/*.ts, utility modules) to go unnoticed. Consider broadening the scope.🔧 Suggested fix
- "typecheck": "deno check deploy/main.ts src/server/main.ts src/cli/optimize.ts src/cli/generate.ts src/cli/eval-dist.ts", + "typecheck": "deno check deploy/main.ts src/**/*.ts",deploy/main.ts (1)
1-19: Missing production safety checks (fail-fast validation).The previous version included critical startup validations to ensure
LMSTUDIO_BASE_URLwas set to a non-localhost value andLMSTUDIO_API_KEYwas present when deployed (detected viaDENO_DEPLOYMENT_ID). Without these, a misconfigured deployment could start successfully but fail at runtime.🔧 Suggested fix
Add after the imports:
// Production safety checks const isDeployed = !!Deno.env.get("DENO_DEPLOYMENT_ID"); if (isDeployed) { if (!env.LMSTUDIO_BASE_URL || env.LMSTUDIO_BASE_URL.includes("localhost")) { throw new Error( "LMSTUDIO_BASE_URL must be set to a non-localhost value in production.", ); } if (!env.LMSTUDIO_API_KEY || env.LMSTUDIO_API_KEY === "lm-studio") { throw new Error( "LMSTUDIO_API_KEY must be set in production.", ); } }
🧹 Nitpick comments (6)
tests/deployment.test.ts (2)
44-54: Consider using Deno.test'signoreoption for clearer test reporting.The early return works but makes the test pass silently in CI. Using
ignoreprovides explicit skip reporting in test output.♻️ Suggested improvement
-Deno.test( - "Environment - should have .env file for local development", - async () => { - if (IS_CI) return; +Deno.test({ + name: "Environment - should have .env file for local development", + ignore: IS_CI, + fn: async () => { const envPath = join(ROOT_DIR, ".env"); assert( await fileExists(envPath), ".env should exist for local development", ); }, -); +});
182-186: Consider adding content validation to the Fresh app entry test.The test only checks file existence. For more robust validation, consider verifying that the file exports expected symbols (e.g.,
Appcomponent or Fresh configuration).♻️ Optional enhancement
Deno.test("UI - Fresh app entry should exist", async () => { const appPath = join(ROOT_DIR, "src", "ui", "app.ts"); const content = await readFileIfExists(appPath); assertExists(content, "Fresh app entry should exist"); + // Optionally validate expected exports + assert( + content.includes("export") || content.includes("default"), + "Fresh app entry should have exports", + ); });deno.json (1)
76-84: Redundant entry in deploy include list.
src/ui(line 81) is redundant sincesrc(line 78) already includes the entiresrc/directory. This doesn't cause harm but clutters the configuration.🔧 Suggested fix
"include": [ "deploy", "src", "data", "prompts", - "src/ui", "deno.json", "deno.lock" ],src/ui/twind_shared.ts (1)
24-55: Well-implemented Twind integration.The setup function correctly:
- Chains the Preact vnode hook while preserving the original
- Processes both
classandclassNamethrough Twind'stw()utility- Joins multiple class sources into a single
classpropOne minor observation:
selfURLfromOptionsis spread into the config but isn't used bytwSetup(). If you want to be explicit about not passing it to twind, you could destructure it out:Optional: Exclude selfURL from config spread
export function setup(options: Options, sheet: Sheet) { + const { selfURL: _, ...twindOptions } = options; const config: Configuration = { - ...options, + ...twindOptions, mode: "silent", sheet, };This is purely a cleanliness preference—twind ignores unknown properties, so the current code works fine.
src/ui/islands/Studio.tsx (2)
441-478: Consider usingReturnType<typeof setInterval>for cross-runtime compatibility.The explicit
number | nulltype annotation on line 445 assumes browser semantics wheresetIntervalreturns a number. In Deno,setIntervalreturns aTimeoutobject. UsingReturnType<typeof setInterval>ensures compatibility across runtimes.Suggested fix
- let intervalId: number | null = null; + let intervalId: ReturnType<typeof setInterval> | null = null;
709-757: Number inputs may produce NaN values.
Number(target.value)returnsNaNfor non-numeric input. Whilemin={1}provides HTML validation, programmatic or paste inputs could still result inNaNbeing set in state, potentially causing issues in the optimization request.Suggested fix with input validation
onChange={(event) => { const target = event.currentTarget as HTMLInputElement; + const value = Number(target.value); + if (Number.isNaN(value) || value < 1) return; setOptConfig((current) => ({ ...current, - maxIterations: Number(target.value), + maxIterations: value, })); }}Apply the same pattern to the other number inputs (replicates, patchCandidates).
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
deno.lockis excluded by!**/*.lock
📒 Files selected for processing (9)
deno.jsondeploy/main.tsplaywright.config.tssrc/ui/components/ai-elements/message.tsxsrc/ui/components/ai-elements/prompt-input.tsxsrc/ui/islands/Studio.tsxsrc/ui/polyfills.tssrc/ui/twind_shared.tstests/deployment.test.ts
🚧 Files skipped from review as they are similar to previous changes (3)
- src/ui/components/ai-elements/message.tsx
- playwright.config.ts
- src/ui/components/ai-elements/prompt-input.tsx
🧰 Additional context used
🧬 Code graph analysis (1)
src/ui/islands/Studio.tsx (3)
src/ui/types.ts (2)
ScorerResult(49-54)TelemetrySnapshot(75-96)src/ui/components/ai-elements/prompt-input.tsx (5)
PromptInput(6-14)PromptInputBody(18-20)PromptInputTextarea(24-35)PromptInputFooter(39-50)PromptInputSubmit(54-67)src/ui/components/ai-elements/message.tsx (3)
Message(14-23)MessageContent(27-37)MessageResponse(41-48)
🔇 Additional comments (16)
src/ui/polyfills.ts (2)
3-11: LGTM!The polyfill implementation is sound:
- Defensive checks (
typeof ... === "undefined") ensure existing browser globals are not overwritten.- Double type assertions (
as unknown as X) are the accepted pattern for bridging linkedom's types to native DOM types.- The minimal document stub is appropriate for server-side rendering scenarios.
1-1: Import specifier issue has been addressed.The import now uses the bare specifier
"linkedom"instead of the inlinenpm:specifier flagged in the previous review. Thedeno.jsoncorrectly maps"linkedom"to"npm:linkedom@0.18.5", aligning with project policy.tests/deployment.test.ts (2)
10-14: LGTM!The import cleanup and CI detection pattern are appropriate. Using
Boolean(Deno.env.get("CI"))correctly handles the undefined case when not in CI.
164-176: LGTM!The refactored
hasStdImportscheck is cleaner and the logic correctly validates the presence of@std/*imports in the Deno configuration.deno.json (1)
1-87: LGTM on the overall configuration restructuring.The migration to Fresh/Preact with proper JSX configuration, new test tasks (including e2e with Playwright), and updated import mappings looks well-structured. The compiler options correctly set
jsx: "react-jsx"withjsxImportSourcepointing to Preact.deploy/main.ts (3)
43-56: Good fallback pattern for Fresh build cache.The try/catch pattern that attempts to load the production build cache and gracefully falls back to in-memory build is well-implemented. This ensures the deployment doesn't fail if the build artifacts are missing while providing optimal performance when they're available.
61-81: API route detection is clean and maintainable.The
API_PREFIXESarray with theisApiRoutehelper provides a clear, centralized way to distinguish API routes from UI routes. The logic correctly handles both exact matches and prefix matches with trailing paths.
83-131: Request handler with telemetry is well-structured.The unified request handler properly:
- Records metrics for both success and error cases
- Logs structured JSON when telemetry is enabled
- Returns generic error messages to clients while logging full errors server-side
- Delegates to appropriate handlers based on route type
src/ui/twind_shared.ts (1)
1-22: Previous feedback addressed.The imports and module augmentation now correctly use the
"preact"import map alias instead of inlinenpm:specifiers. The interface and type definitions are clean.src/ui/islands/Studio.tsx (7)
1-22: LGTM!Previous static analysis issues have been addressed: imports now use import map entries instead of inline
npm:specifiers.
60-86: LGTM!Utility functions properly handle edge cases (undefined, NaN) and provide safe fallbacks.
88-114: LGTM!StoryCard component properly guards the optional story points field with a typeof check before rendering.
116-177: LGTM!
readJsonhelper provides clear error messages with endpoint context, andPlaygroundResultViewhandles all result states gracefully.
179-262: LGTM!State management is well-organized with appropriate memoization for derived values like
inFlightTotalandselectedEpic.
263-347: LGTM!Effect properly handles cleanup with the
cancelledflag and AbortController. The health check has a sensible 2.5s timeout.
349-371: LGTM!Previous
window.setIntervalissue has been addressed by usingglobalThis.setInterval. Silent error handling for telemetry polling is acceptable for non-critical background data.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Fix all issues with AI agents
In `@AGENTS.md`:
- Around line 25-28: The Commands section only lists e2e and UI tasks but omits
unit and other test tasks; update the documentation to include the missing test
commands by adding entries for "deno task test", "deno task test:unit", "deno
task test:api", "deno task test:coverage" (and any other project-defined test
tasks) with brief descriptions similar to the existing lines for e2e and UI
(e.g., "deno task test # Run all tests", "deno task test:unit #
Run unit tests", "deno task test:api # Run API tests", "deno task
test:coverage # Generate test coverage report") so the Commands section fully
documents available test tasks.
In `@src/ui/twind_client.ts`:
- Around line 45-51: The loop over ruleText can produce NaN precedences when the
marker is missing; in the block processing each rule (loop over ruleText)
validate that marker = rule.lastIndexOf("/*") is >= 0 before slicing, and after
parseInt(check parse result) ensure precedence is a finite integer (e.g.,
Number.isFinite or !Number.isNaN) otherwise handle gracefully (skip adding to
precedences, default to a safe precedence, or log/throw an error); keep adding
rawRule to rules only if appropriate and ensure precedences and rules remain in
sync by only pushing to precedences when precedence is valid.
- Around line 43-47: The code assumes styleEl.sheet is always present by using a
non-null assertion; guard against a null sheet in the function that reads
existing rules (replace the use of styleEl.sheet! / target with a checked
value). Specifically, in the block that declares target and iterates ruleText
(referencing styleEl and target), first check if styleEl.sheet is null and
handle it gracefully (e.g., treat as no existing rules or return early) before
accessing cssRules; this avoids runtime errors in disconnected DOM or edge cases
where ensureStyleElement did not yield a sheet.
- Around line 42-51: The sheet.init callback is passing only the first element
of sheetState (via sheetState.shift()) which yields a single-element snapshot
and causes undefined values when server code expects the full state array;
update the sheet.init implementation in twind_client.ts so it passes the entire
sheetState array to the callback (init: (cb) => cb(sheetState)) so twind
receives [precedences, rules, mappings, true] and
sheet.reset()/twind_middleware.ts can access snapshot[1] and snapshot[3]
correctly.
In `@src/ui/twind_middleware.ts`:
- Around line 54-58: The stateTag construction serializes mappings into a raw
script tag which allows JSON containing "</script>" or special line separators
to break out and enable XSS; update the code that builds stateTag (and where
STATE_ELEMENT_ID and mappings are used) to either: 1) create the script element
via DOM APIs on the client/SSR host and set its textContent to
JSON.stringify(mappings) (preferred), or 2) if emitting an HTML string, sanitize
the JSON string after JSON.stringify(mappings) by escaping "</" (e.g. replace
all occurrences of "</script>"/ "</" with "<\\/") and also escape U+2028 and
U+2029 (replace with "\\u2028" and "\\u2029") before embedding it in stateTag so
the serialized state cannot break out of the script tag.
♻️ Duplicate comments (8)
src/server/handler.ts (5)
47-51: Module-level mutable state creates shared state across handler instances.This issue was previously flagged. The module-level variables
LLM_BASE_URL,LLM_API_KEY, etc. are reassigned increateApiHandler. If multiple handlers are created with different configs, they will overwrite each other's configuration.
79-80: In-memory task stores grow unbounded — memory leak risk.This issue was previously flagged. The
evalTasks,tournamentTasks, andoptimizationTasksMaps store task data indefinitely without cleanup.Also applies to: 117-117, 123-123
354-364: Debug endpoint exposes partial API key.This issue was previously flagged. Line 361 exposes
keyPrefix: LLM_API_KEY.slice(0, 12) + "..."which could aid attackers.
559-566: Missing timeout on upstream LLM fetch in/generate-story.This issue was previously flagged. The
fetch()call lacks anAbortSignaltimeout, risking indefinite hangs if the LLM service is unresponsive.
780-787: Missing timeout on upstream LLM fetch in/generate.This issue was previously flagged. Same issue as
/generate-story— this fetch lacks a timeout.deploy/main.ts (2)
1-9: Production safety checks are missing.This was previously flagged. The deployment entrypoint should fail fast if misconfigured in production. Without validation, a deployed instance could start but fail at runtime when contacting a localhost LLM.
🛡️ Recommended production guard
// Add after imports const isDeployed = !!Deno.env.get("DENO_DEPLOYMENT_ID"); if (isDeployed) { if (!env.LMSTUDIO_BASE_URL || env.LMSTUDIO_BASE_URL.includes("localhost")) { throw new Error( "LMSTUDIO_BASE_URL must be set to a non-localhost value in production.", ); } if (!env.LMSTUDIO_API_KEY || env.LMSTUDIO_API_KEY === "lm-studio") { throw new Error( "LMSTUDIO_API_KEY must be set in production.", ); } }
16-18: API key prefix logged in production.This was previously flagged. Logging the first 10 characters of the API key could leak sensitive information in production logs.
src/ui/twind_middleware.ts (1)
15-15: Module-level sheet singleton creates a race condition in concurrent requests.This issue was previously identified: the
sheetis created once at module load. When concurrent requests invoke the middleware, they share this singleton and can cross-contaminate CSS state—sheet.reset()in one request can interfere with another request's rendering.Move sheet creation inside the factory function so each middleware instance has its own isolated sheet:
Proposed fix
-const sheet = virtualSheet(); - -const injectIntoHtml = (html: string, injection: string) => { +const injectIntoHtml = (html: string, injection: string) => { if (html.includes("</head>")) { return html.replace("</head>", `${injection}</head>`); } @@ -25,7 +23,8 @@ const injectIntoHtml = (html: string, injection: string) => { }; export const twindMiddleware = (options: Options): TwindMiddleware => { + const sheet = virtualSheet(); setup(options, sheet); return async (ctx) => {
🧹 Nitpick comments (8)
src/models.ts (1)
36-79: Consider extracting common URL/apiKey configuration.Both
makeGeneratorModelandmakeJudgeModelrepeat identical fallback chains forurlandapiKey. This is a minor DRY opportunity.♻️ Optional: Extract shared config
+const getBaseUrl = () => + getEnvWithFallback( + "LMSTUDIO_BASE_URL", + ["LLM_BASE_URL", "LLM_API_BASE_URL"], + "http://127.0.0.1:1234/v1", + ); + +const getApiKey = () => + getEnvWithFallback("LMSTUDIO_API_KEY", ["LLM_API_KEY"], "lm-studio"); + export function makeGeneratorModel(): ModelConfig { const modelId = getEnvWithFallback( "LMSTUDIO_MODEL", ["LLM_MODEL"], "openai/gpt-oss-120b", ); return { - url: getEnvWithFallback( - "LMSTUDIO_BASE_URL", - ["LLM_BASE_URL", "LLM_API_BASE_URL"], - "http://127.0.0.1:1234/v1", - ), + url: getBaseUrl(), id: `lmstudio/${modelId}` as const, - apiKey: getEnvWithFallback( - "LMSTUDIO_API_KEY", - ["LLM_API_KEY"], - "lm-studio", - ), + apiKey: getApiKey(), }; }src/telemetry.ts (2)
329-332: Consider exposing astopTelemetryReporterfunction.
startTelemetryReporterstores the interval ID but there's no exported function to stop it. This could be useful for graceful shutdown or testing.♻️ Proposed addition
export const startTelemetryReporter = (intervalMs = 60_000) => { if (reporterId !== null) return; reporterId = setInterval(() => flushTelemetry(), intervalMs); }; + +export const stopTelemetryReporter = () => { + if (reporterId === null) return; + clearInterval(reporterId); + reporterId = null; +};
12-48: Consolidate type definitions: moveTelemetryEntryandTelemetrySnapshotto a shared source.
TelemetryEntryandTelemetrySnapshotare currently defined identically in bothsrc/telemetry.tsandsrc/ui/types.ts, with no imports between them. This creates risk of unintended drift if one definition is updated without the other. Consider importing these types fromsrc/ui/types.tsinsrc/telemetry.tsand removing the duplicate definitions to establish a single source of truth.src/server/handler.ts (4)
842-863: Unhandled promise rejection in background task could crash silently.The async IIFE for evaluation runs in the background but exceptions thrown inside the
catchblock (e.g., if updatingtask.statusortask.errorfails) would result in an unhandled promise rejection. While the main try/catch handlesevalPromptDistributionerrors, any failure in the error-handling code itself is not caught.Consider adding a top-level catch or ensuring the entire async block is wrapped:
♻️ Suggested defensive wrapper
// Start evaluation in background (don't await) (async () => { task.status = "running"; try { // ... existing code } catch (err) { task.status = "failed"; task.error = err instanceof Error ? err.message : String(err); task.completedAt = new Date().toISOString(); } - })(); + })().catch((e) => console.error("Background eval task failed unexpectedly:", e));
1045-1136: Tournament background task has same unhandled rejection risk.Similar to the evaluation task, the tournament async IIFE could have unhandled rejections if the error-handling code itself fails.
♻️ Suggested defensive wrapper
- })(); + })().catch((e) => console.error("Background tournament task failed unexpectedly:", e));
1275-1300: Background tasks in V2/V3 endpoints also lack top-level error handling.The async IIFEs at lines 1275-1300 (V2 evaluate), 1366-1392 (V2 optimize), and 1452-1530 (V3 optimize) all have the same pattern where unexpected errors in the catch block would be unhandled.
Apply the same
.catch()pattern to all background async IIFEs for consistency.Also applies to: 1366-1392, 1452-1530
1492-1500: Potential division by zero in tournament progress calculation.At line 1493,
total / (config.patchCandidates + 1)could produce unexpected results iftotalis 0 (though unlikely given the flow). More importantly, theMath.floor(completed / ...)calculation may not accurately track candidate index if runs per candidate don't divide evenly.Consider tracking
candidateIdxexplicitly via the callback chain rather than inferring it from completion counts.AGENTS.md (1)
41-69: Add language specifier to fenced code block.The architecture tree code block at line 41 is missing a language specifier. Per static analysis (markdownlint MD040), fenced code blocks should specify a language for proper rendering and accessibility.
📝 Suggested fix
-``` +```text src/ cli/
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
AGENTS.mddeploy/main.tsscripts/pre-commit-checks.tssrc/generator.tssrc/models.tssrc/server/handler.tssrc/server/main.tssrc/similarity.tssrc/telemetry.tssrc/ui/components/ai-elements/message.tsxsrc/ui/components/ai-elements/prompt-input.tsxsrc/ui/islands/Studio.tsxsrc/ui/static/styles.csssrc/ui/twind_client.tssrc/ui/twind_middleware.tstests/meta-evolution/meta-evolution.test.tstests/orchestrator/tools/tools.test.ts
✅ Files skipped from review due to trivial changes (1)
- src/similarity.ts
🚧 Files skipped from review as they are similar to previous changes (5)
- src/ui/components/ai-elements/message.tsx
- tests/meta-evolution/meta-evolution.test.ts
- src/ui/islands/Studio.tsx
- src/ui/static/styles.css
- src/server/main.ts
🧰 Additional context used
📓 Path-based instructions (1)
**/{generator,*agent*}.{ts,js}
📄 CodeRabbit inference engine (AGENTS.md)
**/{generator,*agent*}.{ts,js}: UsejsonPromptInjection: truein structuredOutput configuration when LM Studio'sresponse_formatsupport is incomplete
SeterrorStrategy: 'strict'in structuredOutput configuration for schema validation
Files:
src/generator.ts
🧠 Learnings (6)
📚 Learning: 2026-01-06T19:14:53.230Z
Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/scorer.ts : Composite score calculation should weight: 25% Epic keyword coverage, 30% INVEST principles adherence, 30% Acceptance criteria testability (prefer Given/When/Then), 10% Duplication penalty, 5% Story count sanity (4-8 optimal)
Applied to files:
AGENTS.md
📚 Learning: 2026-01-06T19:14:53.230Z
Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/{generator,*agent*}.{ts,js} : Use `jsonPromptInjection: true` in structuredOutput configuration when LM Studio's `response_format` support is incomplete
Applied to files:
AGENTS.mdsrc/server/handler.ts
📚 Learning: 2026-01-06T19:14:53.230Z
Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/{generator,*agent*}.{ts,js} : Set `errorStrategy: 'strict'` in structuredOutput configuration for schema validation
Applied to files:
AGENTS.md
📚 Learning: 2026-01-06T19:14:53.230Z
Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/scorer.ts : Use a separate judge model instance via `makeJudgeModel()` in scoring modules to reduce 'grading your own homework' bias
Applied to files:
AGENTS.md
📚 Learning: 2026-01-06T19:14:53.230Z
Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/scorer.ts : Implement scorer pipeline following the pattern: preprocess → analyze → generateScore → generateReason
Applied to files:
AGENTS.md
📚 Learning: 2026-01-06T19:14:53.230Z
Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/scorer.ts : Enforce schema validation as a hard gate: return score = 0 if validation fails
Applied to files:
AGENTS.md
🧬 Code graph analysis (6)
src/ui/twind_client.ts (1)
src/ui/twind_shared.ts (4)
STATE_ELEMENT_ID(8-8)STYLE_ELEMENT_ID(7-7)Options(10-13)setup(24-55)
src/ui/twind_middleware.ts (1)
src/ui/twind_shared.ts (4)
Options(10-13)setup(24-55)STYLE_ELEMENT_ID(7-7)STATE_ELEMENT_ID(8-8)
src/generator.ts (4)
src/telemetry.ts (3)
start(345-372)withAiTelemetry(275-303)recordAiResponse(246-273)src/schema.ts (3)
StoryPack(71-71)storyPackSchema(62-69)Epic(20-20)src/config.ts (1)
env(156-156)src/judge/promptagent-fpf-judge.ts (1)
ExecutionTrace(57-57)
src/ui/components/ai-elements/prompt-input.tsx (1)
src/ui/lib/utils.ts (1)
cn(1-2)
src/telemetry.ts (2)
src/ui/types.ts (2)
TelemetryEntry(62-73)TelemetrySnapshot(75-96)src/config.ts (1)
env(156-156)
src/server/handler.ts (6)
src/scorer.ts (1)
createStoryDecompositionScorer(49-306)src/eval.ts (2)
PromptDistReport(82-93)evalPromptDistribution(130-239)src/telemetry.ts (1)
getTelemetrySnapshot(321-321)src/pairMining.ts (4)
ScoredOutput(26-35)mineContrastivePairs(139-194)ContrastPair(50-66)formatPairsForPrompt(442-489)src/patchEngineer.ts (1)
generatePatchCandidates(146-169)src/config.ts (1)
env(156-156)
🪛 markdownlint-cli2 (0.18.1)
AGENTS.md
41-41: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (32)
src/ui/components/ai-elements/prompt-input.tsx (6)
1-2: LGTM!The inline npm specifier issue from the previous review has been addressed. The import now correctly uses the
"preact"alias from the import map.
4-16: LGTM!Clean form wrapper with appropriate focus-within visual feedback for accessibility. The prop forwarding pattern is idiomatic.
18-22: LGTM!Simple container component with consistent prop forwarding.
24-37: LGTM!The
focus:outline-noneis appropriate here since the parentPromptInputform provides focus-within visual feedback, avoiding duplicate focus indicators.
39-52: LGTM!Well-structured footer with appropriate flex layout for action items.
54-69: LGTM!Good default of
type="submit"for form context. Disabled state styling is properly handled withdisabled:cursor-not-allowed disabled:opacity-60.tests/orchestrator/tools/tools.test.ts (1)
7-20: LGTM! Import paths correctly updated to reflect new module structure.The import reorganization aligns with the PR's move to centralized modules under
src/. Type imports are properly segregated usingtypekeyword forFlatRunandStoryPack.src/models.ts (1)
18-34: LGTM! Clean environment variable fallback implementation.The two-stage lookup pattern with
getEnvWithFallbackprovides good flexibility for different deployment configurations while maintaining sensible defaults.scripts/pre-commit-checks.ts (2)
228-248: LGTM! Fresh UI structure checks appropriately replace Vite checks.The new
checkFreshUifunction validates the expected Fresh framework structure (routes/index.tsxandapp.ts), aligning with the PR's migration away from Vite/React. Using "warning" severity is appropriate since the UI might be optional in some configurations.
253-274: Server entrypoint checks updated correctly.The renamed check validates both local (
src/server/main.ts) and production (deploy/main.ts) server entrypoints, with production being a warning-level check. This matches the PR's "remove deploy integration" theme while still validating the expected structure.src/telemetry.ts (1)
341-378: Well-structured SSE stream implementation.The
createTelemetryStreamproperly handles cleanup on both abort signal and manual cancellation, includes keepalive messages, and pushes an initial snapshot. The implementation follows SSE best practices.src/generator.ts (4)
293-306: Correct structuredOutput configuration per coding guidelines.The
structuredOutputconfiguration correctly usesjsonPromptInjection: trueanderrorStrategy: 'strict'as required by the coding guidelines for generator/agent files. TheAbortSignal.timeoutintegration properly enforces LLM timeout.
97-111: Robust JSON extraction with multiple fallback strategies.The
extractJsonCandidatefunction handles both fenced code blocks (```json) and bare JSON objects, providing good resilience for varied LLM output formats.
375-384: Comprehensive telemetry recording for AI responses.The
recordAiResponsecall captures response-specific telemetry (duration, success, text preview, error, token usage) complementing the aggregate metrics tracked bywithAiTelemetry. TheNumber.isFiniteguard ondurationMsis a good defensive check.
340-350: Good error message extraction with fallbacks.The validation error handling extracts the first issue for a concise error message while preserving the full validation failure structure. The fallback to
responseError.messageensures errors are captured even when other extraction fails.src/server/handler.ts (4)
208-212: CORS configuration is overly permissive for production.
access-control-allow-origin: "*"allows any origin to make requests. While acceptable for development/demo, this should be restricted in production deployments to prevent unauthorized cross-origin access to the API.Is this API intended to be publicly accessible from any origin, or should CORS be configurable per environment?
242-323: LGTM! Robust acceptance criteria parser.The
parseAcceptanceCriteriafunction handles multiple input formats (GWT, numbered lists, bullets, HTML, newlines) with clear fallback logic and a sensible minimum length filter. The implementation is thorough and defensive.
398-458: Champion update logic looks correct with appropriate backup versioning.The POST
/championendpoint properly:
- Creates versioned backups with timestamps
- Parses base/patch sections from composed prompts
- Normalizes and persists all three files atomically
- Clears caches after update
36-37: The scorer instance is designed for safe reuse across requests. The metrics (KeywordCoverageMetric,PromptAgentFPFMetric) are measurement tools, not state containers—each request flows through the pipeline with isolated input/output data. Themeasure()methods operate on fresh inputs per request without accumulating cross-request state.AGENTS.md (1)
31-37: LGTM! Clear functional style guidelines.The new Code Style section provides clear, actionable guidance for maintaining a functional programming style with immutability and pure functions.
deploy/main.ts (3)
46-75: Good fallback handling for missing Fresh build output.The try/catch structure properly distinguishes between deployed and local environments:
- In production (Deno Deploy): Returns 503 with clear message
- In development: Falls back to in-memory build
This graceful degradation is well-designed.
78-98: LGTM! Clean API route detection.The
API_PREFIXESarray andisApiRoutefunction provide a clear, maintainable way to distinguish API routes from UI routes. The prefix-matching logic correctly handles both exact matches and sub-paths.
100-147: Request handler has comprehensive telemetry and error handling.The main request handler:
- Records timing metrics for all requests
- Logs structured telemetry when enabled
- Catches and logs errors while returning generic 500 responses
- Properly records failed requests in telemetry
src/ui/twind_middleware.ts (4)
1-7: Imports look correct.The imports align with the module's responsibilities: virtual sheet from twind/sheets and the shared utilities/constants from twind_shared.ts.
9-13: Type definitions are well-structured.
MappingStateefficiently represents identity mappings as strings and non-identity mappings as tuples. TheTwindMiddlewaretype properly captures the Fresh middleware signature.
17-25: HTML injection helper is correct but consider edge cases.The function handles
</head>,</body>, and fallback cases appropriately. Note thatreplace()only replaces the first occurrence, which is the correct behavior here. However, if the HTML is malformed or uses uppercase tags (</HEAD>), injection may fall through to appending.
60-68: Response reconstruction is handled correctly.Deleting
content-lengthis appropriate since the HTML body size has changed. Preserving other headers, status, and statusText maintains response fidelity.src/ui/twind_client.ts (5)
1-8: Imports are appropriate for client-side hydration.The module correctly imports the Sheet type, shared setup utilities, and element IDs that correspond to what the middleware injects server-side.
12-20: State reading is robust with proper fallbacks.The function gracefully handles missing elements and JSON parse errors by returning an empty array. This ensures hydration can proceed even if server-side state injection failed.
22-32: Style element creation is correct.The function properly checks for existing elements and creates one in
document.headif missing, using the shared constant for the ID.
53-57: Sheet interface implementation looks correct.The
insertmethod properly delegates toCSSStyleSheet.insertRule, andinitusesshift()to provide state sequentially. This matches typical twind sheet initialization patterns.
62-64: Browser guard is appropriate.Checking
typeof document !== "undefined"correctly gates client-side hydration for SSR/isomorphic module loading scenarios.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| const styleTag = `<style id="${STYLE_ELEMENT_ID}">${cssText}</style>`; | ||
| const stateTag = | ||
| `<script id="${STATE_ELEMENT_ID}" type="application/json">${ | ||
| JSON.stringify(mappings) | ||
| }</script>`; |
There was a problem hiding this comment.
JSON injection vulnerability in state serialization.
If mappings contains strings with </script>, the JSON output could break out of the script tag and enable XSS. Consider escaping or using a safer serialization approach.
Proposed fix using JSON escaping
const styleTag = `<style id="${STYLE_ELEMENT_ID}">${cssText}</style>`;
const stateTag =
`<script id="${STATE_ELEMENT_ID}" type="application/json">${
- JSON.stringify(mappings)
+ JSON.stringify(mappings).replace(/</g, "\\u003c")
}</script>`;🤖 Prompt for AI Agents
In `@src/ui/twind_middleware.ts` around lines 54 - 58, The stateTag construction
serializes mappings into a raw script tag which allows JSON containing
"</script>" or special line separators to break out and enable XSS; update the
code that builds stateTag (and where STATE_ELEMENT_ID and mappings are used) to
either: 1) create the script element via DOM APIs on the client/SSR host and set
its textContent to JSON.stringify(mappings) (preferred), or 2) if emitting an
HTML string, sanitize the JSON string after JSON.stringify(mappings) by escaping
"</" (e.g. replace all occurrences of "</script>"/ "</" with "<\\/") and also
escape U+2028 and U+2029 (replace with "\\u2028" and "\\u2029") before embedding
it in stateTag so the serialized state cannot break out of the script tag.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/server/handler.ts`:
- Around line 1580-1588: The tournament progress calculation can divide by zero
when total === 0; change the logic around candidateIdx and totalRuns handling in
the block using candidateIdx, updateTaskTournamentProgress, and
config.patchCandidates so you first guard total (e.g., treat runsPerCandidate =
total === 0 ? 1 : total / (config.patchCandidates + 1)), compute candidateIdx
using that guarded runsPerCandidate, and clamp candidateIdx to the valid range
[0, config.patchCandidates] before calling updateTaskTournamentProgress; ensure
totalCandidates remains config.patchCandidates + 1 and runsCompleted/totalRuns
still reflect the actual values (or zero) but avoid any division by zero.
In `@src/ui/islands/Studio.tsx`:
- Around line 730-794: The number inputs for maxIterations, replicates and
patchCandidates set state with Number(target.value) without guarding against NaN
when the field is cleared or invalid; update the onChange handlers for the
inputs (the ones that call setOptConfig in Studio.tsx) to parse and validate the
value (e.g., use parseInt/Number, check isNaN, and clamp to a minimum of 1 or
fall back to the current state's value) before calling setOptConfig so
optConfig.maxIterations, optConfig.replicates and optConfig.patchCandidates
never become NaN.
🧹 Nitpick comments (4)
deno.json (1)
76-84: Redundant entry in deploy include list.
src/uion line 81 is already covered bysrcon line 78. This redundancy is harmless but could be cleaned up.🧹 Suggested cleanup
"include": [ "deploy", "src", "data", "prompts", - "src/ui", "deno.json", "deno.lock" ],deploy/main.ts (1)
111-159: Consider extracting duplicate telemetry logging logic.The success and error paths contain nearly identical telemetry logging code (lines 119-135 and 139-155). While functional, this could be simplified.
♻️ Suggested refactor
Deno.serve(async (req) => { const started = performance.now(); const url = new URL(req.url); const isApi = isApiRoute(url.pathname); const routeKey = isApi ? normalizePath(url.pathname) : "ui"; + + const logRequest = (status: number, durationMs: number) => { + recordHttpRequest({ + key: `${req.method} ${routeKey}`, + status, + durationMs, + }); + if (env.TELEMETRY_LOG_REQUESTS) { + console.log( + JSON.stringify({ + type: "http", + at: new Date().toISOString(), + method: req.method, + route: routeKey, + status, + durationMs: Math.round(durationMs), + }), + ); + } + }; + try { const res = isApi ? await apiHandler(req) : await freshHandler(req); const durationMs = performance.now() - started; - recordHttpRequest({ - key: `${req.method} ${routeKey}`, - status: res.status, - durationMs, - }); - if (env.TELEMETRY_LOG_REQUESTS) { - console.log( - JSON.stringify({ - type: "http", - at: new Date().toISOString(), - method: req.method, - route: routeKey, - status: res.status, - durationMs: Math.round(durationMs), - }), - ); - } + logRequest(res.status, durationMs); return res; } catch (err) { const durationMs = performance.now() - started; - recordHttpRequest({ - key: `${req.method} ${routeKey}`, - status: 500, - durationMs, - }); - if (env.TELEMETRY_LOG_REQUESTS) { - console.log( - JSON.stringify({ - type: "http", - at: new Date().toISOString(), - method: req.method, - route: routeKey, - status: 500, - durationMs: Math.round(durationMs), - }), - ); - } + logRequest(500, durationMs); console.error(err); return new Response("Internal server error", { status: 500 }); } });src/server/handler.ts (2)
924-945: Background task error handling is correct but consider adding a safety net.The fire-and-forget IIFE pattern is appropriate for async tasks. The try-catch properly handles errors by setting task status and error message. However, if
task.error = ...ortask.completedAt = ...assignments somehow throw, the rejection would be unhandled.Consider wrapping in an additional outer catch for defensive programming:
♻️ Suggested approach
// Wrap background tasks to prevent unhandled rejections (async () => { try { // ... task logic } catch (err) { task.status = "failed"; task.error = err instanceof Error ? err.message : String(err); task.completedAt = new Date().toISOString(); } })().catch((err) => console.error("Unhandled background task error:", err));
498-509: Non-atomic file writes could leave inconsistent state on failure.If one of the three
writeTextFilecalls fails (e.g., disk full), the prompt files could be left in an inconsistent state wherechampion.mddiffers from the combination ofbase.mdandpatch.md. The cache is cleared regardless, so the next load would read the inconsistent state.♻️ Suggested approach
Consider writing to temporary files first, then renaming atomically, or at minimum, writing in sequence so failures are more predictable:
// Write in sequence to fail fast before partial writes await Deno.writeTextFile(`${PROMPTS_ROOT}/champion.md`, normalizedComposed); await Deno.writeTextFile(`${PROMPTS_ROOT}/champion.base.md`, nextBase); await Deno.writeTextFile(`${PROMPTS_ROOT}/champion.patch.md`, nextPatch);Or use a write-then-rename pattern for true atomicity.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
AGENTS.mddeno.jsondeploy/main.tsplaywright.config.tssrc/config.tssrc/judge/promptagent-fpf-judge.tssrc/server/handler.tssrc/ui/components/ai-elements/message.tsxsrc/ui/islands/Studio.tsxsrc/ui/twind_middleware.tstests/deployment.test.ts
🚧 Files skipped from review as they are similar to previous changes (3)
- src/ui/components/ai-elements/message.tsx
- src/ui/twind_middleware.ts
- playwright.config.ts
🧰 Additional context used
📓 Path-based instructions (1)
**/{generator,*agent*}.{ts,js}
📄 CodeRabbit inference engine (AGENTS.md)
**/{generator,*agent*}.{ts,js}: UsejsonPromptInjection: truein structuredOutput configuration when LM Studio'sresponse_formatsupport is incomplete
SeterrorStrategy: 'strict'in structuredOutput configuration for schema validation
Files:
src/judge/promptagent-fpf-judge.ts
🧠 Learnings (8)
📚 Learning: 2026-01-06T19:14:53.230Z
Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/scorer.ts : Use a separate judge model instance via `makeJudgeModel()` in scoring modules to reduce 'grading your own homework' bias
Applied to files:
src/judge/promptagent-fpf-judge.tsAGENTS.md
📚 Learning: 2026-01-06T19:14:53.230Z
Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/{generator,*agent*}.{ts,js} : Use `jsonPromptInjection: true` in structuredOutput configuration when LM Studio's `response_format` support is incomplete
Applied to files:
src/judge/promptagent-fpf-judge.tsAGENTS.mdsrc/server/handler.ts
📚 Learning: 2026-01-06T19:14:53.230Z
Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/{generator,*agent*}.{ts,js} : Set `errorStrategy: 'strict'` in structuredOutput configuration for schema validation
Applied to files:
src/judge/promptagent-fpf-judge.tsAGENTS.md
📚 Learning: 2026-01-06T19:14:53.230Z
Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/scorer.ts : Composite score calculation should weight: 25% Epic keyword coverage, 30% INVEST principles adherence, 30% Acceptance criteria testability (prefer Given/When/Then), 10% Duplication penalty, 5% Story count sanity (4-8 optimal)
Applied to files:
src/judge/promptagent-fpf-judge.tsAGENTS.md
📚 Learning: 2026-01-06T19:14:53.230Z
Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/scorer.ts : Enforce schema validation as a hard gate: return score = 0 if validation fails
Applied to files:
src/judge/promptagent-fpf-judge.tsAGENTS.md
📚 Learning: 2026-01-06T19:14:53.230Z
Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/schema.ts : Use Zod 4.x for schema validation
Applied to files:
src/config.ts
📚 Learning: 2026-01-14T08:35:08.246Z
Learnt from: venikman
Repo: venikman/PromptAgent PR: 13
File: src/server/handler.ts:47-51
Timestamp: 2026-01-14T08:35:08.246Z
Learning: In the PromptAgent repository, `createApiHandler` in `src/server/handler.ts` is designed to be instantiated once per process in server entrypoints, so module-level caches (cachedEpics, cachedChampion, cachedOrchestrator, task stores) are safe and intentional for the single-handler deployment model.
Applied to files:
deploy/main.tssrc/server/handler.ts
📚 Learning: 2026-01-06T19:14:53.230Z
Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/scorer.ts : Implement scorer pipeline following the pattern: preprocess → analyze → generateScore → generateReason
Applied to files:
AGENTS.md
🧬 Code graph analysis (2)
src/judge/promptagent-fpf-judge.ts (2)
src/config.ts (1)
env(168-168)src/telemetry.ts (1)
withAiTelemetry(275-303)
deploy/main.ts (4)
src/config.ts (1)
env(168-168)src/server/handler.ts (2)
ApiConfig(39-45)createApiHandler(399-1658)src/ui/app.ts (1)
app(6-9)src/telemetry.ts (3)
startTelemetryReporter(329-332)normalizePath(179-182)recordHttpRequest(184-195)
🔇 Additional comments (22)
deno.json (1)
1-87: Configuration looks well-structured.The Fresh/Preact setup with proper import mappings, jsx configuration, and task definitions is clean. The addressed feedback from previous reviews (typecheck scope, test:unit glob handling) has been incorporated correctly.
src/judge/promptagent-fpf-judge.ts (2)
368-383: Structured output configuration now complies with coding guidelines.The
structuredOutputconfiguration correctly includesjsonPromptInjection: trueanderrorStrategy: "strict"as required by the coding guidelines for agent files. The telemetry wrapping and timeout handling are also well-implemented.
456-473: Clean overloaded method signature for flexibility.The
measuremethod overloads provide a good API surface, supporting both the fullPromptAgentJudgeInputobject and a simpler(input, output)string signature. The fallback defaults for the string signature are sensible.src/ui/islands/Studio.tsx (2)
276-362: Initial data loading effect is well-structured.The effect properly handles cancellation, abort controllers, and cleanup. The parallel loading of health, epics, and champion data with appropriate error handling is clean. The early return pattern for cancelled requests prevents state updates after unmount.
657-667: Past review feedback addressed:modeprop removed fromMessageResponse.The
MessageResponsecomponents are now used without the unusedmode="static"prop, addressing the previous review concern about the unused prop.src/config.ts (2)
15-20: Boolean environment variable handling is now correct.The
envBooleanhelper properly addresses the previous review concern aboutz.coerce.boolean()treating string"false"astrue. The implementation correctly normalizes and parses"true"/"1"astrueand defaults appropriately.
26-34: LLM configuration with layered fallbacks is well-designed.The fallback chain from
LMSTUDIO_*to genericLLM_*variables provides flexibility for different deployment scenarios. The timeout constraints (1-600 seconds) are sensible bounds.tests/deployment.test.ts (2)
44-54: CI-aware test skipping is appropriate.Skipping the
.envexistence check on CI makes sense since CI environments typically inject secrets differently. TheIS_CIdetection pattern is clean.
178-186: Section header and test now align correctly.The section header "UI Entry Point Tests" now accurately describes the test that validates the Fresh app entry point existence. This addresses the previous review feedback about the mismatched header.
AGENTS.md (3)
35-42: Good addition of functional code style guidelines.The new Code Style section clearly documents the project's preference for pure functions, immutability, and expression-based transforms. This aligns well with the functional patterns used throughout the codebase.
77-86: Structured output configuration matches learnings.The example correctly includes
jsonPromptInjection: trueanderrorStrategy: "strict"as specified in the project learnings for LM Studio compatibility.
43-73: Architecture documentation reflects new modular structure.The updated file tree accurately documents the reorganized
src/directory with the new modules (fpf, judge, meta-evolution, orchestrator, server, telemetry, ui) and prompt artifacts.deploy/main.ts (2)
11-25: Production safety checks look good.The checks correctly validate that
LMSTUDIO_BASE_URLis set to a non-localhost value and thatLMSTUDIO_API_KEYis not the default placeholder. The regex/localhost|127\.0\.0\.1/covers common local patterns.
57-86: Fresh UI fallback handling is well-designed.The try-catch properly distinguishes between deployed and local environments:
- In production: returns 503 with clear message when build is missing
- In development: falls back to in-memory build for convenience
This graceful degradation prevents cryptic errors while maintaining development ergonomics.
src/server/handler.ts (8)
47-51: Module-level state is acceptable for single-handler deployment.Based on learnings,
createApiHandleris designed to be instantiated once per process, so module-level caches for config, epics, champion, and orchestrator are safe and intentional.
128-153: Task cleanup implementation addresses memory leak concern.The TTL-based cleanup with 1-hour retention and 10-minute intervals is well-implemented. The guard against multiple schedulers and proper handling of invalid timestamps make this robust.
309-345: Acceptance criteria parser handles multiple formats comprehensively.The GWT (Given/When/Then) parsing correctly groups scenarios by "Given" keywords and handles the common BDD format. The fallback chain (GWT → numbered → bullets → HTML → newlines) ensures broad compatibility.
250-262: CORS configuration is secure for production.The logic correctly prevents wildcard (
*) origins in deployed environments and falls back to same-origin validation when no allowlist is configured. This is a safe default.
630-639: LLM fetch includes proper timeout.The
AbortSignal.timeout(timeoutMs)prevents indefinite hangs when the LLM service is unresponsive. This addresses the previous review concern about missing timeouts.
422-432: Debug endpoint no longer exposes API key material.The endpoint now returns only
hasKeyandkeyLengthinstead of a key prefix, addressing the previous security concern.
856-865: Generate endpoint also has proper timeout.The
/generateendpoint correctly includesAbortSignal.timeout(timeoutMs)for the LLM fetch, consistent with/generate-story.
1-45: Well-structured module with clear type definitions.The imports are organized logically and the
ApiConfigtype provides a clear contract for handler initialization. The separation of PromptAgent modules, orchestrator imports, and config/telemetry is clean.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/ui/twind_client.ts`:
- Line 63: The TS2345 arises because init: (cb) => cb(sheetState) passes
sheetState to a generically-typed cb; fix by asserting types so the compiler
knows the callback accepts SheetInitState—either cast the cb parameter or the
entire init function to Sheet["init"] (e.g., implement init as a function typed
like (cb: (state?: unknown) => unknown) => cb(sheetState) and then assert it as
Sheet["init"], or cast the rhs to Sheet["init"]); update the init line to use
that type assertion referencing init, cb, sheetState, and Sheet["init"] to
resolve the error.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/config.tssrc/ui/twind_client.ts
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-01-06T19:14:53.230Z
Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/schema.ts : Use Zod 4.x for schema validation
Applied to files:
src/config.ts
🧬 Code graph analysis (1)
src/ui/twind_client.ts (1)
src/ui/twind_shared.ts (4)
STATE_ELEMENT_ID(8-8)STYLE_ELEMENT_ID(7-7)Options(10-13)setup(24-55)
🪛 GitHub Actions: CI
src/ui/twind_client.ts
[error] 63-63: TS2345: Argument of type 'unknown' is not assignable to parameter of type 'T | undefined'. init: (cb) => cb(sheetState). Type checking failed. (Command: deno check deploy/main.ts src/**/*.ts)
🔇 Additional comments (9)
src/ui/twind_client.ts (2)
44-57: LGTM! Defensive checks properly address edge cases.The null check for
styleEl.sheet(line 44-45) and the robust precedence parsing withmarker === -1guards andNumber.isFinitevalidation (lines 49-56) properly handle the edge cases previously identified. The implementation now gracefully handles malformed CSS rules and disconnected DOM scenarios.
69-71: LGTM!Appropriate browser environment check before hydration.
src/config.ts (7)
1-2: LGTM!Clean import paths using import map resolution. This centralizes version management and follows Deno best practices.
10-14: LGTM!Good use of fallback variables to support alternative environment variable names (
LLM_BASE_URL,LLM_API_BASE_URL, etc.), providing backward compatibility for existing deployments.
15-26: LGTM!The
envBooleanhelper correctly addresses the boolean coercion issue. Key behaviors verified:
"false"/"0"→false✓"true"/"1"→true✓- Missing env var → uses default ✓
- Case-insensitive handling ✓
27-40: LGTM!Well-structured LLM configuration with:
- URL validation via
.url()forLMSTUDIO_BASE_URL- Sensible timeout bounds (1s–10min) with 2-minute default
- Fallback support for alternative env var names
41-51: LGTM!Sensible telemetry defaults:
TELEMETRY_INCLUDE_LLM_OUTPUTdefaults tofalse(privacy-conscious default for potentially sensitive LLM outputs)- Preview character limit (800) balances debuggability with log size
123-123: LGTM!
POLL_ENABLEDandNQD_ENABLEDnow use theenvBooleanhelper, ensuring consistent boolean parsing across all env-based feature flags.Also applies to: 166-166
172-176: LGTM!Clean fail-fast validation at module load with both the parsed config and schema exported for testing/documentation purposes.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/server/handler.ts`:
- Around line 856-877: The fetch to the LLM endpoint (the call that sets
timeoutMs and assigns upstream from fetch(`${LLM_BASE_URL.replace(/\/$/,
"")}/chat/completions`)) lacks a try-catch and will throw on timeouts or network
errors; wrap that fetch + reading of upstream.text() in a try-catch, catch
AbortError/other errors and return a proper jsonResponse (matching the existing
error shape used elsewhere, e.g., error: "llm_error", status: 502 or appropriate
upstream.status if available, and include safeJson(error.message) or the error
details) instead of letting exceptions propagate; ensure you still handle the
upstream.ok branch the same way inside the try block.
♻️ Duplicate comments (3)
src/ui/islands/Studio.tsx (2)
70-73: Number input validation has been addressed.The
parsePositiveInthelper properly validates parsed values usingNumber.isFiniteand minimum bound check, returning the fallback value when input is invalid. This addresses the previously raised concern aboutNaNvalues in config state.
735-808: Number input validation is properly implemented.All numeric configuration inputs (
maxIterations,replicates,patchCandidates) now use theparsePositiveInthelper to guard againstNaNand invalid values, falling back to the current state value when input is invalid.src/server/handler.ts (1)
1578-1597: LGTM — Division by zero protection confirmed.The guard
total > 0 ? ... : 1prevents division by zero, and theMath.min(Math.max(...))clamping ensurescandidateIdxstays within valid bounds[0, config.patchCandidates]. This properly addresses the concern raised in the previous review.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/server/handler.tssrc/ui/islands/Studio.tsxsrc/ui/twind_client.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- src/ui/twind_client.ts
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2026-01-14T08:35:08.246Z
Learnt from: venikman
Repo: venikman/PromptAgent PR: 13
File: src/server/handler.ts:47-51
Timestamp: 2026-01-14T08:35:08.246Z
Learning: In the PromptAgent repository, `createApiHandler` in `src/server/handler.ts` is designed to be instantiated once per process in server entrypoints, so module-level caches (cachedEpics, cachedChampion, cachedOrchestrator, task stores) are safe and intentional for the single-handler deployment model.
Applied to files:
src/server/handler.ts
📚 Learning: 2026-01-06T19:14:53.230Z
Learnt from: CR
Repo: venikman/PromptAgent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-06T19:14:53.230Z
Learning: Applies to **/{generator,*agent*}.{ts,js} : Use `jsonPromptInjection: true` in structuredOutput configuration when LM Studio's `response_format` support is incomplete
Applied to files:
src/server/handler.ts
🧬 Code graph analysis (1)
src/ui/islands/Studio.tsx (3)
src/ui/types.ts (2)
ScorerResult(49-54)TelemetrySnapshot(75-96)src/ui/components/ai-elements/prompt-input.tsx (5)
PromptInput(6-16)PromptInputBody(20-22)PromptInputTextarea(26-37)PromptInputFooter(41-52)PromptInputSubmit(56-69)src/ui/components/ai-elements/message.tsx (3)
Message(13-22)MessageContent(26-38)MessageResponse(42-51)
🔇 Additional comments (23)
src/ui/islands/Studio.tsx (11)
1-58: LGTM!Imports now use import map entries as previously requested. Type definitions are well-structured and provide good type safety for the component's state management.
93-122: LGTM!Clean component with proper defensive rendering for optional
StoryPoints. Key generation approach is acceptable given the data context.
124-171: LGTM!Clean result rendering with proper fallback handling for missing
storyPack. Error display is appropriately styled.
193-215: LGTM!Well-structured reusable panel component with clear prop interface.
217-280: LGTM!State management is well-organized. Memoized derived values have correct dependencies and sensible fallback behavior.
281-367: LGTM!Proper use of
AbortControllerwith timeout for health check. Cancellation flag pattern correctly prevents state updates after unmount. Parallel loading of initial data is efficient.
369-391: LGTM!Telemetry polling correctly uses
globalThis.setInterval. Silent error handling is acceptable for non-critical telemetry data.
393-464: LGTM!Both async handlers have proper validation, error handling, and loading state management. The pattern of clearing previous state before starting new operations prevents stale data display.
466-505: LGTM!Optimization task polling correctly handles terminal states and cleanup. Uses
globalThis.setIntervalas addressed in previous review.
678-693: LGTM!Champion refresh correctly preserves and restores the previous value on fetch failure, addressing the previously raised concern about the UI getting stuck in loading state.
513-886: LGTM!JSX rendering is well-structured with proper semantic HTML, accessible form controls, and clear visual feedback for loading and error states. The
MessageResponsecomponents no longer use the unusedmodeprop.src/server/handler.ts (12)
1-64: LGTM!Imports are well-organized, and the
normalizeMaxTokensutility correctly validates input boundaries. Module-level configuration is appropriate per the single-handler deployment model documented in learnings.
124-153: LGTM — TTL cleanup implementation is solid.The cleanup logic correctly guards against missing
completedAtand invalid date parsing. The 10-minute sweep interval for a 1-hour TTL is a reasonable balance between memory efficiency and CPU overhead.
188-236: LGTM!Lazy initialization with caching is appropriate. Error paths return safe defaults (empty arrays/strings), and the orchestrator is properly invalidated when champion is updated (line 509).
238-280: LGTM!CORS handling correctly differentiates between deployed and development environments. The
Vary: Originheader is properly added when using explicit origin matching, which ensures correct caching behavior.
298-390: LGTM!The acceptance criteria parser handles multiple formats with a sensible fallback chain. The minimum length filter prevents noise from being included, and the GWT scenario grouping using
→separators provides readable output.
422-432: LGTM — Security improvement confirmed.The
keyPrefixfield removal addresses the previous security concern. Exposing onlyhasKeyandkeyLengthis acceptable for debugging purposes.
630-639: LGTM — Timeout protection confirmed.The
AbortSignal.timeout(timeoutMs)addresses the previous concern about indefinite hangs on LLM requests. The try-catch at line 622 properly handles timeout/network errors.
924-945: LGTM!The background evaluation task properly catches exceptions and updates task status to "failed" with the error message. The
completedAttimestamp is set in both success and failure paths, enabling proper TTL cleanup.
975-1077: LGTM!Both endpoints validate required inputs and follow the consistent error handling pattern. The transformation from internal
ContrastPairto frontend format in/mine-pairsis straightforward.
1129-1222: LGTM!The tournament execution properly evaluates champion first (ensuring
candidatesis never empty), then iterates through patches. Error handling in the background task is comprehensive, and the winner selection correctly sorts by objective score.
1268-1483: LGTM!V2 endpoints correctly leverage the orchestrator abstraction and Deno KV for persistence. The
maxTokensvalidation is consistently applied across all endpoints that accept it, and background tasks properly usefailTask()to record errors.
1661-1665: LGTM!The fallback 404 response and consistent CORS header application complete the handler cleanly.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| const timeoutMs = env.LLM_TIMEOUT_MS; | ||
| const upstream = await fetch( | ||
| `${LLM_BASE_URL.replace(/\/$/, "")}/chat/completions`, | ||
| { | ||
| method: "POST", | ||
| headers, | ||
| body: JSON.stringify(requestBody), | ||
| signal: AbortSignal.timeout(timeoutMs), | ||
| }, | ||
| ); | ||
|
|
||
| const raw = await upstream.text(); | ||
| if (!upstream.ok) { | ||
| return jsonResponse( | ||
| { | ||
| error: "llm_error", | ||
| status: upstream.status, | ||
| detail: safeJson(raw), | ||
| }, | ||
| upstream.status, | ||
| ); | ||
| } |
There was a problem hiding this comment.
Missing try-catch around LLM fetch — timeout/network errors will propagate uncaught.
Unlike /generate-story (line 622), this fetch call isn't wrapped in a try-catch. If the LLM request times out (AbortError) or fails due to network issues, the error propagates as an unhandled exception rather than returning a proper JSON error response.
🛡️ Suggested fix
headers["authorization"] = `Bearer ${LLM_API_KEY}`;
}
+ try {
const timeoutMs = env.LLM_TIMEOUT_MS;
const upstream = await fetch(
`${LLM_BASE_URL.replace(/\/$/, "")}/chat/completions`,
{
method: "POST",
headers,
body: JSON.stringify(requestBody),
signal: AbortSignal.timeout(timeoutMs),
},
);
const raw = await upstream.text();
if (!upstream.ok) {
return jsonResponse(
{
error: "llm_error",
status: upstream.status,
detail: safeJson(raw),
},
upstream.status,
);
}
const data = safeJson(raw) as {
choices?: Array<{ message?: { content?: string } }>;
};
const content = data?.choices?.[0]?.message?.content ?? "";
return jsonResponse({
provider: "lm-studio",
model,
response: content,
});
+ } catch (err) {
+ return jsonResponse(
+ { error: err instanceof Error ? err.message : String(err) },
+ 500,
+ );
+ }
}🤖 Prompt for AI Agents
In `@src/server/handler.ts` around lines 856 - 877, The fetch to the LLM endpoint
(the call that sets timeoutMs and assigns upstream from
fetch(`${LLM_BASE_URL.replace(/\/$/, "")}/chat/completions`)) lacks a try-catch
and will throw on timeouts or network errors; wrap that fetch + reading of
upstream.text() in a try-catch, catch AbortError/other errors and return a
proper jsonResponse (matching the existing error shape used elsewhere, e.g.,
error: "llm_error", status: 502 or appropriate upstream.status if available, and
include safeJson(error.message) or the error details) instead of letting
exceptions propagate; ensure you still handle the upstream.ok branch the same
way inside the try block.
|
Addressed review items: guarded tournament progress division-by-zero edge case, validated optimization number inputs to avoid NaN, and fixed twind_client init typing (typecheck clean). |
Summary by CodeRabbit
New Features
Refactor
Documentation
Chores
✏️ Tip: You can customize this high-level summary in your review settings.