-
Notifications
You must be signed in to change notification settings - Fork 47
Local Dockerised Eval Server #50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds a centralized frontend LLMConfigurationManager, a configure_llm JSON‑RPC on the eval server, nested/per-request and persistent LLM model/provider overrides, automated-mode credential bypasses, cross‑tab sync, examples/docs/config updates, and wiring so evaluations use the unified configuration and per-client overrides. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant UI as Frontend (AIChatPanel)
participant CM as LLMConfigurationManager
participant Eval as EvalServer (Node)
Note over UI,Eval: Persistent runtime configuration (configure_llm)
UI->>Eval: JSON‑RPC configure_llm { provider, apiKey?, endpoint?, models, partial? }
Eval->>Eval: validate + merge per‑connection llmConfig
Eval-->>UI: JSON‑RPC response { status: "success", appliedConfig }
UI->>CM: saveConfiguration(appliedConfig)
UI->>UI: AgentService.refreshCredentials()
sequenceDiagram
autonumber
participant Client as Frontend Client
participant CM as LLMConfigurationManager
participant Eval as EvalServer
Note over Client,Eval: Per-request model override during evaluate/chat
Client->>CM: setOverride({ provider, mainModel, miniModel?, nanoModel?, apiKey?, endpoint? })
Client->>Eval: evaluate { model: nested-or-flat config }
Eval->>Eval: merge connection llmConfig + eval overrides -> nested modelConfig
Eval-->>Client: evaluation result
Client->>CM: clearOverride()
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Claude finished @olesho's task —— View job PR Review: Local Dockerised Eval ServerI've completed a comprehensive review of this substantial PR implementing a local dockerized evaluation server with advanced LLM configuration management. This is an impressive addition with solid architecture, though there are several areas for improvement. Review Progress:
🎯 Overall AssessmentThis PR introduces a sophisticated evaluation server system with excellent separation of concerns and flexible configuration management. The implementation demonstrates strong architectural decisions with room for security and robustness improvements. ✅ StrengthsArchitecture & Design
Code Quality
🔍 Areas for ImprovementCritical Security Issues1. API Key Exposure Risk ( // ISSUE: API keys logged at info level
logger.info('Setting early configuration override', {
hasApiKey: !!apiKey,
apiKeyLength: apiKey?.length // Should not log length
}); Recommendation: Never log API key lengths or related metadata. Use debug level for configuration details. 2. Insufficient Input Validation ( if (!requestBody.input || typeof requestBody.input !== 'string') {
throw new Error('Missing or invalid "input" field. Expected a string.');
} Recommendation: Add comprehensive validation for all request parameters, including size limits and sanitization. 3. CORS Security ( res.setHeader('Access-Control-Allow-Origin', '*'); Recommendation: Replace wildcard CORS with specific allowed origins for production use. Memory & Performance Concerns4. Potential Memory Leaks ( for (const [clientId, connection] of this.connectedClients) {
connection.rpcClient.cleanup();
// Missing: Remove event listeners and clear timers
} Recommendation: Ensure all event listeners, intervals, and timeouts are properly cleaned up. 5. Missing Request Size Limits ( for await (const chunk of req) {
body += chunk; // No size limit
} Recommendation: Implement request size limits to prevent DoS attacks. Error Handling & Reliability6. Incomplete Error Context ( } catch (error) {
const errorMessage = error instanceof Error ? error.message : 'Unknown error';
// Missing: Error stack traces, evaluation context
} Recommendation: Include evaluation context and stack traces in error responses for better debugging. 7. Inconsistent Timeout Handling ( timeout: 1500000, // 25 minutes - very long Recommendation: Implement configurable timeouts with reasonable defaults and proper cleanup on timeout. Code Quality Issues8. Complex Configuration Logic ( 9. Missing Validation ( setOverride(config: Partial<LLMConfig>): void {
this.overrideConfig = { ...config }; // No validation
} Recommendation: Add validation for provider compatibility and required fields. 🧪 Test Coverage GapsThe PR lacks comprehensive test coverage for:
Recommendation: Add unit tests for core components and integration tests for the evaluation flow. 🚀 Performance Recommendations1. Connection PoolingImplement connection pooling for HTTP requests to external LLM providers to reduce latency. 2. Configuration CachingCache parsed configurations to avoid repeated localStorage access. 3. Request DebouncingAdd debouncing for rapid configuration changes to prevent unnecessary reinitializations. 🔧 Implementation Suggestions1. Enhanced Error Typesexport class EvaluationError extends Error {
constructor(
message: string,
public readonly code: string,
public readonly evaluationId?: string,
public readonly context?: Record<string, any>
) {
super(message);
}
} 2. Configuration Validation Schemaconst CONFIG_SCHEMA = {
provider: { type: 'string', enum: ['openai', 'litellm', 'groq', 'openrouter'] },
apiKey: { type: 'string', minLength: 10 },
models: { type: 'object', required: ['main'] }
}; 3. Request Rate LimitingImplement rate limiting for evaluation requests to prevent abuse. 📋 Immediate Action ItemsHigh Priority
Medium Priority
Low Priority
🏆 ConclusionThis is a well-architected and feature-rich addition that significantly enhances the evaluation capabilities. The core design is solid with excellent separation of concerns. The main concerns are around security hardening, comprehensive testing, and production readiness. Recommendation: Address the security issues and add basic test coverage before merging. The architecture is sound and will provide a strong foundation for evaluation workflows. Overall Score: 7.5/10 - Excellent architecture with important security/robustness improvements needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 15
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
front_end/panels/ai_chat/ui/AIChatPanel.ts (3)
51-63
: Stop logging/previewing secrets in StorageMonitor.
Current logs emit value length/preview for keys containing 'openrouter' or 'ai_chat'. This risks API key leakage in dev/prod logs.- localStorage.setItem = (key: string, value: string) => { - if (key.includes('openrouter') || key.includes('ai_chat')) { - logger.debug(`=== LOCALSTORAGE SET ===`); - logger.debug(`Key: ${key}`); - logger.debug(`Value exists: ${!!value}`); - logger.debug(`Value length: ${value?.length || 0}`); - logger.debug(`Value preview: ${value?.substring(0, 50) + (value?.length > 50 ? '...' : '') || 'null'}`); - logger.debug(`Timestamp: ${new Date().toISOString()}`); - } - return this.originalSetItem(key, value); - }; + localStorage.setItem = (key: string, value: string) => { + if (key.includes('openrouter') || key.includes('ai_chat')) { + logger.debug('=== LOCALSTORAGE SET ==='); + logger.debug(`Key: ${key}`); + logger.debug('Value: [redacted]'); + logger.debug(`Timestamp: ${new Date().toISOString()}`); + } + return this.originalSetItem(key, value); + }; @@ - if (key.includes('openrouter') || key.includes('ai_chat')) { + if (key.includes('openrouter') || key.includes('ai_chat')) { logger.debug(`=== LOCALSTORAGE REMOVE ===`); logger.debug(`Key: ${key}`); logger.debug(`Timestamp: ${new Date().toISOString()}`); } return this.originalRemoveItem(key); };Also gate StorageMonitor behind a debug flag to avoid shipping monkey‑patching:
- // Initialize storage monitoring for debugging - StorageMonitor.getInstance(); + // Initialize storage monitoring for debugging + if (BUILD_CONFIG.DEBUG) { + StorageMonitor.getInstance(); + }Also applies to: 66-73
1057-1076
: Remove API‑key visibility logs on OAuth success.
Logging “API key length” and method is unnecessary and leaks sensitive metadata.- const apiKey = localStorage.getItem('ai_chat_openrouter_api_key'); - const authMethod = localStorage.getItem('openrouter_auth_method'); - logger.info('- API key exists:', !!apiKey); - logger.info('- API key length:', apiKey?.length || 0); - logger.info('- Auth method:', authMethod); + // Avoid logging credential presence/length in production + const apiKey = localStorage.getItem('ai_chat_openrouter_api_key'); + const authMethod = localStorage.getItem('openrouter_auth_method'); + logger.info('OpenRouter OAuth success');
1496-1527
: Avoid logging API key prefix/length.
These logs leak secrets to console/storage. Log boolean presence only.- logger.info('Retrieved API key:'); - logger.info('- Exists:', !!apiKey); - logger.info('- Length:', apiKey?.length || 0); - logger.info('- Prefix:', apiKey?.substring(0, 8) + '...' || 'none'); + logger.info('Retrieved API key: [present=%s]', !!apiKey);eval-server/nodejs/src/config.js (1)
65-71
: Enforce OPENAI_API_KEY per guidelines.
Guideline requires OPENAI_API_KEY (and PORT default). validateConfig currently makes it optional unless requireLLM=true; make it required by default.-export function validateConfig(requireLLM = false) { +export function validateConfig(requireLLM = true) { @@ - // Only require OpenAI API key if LLM judge is explicitly needed - if (requireLLM && !CONFIG.llm.apiKey) { + // Require OpenAI API key by default for evaluator/judge usage + if (!CONFIG.llm.apiKey) { errors.push('OPENAI_API_KEY is required when using LLM judge'); }Optionally warn if no provider creds exist (CONFIG.providers.*) even when judge not used.
front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts (2)
212-218
: Global LLM override + maxConcurrency=3 risks cross‑evaluation credential bleed. Set to 1 or make overrides per‑evaluation.Because LLMConfigurationManager override is global, concurrent evaluations can stomp each other’s API key/provider. Until per‑call config is plumbed, reduce concurrency here.
Apply this diff:
- maxConcurrency: 3, + // NOTE: Global LLM override is not concurrency-safe. Keep at 1 until per-eval config is implemented. + maxConcurrency: 1,If you prefer to keep 3, we need to make AgentService/LLM client accept per-call config and eliminate the global override path.
335-342
: Redact sensitive payloads from logs (models, keys, message text).Avoid logging raw params.model, merged inputs, or full user message. Keep booleans/lengths only.
Apply this diff:
- logger.info('Received evaluation request', { + logger.info('Received evaluation request', { evaluationId: params.evaluationId, tool: params.tool, url: params.url, isChat: params.tool === 'chat', - modelOverride: params.input?.ai_chat_model, - modelConfig: params.model + modelOverride: !!params.input?.ai_chat_model, + hasModelConfig: !!params.model });- logger.info('DEBUG: Checking for model configuration override', { - evaluationId: params.evaluationId, - hasModel: !!params.model, - model: params.model - }); + logger.info('DEBUG: Checking for model configuration override', { + evaluationId: params.evaluationId, + hasModel: !!params.model + });- logger.info('DEBUG: Created merged input for chat evaluation', { - evaluationId: params.evaluationId, - hasOverrideConfig: !!overrideConfig, - overrideConfig, - mergedInput: { - ...mergedInput, - api_key: mergedInput.api_key ? `${mergedInput.api_key.substring(0, 10)}...` : undefined - } - }); + logger.info('DEBUG: Created merged input for chat evaluation', { + evaluationId: params.evaluationId, + hasOverride: configManager.hasOverride?.() || !!overrideConfig, + mergedInputPreview: { + provider: mergedInput.provider, + main_model: mergedInput.main_model, + mini_model: mergedInput.mini_model, + nano_model: mergedInput.nano_model, + has_api_key: !!mergedInput.api_key + } + });- logger.info('Starting chat evaluation', { - message: input.message, + logger.info('Starting chat evaluation', { + messageLength: String(input.message || '').length, timeout, hasTracingContext: !!tracingContext });Also applies to: 346-351, 367-375, 561-569, 821-826
🧹 Nitpick comments (26)
eval-server/nodejs/.env.example (1)
31-34
: Satisfy dotenv-linter ordering and trailing newlinePure formatting, but easy to fix to keep linters green and examples clean.
Apply:
-PORT=8080 -HOST=127.0.0.1 +HOST=127.0.0.1 +PORT=8080 @@ -LITELLM_ENDPOINT=http://localhost:4000 -LITELLM_API_KEY=your-litellm-api-key-here +LITELLM_API_KEY=your-litellm-api-key-here +LITELLM_ENDPOINT=http://localhost:4000 @@ -DEFAULT_PROVIDER=openai -DEFAULT_MAIN_MODEL=gpt-4 -DEFAULT_MINI_MODEL=gpt-4-mini -DEFAULT_NANO_MODEL=gpt-3.5-turbo +DEFAULT_MAIN_MODEL= +DEFAULT_MINI_MODEL= +DEFAULT_NANO_MODEL= +DEFAULT_PROVIDER=openai @@ -LOG_LEVEL=info -LOG_DIR=./logs +LOG_DIR=./logs +LOG_LEVEL=info @@ -AUTH_SECRET_KEY= +AUTH_SECRET_KEY= +Also applies to: 39-41, 42-45
eval-server/nodejs/CLAUDE.md (2)
110-131
: Add security note for runtime LLM config (api keys over RPC)Recommend calling out that apiKey in configure_llm must only traverse trusted, local, encrypted channels, and keys must be redacted in logs.
Patch suggestion:
"id": "config-request-id" }+#### Security considerations
+- Only useconfigure_llm
over trusted local connections (wss:// or ws://localhost).
+- Never log raw API keys; always redact in server/client logs.
+- Consider scoping keys per‑client session and expiring overrides on disconnect.
+--- `133-163`: **Clarify naming conventions between protocol and UI** Protocol example uses snake_case (main_model, mini_model, nano_model) while frontend uses camelCase in LLMConfigurationManager. Add a note that the server/bridge maps these fields. </blockquote></details> <details> <summary>front_end/panels/ai_chat/core/AgentNodes.ts (1)</summary><blockquote> `595-605`: **Validate config and fall back to node defaults; pass endpoint when present** If override/localStorage is incomplete, tool.execute may get undefined apiKey/model. Validate and fall back to constructor defaults. Also pass endpoint for LiteLLM. ```diff - const configManager = LLMConfigurationManager.getInstance(); - const config = configManager.getConfiguration(); - - return await selectedTool.execute(toolArgs as any, { - apiKey: config.apiKey, - provider: config.provider, - model: config.mainModel, - miniModel: config.miniModel, - nanoModel: config.nanoModel - }); + const mgr = LLMConfigurationManager.getInstance(); + const cfg = mgr.getConfiguration(); + const fallback = { + provider: this.provider, + model: this.modelName, + miniModel: this.miniModel, + nanoModel: this.nanoModel, + }; + // Minimal validation: require provider+model; keep apiKey optional to support OAuth/OpenRouter flows + const provider = cfg.provider || (fallback.provider as any); + const model = cfg.mainModel || (fallback.model as any); + return await selectedTool.execute(toolArgs as any, { + apiKey: cfg.apiKey, + provider, + model, + miniModel: cfg.miniModel || fallback.miniModel, + nanoModel: cfg.nanoModel || fallback.nanoModel, + ...(cfg.endpoint ? { endpoint: cfg.endpoint } : {}), + });
eval-server/nodejs/examples/with-http-wrapper.js (1)
14-17
: Example ports diverge from .env; consider env-driven defaultsThe example binds EvalServer to 8082 and HTTP to 8080 while .env defaults to 8080. Consider reading from env with safe defaults to prevent confusion.
-const evalServer = new EvalServer({ - // No authKey - authentication disabled for automated mode - host: '127.0.0.1', - port: 8082 -}); +const evalServer = new EvalServer({ + // No authKey - authentication disabled for automated mode (local only) + host: process.env.HOST || '127.0.0.1', + port: Number(process.env.PORT || 8082), +}); @@ -const httpWrapper = new HTTPWrapper(evalServer, { - port: 8080, - host: '127.0.0.1' -}); +const httpWrapper = new HTTPWrapper(evalServer, { + port: Number(process.env.HTTP_PORT || 8080), + host: process.env.HOST || '127.0.0.1', +});Bind only to loopback in examples to avoid exposing no‑auth endpoints.
Also applies to: 21-23, 28-38
eval-server/nodejs/examples/multiple-evals.js (5)
29-34
: Prefer CONFIG.defaults for model names.
Hardcoding 'gpt-4' drifts from server defaults; read defaults so examples follow env.- main_model: { - provider: 'openai', - model: 'gpt-4', - api_key: CONFIG.providers.openai.apiKey - } + main_model: { + provider: 'openai', + model: CONFIG.defaults.mainModel, + api_key: CONFIG.providers.openai.apiKey + }
45-51
: Same: use CONFIG.defaults or provider‑specific defaults.- model: 'llama-3.1-8b-instant', + model: CONFIG.defaults.mainModel, // or a GROQ default read from CONFIG if added
62-68
: Same: avoid hardcoding OpenRouter model in example.- model: 'anthropic/claude-3-sonnet', + model: CONFIG.defaults.mainModel,
79-86
: Same: LiteLLM model constant may not exist at endpoint.
Use CONFIG.defaults or make the model user‑provided to avoid 404s at proxies.- model: 'claude-3-haiku-20240307', + model: CONFIG.defaults.mainModel,
100-105
: Minor: read host/port from CONFIG for consistency.-const server = new EvalServer({ - authKey: 'hello', - host: '127.0.0.1', - port: 8080 -}); +const server = new EvalServer({ + authKey: 'hello', + host: CONFIG.server.host || '127.0.0.1', + port: CONFIG.server.port || 8080 +});front_end/panels/ai_chat/ui/AIChatPanel.ts (2)
361-376
: Use LLMConfigurationManager for provider fallback.
This reads from localStorage; prefer config manager to keep a single source of truth.- const currentProvider = localStorage.getItem(PROVIDER_SELECTION_KEY) || 'openai'; + const currentProvider = LLMConfigurationManager.getInstance().getProvider();
1971-1981
: Restore StorageMonitor hooks on panel hide.
To avoid global side effects across DevTools sessions, call restore().override willHide(): void { // Explicitly remove any event listeners to prevent memory leaks + try { StorageMonitor.getInstance().restore(); } catch {} if (this.#boundOnMessagesChanged) { this.#agentService.removeEventListener(AgentEvents.MESSAGES_CHANGED, this.#boundOnMessagesChanged); }
front_end/panels/ai_chat/evaluation/remote/EvaluationProtocol.ts (1)
92-95
: Tighten typing and align casing for provider fields.
Use a union type for provider in EvaluationParams.model to match LLMConfigurationParams; keep casing consistent across protocol objects to reduce translation glue.- provider?: string; + provider?: 'openai' | 'litellm' | 'groq' | 'openrouter'; - api_key?: string; // New: per-request API key + api_key?: string; // Note: snake_case for wire protocol; avoid loggingeval-server/nodejs/src/api-server.js (2)
37-43
: Defaults drift from CONFIG.defaults; unify source of truth.
Hardcoded 'gpt-4.1*' differs from src/config.js defaults. Load defaults from CONFIG to prevent divergence.+import { CONFIG } from './config.js'; @@ - this.configDefaults = { - model: { - main_model: 'gpt-4.1', - mini_model: 'gpt-4.1-mini', - nano_model: 'gpt-4.1-nano', - provider: 'openai' - } - }; + this.configDefaults = { + model: { + main_model: CONFIG.defaults.mainModel, + mini_model: CONFIG.defaults.miniModel, + nano_model: CONFIG.defaults.nanoModel, + provider: CONFIG.defaults.provider + } + }; @@ - this.configDefaults = { - model: { - main_model: 'gpt-4.1', - mini_model: 'gpt-4.1-mini', - nano_model: 'gpt-4.1-nano', - provider: 'openai' - } - }; + this.configDefaults = { + model: { + main_model: CONFIG.defaults.mainModel, + mini_model: CONFIG.defaults.miniModel, + nano_model: CONFIG.defaults.nanoModel, + provider: CONFIG.defaults.provider + } + };Also applies to: 48-55
58-66
: CORS wildcard on local API.
Fine for local dev; consider restricting or making it configurable before shipping.front_end/panels/ai_chat/core/LLMConfigurationManager.ts (1)
64-70
: Defensive provider parsinggetProvider casts arbitrary storage values to LLMProvider. Guard against invalid strings.
getProvider(): LLMProvider { if (this.overrideConfig?.provider) { return this.overrideConfig.provider; } const stored = localStorage.getItem(STORAGE_KEYS.PROVIDER); - return (stored as LLMProvider) || 'openai'; + const allowed: LLMProvider[] = ['openai','litellm','groq','openrouter']; + return allowed.includes(stored as LLMProvider) ? (stored as LLMProvider) : 'openai'; }front_end/panels/ai_chat/core/AgentService.ts (2)
647-681
: Use config manager for endpoint detection; avoid direct localStorage#doesCurrentConfigRequireApiKey reads localStorage directly. Prefer the centralized manager to avoid key drift.
#doesCurrentConfigRequireApiKey(): boolean { try { // Check the selected provider - const selectedProvider = this.#configManager.getProvider(); + const selectedProvider = this.#configManager.getProvider(); // OpenAI provider always requires an API key if (selectedProvider === 'openai') { return true; } // Groq provider always requires an API key if (selectedProvider === 'groq') { return true; } // OpenRouter provider always requires an API key if (selectedProvider === 'openrouter') { return true; } // For LiteLLM, only require API key if no endpoint is configured if (selectedProvider === 'litellm') { - const hasLiteLLMEndpoint = Boolean(localStorage.getItem('ai_chat_litellm_endpoint')); + const hasLiteLLMEndpoint = !!this.#configManager.getEndpoint(); // If we have an endpoint, API key is optional return !hasLiteLLMEndpoint; } // Default to requiring API key for any unknown provider return true;
4-4
: “Cache break” commentsHarmless, but consider removing before merge to keep history noise down.
Also applies to: 26-27
MODEL-CONFIGS.md (1)
56-62
: Fix markdownlint issues (MD036, MD024)Replace emphasis-styled headings with proper “### …” headings and dedupe repeated headings.
Also applies to: 62-62, 384-384
eval-server/nodejs/examples/library-usage.js (1)
50-53
: LiteLLM should be considered configured with endpoint onlyExamples mark LiteLLM configured only if apiKey AND endpoint are present. The rest of the system allows endpoint‑only LiteLLM.
- if (CONFIG.providers.litellm.apiKey && CONFIG.providers.litellm.endpoint) { + if (CONFIG.providers.litellm.endpoint) { availableProviders.push('litellm'); console.log(' ✅ LiteLLM configured'); }eval-server/nodejs/src/lib/EvalServer.js (1)
662-678
: Minor logging nit: messageType for JSON‑RPCsendMessage logs data.type which is undefined for JSON‑RPC responses/requests; consider guarding to avoid “undefined”.
- messageType: data.type + messageType: (data && data.type) ? data.type : (data && data.jsonrpc ? 'jsonrpc' : 'unknown')front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts (5)
352-365
: Include endpoint in early override extraction; else LiteLLM/OpenRouter endpoint overrides are ignored.You parse provider/model/apiKey but omit endpoint; add it to extraction and override.
Apply this diff:
- const provider = mainModel?.provider || params.model.provider || 'openai'; - const apiKey = mainModel?.api_key || params.model.api_key; + const provider = mainModel?.provider || params.model.provider || 'openai'; + const apiKey = mainModel?.api_key || params.model.api_key; const modelName = mainModel?.model || mainModel || params.model.main_model; const miniModel = (params.model.mini_model as any)?.model || params.model.mini_model; const nanoModel = (params.model.nano_model as any)?.model || params.model.nano_model; + const endpoint = (params.model as any)?.endpoint || (mainModel as any)?.endpoint; @@ - configManager.setOverride({ + configManager.setOverride({ provider, apiKey, mainModel: modelName, miniModel: miniModel || modelName, - nanoModel: nanoModel || miniModel || modelName + nanoModel: nanoModel || miniModel || modelName, + endpoint });Also applies to: 385-393
545-551
: Also pass through endpoint when building merged chat input.This ensures the chat path can use LiteLLM/OpenRouter endpoints provided in the request.
Apply this diff:
...(params.model && { // Extract nested model configuration properly main_model: (params.model.main_model as any)?.model || params.model.main_model, mini_model: (params.model.mini_model as any)?.model || params.model.mini_model, nano_model: (params.model.nano_model as any)?.model || params.model.nano_model, provider: (params.model.main_model as any)?.provider || params.model.provider, - api_key: (params.model.main_model as any)?.api_key || (params.model as any).api_key + api_key: (params.model.main_model as any)?.api_key || (params.model as any).api_key, + endpoint: (params.model as any)?.endpoint || (params.model.main_model as any)?.endpoint }),
300-307
: Auth failure path: short‑circuit earlier on known failure.You send the verify message and later reject the auth promise. Consider rejecting immediately after sending the negative verification to avoid a dangling connection.
Also applies to: 306-308
4-4
: Nit: “Cache break” comment timestamp is great; consider a link to the change or commit for traceability.
978-986
: Minor: Don’t log whether an API key exists when processing configuration.Even boolean presence can be sensitive in shared logs. Safe to keep in debug builds only.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
eval-server/nodejs/package-lock.json
is excluded by!**/package-lock.json
📒 Files selected for processing (19)
MODEL-CONFIGS.md
(1 hunks)eval-server/nodejs/.env.example
(1 hunks)eval-server/nodejs/CLAUDE.md
(4 hunks)eval-server/nodejs/examples/library-usage.js
(3 hunks)eval-server/nodejs/examples/multiple-evals.js
(4 hunks)eval-server/nodejs/examples/with-http-wrapper.js
(1 hunks)eval-server/nodejs/src/api-server.js
(6 hunks)eval-server/nodejs/src/config.js
(1 hunks)eval-server/nodejs/src/lib/EvalServer.js
(5 hunks)front_end/panels/ai_chat/BUILD.gn
(2 hunks)front_end/panels/ai_chat/LLM/OpenAIProvider.ts
(2 hunks)front_end/panels/ai_chat/common/EvaluationConfig.ts
(3 hunks)front_end/panels/ai_chat/core/AgentNodes.ts
(2 hunks)front_end/panels/ai_chat/core/AgentService.ts
(9 hunks)front_end/panels/ai_chat/core/LLMConfigurationManager.ts
(1 hunks)front_end/panels/ai_chat/evaluation/EvaluationAgent.ts
(3 hunks)front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts
(11 hunks)front_end/panels/ai_chat/evaluation/remote/EvaluationProtocol.ts
(2 hunks)front_end/panels/ai_chat/ui/AIChatPanel.ts
(5 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
eval-server/nodejs/src/config.js
📄 CodeRabbit inference engine (eval-server/nodejs/CLAUDE.md)
eval-server/nodejs/src/config.js
: Centralize configuration via environment variables in src/config.js
Require OPENAI_API_KEY and PORT (default 8080) from environment configuration
Files:
eval-server/nodejs/src/config.js
🧠 Learnings (10)
📚 Learning: 2025-08-29T23:08:07.342Z
Learnt from: CR
PR: BrowserOperator/browser-operator-core#0
File: eval-server/nodejs/CLAUDE.md:0-0
Timestamp: 2025-08-29T23:08:07.342Z
Learning: Applies to eval-server/nodejs/src/config.js : Require OPENAI_API_KEY and PORT (default 8080) from environment configuration
Applied to files:
eval-server/nodejs/.env.example
front_end/panels/ai_chat/core/AgentService.ts
eval-server/nodejs/examples/with-http-wrapper.js
eval-server/nodejs/src/config.js
eval-server/nodejs/examples/library-usage.js
📚 Learning: 2025-08-29T23:08:07.342Z
Learnt from: CR
PR: BrowserOperator/browser-operator-core#0
File: eval-server/nodejs/CLAUDE.md:0-0
Timestamp: 2025-08-29T23:08:07.342Z
Learning: Applies to eval-server/nodejs/src/config.js : Centralize configuration via environment variables in src/config.js
Applied to files:
eval-server/nodejs/.env.example
front_end/panels/ai_chat/evaluation/EvaluationAgent.ts
front_end/panels/ai_chat/common/EvaluationConfig.ts
eval-server/nodejs/examples/multiple-evals.js
eval-server/nodejs/src/config.js
eval-server/nodejs/examples/library-usage.js
📚 Learning: 2025-08-29T23:08:07.342Z
Learnt from: CR
PR: BrowserOperator/browser-operator-core#0
File: eval-server/nodejs/CLAUDE.md:0-0
Timestamp: 2025-08-29T23:08:07.342Z
Learning: Applies to eval-server/nodejs/src/evaluator.js : LLM evaluator must integrate with the OpenAI API to judge agent responses
Applied to files:
front_end/panels/ai_chat/LLM/OpenAIProvider.ts
eval-server/nodejs/examples/multiple-evals.js
eval-server/nodejs/CLAUDE.md
front_end/panels/ai_chat/evaluation/remote/EvaluationProtocol.ts
front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts
eval-server/nodejs/examples/library-usage.js
📚 Learning: 2025-08-29T23:08:07.342Z
Learnt from: CR
PR: BrowserOperator/browser-operator-core#0
File: eval-server/nodejs/CLAUDE.md:0-0
Timestamp: 2025-08-29T23:08:07.342Z
Learning: Applies to eval-server/nodejs/src/server.js : WebSocket server must accept agent connections, manage agent lifecycle (connect/ready/disconnect), orchestrate evaluation sessions, and handle bidirectional RPC communication
Applied to files:
eval-server/nodejs/examples/with-http-wrapper.js
front_end/panels/ai_chat/evaluation/EvaluationAgent.ts
eval-server/nodejs/src/lib/EvalServer.js
eval-server/nodejs/CLAUDE.md
📚 Learning: 2025-08-29T23:08:07.342Z
Learnt from: CR
PR: BrowserOperator/browser-operator-core#0
File: eval-server/nodejs/CLAUDE.md:0-0
Timestamp: 2025-08-29T23:08:07.342Z
Learning: Applies to eval-server/nodejs/src/rpc-client.js : RPC client must call Evaluate(request: string) -> string on connected agents
Applied to files:
eval-server/nodejs/src/api-server.js
eval-server/nodejs/src/lib/EvalServer.js
eval-server/nodejs/CLAUDE.md
📚 Learning: 2025-08-29T23:08:07.342Z
Learnt from: CR
PR: BrowserOperator/browser-operator-core#0
File: eval-server/nodejs/CLAUDE.md:0-0
Timestamp: 2025-08-29T23:08:07.342Z
Learning: Applies to eval-server/nodejs/src/evaluator.js : LLM evaluator must return structured JSON with scores and reasoning
Applied to files:
eval-server/nodejs/examples/multiple-evals.js
📚 Learning: 2025-08-29T23:08:07.342Z
Learnt from: CR
PR: BrowserOperator/browser-operator-core#0
File: eval-server/nodejs/CLAUDE.md:0-0
Timestamp: 2025-08-29T23:08:07.342Z
Learning: Applies to eval-server/nodejs/src/rpc-client.js : RPC client must implement JSON-RPC 2.0 protocol
Applied to files:
eval-server/nodejs/src/lib/EvalServer.js
eval-server/nodejs/CLAUDE.md
📚 Learning: 2025-08-29T23:08:07.342Z
Learnt from: CR
PR: BrowserOperator/browser-operator-core#0
File: eval-server/nodejs/CLAUDE.md:0-0
Timestamp: 2025-08-29T23:08:07.342Z
Learning: Applies to eval-server/nodejs/src/logger.js : Log all RPC calls, evaluations, and connection events
Applied to files:
eval-server/nodejs/src/lib/EvalServer.js
📚 Learning: 2025-08-29T23:08:07.342Z
Learnt from: CR
PR: BrowserOperator/browser-operator-core#0
File: eval-server/nodejs/CLAUDE.md:0-0
Timestamp: 2025-08-29T23:08:07.342Z
Learning: Applies to eval-server/nodejs/src/rpc-client.js : RPC client must correlate requests and responses using unique IDs
Applied to files:
eval-server/nodejs/CLAUDE.md
📚 Learning: 2025-08-29T23:08:07.342Z
Learnt from: CR
PR: BrowserOperator/browser-operator-core#0
File: eval-server/nodejs/CLAUDE.md:0-0
Timestamp: 2025-08-29T23:08:07.342Z
Learning: Applies to eval-server/nodejs/src/rpc-client.js : RPC client must handle timeouts and error conditions
Applied to files:
eval-server/nodejs/CLAUDE.md
🧬 Code graph analysis (11)
front_end/panels/ai_chat/core/LLMConfigurationManager.ts (5)
front_end/panels/ai_chat/core/AgentService.ts (2)
logger
(295-306)event
(686-722)front_end/panels/ai_chat/ui/AIChatPanel.ts (14)
logger
(957-1050)logger
(1358-1428)logger
(1578-1591)logger
(1635-1638)window
(1055-1102)localStorage
(822-845)provider
(1478-1541)event
(1689-1696)event
(1701-1705)event
(1710-1714)event
(1719-1723)event
(1728-1732)event
(1737-1741)error
(1948-1961)front_end/panels/ai_chat/core/Logger.ts (1)
createLogger
(316-318)front_end/panels/ai_chat/LLM/LLMTypes.ts (1)
LLMProvider
(142-142)front_end/panels/ai_chat/core/Constants.ts (1)
STORAGE_KEYS
(39-45)
front_end/panels/ai_chat/LLM/OpenAIProvider.ts (2)
front_end/panels/ai_chat/core/BuildConfig.ts (1)
BUILD_CONFIG
(11-20)front_end/panels/ai_chat/ui/AIChatPanel.ts (4)
apiKey
(1191-1222)apiKey
(1264-1283)apiKey
(1596-1630)localStorage
(822-845)
front_end/panels/ai_chat/core/AgentService.ts (3)
front_end/panels/ai_chat/core/LLMConfigurationManager.ts (1)
LLMConfigurationManager
(41-368)front_end/panels/ai_chat/core/BuildConfig.ts (1)
BUILD_CONFIG
(11-20)front_end/panels/ai_chat/core/Graph.ts (1)
createAgentGraph
(24-42)
front_end/panels/ai_chat/evaluation/EvaluationAgent.ts (1)
front_end/panels/ai_chat/core/BuildConfig.ts (1)
BUILD_CONFIG
(11-20)
front_end/panels/ai_chat/common/EvaluationConfig.ts (1)
front_end/panels/ai_chat/core/BuildConfig.ts (1)
BUILD_CONFIG
(11-20)
eval-server/nodejs/examples/multiple-evals.js (2)
eval-server/nodejs/src/config.js (2)
CONFIG
(5-63)CONFIG
(5-63)eval-server/nodejs/examples/library-usage.js (3)
response
(63-71)response
(121-135)response
(195-223)
front_end/panels/ai_chat/core/AgentNodes.ts (1)
front_end/panels/ai_chat/core/LLMConfigurationManager.ts (1)
LLMConfigurationManager
(41-368)
eval-server/nodejs/src/lib/EvalServer.js (2)
front_end/panels/ai_chat/core/AgentService.ts (1)
logger
(295-306)eval-server/nodejs/src/config.js (2)
CONFIG
(5-63)CONFIG
(5-63)
front_end/panels/ai_chat/ui/AIChatPanel.ts (2)
front_end/panels/ai_chat/core/LLMConfigurationManager.ts (1)
LLMConfigurationManager
(41-368)front_end/panels/ai_chat/core/AgentService.ts (1)
logger
(295-306)
front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts (5)
front_end/panels/ai_chat/evaluation/remote/EvaluationProtocol.ts (5)
isLLMConfigurationRequest
(297-299)LLMConfigurationRequest
(259-264)createLLMConfigurationResponse
(314-334)createErrorResponse
(240-255)ErrorCodes
(141-155)front_end/panels/ai_chat/core/BuildConfig.ts (1)
BUILD_CONFIG
(11-20)front_end/panels/ai_chat/core/AgentService.ts (1)
AgentService
(44-786)front_end/panels/ai_chat/core/LLMConfigurationManager.ts (1)
LLMConfigurationManager
(41-368)eval-server/nodejs/examples/library-usage.js (3)
response
(63-71)response
(121-135)response
(195-223)
eval-server/nodejs/examples/library-usage.js (1)
eval-server/nodejs/src/config.js (2)
CONFIG
(5-63)CONFIG
(5-63)
🪛 dotenv-linter (3.3.0)
eval-server/nodejs/.env.example
[warning] 6-6: [UnorderedKey] The HOST key should go before the PORT key
(UnorderedKey)
[warning] 16-16: [UnorderedKey] The LITELLM_API_KEY key should go before the LITELLM_ENDPOINT key
(UnorderedKey)
[warning] 27-27: [UnorderedKey] The DEFAULT_MAIN_MODEL key should go before the DEFAULT_PROVIDER key
(UnorderedKey)
[warning] 28-28: [UnorderedKey] The DEFAULT_MINI_MODEL key should go before the DEFAULT_PROVIDER key
(UnorderedKey)
[warning] 29-29: [UnorderedKey] The DEFAULT_NANO_MODEL key should go before the DEFAULT_PROVIDER key
(UnorderedKey)
[warning] 33-33: [UnorderedKey] The LOG_DIR key should go before the LOG_LEVEL key
(UnorderedKey)
[warning] 45-45: [EndingBlankLine] No blank line at the end of the file
(EndingBlankLine)
🪛 markdownlint-cli2 (0.17.2)
MODEL-CONFIGS.md
56-56: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
62-62: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
384-384: Multiple headings with the same content
(MD024, no-duplicate-heading)
🔇 Additional comments (33)
front_end/panels/ai_chat/BUILD.gn (1)
59-60
: LGTM: Manager included in both sources and metadataThe new LLMConfigurationManager.ts is correctly added to build and metadata lists.
Also applies to: 196-197
front_end/panels/ai_chat/evaluation/EvaluationAgent.ts (1)
277-281
: Good: no secret leakage in logsYou're logging only boolean presence and verification result, not the secrets themselves.
front_end/panels/ai_chat/LLM/OpenAIProvider.ts (2)
4-4
: Cache-break comment is fineDoc-only; non-functional.
11-11
: AUTOMATED_MODE credential bypass — override wiring verifiedvalidateCredentials() intentionally skips checks in AUTOMATED_MODE; the per-request override path supplies the API key into LLMClient → OpenAIProvider at runtime. Key pointers: front_end/panels/ai_chat/LLM/OpenAIProvider.ts (526–532), front_end/panels/ai_chat/core/LLMConfigurationManager.ts (103–107), front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts (874–879), front_end/panels/ai_chat/core/AgentService.ts (116–137), front_end/panels/ai_chat/LLM/LLMClient.ts (78–82). Optional: guard verbose logs that expose API keys/payloads behind debug.
front_end/panels/ai_chat/core/AgentNodes.ts (1)
18-18
: Central config import looks goodUsing the singleton avoids drift across nodes.
front_end/panels/ai_chat/common/EvaluationConfig.ts (2)
33-36
: Default ws://localhost:8082 matches examplesNice alignment with the example start script.
53-64
: Automated mode: good defaults; avoid persisting secretsBehavior matches intent — automated mode defaults evaluation to enabled and endpoint to ws://localhost:8082 and does not persist the secret key. I attempted a repo-wide search for other hardcoded 8080/8082 references, but ripgrep returned "No files were searched" in this environment, so I could not verify. Manually run a repo-wide search for "localhost:8080", "localhost:8082", "ws://localhost:8082", and the ai_chat localStorage keys (ai_chat_evaluation_*) to avoid docs/UI confusion. Also applies to: 68–70, 108–111.
eval-server/nodejs/CLAUDE.md (1)
45-51
: configure_llm implemented and wired in EvalServer.jsSwitch case 'configure_llm' (eval-server/nodejs/src/lib/EvalServer.js:346) calls this.handleConfigureLLM; async handleConfigureLLM is implemented at eval-server/nodejs/src/lib/EvalServer.js:382.
eval-server/nodejs/.env.example (1)
5-6
: Align default ports and pin model IDs — verification requiredRipgrep returned "No files were searched", so repo-wide verification couldn't complete.
- Port: eval-server/nodejs/.env.example (lines 5–6) sets PORT=8080 while examples/server/frontend use 8082 — unify to 8082 (recommended) or update examples to 8080.
- Models: replace placeholders (gpt-4, gpt-4-mini, gpt-3.5-turbo) with pinned, supported IDs for stability (e.g., gpt-4-0613, gpt-3.5-turbo-0613) or leave blank with a comment referencing OpenAI's canonical model list.
- Action: run a repo-wide search for ports and model placeholders (e.g., rg -n --hidden -S 'localhost:808[0-9]' && rg -n --hidden -S 'gpt-4|gpt-4-mini|gpt-3.5-turbo') or allow re-run; verification is incomplete until that search succeeds.
eval-server/nodejs/examples/multiple-evals.js (3)
12-12
: Good: centralized config import.
Importing CONFIG keeps example wiring in sync with server defaults.
94-96
: Nice touch: provider/default label in stack log.
Clearer operator feedback without leaking secrets.
132-141
: Good: runtime log mentions provider and model only.
No API keys in logs. Keep it that way.front_end/panels/ai_chat/ui/AIChatPanel.ts (6)
321-327
: Getter fallback logic LGTM.
Using config manager + fallback to main is correct.
330-337
: Nano fallback chain LGTM.
[nano -> mini -> main] is sensible.
339-347
: Provider resolution via config manager improves cohesion.
350-358
: Same for mini+provider.
727-742
: Config manager initialization at construction: good timing.
1108-1123
: Update tests/e2e to call setLLMConfiguration instead of editing localStorage directly.Search for setLLMConfiguration returned only its definition at front_end/panels/ai_chat/ui/AIChatPanel.ts:1111 — no call sites detected; update any tests/e2e that previously mutated localStorage to use this API. Run: rg -n "setLLMConfiguration\(" --type=ts -C2 to re-check.
eval-server/nodejs/src/config.js (1)
7-9
: Server port defaults OK.
parseInt fallback to 8080 covers unset/invalid env.front_end/panels/ai_chat/evaluation/remote/EvaluationProtocol.ts (1)
259-334
: LLM configuration RPC: API shape looks good.
Clear request/response types and guards.front_end/panels/ai_chat/core/AgentService.ts (4)
192-205
: Redundant API‑key gate after prior validationFollowing the fix above, this gate remains useful for non‑AUTOMATED_MODE. No change needed, just confirming behavior.
If you’d like to keep one source of truth, you can drop the apiKey portion from validateConfiguration entirely and rely on this gate for runtime enforcement in non‑AUTOMATED_MODE.
344-349
: AUTOMATED_MODE cold‑start still hit by validationsendMessage initializes the graph in AUTOMATED_MODE when #graph is absent, but initialize() → #initializeLLMClient would previously fail validation. The earlier fix resolves this path.
After applying the validation bypass, please test an AUTOMATED_MODE run with no stored API key to confirm first message succeeds.
292-306
: Config change handler looks goodMarking the service uninitialized and clearing the graph is the right minimal action on config changes.
171-177
: Log detail is appropriateProvider list and counts are useful; no secrets are logged.
eval-server/nodejs/examples/library-usage.js (1)
76-85
: Good progressive demo flowBasic eval, then model selection, then provider config demos — solid developer UX.
eval-server/nodejs/src/lib/EvalServer.js (3)
329-379
: JSON‑RPC handler: solid structureRouting requests and returning JSON‑RPC compliant responses is correct.
701-729
: Model config layering looks goodNested main/mini/nano with evaluation overrides is sensible.
Also applies to: 730-748
199-205
: Exposing getClientManager() is finePublic accessor is a useful addition.
front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts (5)
193-195
: Good: routes configure_llm messages to a dedicated handler.This integrates the new JSON‑RPC method cleanly.
689-762
: Tool timeout handling looks solid.Clear timeouts, trace updates, and error propagation are handled correctly.
782-793
: Heartbeat is fine.Interval cleared on disconnect; low risk.
1-1048
: Repo‑wide verification helpersRun these to confirm assumptions and spot remaining hazards.
278-284
: Restrict automated-mode auth bypass to local endpoints and make it explicit opt‑in.AUTOMATED_MODE currently skips auth for any endpoint — gate the bypass so it only applies to loopback/private hosts (localhost, 127.0.0.1, ::1, *.local) and require an explicit allow flag (e.g., AUTOMATED_MODE_AUTH_BYPASS) in addition to AUTOMATED_MODE; refuse bypass and log a warning for non-local endpoints.
Location: front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts (around lines 278–284).
logger.info('Processing responses request:', { | ||
input: requestBody.input, | ||
modelConfig | ||
modelConfig: nestedModelConfig | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not log full model config (leaks API keys).
This prints nestedModelConfig containing api_key. Remove or redact.
- logger.info('Processing responses request:', {
- input: requestBody.input,
- modelConfig: nestedModelConfig
- });
+ logger.info('Processing responses request', { input: requestBody.input });
If you need visibility, log only provider/model names without secrets.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
logger.info('Processing responses request:', { | |
input: requestBody.input, | |
modelConfig | |
modelConfig: nestedModelConfig | |
}); | |
logger.info('Processing responses request', { input: requestBody.input }); |
🤖 Prompt for AI Agents
In eval-server/nodejs/src/api-server.js around lines 270 to 273, the current
logger.info call prints nestedModelConfig which may contain secrets (api_key);
remove or redact any sensitive fields before logging. Replace logging of the
full object with a sanitized object that only includes non-sensitive metadata
(e.g., provider name and model name) or explicitly remove api_key/credentials
fields, and ensure nested structures are traversed to strip any secret keys
before passing data to logger.info.
async handleConfigureLLM(connection, params) { | ||
if (!connection.registered) { | ||
throw new Error('Client must be registered before configuring LLM'); | ||
} | ||
|
||
const { provider, apiKey, endpoint, models, partial = false } = params; | ||
|
||
// Validate provider | ||
const supportedProviders = ['openai', 'litellm', 'groq', 'openrouter']; | ||
if (!supportedProviders.includes(provider)) { | ||
throw new Error(`Unsupported provider: ${provider}. Supported providers: ${supportedProviders.join(', ')}`); | ||
} | ||
|
||
// Validate models | ||
if (!models || !models.main) { | ||
throw new Error('Main model is required'); | ||
} | ||
|
||
// Store configuration for this client connection | ||
if (!connection.llmConfig) { | ||
connection.llmConfig = {}; | ||
} | ||
|
||
// Apply configuration (full or partial update) | ||
if (partial && connection.llmConfig) { | ||
// Partial update - merge with existing config | ||
connection.llmConfig = { | ||
...connection.llmConfig, | ||
provider: provider || connection.llmConfig.provider, | ||
apiKey: apiKey || connection.llmConfig.apiKey, | ||
endpoint: endpoint || connection.llmConfig.endpoint, | ||
models: { | ||
...connection.llmConfig.models, | ||
...models | ||
} | ||
}; | ||
} else { | ||
// Full update - replace entire config | ||
connection.llmConfig = { | ||
provider, | ||
apiKey: apiKey || CONFIG.providers[provider]?.apiKey, | ||
endpoint: endpoint || CONFIG.providers[provider]?.endpoint, | ||
models: { | ||
main: models.main, | ||
mini: models.mini || models.main, | ||
nano: models.nano || models.mini || models.main | ||
} | ||
}; | ||
} | ||
|
||
logger.info('LLM configuration updated', { | ||
clientId: connection.clientId, | ||
provider: connection.llmConfig.provider, | ||
models: connection.llmConfig.models, | ||
hasApiKey: !!connection.llmConfig.apiKey, | ||
hasEndpoint: !!connection.llmConfig.endpoint | ||
}); | ||
|
||
return { | ||
status: 'success', | ||
message: 'LLM configuration updated successfully', | ||
appliedConfig: { | ||
provider: connection.llmConfig.provider, | ||
models: connection.llmConfig.models | ||
} | ||
}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
configure_llm: partial updates require provider/models even when not changing them
For partial=true, the code still validates provider and models.main unconditionally. This rejects legitimate partial updates.
Use an effectiveProvider (from params or existing/default), and only require models.main for full updates.
async handleConfigureLLM(connection, params) {
if (!connection.registered) {
throw new Error('Client must be registered before configuring LLM');
}
- const { provider, apiKey, endpoint, models, partial = false } = params;
+ const { provider, apiKey, endpoint, models, partial = false } = params;
- // Validate provider
- const supportedProviders = ['openai', 'litellm', 'groq', 'openrouter'];
- if (!supportedProviders.includes(provider)) {
- throw new Error(`Unsupported provider: ${provider}. Supported providers: ${supportedProviders.join(', ')}`);
- }
+ const supportedProviders = ['openai', 'litellm', 'groq', 'openrouter'];
+ const existing = connection.llmConfig || {};
+ const effectiveProvider = provider || existing.provider || CONFIG.defaults.provider;
+ if (!supportedProviders.includes(effectiveProvider)) {
+ throw new Error(`Unsupported provider: ${effectiveProvider}. Supported providers: ${supportedProviders.join(', ')}`);
+ }
- // Validate models
- if (!models || !models.main) {
- throw new Error('Main model is required');
- }
+ // Validate models: required only for full update
+ if (!partial && (!models || !models.main)) {
+ throw new Error('Main model is required');
+ }
// Store configuration for this client connection
if (!connection.llmConfig) {
connection.llmConfig = {};
}
// Apply configuration (full or partial update)
if (partial && connection.llmConfig) {
// Partial update - merge with existing config
connection.llmConfig = {
...connection.llmConfig,
- provider: provider || connection.llmConfig.provider,
- apiKey: apiKey || connection.llmConfig.apiKey,
- endpoint: endpoint || connection.llmConfig.endpoint,
+ provider: effectiveProvider,
+ apiKey: apiKey ?? connection.llmConfig.apiKey ?? CONFIG.providers[effectiveProvider]?.apiKey,
+ endpoint: endpoint ?? connection.llmConfig.endpoint ?? CONFIG.providers[effectiveProvider]?.endpoint,
models: {
...connection.llmConfig.models,
...models
}
};
} else {
// Full update - replace entire config
connection.llmConfig = {
- provider,
- apiKey: apiKey || CONFIG.providers[provider]?.apiKey,
- endpoint: endpoint || CONFIG.providers[provider]?.endpoint,
+ provider: effectiveProvider,
+ apiKey: apiKey || CONFIG.providers[effectiveProvider]?.apiKey,
+ endpoint: endpoint || CONFIG.providers[effectiveProvider]?.endpoint,
models: {
- main: models.main,
- mini: models.mini || models.main,
- nano: models.nano || models.mini || models.main
+ main: models.main,
+ mini: models.mini || models.main,
+ nano: models.nano || models.mini || models.main
}
};
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
async handleConfigureLLM(connection, params) { | |
if (!connection.registered) { | |
throw new Error('Client must be registered before configuring LLM'); | |
} | |
const { provider, apiKey, endpoint, models, partial = false } = params; | |
// Validate provider | |
const supportedProviders = ['openai', 'litellm', 'groq', 'openrouter']; | |
if (!supportedProviders.includes(provider)) { | |
throw new Error(`Unsupported provider: ${provider}. Supported providers: ${supportedProviders.join(', ')}`); | |
} | |
// Validate models | |
if (!models || !models.main) { | |
throw new Error('Main model is required'); | |
} | |
// Store configuration for this client connection | |
if (!connection.llmConfig) { | |
connection.llmConfig = {}; | |
} | |
// Apply configuration (full or partial update) | |
if (partial && connection.llmConfig) { | |
// Partial update - merge with existing config | |
connection.llmConfig = { | |
...connection.llmConfig, | |
provider: provider || connection.llmConfig.provider, | |
apiKey: apiKey || connection.llmConfig.apiKey, | |
endpoint: endpoint || connection.llmConfig.endpoint, | |
models: { | |
...connection.llmConfig.models, | |
...models | |
} | |
}; | |
} else { | |
// Full update - replace entire config | |
connection.llmConfig = { | |
provider, | |
apiKey: apiKey || CONFIG.providers[provider]?.apiKey, | |
endpoint: endpoint || CONFIG.providers[provider]?.endpoint, | |
models: { | |
main: models.main, | |
mini: models.mini || models.main, | |
nano: models.nano || models.mini || models.main | |
} | |
}; | |
} | |
logger.info('LLM configuration updated', { | |
clientId: connection.clientId, | |
provider: connection.llmConfig.provider, | |
models: connection.llmConfig.models, | |
hasApiKey: !!connection.llmConfig.apiKey, | |
hasEndpoint: !!connection.llmConfig.endpoint | |
}); | |
return { | |
status: 'success', | |
message: 'LLM configuration updated successfully', | |
appliedConfig: { | |
provider: connection.llmConfig.provider, | |
models: connection.llmConfig.models | |
} | |
}; | |
} | |
async handleConfigureLLM(connection, params) { | |
if (!connection.registered) { | |
throw new Error('Client must be registered before configuring LLM'); | |
} | |
const { provider, apiKey, endpoint, models, partial = false } = params; | |
const supportedProviders = ['openai', 'litellm', 'groq', 'openrouter']; | |
const existing = connection.llmConfig || {}; | |
const effectiveProvider = provider || existing.provider || CONFIG.defaults.provider; | |
if (!supportedProviders.includes(effectiveProvider)) { | |
throw new Error(`Unsupported provider: ${effectiveProvider}. Supported providers: ${supportedProviders.join(', ')}`); | |
} | |
// Validate models: required only for full update | |
if (!partial && (!models || !models.main)) { | |
throw new Error('Main model is required'); | |
} | |
// Store configuration for this client connection | |
if (!connection.llmConfig) { | |
connection.llmConfig = {}; | |
} | |
// Apply configuration (full or partial update) | |
if (partial && connection.llmConfig) { | |
// Partial update - merge with existing config | |
connection.llmConfig = { | |
...connection.llmConfig, | |
provider: effectiveProvider, | |
apiKey: apiKey ?? connection.llmConfig.apiKey ?? CONFIG.providers[effectiveProvider]?.apiKey, | |
endpoint: endpoint ?? connection.llmConfig.endpoint ?? CONFIG.providers[effectiveProvider]?.endpoint, | |
models: { | |
...connection.llmConfig.models, | |
...models | |
} | |
}; | |
} else { | |
// Full update - replace entire config | |
connection.llmConfig = { | |
provider: effectiveProvider, | |
apiKey: apiKey || CONFIG.providers[effectiveProvider]?.apiKey, | |
endpoint: endpoint || CONFIG.providers[effectiveProvider]?.endpoint, | |
models: { | |
main: models.main, | |
mini: models.mini || models.main, | |
nano: models.nano || models.mini || models.main | |
} | |
}; | |
} | |
logger.info('LLM configuration updated', { | |
clientId: connection.clientId, | |
provider: connection.llmConfig.provider, | |
models: connection.llmConfig.models, | |
hasApiKey: !!connection.llmConfig.apiKey, | |
hasEndpoint: !!connection.llmConfig.endpoint | |
}); | |
return { | |
status: 'success', | |
message: 'LLM configuration updated successfully', | |
appliedConfig: { | |
provider: connection.llmConfig.provider, | |
models: connection.llmConfig.models | |
} | |
}; | |
} |
🤖 Prompt for AI Agents
In eval-server/nodejs/src/lib/EvalServer.js around lines 384–450, change the
configure LLM logic so partial updates don't force provider/models validation:
compute an effectiveProvider = params.provider || connection.llmConfig?.provider
|| CONFIG.defaultProvider; only validate supportedProviders when params.provider
is present (or when there is no existing provider at all), and only require
models.main for full updates (partial === false or no existing models). For
partial updates, merge models permissively (do not throw if params.models or
params.models.main is missing) and when filling apiKey/endpoint for the applied
config use params values or fall back to connection.llmConfig values or
CONFIG.providers[effectiveProvider] defaults. Ensure logging and returned
appliedConfig reflect the final effectiveProvider and merged models.
// Validate and add the selected provider | ||
const validation = this.#configManager.validateConfiguration(); | ||
if (!validation.isValid) { | ||
throw new Error(`Configuration validation failed: ${validation.errors.join(', ')}`); | ||
} | ||
|
||
if (provider === 'openrouter' && openrouterKey) { | ||
providers.push({ | ||
provider: 'openrouter' as const, | ||
apiKey: openrouterKey | ||
}); | ||
|
||
// Only add the selected provider if it has valid configuration | ||
switch (provider) { | ||
case 'openai': | ||
if (apiKey) { | ||
providers.push({ | ||
provider: 'openai' as const, | ||
apiKey | ||
}); | ||
} | ||
break; | ||
case 'litellm': | ||
if (endpoint) { | ||
providers.push({ | ||
provider: 'litellm' as const, | ||
apiKey: apiKey || '', // Can be empty for some LiteLLM endpoints | ||
providerURL: endpoint | ||
}); | ||
} | ||
break; | ||
case 'groq': | ||
if (apiKey) { | ||
providers.push({ | ||
provider: 'groq' as const, | ||
apiKey | ||
}); | ||
} | ||
break; | ||
case 'openrouter': | ||
if (apiKey) { | ||
providers.push({ | ||
provider: 'openrouter' as const, | ||
apiKey | ||
}); | ||
} | ||
break; | ||
} | ||
|
||
if (providers.length === 0) { | ||
let errorMessage = 'OpenAI API key is required for this configuration'; | ||
if (provider === 'litellm') { | ||
errorMessage = 'LiteLLM endpoint is required for this configuration'; | ||
} else if (provider === 'groq') { | ||
errorMessage = 'Groq API key is required for this configuration'; | ||
} else if (provider === 'openrouter') { | ||
errorMessage = 'OpenRouter API key is required for this configuration'; | ||
} | ||
throw new Error(errorMessage); | ||
throw new Error(`No valid configuration found for provider ${provider}`); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AUTOMATED_MODE blocked by strict validation; skip credential checks when enabled
#initializeLLMClient calls validateConfiguration unconditionally. In AUTOMATED_MODE this rejects configs that intentionally omit API keys (key provided per‑request). Pass an “ignore credentials” flag.
- const validation = this.#configManager.validateConfiguration();
+ const validation = this.#configManager.validateConfiguration({
+ ignoreCredentialRequirements: BUILD_CONFIG.AUTOMATED_MODE
+ });
if (!validation.isValid) {
throw new Error(`Configuration validation failed: ${validation.errors.join(', ')}`);
}
Note: This change depends on LLMConfigurationManager.validateConfiguration supporting the new option. See suggested diff there.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In front_end/panels/ai_chat/core/AgentService.ts around lines 124 to 169, the
unconditional call to this.#configManager.validateConfiguration() causes failure
when AUTOMATED_MODE intentionally omits API keys; update the call to pass an
option like { ignoreCredentials: true } when AUTOMATED_MODE is enabled (e.g.,
detect the mode before validation) so credential presence is not enforced, and
relax the subsequent provider checks to allow adding providers when
AUTOMATED_MODE is enabled (do not require apiKey for providers in that mode, but
still include providerURL for litellm); ensure this change aligns with the
updated LLMConfigurationManager.validateConfiguration signature.
// Clean up early override if we set one | ||
if (hasSetEarlyOverride) { | ||
try { | ||
const configManager = LLMConfigurationManager.getInstance(); | ||
configManager.clearOverride(); | ||
logger.info('Cleared early configuration override', { evaluationId: params.evaluationId }); | ||
} catch (clearError) { | ||
logger.warn('Failed to clear early configuration override:', clearError); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Global override cleanup race across finally blocks.
handleEvaluationRequest and executeChatEvaluation both clear the same global override. With concurrency>1, overrides for other evaluations may be cleared unexpectedly.
Short‑term: set maxConcurrency=1 (see prior comment). Long‑term: remove global override; pass config to AgentService per call (e.g., AgentService.sendMessage(input, { provider, apiKey, models… })) and make AgentService use that context instead of global state.
🤖 Prompt for AI Agents
In front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts around lines
675-684, the code clears a global LLMConfigurationManager override
(clearOverride) which races with other evaluations and can wipe overrides for
concurrent runs; short-term, set maxConcurrency=1 to avoid concurrent clears,
and remove the clearOverride call here; long-term, stop using global overrides:
change AgentService APIs to accept an explicit per-call config object (provider,
apiKey, models, etc.), thread that config through EvaluationAgent ->
AgentService.sendMessage(..., config) and make AgentService use that passed
context instead of reading/writing LLMConfigurationManager global state, then
delete code paths that set or clear global overrides.
### ✅ Phase 1: Core Infrastructure (COMPLETED) | ||
1. ✅ **Create LLMConfigurationManager** singleton class - `/front_end/panels/ai_chat/core/LLMConfigurationManager.ts` | ||
2. ✅ **Update AgentService** to use manager instead of localStorage | ||
3. ✅ **Update AIChatPanel** model selection methods | ||
4. ✅ **Add configuration refresh mechanisms** | ||
|
||
### ✅ Phase 2: Per-Request Override Support (COMPLETED) | ||
5. ✅ **Update EvaluationAgent** per-request override logic | ||
6. ✅ **Update Graph/AgentNodes** configuration passing via LLMConfigurationManager | ||
7. ✅ **Per-request override functionality** implemented | ||
|
||
### ✅ Phase 3: Persistent Configuration API (COMPLETED) | ||
8. ✅ **Extend EvaluationProtocol** with `configure_llm` method | ||
9. ✅ **Implement `configure_llm` handler** in EvaluationAgent | ||
10. 🔄 **Add eval-server support** for persistent configuration (server-side implementation needed) | ||
11. ✅ **Add configuration validation** and error handling | ||
|
||
### 🔄 Phase 4: Testing & Documentation (NEXT) | ||
12. ✅ **Maintain backwards compatibility** during transition | ||
13. 🔄 **Add comprehensive tests** for both modes | ||
14. ✅ **Update documentation** and usage examples | ||
15. 🔄 **Performance testing** with multiple tabs and configurations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Doc/code mismatch: partial configure_llm vs server enforcement
The doc says partial updates are supported, but server handleConfigureLLM currently requires provider and models.main in all cases. See EvalServer.js suggestions to align behavior.
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "---- MODEL-CONFIGS.md lines 335-356 ----"
if [ -f MODEL-CONFIGS.md ]; then
sed -n '335,356p' MODEL-CONFIGS.md || true
else
echo "MODEL-CONFIGS.md not found in repo root"
fi
echo
echo "---- Searching for configure_llm / handleConfigureLLM / EvalServer ----"
rg -n --hidden --no-ignore -S 'configure_llm|configure-llm|handleConfigureLLM|configureLLM|handle_configure_llm' -g '!**/node_modules/**' || true
echo
echo "---- Showing context for matches (±5 lines) ----"
rg -n -C5 --hidden --no-ignore -S 'configure_llm|configure-llm|handleConfigureLLM|configureLLM|handle_configure_llm' -g '!**/node_modules/**' || true
echo
echo "---- Searching for EvalServer filename / class / eval-server directory ----"
rg -n --hidden --no-ignore -S 'class\s+EvalServer|EvalServer.js|EvalServer.ts|eval-server' -g '!**/node_modules/**' || true
find . -type f \( -iname 'evalserver*' -o -iname '*eval-server*' -o -iname 'EvalServer.js' -o -iname 'EvalServer.ts' \) -print || true
echo
echo "---- Done ----"
Length of output: 1950
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "---- File: eval-server/nodejs/src/lib/EvalServer.js (full) ----"
if [ -f eval-server/nodejs/src/lib/EvalServer.js ]; then
nl -ba eval-server/nodejs/src/lib/EvalServer.js | sed -n '1,240p' || true
else
echo "File not found: eval-server/nodejs/src/lib/EvalServer.js"
fi
echo
echo "---- Search for configure/handleConfigure keywords across repo ----"
rg -n --hidden --no-ignore -S 'configure_llm|configure-llm|configureLLM|handleConfigureLLM|handle_configure_llm' -g '!**/node_modules/**' || true
echo
echo "---- Show matches with context (±10 lines) ----"
rg -n -C10 --hidden --no-ignore -S 'configure_llm|configure-llm|configureLLM|handleConfigureLLM|handle_configure_llm' -g '!**/node_modules/**' || true
Length of output: 520
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="eval-server/nodejs/src/lib/EvalServer.js"
echo "---- File: $FILE ----"
if [ -f "$FILE" ]; then
awk '{printf "%5d: %s\n", NR, $0}' "$FILE" || true
else
echo "File not found: $FILE"
fi
echo
echo "---- Repo-wide search for configure/handleConfigure patterns ----"
grep -RIn --exclude-dir=node_modules -C5 -E 'configure(_|-)?llm|configureLLM|handleConfigureLLM|handle_configure_llm' . || true
echo
echo "---- Done ----"
Length of output: 44342
EvalServer unconditionally rejects partial configure_llm updates (provider + models.main required)
- Evidence: eval-server/nodejs/src/lib/EvalServer.js — handleConfigureLLM (lines 384–400) destructures params and then unconditionally validates provider and models.main, so calls with params.partial === true but missing provider/models will throw.
- Action: Make validation conditional — require provider/models.main only when partial is false (or validate only the fields present when partial is true). Ensure partial merges only when connection.llmConfig exists; otherwise reject with a clear error. (Change validation at eval-server/nodejs/src/lib/EvalServer.js lines ~391–400 and adjust the partial-merge logic at ~408–432.)
- Docs: Reconcile MODEL-CONFIGS.md / eval-server/nodejs/CLAUDE.md examples with the server behavior or update docs to reflect stricter requirements.
🤖 Prompt for AI Agents
In MODEL-CONFIGS.md around lines 335 to 356: the comment notes eval-server
currently rejects partial configure_llm updates because EvalServer.js
unconditionally validates provider and models.main; update
eval-server/nodejs/src/lib/EvalServer.js (around lines ~391–400 and ~408–432) so
validation of provider and models.main only occurs when params.partial is false,
and when params.partial is true validate only the fields present (and perform a
partial merge only if connection.llmConfig exists, otherwise return a clear
error indicating no existing config to patch); also adjust the partial-merge
logic to merge safely and persist the result, and update MODEL-CONFIGS.md /
eval-server/nodejs/CLAUDE.md examples to reflect the new behavior or stricter
requirements.
Claude finished @olesho's task —— View job PR Review: Local Dockerised Eval ServerI've completed a comprehensive review of this substantial PR implementing a local dockerized evaluation server with advanced LLM configuration management. This is an impressive addition with solid architecture, though there are several areas for improvement. Review Progress:
🎯 Overall AssessmentThis PR introduces a sophisticated evaluation server system with excellent separation of concerns and flexible configuration management. The implementation demonstrates strong architectural decisions with room for security and robustness improvements. ✅ StrengthsArchitecture & Design
Code Quality
🔍 Areas for ImprovementCritical Security Issues1. API Key Exposure Risk ( // ISSUE: API keys logged at info level
logger.info('Setting early configuration override', {
hasApiKey: !!apiKey,
apiKeyLength: apiKey?.length // Should not log length
}); Recommendation: Never log API key lengths or related metadata. Use debug level for configuration details. 2. Insufficient Input Validation ( if (!requestBody.input || typeof requestBody.input !== 'string') {
throw new Error('Missing or invalid "input" field. Expected a string.');
} Recommendation: Add comprehensive validation for all request parameters, including size limits and sanitization. 3. CORS Security ( res.setHeader('Access-Control-Allow-Origin', '*'); Recommendation: Replace wildcard CORS with specific allowed origins for production use. Memory & Performance Concerns4. Potential Memory Leaks ( for (const [clientId, connection] of this.connectedClients) {
connection.rpcClient.cleanup();
// Missing: Remove event listeners and clear timers
} Recommendation: Ensure all event listeners, intervals, and timeouts are properly cleaned up. 5. Missing Request Size Limits ( for await (const chunk of req) {
body += chunk; // No size limit
} Recommendation: Implement request size limits to prevent DoS attacks. Error Handling & Reliability6. Incomplete Error Context ( } catch (error) {
const errorMessage = error instanceof Error ? error.message : 'Unknown error';
// Missing: Error stack traces, evaluation context
} Recommendation: Include evaluation context and stack traces in error responses for better debugging. 7. Inconsistent Timeout Handling ( timeout: 1500000, // 25 minutes - very long Recommendation: Implement configurable timeouts with reasonable defaults and proper cleanup on timeout. Code Quality Issues8. Complex Configuration Logic ( 9. Missing Validation ( setOverride(config: Partial<LLMConfig>): void {
this.overrideConfig = { ...config }; // No validation
} Recommendation: Add validation for provider compatibility and required fields. 🧪 Test Coverage GapsThe PR lacks comprehensive test coverage for:
Recommendation: Add unit tests for core components and integration tests for the evaluation flow. 🚀 Performance Recommendations1. Connection PoolingImplement connection pooling for HTTP requests to external LLM providers to reduce latency. 2. Configuration CachingCache parsed configurations to avoid repeated localStorage access. 3. Request DebouncingAdd debouncing for rapid configuration changes to prevent unnecessary reinitializations. 🔧 Implementation Suggestions1. Enhanced Error Typesexport class EvaluationError extends Error {
constructor(
message: string,
public readonly code: string,
public readonly evaluationId?: string,
public readonly context?: Record<string, any>
) {
super(message);
}
} 2. Configuration Validation Schemaconst CONFIG_SCHEMA = {
provider: { type: 'string', enum: ['openai', 'litellm', 'groq', 'openrouter'] },
apiKey: { type: 'string', minLength: 10 },
models: { type: 'object', required: ['main'] }
}; 3. Request Rate LimitingImplement rate limiting for evaluation requests to prevent abuse. 📋 Immediate Action ItemsHigh Priority
Medium Priority
Low Priority
🏆 ConclusionThis is a well-architected and feature-rich addition that significantly enhances the evaluation capabilities. The core design is solid with excellent separation of concerns. The main concerns are around security hardening, comprehensive testing, and production readiness. Recommendation: Address the security issues and add basic test coverage before merging. The architecture is sound and will provide a strong foundation for evaluation workflows. Overall Score: 7.5/10 - Excellent architecture with important security/robustness improvements needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
front_end/panels/ai_chat/core/AgentService.ts (1)
651-651
: Replace direct localStorage access with configuration managerThe method uses
this.#configManager.getProvider()
for the provider but still directly accesses localStorage for the endpoint check at line 670. This inconsistency could lead to mismatched state.
🧹 Nitpick comments (3)
front_end/panels/ai_chat/core/AgentService.ts (3)
113-178
: Consider adding provider-specific error messagesThe initialization logic is solid with AUTOMATED_MODE support. However, when validation fails, the error message could be more specific about which provider failed and why.
Consider enhancing the error message to include the provider name:
if (!validation.isValid) { - throw new Error(`Configuration validation failed: ${validation.errors.join(', ')}`); + throw new Error(`Configuration validation failed for ${provider}: ${validation.errors.join(', ')}`); }
296-307
: Consider handling ongoing operations during configuration changesThe configuration change handler properly marks the service for reinitialization. However, if there's an ongoing graph operation when the configuration changes, it might lead to undefined behavior.
Consider canceling any ongoing operations before clearing the graph:
#handleConfigurationChange(): void { logger.info('LLM configuration changed, reinitializing if needed'); // If we're initialized, we need to reinitialize with new configuration if (this.#isInitialized) { + // Cancel any ongoing graph operations + if (this.#runningGraphStatePromise) { + logger.info('Canceling ongoing graph operation due to config change'); + this.#runningGraphStatePromise = undefined; + } + // Mark as uninitialized to force reinit on next use this.#isInitialized = false; this.#graph = undefined; logger.info('Marked agent service for reinitialization due to config change'); } }
313-328
: Add error recovery mechanism for failed reinitializationThe refresh method properly reinitializes the service, but if reinitialization fails, the service is left in an unusable state.
Consider adding a fallback to restore the previous state if reinitialization fails:
async refreshCredentials(): Promise<void> { logger.info('Refreshing credentials and reinitializing agent service'); + // Store current state in case we need to rollback + const previousInitialized = this.#isInitialized; + const previousGraph = this.#graph; + this.#isInitialized = false; this.#graph = undefined; // Force reinitialization on next use try { const config = this.#configManager.getConfiguration(); await this.initialize(config.apiKey || null, config.mainModel, config.miniModel || '', config.nanoModel || ''); logger.info('Agent service reinitialized successfully'); } catch (error) { logger.error('Failed to reinitialize agent service:', error); + // Restore previous state if we were previously initialized + if (previousInitialized && previousGraph) { + this.#isInitialized = previousInitialized; + this.#graph = previousGraph; + logger.warn('Restored previous state after failed reinitialization'); + } throw error; } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
front_end/panels/ai_chat/LLM/OpenAIProvider.ts
(2 hunks)front_end/panels/ai_chat/core/AgentService.ts
(9 hunks)front_end/panels/ai_chat/core/LLMConfigurationManager.ts
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- front_end/panels/ai_chat/core/LLMConfigurationManager.ts
- front_end/panels/ai_chat/LLM/OpenAIProvider.ts
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-29T23:08:07.342Z
Learnt from: CR
PR: BrowserOperator/browser-operator-core#0
File: eval-server/nodejs/CLAUDE.md:0-0
Timestamp: 2025-08-29T23:08:07.342Z
Learning: Applies to eval-server/nodejs/src/config.js : Require OPENAI_API_KEY and PORT (default 8080) from environment configuration
Applied to files:
front_end/panels/ai_chat/core/AgentService.ts
🧬 Code graph analysis (1)
front_end/panels/ai_chat/core/AgentService.ts (3)
front_end/panels/ai_chat/core/LLMConfigurationManager.ts (1)
LLMConfigurationManager
(42-372)front_end/panels/ai_chat/core/BuildConfig.ts (1)
BUILD_CONFIG
(11-20)front_end/panels/ai_chat/core/Graph.ts (1)
createAgentGraph
(24-42)
🔇 Additional comments (4)
front_end/panels/ai_chat/core/AgentService.ts (4)
17-18
: LGTM! Clean centralized configuration management setupGood architectural decision to use a centralized configuration manager. The imports are properly organized and the singleton pattern ensures consistent state management across the application.
Also applies to: 24-24
62-90
: LGTM! Proper initialization with configuration change handlingGood initialization of the configuration manager and subscription to configuration changes. This enables real-time config updates without requiring a restart.
193-205
: LGTM! Proper AUTOMATED_MODE bypass for API key validationThe implementation correctly handles AUTOMATED_MODE by bypassing API key requirements when appropriate. The provider-specific error messages are clear and helpful.
345-350
: LGTM! Proper AUTOMATED_MODE initialization in sendMessageGood implementation of AUTOMATED_MODE support that ensures the graph is initialized even without an API key. This allows automated testing and evaluation flows to work correctly.
2. ✓ Fixed multi-provider credential preservation in saveProviderSpecificSettings() 3. ✓ Secured logging in both handleStorageChange() and getDebugInfo()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
front_end/panels/ai_chat/core/LLMConfigurationManager.ts (3)
253-293
: Skip‑credential checks: addressed and looks good.The AUTOMATED_MODE bypass via skipCredentialChecks is correctly implemented.
295-336
: Per‑provider credential persistence: addressed and looks good.Only the active provider’s keys are updated; others are preserved.
373-389
: Debug info redaction: looks good.apiKey is masked in both override and current configs.
🧹 Nitpick comments (7)
front_end/panels/ai_chat/core/LLMConfigurationManager.ts (7)
65-71
: Harden provider parsing; avoid unsafe cast.Validate stored provider against the allowed set instead of casting arbitrary strings.
Apply this diff:
getProvider(): LLMProvider { if (this.overrideConfig?.provider) { return this.overrideConfig.provider; } - const stored = localStorage.getItem(STORAGE_KEYS.PROVIDER); - return (stored as LLMProvider) || 'openai'; + const stored = localStorage.getItem(STORAGE_KEYS.PROVIDER); + switch (stored) { + case 'openai': + case 'litellm': + case 'groq': + case 'openrouter': + return stored; + default: + return 'openai'; + } }
145-153
: Don’t populate optional fields with empty strings in getConfiguration.Map empty strings to undefined for optional fields to align with LLMConfig and avoid downstream truthiness bugs.
Apply this diff:
getConfiguration(): LLMConfig { return { provider: this.getProvider(), - apiKey: this.getApiKey(), + apiKey: this.getApiKey() || undefined, endpoint: this.getEndpoint(), mainModel: this.getMainModel(), - miniModel: this.getMiniModel(), - nanoModel: this.getNanoModel(), + miniModel: this.getMiniModel() || undefined, + nanoModel: this.getNanoModel() || undefined, }; }
266-269
: Trim whitespace when validating mainModel.Prevent accepting a string of spaces.
Apply this diff:
- if (!config.mainModel) { + if (!config.mainModel?.trim()) { errors.push('Main model is required'); }
341-358
: Redact endpoint and tighten event‑key typing in storage logs.Endpoints can reveal internal hosts or bearer tokens in URLs; redact them too. Also avoid the any cast by using a Set check.
Apply this diff:
- private handleStorageChange(event: StorageEvent): void { - if (event.key && Object.values(STORAGE_KEYS).includes(event.key as any)) { - const sensitiveKeys = new Set([ + private handleStorageChange(event: StorageEvent): void { + const allKeys = new Set(Object.values(STORAGE_KEYS)); + if (event.key && allKeys.has(event.key)) { + const sensitiveKeys = new Set<string>([ STORAGE_KEYS.OPENAI_API_KEY, STORAGE_KEYS.LITELLM_API_KEY, STORAGE_KEYS.GROQ_API_KEY, STORAGE_KEYS.OPENROUTER_API_KEY, + STORAGE_KEYS.LITELLM_ENDPOINT, // endpoints may be sensitive ]); - const redacted = - sensitiveKeys.has(event.key as any) ? '(redacted)' : + const redacted = + sensitiveKeys.has(event.key) ? '(redacted)' : (event.newValue ? `${event.newValue.slice(0, 8)}…` : null); logger.debug('Configuration changed in another tab', { key: event.key, newValue: redacted }); this.notifyListeners(); } }
363-371
: Debounce change notifications to avoid listener storms.Saving updates multiple keys, which can spam listeners across tabs. Batch via microtask.
Apply this diff:
export class LLMConfigurationManager { private static instance: LLMConfigurationManager; private overrideConfig?: Partial<LLMConfig>; // Override for automated mode private changeListeners: Array<() => void> = []; + private notifyQueued = false;
private notifyListeners(): void { - this.changeListeners.forEach(listener => { - try { - listener(); - } catch (error) { - logger.error('Error in configuration change listener:', error); - } - }); + if (this.notifyQueued) return; + this.notifyQueued = true; + queueMicrotask(() => { + this.notifyQueued = false; + for (const listener of this.changeListeners) { + try { + listener(); + } catch (error) { + logger.error('Error in configuration change listener:', error); + } + } + }); }Also applies to: 43-50
239-251
: Return an unsubscribe and prevent duplicate listeners.Improves ergonomics and avoids accidental duplicate registrations.
Apply this diff:
- addChangeListener(listener: () => void): void { - this.changeListeners.push(listener); - } + addChangeListener(listener: () => void): () => void { + if (!this.changeListeners.includes(listener)) { + this.changeListeners.push(listener); + } + return () => this.removeChangeListener(listener); + }
199-220
: Minor: consider grouping saves to reduce cross‑tab churn.If feasible, persist fewer keys (e.g., skip writing unchanged values) to minimize storage events.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
front_end/panels/ai_chat/core/LLMConfigurationManager.ts
(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
front_end/panels/ai_chat/core/LLMConfigurationManager.ts (3)
front_end/panels/ai_chat/core/AgentService.ts (1)
logger
(296-307)front_end/panels/ai_chat/core/Logger.ts (1)
createLogger
(316-318)front_end/panels/ai_chat/LLM/LLMTypes.ts (1)
LLMProvider
(142-142)
…e, implemented single consolidated override
Claude finished @olesho's task —— View job PR Review: Local Dockerised Eval ServerI've completed a comprehensive review of this substantial PR implementing a local dockerized evaluation server with advanced LLM configuration management. This is an impressive addition with solid architecture, though there are several areas for improvement. Review Progress:
🎯 Overall AssessmentThis PR introduces a sophisticated evaluation server system with excellent separation of concerns and flexible configuration management. The implementation demonstrates strong architectural decisions with room for security and robustness improvements. ✅ StrengthsArchitecture & Design
Code Quality
🔍 Areas for ImprovementCritical Security Issues1. API Key Exposure Risk ( // ISSUE: API keys logged at info level
logger.info('Setting early configuration override', {
hasApiKey: !!apiKey,
apiKeyLength: apiKey?.length // Should not log length
}); Recommendation: Never log API key lengths or related metadata. Use debug level for configuration details. 2. Insufficient Input Validation ( if (!requestBody.input || typeof requestBody.input !== 'string') {
throw new Error('Missing or invalid "input" field. Expected a string.');
} Recommendation: Add comprehensive validation for all request parameters, including size limits and sanitization. 3. CORS Security ( res.setHeader('Access-Control-Allow-Origin', '*'); Recommendation: Replace wildcard CORS with specific allowed origins for production use. Memory & Performance Concerns4. Potential Memory Leaks ( for (const [clientId, connection] of this.connectedClients) {
connection.rpcClient.cleanup();
// Missing: Remove event listeners and clear timers
} Recommendation: Ensure all event listeners, intervals, and timeouts are properly cleaned up. 5. Missing Request Size Limits ( for await (const chunk of req) {
body += chunk; // No size limit
} Recommendation: Implement request size limits to prevent DoS attacks. Error Handling & Reliability6. Incomplete Error Context ( } catch (error) {
const errorMessage = error instanceof Error ? error.message : 'Unknown error';
// Missing: Error stack traces, evaluation context
} Recommendation: Include evaluation context and stack traces in error responses for better debugging. 7. Inconsistent Timeout Handling ( timeout: 1500000, // 25 minutes - very long Recommendation: Implement configurable timeouts with reasonable defaults and proper cleanup on timeout. Code Quality Issues8. Complex Configuration Logic ( 9. Missing Validation ( setOverride(config: Partial<LLMConfig>): void {
this.overrideConfig = { ...config }; // No validation
} Recommendation: Add validation for provider compatibility and required fields. 🧪 Test Coverage GapsThe PR lacks comprehensive test coverage for:
Recommendation: Add unit tests for core components and integration tests for the evaluation flow. 🚀 Performance Recommendations1. Connection PoolingImplement connection pooling for HTTP requests to external LLM providers to reduce latency. 2. Configuration CachingCache parsed configurations to avoid repeated localStorage access. 3. Request DebouncingAdd debouncing for rapid configuration changes to prevent unnecessary reinitializations. 🔧 Implementation Suggestions1. Enhanced Error Typesexport class EvaluationError extends Error {
constructor(
message: string,
public readonly code: string,
public readonly evaluationId?: string,
public readonly context?: Record<string, any>
) {
super(message);
}
} 2. Configuration Validation Schemaconst CONFIG_SCHEMA = {
provider: { type: 'string', enum: ['openai', 'litellm', 'groq', 'openrouter'] },
apiKey: { type: 'string', minLength: 10 },
models: { type: 'object', required: ['main'] }
}; 3. Request Rate LimitingImplement rate limiting for evaluation requests to prevent abuse. 📋 Immediate Action ItemsHigh Priority
Medium Priority
Low Priority
🏆 ConclusionThis is a well-architected and feature-rich addition that significantly enhances the evaluation capabilities. The core design is solid with excellent separation of concerns. The main concerns are around security hardening, comprehensive testing, and production readiness. Recommendation: Address the security issues and add basic test coverage before merging. The architecture is sound and will provide a strong foundation for evaluation workflows. Overall Score: 7.5/10 - Excellent architecture with important security/robustness improvements needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (3)
front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts (3)
537-560
: Don't merge persisted config into chat input unless an override is actually set.Using getConfiguration() unconditionally can clobber per-request values with localStorage. Check hasOverride() first.
677-686
: Global override cleanup race across finally blocks.handleEvaluationRequest and executeChatEvaluation both clear the same global override. With concurrency>1, overrides for other evaluations may be cleared unexpectedly.
978-1023
: Support partial LLM configuration updates but rollback on validation failure needs improvement.Current handler correctly validates after saving but the rollback mechanism (restoring currentConfig) has issues: saveConfiguration might fail silently or partially apply changes. Instead, validate the merged config first, then save only if valid.
Apply this diff to validate before persisting:
// Merge configurations based on partial flag const mergedConfig = { provider: params.partial ? (params.provider ?? currentConfig.provider) : params.provider, apiKey: params.partial ? (params.apiKey ?? currentConfig.apiKey) : params.apiKey, endpoint: params.partial ? (params.endpoint ?? currentConfig.endpoint) : params.endpoint, mainModel: params.partial ? (params.models?.main ?? currentConfig.mainModel) : params.models.main, miniModel: params.partial ? (params.models?.mini ?? currentConfig.miniModel) : params.models?.mini, nanoModel: params.partial ? (params.models?.nano ?? currentConfig.nanoModel) : params.models?.nano }; - // Validate the merged configuration - const validation = configManager.validateConfiguration(); - - // Check validation after setting the merged config temporarily - configManager.saveConfiguration(mergedConfig); - const postSaveValidation = configManager.validateConfiguration(); - - if (!postSaveValidation.isValid) { - // Restore the original config if validation fails - configManager.saveConfiguration(currentConfig); + // Temporarily set override to validate the merged config + configManager.setOverride(mergedConfig); + const validation = configManager.validateConfiguration(); + configManager.clearOverride(); + + if (!validation.isValid) { // Send error response with validation errors const errorResponse = createErrorResponse( id, ErrorCodes.INVALID_PARAMS, 'Invalid configuration', - { errors: postSaveValidation.errors } + { errors: validation.errors } ); if (this.client) { this.client.send(errorResponse); } logger.error('Configuration validation failed', { - errors: postSaveValidation.errors + errors: validation.errors }); return; } + + // Save configuration only after validation passes + configManager.saveConfiguration(mergedConfig);
🧹 Nitpick comments (1)
front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts (1)
4-4
: Remove or update the obsolete cache break comment.The comment references a date from September 2025, which appears to be in the future. This should either be removed or updated with an accurate date.
-// Cache break: 2025-09-17T17:38:00Z - Fixed API key validation
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts
(12 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-29T23:08:07.342Z
Learnt from: CR
PR: BrowserOperator/browser-operator-core#0
File: eval-server/nodejs/CLAUDE.md:0-0
Timestamp: 2025-08-29T23:08:07.342Z
Learning: Applies to eval-server/nodejs/src/evaluator.js : LLM evaluator must integrate with the OpenAI API to judge agent responses
Applied to files:
front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts
🧬 Code graph analysis (1)
front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts (6)
front_end/panels/ai_chat/evaluation/remote/EvaluationProtocol.ts (6)
isLLMConfigurationRequest
(297-299)createAuthVerifyMessage
(197-203)LLMConfigurationRequest
(259-264)createErrorResponse
(240-255)ErrorCodes
(141-155)createLLMConfigurationResponse
(314-334)front_end/panels/ai_chat/core/BuildConfig.ts (1)
BUILD_CONFIG
(11-20)front_end/panels/ai_chat/core/AgentService.ts (1)
AgentService
(44-787)front_end/panels/ai_chat/core/LLMConfigurationManager.ts (2)
LLMConfigurationManager
(42-390)hasOverride
(184-186)front_end/panels/ai_chat/models/ChatTypes.ts (1)
ChatMessage
(74-75)front_end/panels/ai_chat/tracing/TracingConfig.ts (1)
withTracingContext
(313-315)
🔇 Additional comments (3)
front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts (3)
278-284
: LGTM! Clean automated-mode authentication bypass.The implementation correctly checks BUILD_CONFIG.AUTOMATED_MODE and sends a verified auth response early, bypassing the secret key verification flow for Docker/CI deployments.
352-413
: Early override configuration extraction handles both flat and nested formats well.The debug logging is comprehensive and the extraction logic correctly handles both nested (
main_model: { provider, model, api_key }
) and flat formats with proper fallbacks.
958-961
: Override cleanup OK — maxConcurrency = 3; confirm concurrency strategy.Finally block clears the config override. The register message sets maxConcurrency: 3 in front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts; since concurrency >1 a race condition is still possible — either reduce to 1 or add per-evaluation locking/scoped overrides.
const provider = input.provider || config.provider || 'openai'; | ||
const mainModel = input.main_model || config.mainModel || this.judgeModel; | ||
const miniModel = input.mini_model ?? config.miniModel ?? this.miniModel; | ||
const nanoModel = input.nano_model ?? config.nanoModel ?? this.nanoModel; | ||
const apiKey = input.api_key ?? config.apiKey ?? agentService.getApiKey(); | ||
const endpoint = input.endpoint ?? config.endpoint; | ||
|
||
logger.info('Setting consolidated configuration override for chat evaluation', { | ||
provider: provider, | ||
mainModel: mainModel, | ||
hasApiKey: !!apiKey, | ||
hasEndpoint: !!endpoint, | ||
isInitialized: agentService.isInitialized() | ||
}); | ||
|
||
// Always reinitialize with the current model and explicit mini/nano | ||
await agentService.initialize(agentService.getApiKey(), modelName, miniModel, nanoModel); | ||
|
||
// Set single consolidated override with all fallback values | ||
configManager.setOverride({ | ||
provider: provider, | ||
mainModel: mainModel, | ||
miniModel: miniModel, | ||
nanoModel: nanoModel, | ||
apiKey: apiKey || undefined, | ||
endpoint: endpoint || undefined | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
🧩 Analysis chain
Single override point with merged fallback values (looks good but ensure no duplicates).
The consolidation looks correct: you set a single override with proper fallback values (input → config → defaults), avoiding the duplicate override issue. Just ensure this is the only setOverride call in this flow.
🏁 Script executed:
#!/bin/bash
# Verify there's only one setOverride call in the executeChatEvaluation flow
rg -A3 -B3 'configManager\.setOverride' front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts | grep -n 'setOverride'
Length of output: 76
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts"
echo "File: $file"
echo
echo "Search for executeChatEvaluation definition:"
rg -n --line-number 'executeChatEvaluation' "$file" || true
echo
echo "Matches for configManager.setOverride with context:"
rg -n -C5 'configManager\.setOverride' "$file" || true
Length of output: 1023
Consolidate duplicate configManager.setOverride calls — keep a single consolidated override or explicitly justify both.
- front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts — configManager.setOverride at line 385.
- front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts — configManager.setOverride at line 864.
🤖 Prompt for AI Agents
front_end/panels/ai_chat/evaluation/remote/EvaluationAgent.ts lines ~385 and
848-871: there are duplicate configManager.setOverride calls; keep a single
consolidated override (the one at 848-871 which already collects provider,
mainModel, miniModel, nanoModel, apiKey and endpoint) or, if both are required
for different lifecycles, add a clear comment justifying the second call and
ensure the earlier call only sets the minimal/intentional fields. Fix by
removing the redundant call at line 385 (or merging its unique fields into the
consolidated call at 848-871), and run a quick search for usages relying on the
earlier override to confirm no behavioral regression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 20
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (17)
eval-server/nodejs/schemas/client.schema.json (1)
156-158
: Add "llm_judge" to client.schema.json enum for backward compatibilityMany eval YAMLs under eval-server/nodejs/evals use type: "llm_judge" (underscore). Expand the enum in eval-server/nodejs/schemas/client.schema.json to accept both forms to avoid validation failures.
"type": "string", - "enum": ["llm-judge", "snapshot", "hybrid"] + "enum": ["llm-judge", "llm_judge", "snapshot", "hybrid"]front_end/panels/ai_chat/tools/FetcherTool.ts (1)
99-116
: Abort handling: stop processing on AbortError instead of accumulating failures.Current loop catches AbortError per URL and continues, producing many failed entries post‑abort. Short‑circuit and return partial results.
for (const url of urlsToProcess) { throwIfAborted(); try { logger.info('Processing URL', { url }); const fetchedContent = await this.fetchContentFromUrl(url, reasoning, ctx); results.push(fetchedContent); } catch (error: any) { - logger.error('Error processing URL', { url, error: error.message, stack: error.stack }); - results.push({ - url, - title: '', - markdownContent: '', - success: false, - error: `Failed to process URL: ${error.message}` - }); + if (error instanceof DOMException && error.name === 'AbortError') { + logger.info('Fetch aborted; stopping further URL processing.'); + return { + sources: results, + success: results.some(r => r.success), + error: 'aborted' + }; + } + logger.error('Error processing URL', { url, error: error.message, stack: error.stack }); + results.push({ + url, + title: '', + markdownContent: '', + success: false, + error: `Failed to process URL: ${error.message}` + }); } }front_end/panels/ai_chat/tools/Tools.ts (3)
1129-1159
: Escape selector to avoid JS eval breakage/injectionselector is embedded unescaped into an evaluated string. Quotes or backticks in selector can break the expression or be abused.
Use JSON.stringify for safe injection:
- const result = await target.runtimeAgent().invoke_evaluate({ - expression: `(() => { - const element = document.querySelector("${selector}"); + const result = await target.runtimeAgent().invoke_evaluate({ + expression: `(() => { + const sel = ${JSON.stringify(selector)}; + const element = document.querySelector(sel); if (!element) { return { success: false, - message: "Element not found with selector: ${selector}" + message: "Element not found with selector: " + sel }; }
1204-1213
: Escape query string in evaluated codequery is directly interpolated; escape to prevent syntax errors/injection.
- const query = "${query}"; + const query = ${JSON.stringify(query)}; const limit = ${limit};
3181-3199
: Escape XPath in evaluated codexpath is injected into document.evaluate without escaping; can break the expression.
- expression: ` - (function() { - const element = document.evaluate("${xpath}", document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue; + expression: ` + (function() { + const xpath = ${JSON.stringify(xpath)}; + const element = document.evaluate(xpath, document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue; if (!element) return { found: false };front_end/panels/ai_chat/evaluation/test-cases/schema-extractor-tests.ts (2)
120-126
: Replace remaining 'extract_schema' UI references with 'extract_data'TOOL_TEST_MAPPING and the default toolType still reference 'extract_schema' which doesn't match ToolRegistry (it registers 'extract_data'); update these to 'extract_data'.
- front_end/panels/ai_chat/ui/EvaluationDialog.ts:28 — TOOL_TEST_MAPPING key 'extract_schema'
- front_end/panels/ai_chat/ui/EvaluationDialog.ts:175 — initial state toolType: 'extract_schema'
24-31
: Action required: 'extract_data' routes to SchemaBasedExtractorTool (not CombinedExtractionTool)createSchemaTest uses 'extract_data' but ToolRegistry.registerToolFactory('extract_data', () => new SchemaBasedExtractorTool()) — CombinedExtractionTool is named 'navigate_url_and_extraction' and is not bound to 'extract_data'.
Files to inspect/fix:
- front_end/panels/ai_chat/evaluation/test-cases/schema-extractor-tests.ts (uses 'extract_data')
- front_end/panels/ai_chat/agent_framework/implementation/ConfiguredAgents.ts (registerToolFactory('extract_data', ...))
- front_end/panels/ai_chat/tools/SchemaBasedExtractorTool.ts (name = 'extract_data')
- front_end/panels/ai_chat/tools/CombinedExtractionTool.ts (name = 'navigate_url_and_extraction')
Resolution options:
- Update tests to use 'navigate_url_and_extraction' if CombinedExtractionTool is intended, or
- Change ConfiguredAgents.ts to register CombinedExtractionTool under 'extract_data' and verify CombinedExtractionTool's argument/response contract matches the test expectations.
config/gni/devtools_grd_files.gni (1)
661-676
: Missing LLMConfigurationManager in GRD may break imports at runtimeAI summary and PR context reference LLMConfigurationManager, but it isn't listed here. If any frontend code imports it, the GRD omission will 404 in DevTools.
Add it alongside the other ai_chat core files:
"front_end/panels/ai_chat/core/AgentErrorHandler.js", + "front_end/panels/ai_chat/core/LLMConfigurationManager.js", "front_end/panels/ai_chat/core/AgentDescriptorRegistry.js",
front_end/panels/ai_chat/ui/ChatView.ts (1)
1015-1032
: Avoid navigating DevTools UI with window.location; open in new tab via host APIUsing window.location.href navigates the DevTools front-end away from DevTools. Use InspectorFrontendHost to open external links safely.
Apply:
+import * as Host from '../../../core/host/host.js'; @@ - // Navigate to OpenAI API keys page in current window - window.location.href = 'https://platform.openai.com/settings/organization/api-keys'; + // Open in a new tab to avoid navigating DevTools UI + Host.InspectorFrontendHost.InspectorFrontendHostInstance.openInNewTab( + 'https://platform.openai.com/settings/organization/api-keys' + );Optional: prefer DevTools settings storage over raw localStorage for provider selection.
front_end/panels/ai_chat/ui/__tests__/AIChatPanel.test.ts (1)
194-209
: Fix assertion: Map.get returns undefined, not null.Your mock localStorage uses a Map; after delete, Map.get(...) returns undefined. The current strictEqual(null) will fail.
Apply:
- // Should clear mini model (falls back to main model) - assert.strictEqual(mockLocalStorage.get('ai_chat_mini_model'), null); + // Should clear mini model (falls back to main model) + assert.strictEqual(mockLocalStorage.get('ai_chat_mini_model'), undefined); + // Or, if you want to assert via the Storage API semantics: + assert.strictEqual(window.localStorage.getItem('ai_chat_mini_model'), null);front_end/panels/ai_chat/core/ToolNameMap.ts (1)
46-57
: Potential spec violation: collision resolution can exceed 64 chars.After appending - and _, names can exceed 64 even if sanitize() truncated. This violates the provider regex and may be rejected.
Apply:
let candidate = sanitize(original); if (sanitizedToOriginal.has(candidate) && sanitizedToOriginal.get(candidate) !== original) { const suffix = shortHash(original); - const base = candidate.replace(/_+$/g, ''); - candidate = base + '-' + suffix; + const base = candidate.replace(/_+$/g, ''); + // Ensure total length <= 64 after adding "-<suffix>" + const maxBase = Math.max(1, 64 - 1 - suffix.length); + candidate = (base.length > maxBase ? base.slice(0, maxBase) : base) + '-' + suffix; } let unique = candidate; let counter = 1; while (sanitizedToOriginal.has(unique) && sanitizedToOriginal.get(unique) !== original) { - const add = `_${counter++}`; - unique = candidate + add; + const add = `_${counter++}`; + // Ensure total length <= 64 when adding numeric suffix + const maxCandidate = Math.max(1, 64 - add.length); + unique = candidate.length > maxCandidate ? candidate.slice(0, maxCandidate) + add : candidate + add; }Also consider logging when post-collision trimming occurs for observability.
front_end/panels/ai_chat/ui/AIChatPanel.ts (5)
53-64
: Stop logging secret previews in StorageMonitor (severe secret leakage).Current logs include value previews/lengths for keys that may contain API secrets. Remove them.
- localStorage.setItem = (key: string, value: string) => { - if (key.includes('openrouter') || key.includes('ai_chat')) { - logger.debug(`=== LOCALSTORAGE SET ===`); - logger.debug(`Key: ${key}`); - logger.debug(`Value exists: ${!!value}`); - logger.debug(`Value length: ${value?.length || 0}`); - logger.debug(`Value preview: ${value?.substring(0, 50) + (value?.length > 50 ? '...' : '') || 'null'}`); - logger.debug(`Timestamp: ${new Date().toISOString()}`); - } - return this.originalSetItem(key, value); - }; + localStorage.setItem = (key: string, value: string) => { + if (key.includes('openrouter') || key.includes('ai_chat')) { + logger.debug(`=== LOCALSTORAGE SET ===`); + logger.debug(`Key: ${key}`); + logger.debug(`Timestamp: ${new Date().toISOString()}`); + } + return this.originalSetItem(key, value); + };
1071-1091
: Remove API key presence/length logging on OAuth success.Avoid logging any characteristics of secrets.
- window.addEventListener('openrouter-oauth-success', async () => { + window.addEventListener('openrouter-oauth-success', async () => { logger.info('=== OAUTH SUCCESS EVENT RECEIVED IN AICHATPANEL ==='); - logger.info('Timestamp:', new Date().toISOString()); - logger.info('Current localStorage state for OpenRouter:'); - const apiKey = localStorage.getItem('ai_chat_openrouter_api_key'); - const authMethod = localStorage.getItem('openrouter_auth_method'); - logger.info('- API key exists:', !!apiKey); - logger.info('- API key length:', apiKey?.length || 0); - logger.info('- Auth method:', authMethod); + logger.info('Timestamp:', new Date().toISOString()); + const apiKey = localStorage.getItem('ai_chat_openrouter_api_key');
1104-1115
: Avoid logging secret lengths on storage change events.- if (event.key === 'ai_chat_openrouter_api_key' || + if (event.key === 'ai_chat_openrouter_api_key' || event.key === 'openrouter_auth_method') { logger.info('=== STORAGE CHANGE EVENT FOR OPENROUTER ==='); logger.info('Changed key:', event.key); - logger.info('Old value exists:', !!event.oldValue); - logger.info('New value exists:', !!event.newValue); - logger.info('New value length:', event.newValue?.length || 0); logger.info('Re-initializing agent service after storage change...'); this.#initializeAgentService(); }
1144-1154
: Remove API key presence/length logs from refreshCredentials.- logger.info('Current OpenRouter storage state:'); - const apiKey = localStorage.getItem('ai_chat_openrouter_api_key'); - const authMethod = localStorage.getItem('openrouter_auth_method'); - logger.info('- API key exists:', !!apiKey); - logger.info('- API key length:', apiKey?.length || 0); - logger.info('- Auth method:', authMethod); + // Redacted OpenRouter storage state (do not log secrets)
1532-1541
: Do not log API key prefixes.Even partial keys are sensitive.
- logger.info('Retrieved API key:'); - logger.info('- Exists:', !!apiKey); - logger.info('- Length:', apiKey?.length || 0); - logger.info('- Prefix:', apiKey?.substring(0, 8) + '...' || 'none'); + logger.info('Retrieved API key (redacted):', { exists: !!apiKey });front_end/panels/ai_chat/agent_framework/AgentRunner.ts (1)
889-902
: Complex handoff logic with nested session handling could lead to lost sessionsThe handoff execution for LLM-triggered tools has complex session management that could result in orphaned or incorrectly linked sessions if an error occurs during handoff.
Consider extracting the handoff session management into a separate, well-tested utility function to reduce complexity and potential for errors.
tool: "extract_data" | ||
timeout: 45000 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
🧩 Analysis chain
Tool rename to "extract_data" and longer timeout look good; verify end‑to‑end support.
Confirm server/tool registry, schemas, and tests accept "extract_data" (no lingering "extract_schema_data" usage).
🏁 Script executed:
#!/bin/bash
# Check for any remaining references to the old tool name
rg -nP '\bextract_schema_data\b' -C2
# Spot YAML evals still using the old tool field
rg -nP --type=yaml -n -C2 '\btool:\s*"?extract_schema_data"?'
# Sanity-check that the new name is present in configs and schema enums
rg -nP -n -C2 '\btool:\s*"?extract_data"?'
rg -nP -n -C3 '"evaluation"\s*:.*\{[^}]*"tool"[^}]*\benum\b[^}]*\bextract_data\b'
Length of output: 9746
Replace remaining "extract_schema_data" references with "extract_data" and confirm server/tool registry + tests accept the new name.
Docs/examples still use the old tool name while eval YAMLs/templates use "extract_data" — update the docs/snippets and ensure any server enums/switches or tests accept "extract_data".
Affected files (examples found):
- eval-server/nodejs/docs/PROTOCOL.md
- eval-server/nodejs/docs/YAML_SCHEMA.md
- eval-server/nodejs/docs/CLIENT_SETUP.md
- eval-server/nodejs/docs/TRIGGERING_EVALUATIONS.md
🤖 Prompt for AI Agents
In eval-server/nodejs/evals/schema-extractor/google-search-001.yaml around lines
12 to 14, the YAML uses the new tool name "extract_data" but other docs and code
still reference "extract_schema_data"; search and replace all remaining
"extract_schema_data" occurrences with "extract_data" in the listed docs
(eval-server/nodejs/docs/PROTOCOL.md, YAML_SCHEMA.md, CLIENT_SETUP.md,
TRIGGERING_EVALUATIONS.md) and update any server-side registries/enums/switch
statements and unit/integration tests that validate tool names to accept
"extract_data" (or map the old name to the new one). After changes, run the test
suite and CI to ensure no references remain and the new name is accepted.
- "Coordinated property type selection through action_agent" | ||
- "Applied search filters through proper action_agent calls" | ||
- "Extracted property listings with schema_based_extractor" | ||
- "Extracted property listings with extract_data" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
🧩 Analysis chain
Ensure repo-wide extractor rename alignment.
Double-check that client schema/tool enums and templates now use extract_data.
Run:
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Find lingering old extractor names..."
rg -n -C1 -g '!**/node_modules/**' -e 'schema_based_extractor' -e 'extract_schema_data' || true
echo
echo "Validate schema/tool enums reference extract_data:"
fd default-client.yaml | xargs -I{} rg -n -C1 'extract_data' {}
fd client.schema.json | xargs -I{} rg -n -C2 'extract_data|extract_schema_data|schema_based_extractor' {}
Length of output: 465
Ensure repo-wide extractor rename alignment
default-client.yaml already uses "extract_data" (lines 29–31) but client.schema.json still enumerates "extract_schema_streamlined" alongside "extract_data" (lines 131–135); update/remove the legacy enum value and align any templates/libs to "extract_data", then run a repo-wide search for other old extractor names.
🤖 Prompt for AI Agents
In eval-server/nodejs/evals/web-task-agent/realestate-001.yaml around line 52
the extractor name is "extract_data" but client.schema.json (around lines
131–135) still includes the legacy enum value "extract_schema_streamlined";
remove that legacy enum entry from client.schema.json (leaving only
"extract_data"), update any templates or library references that still use
"extract_schema_streamlined" to "extract_data", and run a repo-wide
search/replace to ensure all extractor names are aligned and update tests or
docs if required.
- "Coordinated property type selection through action_agent" | ||
- "Applied search filters through proper action_agent calls" | ||
- "Extracted property listings with schema_based_extractor" | ||
- "Extracted property listings with extract_data" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
🧩 Analysis chain
Consistency check: fully migrate to extract_data across configs/schemas.
Ensure no lingering references to schema_based_extractor or extract_schema_data remain in evals, templates, or schemas.
Run to verify:
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Searching for old extractor names..."
rg -n -C1 -g '!**/node_modules/**' -e 'schema_based_extractor' -e 'extract_schema_data' || true
echo
echo "Confirming extract_data is used for evaluation.tool values..."
rg -n -C1 -g '!**/node_modules/**' -e 'tool:\s*"extract_data"' -e '"tool"\s*:\s*"extract_data"' || true
echo
echo "Check client schema enum still doesn't reference old names..."
fd client.schema.json | xargs -I{} rg -n -C2 -e 'extract_schema_data|schema_based_extractor|extract_data' {}
Length of output: 629
Consistency check — fully migrate to extract_data across configs/schemas.
No matches for schema_based_extractor
or extract_schema_data
, but client.schema.json's enum still contains "extract_schema_streamlined"
alongside "extract_data"
— remove/rename that legacy entry so only "extract_data"
remains.
client.schema.json (enum, ~lines 131–135)
🤖 Prompt for AI Agents
In eval-server/nodejs/evals/web-task-agent/web-task-agent-realestate-001.yaml
around line 52 and in client.schema.json around lines 131–135, the enum still
contains the legacy value "extract_schema_streamlined" alongside "extract_data";
remove (or rename) that legacy enum entry so the enum only includes
"extract_data", and scan the repo/configs for any remaining references to
"extract_schema_streamlined" (rename them to "extract_data" or delete) to
complete the migration to extract_data.
wait_timeout: 5000 | ||
|
||
tool: "extract_schema_data" | ||
tool: "extract_data" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
🧩 Analysis chain
Potential schema key mismatch with new tool name
You switched to tool: "extract_data" but input still uses “schema” (not “extraction_schema” like other configs). If the new tool expects extraction_schema, this example will fail.
- Confirm the expected input key for extract_data.
- If needed, update this template to:
- input.extraction_schema: <schema…>
- Or ensure the tool supports both keys.
Also check validator type naming consistency (“llm-judge” vs “llm_judge” used elsewhere).
Run:
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Search extract_data handlers:"
rg -n -C3 -S 'extract_data'
echo
echo "Check llm[-_]judge variants:"
rg -n -C1 -S '\bllm[_-]judge\b' eval-server
Length of output: 101830
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== SchemaBasedExtractorTool top ==="
sed -n '1,260p' front_end/panels/ai_chat/tools/SchemaBasedExtractorTool.ts || true
echo
echo "=== Search for related arg/interface names and any 'extraction_schema' usage ==="
rg -n --hidden --no-ignore 'SchemaExtractionArgs|SchemaExtractionResult|extraction_schema|extractionSchema' || true
echo
echo "=== client.schema.json (start) ==="
sed -n '1,260p' eval-server/nodejs/schemas/client.schema.json || true
echo
echo "=== Search for 'schema' keys in eval-server templates/evals to confirm usage ==="
rg -n --hidden -S --glob 'eval-server/**' '\binput:\s*$' -n -C1 || true
rg -n --hidden -S --glob 'eval-server/**' '\bschema\b' -n -C1 || true
Length of output: 87952
No change for extract_data input; fix inconsistent llm-judge naming
- extract_data expects input.schema (SchemaBasedExtractorTool requires a "schema") — default-client.yaml's input.schema is correct (see front_end/panels/ai_chat/tools/SchemaBasedExtractorTool.ts and eval-server/nodejs/templates/default-client.yaml).
- Validation type is inconsistent: client.schema.json and EvalServer.js expect validation.type == "llm-judge" (hyphen) but many eval YAMLs use "llm_judge" (underscore) — e.g., files under eval-server/nodejs/evals/** (streamlined-schema-extractor/*.yaml). Normalize validation.type to "llm-judge" across YAMLs or update eval-server/nodejs/schemas/client.schema.json and eval-server/nodejs/src/lib/EvalServer.js to accept "llm_judge" as well.
🤖 Prompt for AI Agents
In eval-server/nodejs/templates/default-client.yaml around line 30, there's a
mismatch in validation.type naming: some YAMLs use "llm_judge" (underscore)
while client.schema.json and EvalServer.js expect "llm-judge" (hyphen). Fix by
normalizing to "llm-judge" across all eval YAMLs under
eval-server/nodejs/evals/** (replace "llm_judge" with "llm-judge");
alternatively, if you prefer to accept the underscore, update
eval-server/nodejs/schemas/client.schema.json and
eval-server/nodejs/src/lib/EvalServer.js to allow both "llm-judge" and
"llm_judge" (e.g., accept either value in validation.type) and then run tests to
ensure validation passes.
if (abortSignal?.aborted) { | ||
logger.info(`${agentName} execution aborted at iteration ${iteration + 1}/${maxIterations}`); | ||
const abortResult = createErrorResult('Execution was cancelled', messages, 'error'); | ||
return { ...abortResult, agentSession: currentSession }; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Early abort check doesn't clean up properly
When execution is aborted, the function returns immediately without proper cleanup or status updates for the agent session.
// Check if execution has been aborted
if (abortSignal?.aborted) {
logger.info(`${agentName} execution aborted at iteration ${iteration + 1}/${maxIterations}`);
+ // Update session status before returning
+ currentSession.status = 'error';
+ currentSession.endTime = new Date();
+ currentSession.terminationReason = 'cancelled';
const abortResult = createErrorResult('Execution was cancelled', messages, 'error');
return { ...abortResult, agentSession: currentSession };
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if (abortSignal?.aborted) { | |
logger.info(`${agentName} execution aborted at iteration ${iteration + 1}/${maxIterations}`); | |
const abortResult = createErrorResult('Execution was cancelled', messages, 'error'); | |
return { ...abortResult, agentSession: currentSession }; | |
} | |
if (abortSignal?.aborted) { | |
logger.info(`${agentName} execution aborted at iteration ${iteration + 1}/${maxIterations}`); | |
// Update session status before returning | |
currentSession.status = 'error'; | |
currentSession.endTime = new Date(); | |
currentSession.terminationReason = 'cancelled'; | |
const abortResult = createErrorResult('Execution was cancelled', messages, 'error'); | |
return { ...abortResult, agentSession: currentSession }; | |
} |
*/ | ||
export class SchemaBasedExtractorTool implements Tool<SchemaExtractionArgs, SchemaExtractionResult> { | ||
name = 'extract_schema_data'; | ||
name = 'extract_data'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Rename to 'extract_data' is consistent with config changes.
Good alignment with YAML/tooling.
🏁 Script executed:
#!/bin/bash
# Ensure no references to the old name linger in code/JSON/YAML/tests
rg -nP '\bextract_schema_data\b' -C2
Length of output: 5869
Rename to 'extract_data' — update remaining docs referencing 'extract_schema_data'
Change in SchemaBasedExtractorTool.ts is correct, but search shows many occurrences of "extract_schema_data" in eval-server/nodejs/docs (YAML_SCHEMA.md, CLIENT_SETUP.md, PROTOCOL.md, TRIGGERING_EVALUATIONS.md). Update those docs or add a compatibility alias so clients aren’t broken.
🤖 Prompt for AI Agents
In front_end/panels/ai_chat/tools/SchemaBasedExtractorTool.ts around line 39,
the tool name was changed to 'extract_data' but many docs and clients still
reference 'extract_schema_data'; update all occurrences in
eval-server/nodejs/docs (YAML_SCHEMA.md, CLIENT_SETUP.md, PROTOCOL.md,
TRIGGERING_EVALUATIONS.md) to 'extract_data' and/or add a compatibility alias so
old clients continue working: either update documentation examples and protocol
specs to the new name, and add server-side handling that accepts
'extract_schema_data' and maps it to 'extract_data' (or vice versa) to avoid
breaking clients while rolling out the rename.
// Add delay before retry to prevent overloading the LLM (abortable) | ||
if (attempt > 1) { | ||
await new Promise(resolve => setTimeout(resolve, this.RETRY_DELAY_MS)); | ||
await this.sleep(this.RETRY_DELAY_MS, ctx?.abortSignal); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
🧩 Analysis chain
LGTM: retry delay is now abortable; consider propagating abort to LLM calls too
Nice improvement. Also pass ctx.abortSignal into callLLMWithTracing (if supported) so in‑flight calls cancel promptly.
// If callLLMWithTracing supports it, add:
// options: { retryConfig: {...}, signal: ctx?.abortSignal }
Run to verify types and support:
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check LLMContext includes abortSignal
rg -n -C3 -P '(interface|type)\s+LLMContext\b'
rg -n -C2 '\babortSignal\b' | head -n 50
# Inspect callLLMWithTracing signature/options for 'signal' support
rg -n -C5 -P 'export\s+(async\s+)?function\s+callLLMWithTracing\b|callLLMWithTracing\('
Length of output: 20076
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "---- front_end/panels/ai_chat/tools/Tools.ts (around LLMContext) ----"
sed -n '1,240p' front_end/panels/ai_chat/tools/Tools.ts || true
echo
echo "---- front_end/panels/ai_chat/tools/LLMTracingWrapper.ts (full) ----"
sed -n '1,400p' front_end/panels/ai_chat/tools/LLMTracingWrapper.ts || true
echo
echo "---- front_end/panels/ai_chat/tools/StreamlinedSchemaExtractorTool.ts (around call sites) ----"
sed -n '1,520p' front_end/panels/ai_chat/tools/StreamlinedSchemaExtractorTool.ts || true
Length of output: 29124
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Searching for LLMCallOptions and LLMClient..."
rg -n --hidden -S "LLMCallOptions" || true
rg -n --hidden -S "interface LLMClient\b|class LLMClient\b|export class LLMClient" || true
echo
echo "---- front_end/panels/ai_chat/LLM/LLMTypes.ts (if exists) ----"
sed -n '1,240p' front_end/panels/ai_chat/LLM/LLMTypes.ts || true
echo
echo "---- front_end/panels/ai_chat/LLM/LLMClient.ts (if exists) ----"
sed -n '1,400p' front_end/panels/ai_chat/LLM/LLMClient.ts || true
Length of output: 19739
🏁 Script executed:
#!/bin/bash
set -euo pipefail
files=(
"front_end/panels/ai_chat/LLM/LLMProvider.ts"
"front_end/panels/ai_chat/LLM/OpenAIProvider.ts"
"front_end/panels/ai_chat/LLM/OpenRouterProvider.ts"
"front_end/panels/ai_chat/LLM/LiteLLMProvider.ts"
"front_end/panels/ai_chat/LLM/GroqProvider.ts"
)
for f in "${files[@]}"; do
echo "---- $f ----"
sed -n '1,420p' "$f" || true
echo
done
echo "---- Search for callWithMessages definitions/usages ----"
rg -n --hidden "callWithMessages\(" || true
Length of output: 56564
Propagate AbortSignal through LLM call stack
sleep(...) is already abortable via ctx.abortSignal, but callLLMWithTracing won't cancel in‑flight LLM requests: LLMCallOptions (LLMTypes.ts) has no signal and providers don't pass an AbortSignal to fetch. Action: add signal?: AbortSignal to LLMCallOptions, accept/forward options.signal in callLLMWithTracing -> LLMClient.call -> provider.callWithMessages, and pass { signal: options.signal } to fetch in provider implementations (OpenAIProvider.ts, LiteLLMProvider.ts, OpenRouterProvider.ts, GroqProvider.ts). Then update call sites (e.g. front_end/panels/ai_chat/tools/StreamlinedSchemaExtractorTool.ts) to include options: { retryConfig: {...}, signal: ctx?.abortSignal } so LLM calls abort promptly.
🤖 Prompt for AI Agents
In front_end/panels/ai_chat/tools/StreamlinedSchemaExtractorTool.ts around lines
183-186, the code awaits an abortable sleep but does not propagate
ctx.abortSignal into the LLM call stack so in‑flight LLM requests cannot be
cancelled; add support for AbortSignal by adding signal?: AbortSignal to
LLMCallOptions (LLMTypes.ts), update callLLMWithTracing to accept and forward
options.signal into LLMClient.call, modify LLMClient.call to forward
options.signal to provider.callWithMessages, update each provider implementation
(OpenAIProvider.ts, LiteLLMProvider.ts, OpenRouterProvider.ts, GroqProvider.ts)
to accept an optional signal and pass { signal: options.signal } into fetch/use
of HTTP clients, and finally update call sites (including this file) to include
options: { retryConfig: {...}, signal: ctx?.abortSignal } so LLM requests are
aborted promptly when the context is cancelled.
#onMCPConnectorsClick(): void { | ||
MCPConnectorsCatalogDialog.show({ | ||
onClose: async () => { | ||
// Refresh MCP registry when catalog is closed in case new connectors were added | ||
try { | ||
await MCPRegistry.init(true); | ||
logger.info('MCP registry refreshed after catalog closed'); | ||
} catch (error) { | ||
logger.error('Failed to refresh MCP registry after catalog closed', error); | ||
} | ||
} | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Confirm MCP reinit semantics on catalog close.
You call MCPRegistry.init(true) but not refresh(); other places do both. Ensure init(true) covers refresh or call refresh() here too.
🏁 Script executed:
#!/bin/bash
# Inspect MCPRegistry.init to verify whether it performs refresh and what it returns.
rg -n -C3 --type=ts 'class\s+MCPRegistry|static\s+init\s*\(|init\s*\('
Length of output: 23632
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="front_end/panels/ai_chat/mcp/MCPRegistry.ts"
echo "== File and lines around init =="
nl -ba "$file" | sed -n '420,520p'
echo
echo "== Search for refresh definition/calls in file =="
rg -n --hidden --no-ignore -C2 '\brefresh\s*\(' "$file" || true
echo
echo "== Show surrounding lines for any refresh definition (if found) =="
rg -n --hidden --no-ignore '\basync\s+refresh\s*\(|\brefresh\s*:\s*async\b' "$file" -n -C3 || true
Length of output: 73
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="front_end/panels/ai_chat/mcp/MCPRegistry.ts"
echo "== init region (420-500) =="
awk 'NR>=420 && NR<=500{printf "%6d: %s\n",NR,$0}' "$file" || true
echo
echo "== lines with 'refresh(' (with context) =="
rg -n -C3 '\brefresh\s*\(' "$file" || true
echo
echo "== possible 'async refresh' definitions =="
rg -n '\basync\s+refresh\s*\(' "$file" || true
echo
echo "== fallback: full file with numbers (first 900 lines) =="
awk '{printf "%6d: %s\n",NR,$0}' "$file" > /tmp/mcprestxt || true
sed -n '1,900p' /tmp/mcprestxt || true
Length of output: 31635
Call MCPRegistry.refresh() after init(true) in the catalog onClose handler.
init(...) only connects and runs refreshToolsForServer (it lists tools) — refresh() is required to register tools with ToolRegistry. Update front_end/panels/ai_chat/ui/AIChatPanel.ts (onClose handler ~lines 2126–2132) to call await MCPRegistry.refresh() after await MCPRegistry.init(true).
🤖 Prompt for AI Agents
In front_end/panels/ai_chat/ui/AIChatPanel.ts around lines 2124 to 2136, the
onClose handler currently calls await MCPRegistry.init(true) but does not call
MCPRegistry.refresh(), so tools are not registered with the ToolRegistry; update
the try block to await MCPRegistry.init(true) followed immediately by await
MCPRegistry.refresh(), and then log the success (e.g., "MCP registry refreshed
after catalog closed") in the same try; keep the existing catch to log errors.
The MIT License (MIT) | ||
|
||
Copyright (c) 2024 Anthropic, PBC | ||
Copyright (c) 2015-2017 Evgeny Poberezkin | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
License mismatch: content appears to be Ajv’s MIT license, not MCP SDK’s.
This is a compliance risk. Restore the correct MCP SDK license here and place Ajv’s license under the Ajv vendored path (or avoid vendoring Ajv docs). Mixing third‑party licenses under a misleading path can cause attribution errors.
🏁 Script executed:
#!/bin/bash
# Show indicators that this directory contains Ajv content rather than MCP SDK
head -n 3 front_end/third_party/mcp-sdk/README.md || true
rg -n 'Ajv' front_end/third_party/mcp-sdk -C2 || true
rg -n 'Anthropic|Model Context Protocol|MCP' front_end/third_party/mcp-sdk -C2 || true
Length of output: 284608
License mismatch — restore MCP SDK license and relocate Ajv's MIT
front_end/third_party/mcp-sdk contains Ajv sources (README.md, lib/, dist/, ajv/) but front_end/third_party/mcp-sdk/LICENSE is Ajv’s MIT (Evgeny Poberezkin). Replace that file with the correct MCP SDK license, and move Ajv’s MIT into the Ajv vendored path (e.g. front_end/third_party/mcp-sdk/ajv/LICENSE or front_end/third_party/ajv/LICENSE) — or stop vendoring Ajv docs.
const { code, state } = this.parseAuthCallback(url); | ||
const expected = sessionStorage.getItem(this.STATE_KEY); | ||
if (expected && state && expected !== state) { | ||
throw new Error('oauth_invalid_state'); | ||
} | ||
// Cleanup state now that we used it | ||
sessionStorage.removeItem(this.STATE_KEY); | ||
// Return user to original page |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enforce OAuth state validation when expected is set.
If a stored state exists, require an exact match; missing state should be rejected. Prevents CSRF on the redirect.
- const expected = sessionStorage.getItem(this.STATE_KEY);
- if (expected && state && expected !== state) {
- throw new Error('oauth_invalid_state');
- }
- // Cleanup state now that we used it
- sessionStorage.removeItem(this.STATE_KEY);
+ const expected = sessionStorage.getItem(this.STATE_KEY);
+ if (expected) {
+ if (!state || expected !== state) {
+ throw new Error('oauth_invalid_state');
+ }
+ // Cleanup state only after successful validation
+ sessionStorage.removeItem(this.STATE_KEY);
+ }
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
const { code, state } = this.parseAuthCallback(url); | |
const expected = sessionStorage.getItem(this.STATE_KEY); | |
if (expected && state && expected !== state) { | |
throw new Error('oauth_invalid_state'); | |
} | |
// Cleanup state now that we used it | |
sessionStorage.removeItem(this.STATE_KEY); | |
// Return user to original page | |
const { code, state } = this.parseAuthCallback(url); | |
const expected = sessionStorage.getItem(this.STATE_KEY); | |
if (expected) { | |
if (!state || expected !== state) { | |
throw new Error('oauth_invalid_state'); | |
} | |
// Cleanup state only after successful validation | |
sessionStorage.removeItem(this.STATE_KEY); | |
} | |
// Return user to original page |
🤖 Prompt for AI Agents
In front_end/third_party/mcp-sdk/MCPOAuthProvider.ts around lines 136 to 143,
the OAuth state validation only rejects when both expected and state exist but
differ; update the logic so that if a stored expected state exists it must be
present in the callback and must match exactly. Replace the current conditional
with a check that if (expected) then if (!state || expected !== state) throw new
Error('oauth_invalid_state');, then proceed to remove the STATE_KEY and continue
returning the user.
Summary by CodeRabbit
New Features
Improvements
Documentation