Skip to content

Assessment: Assessment Pipeline#122

Open
vprashrex wants to merge 8 commits intomainfrom
feature/assesment
Open

Assessment: Assessment Pipeline#122
vprashrex wants to merge 8 commits intomainfrom
feature/assesment

Conversation

@vprashrex
Copy link
Copy Markdown
Collaborator

@vprashrex vprashrex commented Apr 21, 2026

Adds the Assessment module, a feature-gated workflow for running LLM evaluations across datasets and multiple model configurations side by side.

Features
Multi-step evaluation setup

  1. Dataset tab — upload or select an existing dataset (CSV/Excel); preview sample rows
  2. Column Mapper — assign each dataset column a role: Text, Attachment (image/PDF with url or base64 format), Ground Truth, or Skip
  3. Prompt & Config — write a prompt template, define a structured output schema (field name + type), and select one or more model configs to run against
  4. Review — summarise all inputs before submitting

Results tab

  1. Lists all assessment runs grouped by experiment, with per-run status (pending / processing / completed / failed)
  2. Drill into a run to see per-item results in a table; supports retry for failed runs
  3. Export results as JSON, CSV, or XLSX (multi-run exports are zipped)

Real-time updates

  1. SSE stream (/assessment/events) with a circuit-breaker and auto-reconnect so the results tab stays live without polling

Feature flag gating

The /assessment route is gated behind the assessment feature flag read from a server-set cookie — users without the flag are redirected to home

File Change
middleware.ts Feature-gate added — /assessment redirects to home unless the assessment flag is in the user's cookie
app/lib/authCookie.ts kaapi_features cookie helpers added (set, clear)
app/lib/context/AuthContext.tsx features + hasFeature() exposed on auth context; listens to FEATURES_UPDATED_EVENT
app/api/auth/logout/route.ts Clears the features cookie on logout
app/api/users/me/route.ts Sets the features cookie on login from user.features
app/(main)/datasets/page.tsx Import path fix: types/datasettypes/datasets
app/(main)/keystore/page.tsx Removed duplicate STORAGE_KEY constant (moved to app/lib/constants/keystore.ts)

Summary by CodeRabbit

  • New Features

    • Full Assessment workflow: upload/preview datasets, map columns, build prompts/configs, define output schema, run evaluations, and review/export results.
    • Real-time event streaming and polling for live assessment status; per-run preview/export and retry actions.
    • Config management and multi-config comparison with create/version workflows.
    • Dataset preview modal, JSON editor with highlighting, and new assessment UI/tabs; sidebar now includes Assessment entry gated by feature flags.
  • Refactor

    • Centralized feature-flag and client-side storage/keystore plumbing; auth emits feature updates and route gating updated.

…luation

- Added AssessmentPage component with nested steps for dataset selection, column mapping, prompt editing, configuration, output schema, and review.
- Integrated Zustand for state management of assessment dataset.
- Created utility functions for schema conversion and event handling.
- Enhanced Sidebar component to include a link to the new Assessment page.
- Added necessary types for assessment evaluation and schema properties.
- Implemented polling for evaluation status and indicators for processing, success, and failure.
- Included loading states and error handling for API interactions.
- Updated package dependencies to include zustand and xlsx.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 21, 2026

📝 Walkthrough

Walkthrough

Adds a new Assessment feature with UI, state, types, and multiple Next.js proxy routes for assessments/datasets/evaluations/events; introduces feature-flag infrastructure and cookies; dataset/config management utilities and components; centralizes keystore constant; adds icons, colors, and new runtime dependencies.

Changes

Cohort / File(s) Summary
Assessment UI & Page
app/assessment/page.tsx, app/assessment/AssessmentPageClient.tsx
New assessment page and main client component orchestrating dataset/config/result tabs, API-key handling, polling, submission flow, and feature-gate handling.
Assessment Steps & Components
app/assessment/components/...
DatasetStep.tsx, ColumnMapperStep.tsx, PromptEditorStep.tsx, PromptAndConfigStep.tsx, MultiConfigStep.tsx, OutputSchemaStep.tsx, ReviewStep.tsx, Stepper.tsx, JsonEditor.tsx, DataViewModal.tsx, EvaluationsTab.tsx
New stepper-driven multi-step UI and supporting components for dataset upload/parsing, column mapping, prompt editing, config selection/creation, schema editor (visual + code), review/submit, evaluations listing, preview/export, and data view modal.
Assessment State, Types & Utilities
app/assessment/types.ts, app/assessment/store.ts, app/assessment/schemaUtils.ts, app/assessment/useAssessmentEvents.ts, app/assessment/errorUtils.ts
New TypeScript types for assessment flows, Zustand store for dataset/column state, schema → JSON-schema util, SSE client hook with circuit-breaker and reconnection, and forbidden-error helpers.
Config Management API & Constants
app/assessment/config/api.ts, app/assessment/config/constants.ts, app/lib/configTypes.ts
Client-side config API (paged fetch, versions, save, cache invalidation), model config constants/defaults, and centralized config type definitions.
Assessment API Proxy Routes
app/api/assessment/datasets/route.ts, app/api/assessment/datasets/[dataset_id]/route.ts, app/api/assessment/assessments/route.ts, app/api/assessment/assessments/[assessment_id]/results/route.ts, app/api/assessment/assessments/[assessment_id]/retry/route.ts, app/api/assessment/evaluations/route.ts, app/api/assessment/evaluations/[evaluation_id]/results/route.ts, app/api/assessment/evaluations/[evaluation_id]/retry/route.ts, app/api/assessment/events/route.ts
New proxy endpoints forwarding requests to backend including X-API-KEY and cookie forwarding; dataset routes support fetch_content with signed-URL fetch+base64; result routes handle binary downloads; events route proxies SSE with streaming headers.
Feature Flags, Auth & Middleware
app/lib/constants/featureFlags.ts, app/lib/constants/keystore.ts, app/lib/authCookie.ts, app/lib/featureState.ts, app/lib/context/AuthContext.tsx, app/lib/constants.ts, app/lib/types/auth.ts, app/lib/types/nav.ts, app/lib/navConfig.ts, middleware.ts
Adds feature-flag constants/types, features cookie helpers, auth context feature propagation and hasFeature(), client feature removal utility, nav item feature gating, and middleware route gating by features.
Keystore & Dataset type changes
app/(main)/datasets/page.tsx, app/(main)/keystore/page.tsx, app/lib/constants/keystore.ts
Dataset type import switched to new app/lib/types/datasets; removed local exported STORAGE_KEY and introduced centralized STORAGE_KEY.
Icons & Styling
app/components/icons/sidebar/AssessmentIcon.tsx, app/components/icons/index.tsx, app/lib/colors.ts, app/components/Sidebar.tsx
New Assessment SVG icon and re-export; Sidebar updated to conditionally include feature-gated nav items and read feature flags; added colors.text.white.
Speech-to-Text UI
app/components/speech-to-text/ModelComparisonCard.tsx
New ModelComparisonCard component showing WER metrics, transcript preview, and expand/collapse details.
Config Routes & Auth updates
app/api/configs/[config_id]/versions/route.ts, app/api/configs/route.ts, app/api/auth/logout/route.ts, app/api/users/me/route.ts
Preserve query params in config versions endpoint; logout now clears features cookie; users/me sets features cookie from backend response.
Dependencies
package.json
Added dependencies: cockatiel, xlsx, zustand.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Client as AssessmentPageClient
    participant UI as Assessment UI (Steps)
    participant Proxy as Next.js API (app/api/assessment/*)
    participant Backend as Backend API

    User->>Client: Open /assessment
    Client->>UI: render steps, load API key
    UI->>Proxy: POST/GET /api/assessment/datasets (upload/list)
    Proxy->>Backend: forward request with X-API-KEY and cookies
    Backend-->>Proxy: dataset metadata / signed URL / JSON
    Proxy-->>UI: dataset metadata or base64 file content
    UI->>Client: set columns/sampleRow
    User->>UI: submit assessment
    UI->>Proxy: POST /api/assessment/evaluations (create)
    Proxy->>Backend: forward evaluation create
    Backend-->>Proxy: evaluation_id
    Proxy-->>Client: evaluation_id
    Client->>Proxy: GET /api/assessment/events (SSE) with X-API-KEY
    Proxy->>Backend: SSE subscription
    Backend-->>Proxy: SSE events (assessment.results_ready)
    Proxy-->>Client: stream events
    Client->>Proxy: GET /api/assessment/evaluations/:id/results
    Proxy->>Backend: forward, stream/download if binary
    Backend-->>Proxy: results (JSON/CSV/XLSX)
    Proxy-->>Client: results
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~105 minutes

Possibly related PRs

Suggested reviewers

  • nishika26
  • Ayush8923
  • Prajna1999

Poem

🐰 I hopped in with datasets, configs, and streams,

Carrots of schemas and bright little dreams.
I mapped every column and stitched every test,
SSE hummed softly — the results did their best.
Hooray for assessments — may merging be blessed!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.55% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Assessment: Assessment Pipeline' is directly related to the main change in the changeset. The PR adds a comprehensive Assessment module with a complete evaluation pipeline including dataset upload, column mapping, prompt/config setup, and results visualization.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/assesment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
app/lib/context/AuthContext.tsx (1)

126-139: ⚠️ Potential issue | 🟠 Major

Notify feature-flag listeners during logout too.

Line 138 clears apiKeys directly, bypassing persist, so the new kaapi-auth-changed event from Line 98 is not emitted on logout. That can leave same-tab feature-gated UI using stale flags after auth is cleared.

🐛 Proposed fix
     setSession(null);
     setCurrentUser(null);
     clearAllStorage();
-    setApiKeys([]);
+    persist([]);
   }, [persist]);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/lib/context/AuthContext.tsx` around lines 126 - 139, The logout function
currently clears apiKeys by calling setApiKeys([]) which bypasses the persist
hook and prevents the "kaapi-auth-changed" feature-flag event from firing;
change the logout flow in the logout callback to clear api keys via the existing
persist mechanism (use the same persist call used elsewhere to update/clear
"apiKeys" so it emits the kaapi-auth-changed event) instead of calling
setApiKeys directly, keeping the rest of the cleanup (setSession(null),
setCurrentUser(null), clearAllStorage()) intact and ensuring persist remains in
the dependency array.
🟠 Major comments (26)
app/components/speech-to-text/ModelComparisonCard.tsx-149-172 (1)

149-172: ⚠️ Potential issue | 🟠 Major

Add type="button" to non-submitting buttons and ensure the expand toggle is accessible.

Both buttons lack explicit type="button" attributes, causing them to default to type="submit" per HTML spec, which can accidentally trigger form submission if the component is used within a form. Additionally, the icon-only expand button needs an accessible name (aria-label) and expanded state indicator (aria-expanded), and both SVG icons should use aria-hidden="true".

Suggested changes
             {hasExpandedContent && (
               <button
+                type="button"
                 onClick={handleExpandToggle}
+                aria-expanded={isExpanded}
+                aria-label={
+                  isExpanded ? "Collapse model details" : "Expand model details"
+                }
                 className="p-1 rounded hover:bg-gray-100"
                 style={{ color: colors.text.secondary }}
               >
                 <svg
                   className="w-4 h-4"
+                  aria-hidden="true"
                   fill="none"
                   viewBox="0 0 24 24"
                   stroke="currentColor"
           {onClick && (
             <button
+              type="button"
               onClick={onClick}
               className="w-full py-1.5 rounded text-xs font-medium flex items-center justify-center gap-1"
               <svg
                 className="w-3 h-3"
+                aria-hidden="true"
                 fill="none"
                 viewBox="0 0 24 24"
                 stroke="currentColor"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/speech-to-text/ModelComparisonCard.tsx` around lines 149 -
172, The expand toggle currently defined inside the hasExpandedContent
conditional should be made non-submitting and accessible: add type="button" to
the button element that calls handleExpandToggle, add an accessible name (e.g.,
aria-label="Toggle details" or similar) and an aria-expanded attribute bound to
isExpanded, and mark the SVG icon as presentational with aria-hidden="true";
update any other nearby buttons lacking type attributes the same way to prevent
accidental form submission.
app/api/assessment/evaluations/route.ts-8-88 (1)

8-88: ⚠️ Potential issue | 🟠 Major

Use apiClient so auth relay and response status stay consistent.

These handlers manually enforce X-API-KEY, bypassing the shared cookie/header forwarding path, and they return 200 for every successful backend response. For evaluation submission, preserving backend statuses like 201 or 202 is important for clients tracking async work.

♻️ Proposed direction
 import { NextRequest, NextResponse } from 'next/server';
+import { apiClient } from "@/app/lib/apiClient";
 
 export async function GET(request: NextRequest) {
   try {
-    const apiKey = request.headers.get('X-API-KEY');
-
-    if (!apiKey) {
-      return NextResponse.json(
-        { error: 'Missing X-API-KEY header' },
-        { status: 401 }
-      );
-    }
-
-    const backendUrl = process.env.BACKEND_URL || 'http://localhost:8000';
     const searchParams = request.nextUrl.searchParams.toString();
     const queryString = searchParams ? `?${searchParams}` : '';
 
-    const response = await fetch(`${backendUrl}/api/v1/assessment/evaluations${queryString}`, {
-      method: 'GET',
-      headers: {
-        'X-API-KEY': apiKey,
-      },
+    const { status, data } = await apiClient(
+      request,
+      `/api/v1/assessment/evaluations${queryString}`,
+      {
+        method: "GET",
+      },
+    );
-    });
-
-    const data = await response.json();
-
-    if (!response.ok) {
-      return NextResponse.json(data, { status: response.status });
-    }
-
-    return NextResponse.json(data, { status: 200 });
+    return NextResponse.json(data, { status });

Apply the same pattern in POST, passing the JSON body through apiClient and returning its status.

As per coding guidelines, app/api/**/*.{ts,tsx}: Use apiClient(request, endpoint, options) in server-side Next.js route handlers to forward requests to the backend, which automatically relays X-API-KEY and Cookie headers and returns { status, data, headers }.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/evaluations/route.ts` around lines 8 - 88, The GET and
POST handlers in route.ts manually enforce X-API-KEY and call fetch directly,
bypassing the shared header/cookie relay and always returning 200; replace those
fetch flows with apiClient(request, endpoint, options) so auth/cookie forwarding
is preserved and actual backend statuses are returned: for GET call
apiClient(request,
`/api/v1/assessment/evaluations${request.nextUrl.searchParams.toString() ?
`?${request.nextUrl.searchParams.toString()}` : ''}`, { method: 'GET' }) and
return NextResponse.json(response.data, { status: response.status, headers:
response.headers }), and for POST call apiClient(request,
'/api/v1/assessment/evaluations', { method: 'POST', body: await request.json()
}) and return NextResponse.json(response.data, { status: response.status,
headers: response.headers }); remove the manual X-API-KEY checks and
console.error proxy messages in GET and POST so the handlers rely on apiClient
behavior.
app/api/features/route.ts-6-32 (1)

6-32: 🛠️ Refactor suggestion | 🟠 Major

Use the shared server proxy client here.

This route manually reconstructs auth forwarding instead of using the project’s apiClient, so it can drift from the shared X-API-KEY/Cookie relay and error-handling behavior used by other server routes.

♻️ Proposed direction
-import { cookies } from "next/headers";
 import { NextRequest, NextResponse } from "next/server";
+import { apiClient } from "@/app/lib/apiClient";
 
 export const dynamic = "force-dynamic";
 
 export async function GET(request: NextRequest) {
   try {
-    const cookieStore = await cookies();
-    const apiKey =
-      request.headers.get("X-API-KEY") ??
-      cookieStore.get("kaapi_api_key")?.value;
-    if (!apiKey) {
-      return NextResponse.json(
-        { error: "Missing X-API-KEY header" },
-        { status: 401 },
-      );
-    }
-
-    const backendUrl = process.env.BACKEND_URL || "http://localhost:8000";
-    const response = await fetch(`${backendUrl}/api/v1/features`, {
+    const { status, data } = await apiClient(request, "/api/v1/features", {
       method: "GET",
-      headers: { "X-API-KEY": decodeURIComponent(apiKey) },
       cache: "no-store",
     });
 
-    const data = await response.json();
     return NextResponse.json(data, {
-      status: response.status,
+      status,
       headers: {
         "Cache-Control": "no-store, no-cache, must-revalidate",
       },

As per coding guidelines, app/api/**/*.{ts,tsx}: Use apiClient(request, endpoint, options) in server-side Next.js route handlers to forward requests to the backend, which automatically relays X-API-KEY and Cookie headers and returns { status, data, headers }.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/features/route.ts` around lines 6 - 32, The GET handler currently
does its own cookie/header extraction and fetch; replace that logic to call the
shared apiClient(request, "/api/v1/features", { method: "GET", cache: "no-store"
}) so auth (X-API-KEY and Cookie) and error handling are forwarded consistently;
use the returned { status, data, headers } from apiClient to construct the
NextResponse (preserving Cache-Control from response.headers or setting
"no-store, no-cache, must-revalidate" as before) and remove manual cookies() and
decodeURIComponent(apiKey) handling in the GET function.
app/api/assessment/datasets/route.ts-8-87 (1)

8-87: ⚠️ Potential issue | 🟠 Major

Route through apiClient and preserve backend statuses.

Both handlers manually require X-API-KEY, so cookie-authenticated requests are rejected before the shared proxy layer can relay cookies. They also collapse every successful backend response to 200, which can hide 201 Created/202 Accepted from dataset uploads.

♻️ Proposed direction
 import { NextRequest, NextResponse } from 'next/server';
+import { apiClient } from "@/app/lib/apiClient";
 
 export async function GET(request: NextRequest) {
   try {
-    const apiKey = request.headers.get('X-API-KEY');
-
-    if (!apiKey) {
-      return NextResponse.json(
-        { error: 'Missing X-API-KEY header' },
-        { status: 401 }
-      );
-    }
-
-    const backendUrl = process.env.BACKEND_URL || 'http://localhost:8000';
-
-    const response = await fetch(`${backendUrl}/api/v1/assessment/datasets`, {
-      method: 'GET',
-      headers: {
-        'X-API-KEY': apiKey,
-      },
+    const { status, data } = await apiClient(request, "/api/v1/assessment/datasets", {
+      method: "GET",
     });
-
-    const data = await response.json();
-
-    if (!response.ok) {
-      return NextResponse.json(data, { status: response.status });
-    }
-
-    return NextResponse.json(data, { status: 200 });
+    return NextResponse.json(data, { status });

Apply the same pattern in POST, passing the parsed formData as the request body and returning the status from apiClient.

As per coding guidelines, app/api/**/*.{ts,tsx}: Use apiClient(request, endpoint, options) in server-side Next.js route handlers to forward requests to the backend, which automatically relays X-API-KEY and Cookie headers and returns { status, data, headers }.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/datasets/route.ts` around lines 8 - 87, The GET and POST
handlers in route.ts should use the shared apiClient(request, endpoint, options)
so cookies and X-API-KEY are relayed and backend status/headers are preserved;
remove manual request.headers.get('X-API-KEY') checks, call apiClient(request,
'/api/v1/assessment/datasets', { method: 'GET' }) for GET and apiClient(request,
'/api/v1/assessment/datasets', { method: 'POST', body: formData }) for POST (use
await request.formData() before calling), then return NextResponse.json(data, {
status, headers }) using the status and headers returned by apiClient instead of
always mapping successes to 200.
app/assessment/components/DataViewModal.tsx-24-51 (1)

24-51: ⚠️ Potential issue | 🟠 Major

Add dialog semantics and keyboard-close support.

This modal has no role="dialog", aria-modal, labelled title, Escape handler, or accessible name on the icon-only close button. That makes the data preview hard to operate with assistive tech/keyboard-only flows.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/DataViewModal.tsx` around lines 24 - 51, The modal
markup in DataViewModal.tsx is missing dialog semantics and keyboard support;
update the outer modal container element (the fixed inset-0 wrapper or the inner
dialog container used for rendering) to include role="dialog" and
aria-modal="true", give the title element (the h3 that renders {title}) a stable
id (e.g., dataViewModalTitle) and add aria-labelledby referencing that id on the
dialog container, add an accessible label to the icon-only close button (e.g.,
aria-label="Close preview") instead of relying on color-only styling, and wire
an Escape key handler (onKeyDown on the container or a useEffect keyboard
listener) that calls the existing onClose function so keyboard users can close
with Esc. Ensure the stopPropagation click handler remains on the inner
container so clicks inside do not close the dialog.
app/api/assessment/assessments/[assessment_id]/results/route.ts-13-29 (1)

13-29: ⚠️ Potential issue | 🟠 Major

Use the shared proxy path for this backend call.

This route bypasses apiClient and forwards only X-API-KEY, so backend calls won’t receive cookies that the shared route-handler proxy normally relays. If binary passthrough is the blocker, centralize that support in apiClient instead of duplicating proxy logic.

As per coding guidelines, app/api/**/*.{ts,tsx} routes should use apiClient(request, endpoint, options) to forward backend requests.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/assessments/`[assessment_id]/results/route.ts around lines
13 - 29, Replace the direct fetch in route.ts with the central apiClient call:
instead of building backendUrl and calling fetch with only 'X-API-KEY', call
apiClient(request,
`/api/v1/assessment/assessments/${assessment_id}/results${queryString}`, {
method: 'GET', headers: { 'X-API-KEY': apiKey } }) so the shared proxy will
forward cookies and handle binary passthrough; if apiClient doesn't yet support
binary passthrough, add that support there and remove duplicate proxy logic from
this route, keeping references to request, assessment_id, apiClient, and
queryString to locate where to change.
app/api/assessment/evaluations/[evaluation_id]/results/route.ts-13-29 (1)

13-29: ⚠️ Potential issue | 🟠 Major

Keep backend forwarding behind apiClient.

This direct fetch path only forwards X-API-KEY and drops cookies that the shared proxy helper is supposed to relay. If downloads need raw-response handling, consider extending apiClient for passthrough responses rather than bypassing it per route.

As per coding guidelines, app/api/**/*.{ts,tsx} routes should use apiClient(request, endpoint, options) to forward backend requests.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/evaluations/`[evaluation_id]/results/route.ts around lines
13 - 29, This route is directly calling fetch and only forwards the X-API-KEY
header, dropping cookies and bypassing the shared proxy; replace the direct
fetch with the shared apiClient(request, endpoint, options) call so cookies and
other proxy logic are preserved (use apiClient(request,
`/api/v1/assessment/evaluations/${evaluation_id}/results${queryString}`, {
method: 'GET', headers: { 'X-API-KEY': apiKey } })). If you need raw-response
handling for downloads, extend apiClient to support a passthrough/raw response
mode and call that new option from this handler instead of bypassing apiClient.
app/api/assessment/assessments/[assessment_id]/retry/route.ts-7-32 (1)

7-32: ⚠️ Potential issue | 🟠 Major

Use the shared apiClient proxy here.

This route bypasses the server-side proxy helper and only forwards X-API-KEY, so any backend auth/session behavior that depends on forwarded cookies will be lost. Prefer the shared helper so header forwarding and response handling stay consistent.

Proposed direction
 import { NextRequest, NextResponse } from 'next/server';
+import { apiClient } from '@/app/lib/apiClient';

@@
-    const apiKey = request.headers.get('X-API-KEY');
-
-    if (!apiKey) {
-      return NextResponse.json(
-        { error: 'Missing X-API-KEY header' },
-        { status: 401 }
-      );
-    }
-
     const { assessment_id } = await context.params;
-    const backendUrl = process.env.BACKEND_URL || 'http://localhost:8000';
-
-    const response = await fetch(
-      `${backendUrl}/api/v1/assessment/assessments/${assessment_id}/retry`,
-      {
-        method: 'POST',
-        headers: {
-          'X-API-KEY': apiKey,
-        },
-      }
-    );
-
-    const data = await response.json();
-    return NextResponse.json(data, { status: response.status });
+    const { status, data } = await apiClient(
+      request,
+      `/api/v1/assessment/assessments/${assessment_id}/retry`,
+      { method: 'POST' },
+    );
+
+    return NextResponse.json(data, { status });

As per coding guidelines, app/api/**/*.{ts,tsx} routes should use apiClient(request, endpoint, options) to forward backend requests.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/assessments/`[assessment_id]/retry/route.ts around lines 7
- 32, The POST handler is calling fetch directly and bypasses the shared proxy;
update the POST function to call the existing apiClient(request, endpoint,
options) instead of fetch so cookies and forwarded headers are preserved. Build
the endpoint as `/api/v1/assessment/assessments/${assessment_id}/retry`, call
apiClient(request, endpoint, { method: 'POST', headers: { 'X-API-KEY': apiKey }
}) (or omit explicit headers if apiClient already forwards them), await the
response JSON and return it with NextResponse.json(data, { status:
response.status }); ensure you reference the POST function and the assessment_id
param when making the change.
app/assessment/schemaUtils.ts-19-20 (1)

19-20: ⚠️ Potential issue | 🟠 Major

Do not emit empty enum schemas.

If all enum values are blank, this produces enum: [], which makes the field impossible to satisfy. Trim/filter values first and skip or downgrade invalid enum properties instead.

Proposed fix
     } else if (p.type === 'enum') {
-      def = { type: 'string', enum: p.enumValues.filter(v => v.trim()) };
+      const enumValues = p.enumValues.map((v) => v.trim()).filter(Boolean);
+      if (enumValues.length === 0) return;
+      def = { type: 'string', enum: enumValues };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/schemaUtils.ts` around lines 19 - 20, When handling p.type ===
'enum' in schemaUtils.ts, trim and filter p.enumValues first (e.g., const vals =
p.enumValues.map(v => v.trim()).filter(Boolean)); if vals is non-empty set def =
{ type: 'string', enum: vals } otherwise do not emit an empty enum—fallback to a
plain string type (def = { type: 'string' }) or omit the enum constraint for
that property so you avoid producing enum: [] for property p.
package.json-25-25 (1)

25-25: ⚠️ Potential issue | 🟠 Major

Avoid adding vulnerable xlsx@0.18.5 for parsing user-uploaded files.

Version 0.18.5 is affected by two high-severity advisories: GHSA-4r6h-8v6p-xvw6 (Prototype Pollution) and GHSA-5pgg-2g8v-p4x9 (Regular Expression Denial of Service). Since DatasetStep parses user-provided spreadsheet content with this package, this introduces unnecessary supply-chain and input-processing risk. Use a maintained parser, implement server-side constraints on parsing, or choose a non-vulnerable distribution strategy before accepting arbitrary uploads.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@package.json` at line 25, Your package currently depends on a vulnerable xlsx
version (xlsx@0.18.5) which is used by DatasetStep to parse user uploads; remove
or replace this dependency with a non-vulnerable, actively maintained parser (or
upgrade to a patched release) and update usages in DatasetStep to the new API.
Additionally, enforce server-side constraints when parsing untrusted
spreadsheets: validate MIME/type and extension before processing, enforce strict
file size and row/column limits, parse inside an isolated worker/process with
timeouts, and add input sanitization and error handling to DatasetStep to avoid
prototype pollution or ReDoS risks. Ensure tests cover the new parser and the
defensive checks.
app/api/assessment/assessments/[assessment_id]/results/route.ts-31-39 (1)

31-39: ⚠️ Potential issue | 🟠 Major

Stream file exports instead of materializing a Blob.

The misleading comment "stream the response through" contradicts the implementation: await response.blob() buffers the entire CSV/XLSX/ZIP payload in memory before responding. Use response.body directly instead to stream the response without buffering.

Same buffering issue exists in app/api/assessment/evaluations/[evaluation_id]/results/route.ts at line 33—both routes should be fixed.

Proposed streaming change
-      const blob = await response.blob();
       const headers = new Headers();
       headers.set('Content-Type', contentType);
       const disposition = response.headers.get('content-disposition');
       if (disposition) headers.set('Content-Disposition', disposition);
-      return new NextResponse(blob, { status: response.status, headers });
+      return new NextResponse(response.body, { status: response.status, headers });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/assessments/`[assessment_id]/results/route.ts around lines
31 - 39, The code currently buffers file exports by calling response.blob() in
the route handler (see the block that reads response.headers and calls await
response.blob()), which contradicts the "stream the response through" intent;
change the implementation to pipe the upstream response.body stream directly
into the NextResponse without materializing a Blob, e.g. use response.body as
the body for new NextResponse and copy relevant headers (Content-Type and
Content-Disposition) and status; apply the same change to the analogous handler
in app/api/assessment/evaluations/[evaluation_id]/results/route.ts where
response.blob() is used.
app/assessment/page.tsx-1-19 (1)

1-19: ⚠️ Potential issue | 🟠 Major

Move the Assessment route into the (main) route group.

This authenticated application route is currently at app/assessment/page.tsx; place it at app/(main)/assessment/page.tsx so it follows the repo's App Router grouping without changing the public /assessment URL.

As per coding guidelines, app/{(auth),(main)}/**/*.{ts,tsx} routes should be organized using Next.js App Router route groups: app/(auth)/ for unauthenticated flows and app/(main)/ for authenticated application routes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/page.tsx` around lines 1 - 19, The Assessment page route is in
the wrong route group; move the file that exports default async function
AssessmentPage (the component importing AssessmentPageClient, FeatureRouteGuard,
FeatureFlag, and getServerFeatureFlags) into the (main) route group so it lives
under the app/(main)/assessment route group while keeping the public /assessment
URL; after moving, verify and update any relative imports (e.g.,
AssessmentPageClient, FeatureRouteGuard, getServerFeatureFlags, FeatureFlag) so
they resolve from the new location and ensure the exported function signature
and behavior (checking initialFlags[FeatureFlag.ASSESSMENT] and calling
notFound()) remain unchanged.
app/api/assessment/evaluations/[evaluation_id]/retry/route.ts-7-42 (1)

7-42: 🛠️ Refactor suggestion | 🟠 Major

Use apiClient for this proxy route.

Same issue as app/api/assessment/assessments/route.ts: the handler reimplements header forwarding and JSON parsing that apiClient handles. Also address the Prettier formatting CI warning.

As per coding guidelines: "Use apiClient(request, endpoint, options) in server-side Next.js route handlers to forward requests to the backend, which automatically relays X-API-KEY and Cookie headers and returns { status, data, headers }".

♻️ Proposed refactor
-import { NextRequest, NextResponse } from 'next/server';
+import { NextRequest, NextResponse } from 'next/server';
+import { apiClient } from '@/app/lib/apiClient';
 
 interface RouteContext {
   params: Promise<{ evaluation_id: string }>;
 }
 
 export async function POST(request: NextRequest, context: RouteContext) {
   try {
-    const apiKey = request.headers.get('X-API-KEY');
-
-    if (!apiKey) {
-      return NextResponse.json(
-        { error: 'Missing X-API-KEY header' },
-        { status: 401 }
-      );
-    }
-
     const { evaluation_id } = await context.params;
-    const backendUrl = process.env.BACKEND_URL || 'http://localhost:8000';
-
-    const response = await fetch(
-      `${backendUrl}/api/v1/assessment/evaluations/${evaluation_id}/retry`,
-      {
-        method: 'POST',
-        headers: {
-          'X-API-KEY': apiKey,
-        },
-      }
-    );
-
-    const data = await response.json();
-    return NextResponse.json(data, { status: response.status });
+    const { status, data } = await apiClient(
+      request,
+      `/api/v1/assessment/evaluations/${evaluation_id}/retry`,
+      { method: 'POST' }
+    );
+    return NextResponse.json(data, { status });
   } catch (error: unknown) {
     return NextResponse.json(
       {
         error: 'Failed to forward evaluation retry request',
         details: error instanceof Error ? error.message : String(error),
       },
       { status: 500 }
     );
   }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/evaluations/`[evaluation_id]/retry/route.ts around lines 7
- 42, The POST handler in route.ts reimplements header forwarding and JSON
parsing; replace the manual fetch logic with the shared apiClient utility: call
apiClient(request, `/api/v1/assessment/evaluations/${evaluation_id}/retry`, {
method: 'POST' }), then return NextResponse.json(response.data, { status:
response.status }) (or forward headers if needed) instead of manually extracting
X-API-KEY and parsing JSON; update the function POST and remove the custom
headers block and try/catch parsing duplication, and ensure the file formatting
matches Prettier (run formatter) to resolve the CI warning.
app/api/assessment/assessments/route.ts-3-32 (1)

3-32: 🛠️ Refactor suggestion | 🟠 Major

Use apiClient instead of raw fetch for backend forwarding.

Route handlers under app/api/** must use apiClient(request, endpoint, options) which automatically relays X-API-KEY and Cookie headers and returns { status, data, headers }. This eliminates manual key validation, query reconstruction, and JSON parsing boilerplate. Also run npx prettier --write on this file to resolve CI warnings.

♻️ Proposed refactor
-import { NextRequest, NextResponse } from 'next/server';
-
-export async function GET(request: NextRequest) {
-  try {
-    const apiKey = request.headers.get('X-API-KEY');
-
-    if (!apiKey) {
-      return NextResponse.json(
-        { error: 'Missing X-API-KEY header' },
-        { status: 401 }
-      );
-    }
-
-    const backendUrl = process.env.BACKEND_URL || 'http://localhost:8000';
-    const searchParams = request.nextUrl.searchParams.toString();
-    const queryString = searchParams ? `?${searchParams}` : '';
-
-    const response = await fetch(`${backendUrl}/api/v1/assessment/assessments${queryString}`, {
-      method: 'GET',
-      headers: {
-        'X-API-KEY': apiKey,
-      },
-    });
-
-    const data = await response.json();
-    return NextResponse.json(data, { status: response.status });
-  } catch (error: unknown) {
-    return NextResponse.json(
-      { error: 'Failed to forward request to backend', details: error instanceof Error ? error.message : String(error) },
-      { status: 500 }
-    );
-  }
-}
+import { NextRequest, NextResponse } from 'next/server';
+import { apiClient } from '@/app/lib/apiClient';
+
+export async function GET(request: NextRequest) {
+  try {
+    const qs = request.nextUrl.searchParams.toString();
+    const endpoint = `/api/v1/assessment/assessments${qs ? `?${qs}` : ''}`;
+    const { status, data } = await apiClient(request, endpoint, { method: 'GET' });
+    return NextResponse.json(data, { status });
+  } catch (error: unknown) {
+    return NextResponse.json(
+      {
+        error: 'Failed to forward request to backend',
+        details: error instanceof Error ? error.message : String(error),
+      },
+      { status: 500 }
+    );
+  }
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/assessments/route.ts` around lines 3 - 32, Replace the
manual fetch flow inside the exported GET function: remove the manual X-API-KEY
header check, backendUrl/queryString construction, and raw fetch/response.json
logic, and instead call apiClient(request, '/api/v1/assessment/assessments' +
request.nextUrl.search, { method: 'GET' }) (or appropriate options), then return
NextResponse.json(result.data, { status: result.status, headers: result.headers
}) using the returned { status, data, headers } from apiClient; also run npx
prettier --write on this file to fix CI formatting warnings.
app/lib/FeatureFlagProvider.tsx-38-84 (1)

38-84: ⚠️ Potential issue | 🟠 Major

SSR-provided flags can be wiped on first client render.

getServerFeatureFlags() resolves flags from the kaapi_api_key cookie, while fetchFlags here gates on the STORAGE_KEY localStorage entry. These two auth sources can disagree:

  • A signed-in user whose API key only lives in the HttpOnly cookie (or whose localStorage hasn’t populated yet at first render) will hit lines 57–61, setting flags to {} and overwriting the server snapshot that initialFlags just seeded on line 39.
  • Consumers like FeatureRouteGuard / Sidebar that gate on isEnabled will briefly (or persistently) show “feature disabled” even though the server already determined it was enabled.

Consider either (a) skipping the localStorage gate when hasServerSnapshot is true and trusting the server until a real fetch returns a new answer, or (b) attempting the /api/features call unconditionally (the route already returns 401 when unauthenticated, which you can treat as “no change”). This preserves the SSR→CSR continuity the initialFlags prop was designed for.

app/api/assessment/datasets/[dataset_id]/route.ts-42-54 (1)

42-54: ⚠️ Potential issue | 🟠 Major

S3 download has no timeout and buffers the full file in memory.

Two reliability/scalability concerns in the fetch_content=true branch:

  1. No timeout on line 47. If the signed URL hangs (slow S3 region, network partition, dead DNS), this fetch has no AbortSignal.timeout(...) and will tie up the serverless/Node handler indefinitely. On platforms with concurrency-per-instance caps, a handful of such requests can exhaust the pool.
  2. Whole-file base64 in memory on line 51–52. arrayBuffer() + Buffer.from(...).toString('base64') allocates the full file plus ~1.33× for the base64 copy. For even moderately large assessment datasets (tens of MB), this is a large transient allocation on every call and the response body is then serialized through NextResponse.json. Given DatasetStep.tsx just base64-decodes it back to parse with XLSX, a much better contract would be to return the signed_url to the client and let the browser download + parse directly, or to stream the bytes through as application/octet-stream.
🛡️ Minimal fix (timeout), plus streaming suggestion
-      const fileResponse = await fetch(signedUrl);
+      const fileResponse = await fetch(signedUrl, {
+        signal: AbortSignal.timeout(30_000),
+      });

Longer term, consider returning the signed URL directly (client-side fetch already works for presigned S3 URLs without CORS issues if the bucket is configured) or streaming via new Response(fileResponse.body, { headers: { 'content-type': ... } }) instead of buffering + base64.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/datasets/`[dataset_id]/route.ts around lines 42 - 54, The
fetch_content=true branch currently does an untimeouted fetch(signedUrl) and
buffers the entire file (arrayBuffer + Buffer.from(...).toString('base64')),
which can hang handlers and OOM; change the logic in the fetchContent block to
(1) use an AbortController with a reasonable timeout (e.g., 10s) and pass
controller.signal to fetch(signedUrl) to avoid hanging requests, and (2) stop
base64-buffering large files on the server — instead return the signedUrl back
to the client (or stream the fileResponse.body with new Response(...) and
appropriate content-type) via NextResponse.json({ ...data, signed_url: signedUrl
}) so DatasetStep.tsx can fetch/parse in the browser rather than using
arrayBuffer/Buffer.from on the server. Ensure you update references to
signedUrl, fileResponse, and remove the arrayBuffer/Buffer.from usage in that
branch.
app/api/assessment/datasets/[dataset_id]/route.ts-10-64 (1)

10-64: ⚠️ Potential issue | 🟠 Major

Use apiClient to forward the backend request per coding guidelines.

This route uses raw fetch() to call the backend instead of the required apiClient(request, endpoint, options). The coding guideline for app/api/** routes states:

"Use apiClient(request, endpoint, options) in server-side Next.js route handlers to forward requests to the backend, which automatically relays X-API-KEY and Cookie headers and returns { status, data, headers }"

Currently this route:

  • Only reads X-API-KEY from the request headers (line 15), missing the cookie fallback that apiClient provides — callers using cookie-based auth will 401
  • Mirrors the entire upstream response body on error (line 38) instead of extracting a normalized message using the pattern body.error || body.message || body.detail
  • Re-implements query-string passthrough manually

Additionally:

  • The S3 signed URL fetch (line 47) has no timeout, risking indefinite thread pins if the connection hangs
  • Loading the entire file into memory as an ArrayBuffer and base64-encoding it (lines 51–53) causes memory spikes and ~33% size overhead, which can exhaust memory on larger datasets or constrained runtimes

Refactor to use apiClient() and consider returning the signed URL to the client instead of fetching the file server-side.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/datasets/`[dataset_id]/route.ts around lines 10 - 64, The
GET route handler currently calls fetch() directly inside the exported GET
function and manually parses query params, reads only X-API-KEY, mirrors
upstream bodies on error, and downloads the signed S3 file into memory; replace
that call with apiClient(request, `/assessment/datasets/${dataset_id}`, {
method: 'GET', params: <forwarded searchParams with include_signed_url when
fetch_content> }) so headers/cookies are relayed automatically, stop manual
URL/query-string handling, and use the apiClient response shape ({ status, data,
headers }) to return normalized errors (use data.error || data.message ||
data.detail) instead of mirroring the backend body; do not fetch the signed URL
server-side in GET (remove the fileResponse/arrayBuffer/base64 logic) — return
the signed_url to the client or, if you must fetch server-side, implement a
bounded timeout and stream-to-client approach instead of Buffer.from to avoid
memory spikes.
app/assessment/components/DatasetStep.tsx-188-204 (1)

188-204: ⚠️ Potential issue | 🟠 Major

Clear stale columns when dataset parsing fails.

setDatasetId(id) runs before fetchAndParseFile() succeeds. If parsing fails or returns no headers, the selected dataset ID can remain paired with columns/sample data from the previous dataset, and Line 285 can enable “Next” with invalid mapping context.

Track the parsed dataset ID, reset columns on failure, and gate Next on successful parsing.

🐛 Suggested direction
+const [parsedDatasetId, setParsedDatasetId] = useState("");

 const handleDatasetSelect = async (id: string) => {
   setDatasetId(id);
+  setParsedDatasetId("");
   if (!id) return;

   setIsLoadingColumns(true);
   try {
     const parsed = await fetchAndParseFile(id);
-    if (parsed?.headers) {
+    if (parsed?.headers?.length) {
       const firstRow = parsed.rows[0] || [];
       const sampleRow = Object.fromEntries(
         parsed.headers.map((header, index) => [
           header,
           String(firstRow[index] ?? ""),
         ]),
       );
       onColumnsLoaded(parsed.headers, sampleRow);
+      setParsedDatasetId(id);
+    } else {
+      onColumnsLoaded([], {});
+      toast.error("Could not read columns from this dataset");
     }
   } catch (e) {
     console.error("Failed to fetch dataset columns:", e);
+    onColumnsLoaded([], {});
   } finally {
     setIsLoadingColumns(false);
   }
 };

-const canProceed = datasetId && !isLoadingColumns;
+const canProceed = Boolean(datasetId) && parsedDatasetId === datasetId && !isLoadingColumns;

Also applies to: 285-285

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/DatasetStep.tsx` around lines 188 - 204,
handleDatasetSelect currently sets setDatasetId(id) before fetchAndParseFile
succeeds, so if parsing fails or returns no headers the UI can keep stale
columns; modify handleDatasetSelect to clear/reset columns/sample data (call
onColumnsLoaded with empty arrays/object or a dedicated reset handler) and clear
any parsedDatasetId state before calling fetchAndParseFile, then only call
setDatasetId and onColumnsLoaded with real headers/sampleRow after a successful
parse; ensure setIsLoadingColumns is used around the async call and gate the
"Next" button (the logic that checks parsedDatasetId or columns) to require a
successful parse (use a parsedDatasetId or a boolean like isColumnsLoaded) so
Next is disabled unless parsing produced headers.
app/assessment/AssessmentPageClient.tsx-95-144 (1)

95-144: 🛠️ Refactor suggestion | 🟠 Major

Use clientFetch() for assessment API requests.

The polling and submit requests are made with raw fetch(), bypassing centralized 401 refresh handling and auth-expiry dispatch.

♻️ Suggested pattern
-const response = await fetch("/api/assessment/assessments?limit=10", {
+const response = await clientFetch("/api/assessment/assessments?limit=10", {
   headers: { "X-API-KEY": selectedKey.key },
 });

As per coding guidelines, **/*.{ts,tsx} should use clientFetch(endpoint, options) on the client-side for API requests, which handles token refresh on 401 errors and dispatches AUTH_EXPIRED_EVENT when refresh fails.

Also applies to: 190-224

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/AssessmentPageClient.tsx` around lines 95 - 144, Replace raw
fetch calls in the client-side assessment API logic with the centralized helper
clientFetch to ensure token refresh/401 handling; specifically update the
pollEvalStatus function (and the related submit function referenced around lines
190-224) to call clientFetch("/api/assessment/assessments?limit=10", { headers:
{ "X-API-KEY": selectedKey.key }, /* include same options as before */ })
instead of fetch, then preserve the existing response.ok check, JSON parsing,
and subsequent logic (runs detection, hasProcessing check, setEvalIndicator
branches, and dismissedRef usage). Ensure you import/usage-reference clientFetch
where pollEvalStatus and the submit handler are defined and keep the same error
handling behavior (catch block) semantics.
app/assessment/components/PromptAndConfigStep.tsx-512-537 (1)

512-537: ⚠️ Potential issue | 🟠 Major

Check MAX_CONFIGS before saving a new behavior.

When the selection is already full, this still persists a new config, addSelection() rejects it, and the UI then resets + shows “saved and added”. Add the cap check before saveAssessmentConfig() and only show success after the selection is actually added.

🐛 Suggested guard
 const handleCreateAndAdd = async () => {
   if (!configName.trim()) {
     toast.error("Configuration name is required");
     return;
   }
+  if (configs.length >= MAX_CONFIGS) {
+    toast.error(`You can select up to ${MAX_CONFIGS} configurations`);
+    return;
+  }
   setIsSaving(true);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/PromptAndConfigStep.tsx` around lines 512 - 537,
Before calling saveAssessmentConfig in handleCreateAndAdd, check the current
selection count against MAX_CONFIGS (use whatever selection state or selector
the component uses) and abort with toast.error if full; only proceed to save
after confirming there is room. Move the persistence and UI-reset logic
(saveAssessmentConfig, invalidateAssessmentConfigCache,
setDraft(buildInitialDraft()), setConfigName(""), setCommitMessage(""),
setConfigMode("existing"), and toast.success) to occur after addSelection
succeeds, and handle addSelection failure by showing an error and not resetting
the form. Ensure you still call loadConfigs(0, true) after successful
addSelection and keep references to addSelection, saveAssessmentConfig,
handleCreateAndAdd, and MAX_CONFIGS when updating the flow.
app/assessment/config/ConfigurationStep.tsx-217-438 (1)

217-438: ⚠️ Potential issue | 🟠 Major

Avoid stale selection state after async config loads.

toggleVersionSelection() awaits fetchConfigSelection(), then addSelection() writes using the configs array captured before the request. Parallel selections can overwrite each other or apply max/duplicate checks against stale state.

Use a functional setter type for setConfigs, or a latest-configs ref, so async completions merge with the current selection list.

🐛 Suggested direction
 interface ConfigurationStepProps {
   configs: ConfigSelection[];
-  setConfigs: (configs: ConfigSelection[]) => void;
+  setConfigs: React.Dispatch<React.SetStateAction<ConfigSelection[]>>;
 }

 const removeSelection = useCallback(
   (configId: string, version: number) => {
-    setConfigs(
-      configs.filter(
+    setConfigs((prev) =>
+      prev.filter(
         (item) =>
           !(item.config_id === configId && item.config_version === version),
       ),
     );
   },
-  [configs, setConfigs],
+  [setConfigs],
 );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/config/ConfigurationStep.tsx` around lines 217 - 438, The
async path in toggleVersionSelection leads to stale writes because addSelection
uses the captured configs array; change addSelection to use the functional state
updater form (call setConfigs(prev => { perform duplicate/max checks against
prev and return [...prev, selection] })) so duplicate detection and MAX_CONFIGS
enforcement operate on the latest list; ensure any other places that call
setConfigs from async callbacks (e.g., toggleVersionSelection) follow the same
pattern or read from a latest-configs ref.
app/assessment/components/EvaluationsTab.tsx-186-431 (1)

186-431: 🛠️ Refactor suggestion | 🟠 Major

Use clientFetch() for client-side API calls.

These direct fetch() calls bypass the project’s token-refresh and auth-expiry handling. Keep the X-API-KEY header, but route the request through clientFetch(endpoint, options).

♻️ Suggested pattern
-const response = await fetch('/api/assessment/assessments', {
+const response = await clientFetch('/api/assessment/assessments', {
   headers: { 'X-API-KEY': apiKey },
 });

As per coding guidelines, **/*.{ts,tsx} should use clientFetch(endpoint, options) on the client-side for API requests, which handles token refresh on 401 errors and dispatches AUTH_EXPIRED_EVENT when refresh fails.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/EvaluationsTab.tsx` around lines 186 - 431,
Multiple client-side fetch calls (loadAssessments, loadChildRuns,
loadConfigDetail, triggerDownload, handleRerun, handleRetryAssessment,
handlePreview) bypass the project's token-refresh/auth-expiry logic; replace
each direct fetch(...) with clientFetch(endpoint, options) while preserving the
existing headers (including 'X-API-KEY'), HTTP methods, query strings, response
checks, JSON/blob parsing, and error handling. Ensure you import/ensure
clientFetch is available in this module and use it in the same places and shapes
as the original fetch calls so behavior (response.ok checks, parsing, thrown
errors, and finally blocks) remains identical but benefits from token refresh
and AUTH_EXPIRED_EVENT handling.
app/assessment/components/DatasetStep.tsx-58-255 (1)

58-255: 🛠️ Refactor suggestion | 🟠 Major

Use clientFetch() for client-side API calls.

Dataset list/load/upload/delete requests currently use raw fetch(), so 401 refresh and AUTH_EXPIRED_EVENT behavior is skipped.

♻️ Suggested pattern
-const response = await fetch("/api/assessment/datasets", {
+const response = await clientFetch("/api/assessment/datasets", {
   headers: { "X-API-KEY": apiKey },
 });

As per coding guidelines, **/*.{ts,tsx} should use clientFetch(endpoint, options) on the client-side for API requests, which handles token refresh on 401 errors and dispatches AUTH_EXPIRED_EVENT when refresh fails.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/DatasetStep.tsx` around lines 58 - 255, Several
client-side API calls use raw fetch (notably in fetchAndParseFile, loadDatasets,
handleCreateDataset, handleDatasetSelect indirectly, handleViewDataset, and
handleDeleteDataset), which bypasses token refresh and AUTH_EXPIRED_EVENT;
replace those fetch calls with clientFetch(endpoint, options) and preserve all
existing options (method, body/formData, headers including "X-API-KEY": apiKey)
and existing response/error handling logic, import clientFetch at the top of the
file, and ensure you still call await response.json() and check response.ok
exactly as before so behavior remains the same except for automatic
refresh/AUTH_EXPIRED_EVENT handling.
app/assessment/components/PromptAndConfigStep.tsx-345-394 (1)

345-394: ⚠️ Potential issue | 🟠 Major

Avoid stale configs writes after async selection fetches.

addSelection() closes over the configs array from the render that started the request. If users select multiple versions quickly, later async completions can call setConfigs([...configs, selection]) with stale state and drop an earlier selection.

Prefer a functional state update by widening the prop type to React.Dispatch<React.SetStateAction<ConfigSelection[]>>, or keep a synchronized ref for the latest configs before applying async results.

🐛 Suggested direction
 interface PromptAndConfigStepProps {
-  setConfigs: (configs: ConfigSelection[]) => void;
+  setConfigs: React.Dispatch<React.SetStateAction<ConfigSelection[]>>;
 }

 const addSelection = useCallback(
   (selection: ConfigSelection) => {
-    if (
-      configs.some(
+    setConfigs((prev) => {
+      if (
+        prev.some(
           (c) =>
             c.config_id === selection.config_id &&
             c.config_version === selection.config_version,
-        )
-      ) {
-        toast.error("This configuration version is already selected");
-        return;
+        )
+      ) {
+        return prev;
       }
-      if (configs.length >= MAX_CONFIGS) {
-        toast.error(`You can select up to ${MAX_CONFIGS} configurations`);
-        return;
+
+      if (prev.length >= MAX_CONFIGS) {
+        return prev;
       }
-      setConfigs([...configs, selection]);
+
+      return [...prev, selection];
+    });
   },
-  [configs, setConfigs, toast],
+  [setConfigs],
 );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/PromptAndConfigStep.tsx` around lines 345 - 394,
addSelection currently closes over the stale configs array and uses
setConfigs([...configs, selection]) which can drop selections when async fetches
(in toggleVersionSelection -> fetchConfigSelection) resolve out of order; change
addSelection to use the functional state updater form (call setConfigs(prev =>
[...prev, selection])) or change the configs prop type to
React.Dispatch<React.SetStateAction<ConfigSelection[]>> and update all callers
accordingly so asynchronous completions in toggleVersionSelection/addSelection
safely append without relying on a captured configs variable.
app/assessment/config/api.ts-95-100 (1)

95-100: 🛠️ Refactor suggestion | 🟠 Major

Use the exported CACHE_KEY constant instead of hardcoding the string.

constants.ts already exports CACHE_KEY = 'kaapi_configs_cache' (line 276), but here the same literal is duplicated. If the key is ever renamed in constants.ts, cache invalidation will silently stop working. Import and use CACHE_KEY to stay in sync with the rest of the module.

♻️ Proposed fix
-import { CACHE_INVALIDATED_EVENT } from "./constants";
+import { CACHE_INVALIDATED_EVENT, CACHE_KEY } from "./constants";
@@
 export function invalidateAssessmentConfigCache(): void {
   if (typeof window === "undefined") return;

-  localStorage.removeItem("kaapi_configs_cache");
+  localStorage.removeItem(CACHE_KEY);
   window.dispatchEvent(new CustomEvent(CACHE_INVALIDATED_EVENT));
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/config/api.ts` around lines 95 - 100, The function
invalidateAssessmentConfigCache currently hardcodes the cache key string; import
and use the exported CACHE_KEY constant (from constants.ts) instead of the
literal "kaapi_configs_cache" inside invalidateAssessmentConfigCache and in any
localStorage.removeItem calls so the key stays in sync; update the top-of-file
imports to include CACHE_KEY and replace the hardcoded string in
localStorage.removeItem with CACHE_KEY.
app/assessment/config/api.ts-178-201 (1)

178-201: ⚠️ Potential issue | 🟠 Major

Client-side duplicate-name check is O(N) pagination with unhandled TOCTOU race condition.

findConfigByExactName paginates through all configs sequentially (100 per page) to verify a name doesn't exist before creating, yet another client/tab can still create that same name between the check (line 215) and the POST request (line 256). The backend proxy at app/api/configs/route.ts passes query parameters through to the backend, so if the backend /api/v1/configs endpoint supports a name filter parameter, use it to fetch directly instead of client-side pagination. If the backend returns a 409 on duplicate creates, requestJson will throw on non-2xx status, but currently there is no retry logic or special handling for duplicate-name errors—add explicit handling for the duplicate case as a retry that performs a new version POST instead of failing outright.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/config/api.ts` around lines 178 - 201, The client-side
pagination in findConfigByExactName is inefficient and racey; change it to call
the backend configs endpoint with a name filter (use the same query param
pass-through that app/api/configs/route.ts supports) so fetchConfigPage is not
needed — i.e., query /api/v1/configs?name=<normalizedName> and return the single
match or null. Also add explicit handling around the POST request that creates a
config (the requestJson call used for creates): if the POST fails with a 409
duplicate-name response, catch that error and implement a retry path that
performs a “create new version” POST (instead of surfacing the error) so
concurrent creates result in a versioned config rather than an unhandled
failure.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f50ed920-8ac2-4b8e-9512-2e83818393bc

📥 Commits

Reviewing files that changed from the base of the PR and between a52ee8e and 5d8d9bc.

⛔ Files ignored due to path filters (2)
  • package-lock.json is excluded by !**/package-lock.json
  • yarn.lock is excluded by !**/yarn.lock, !**/*.lock
📒 Files selected for processing (51)
  • app/(main)/datasets/page.tsx
  • app/(main)/keystore/page.tsx
  • app/api/assessment/assessments/[assessment_id]/results/route.ts
  • app/api/assessment/assessments/[assessment_id]/retry/route.ts
  • app/api/assessment/assessments/route.ts
  • app/api/assessment/datasets/[dataset_id]/route.ts
  • app/api/assessment/datasets/route.ts
  • app/api/assessment/evaluations/[evaluation_id]/results/route.ts
  • app/api/assessment/evaluations/[evaluation_id]/retry/route.ts
  • app/api/assessment/evaluations/route.ts
  • app/api/assessment/events/route.ts
  • app/api/configs/[config_id]/versions/route.ts
  • app/api/configs/route.ts
  • app/api/features/route.ts
  • app/assessment/AssessmentPageClient.tsx
  • app/assessment/components/ColumnMapperStep.tsx
  • app/assessment/components/DataViewModal.tsx
  • app/assessment/components/DatasetStep.tsx
  • app/assessment/components/EvaluationsTab.tsx
  • app/assessment/components/JsonEditor.tsx
  • app/assessment/components/MultiConfigStep.tsx
  • app/assessment/components/OutputSchemaStep.tsx
  • app/assessment/components/PromptAndConfigStep.tsx
  • app/assessment/components/PromptEditorStep.tsx
  • app/assessment/components/ReviewStep.tsx
  • app/assessment/components/Stepper.tsx
  • app/assessment/config/ConfigurationStep.tsx
  • app/assessment/config/api.ts
  • app/assessment/config/constants.ts
  • app/assessment/page.tsx
  • app/assessment/schemaUtils.ts
  • app/assessment/store.ts
  • app/assessment/types.ts
  • app/assessment/useAssessmentEvents.ts
  • app/components/FeatureRouteGuard.tsx
  • app/components/Sidebar.tsx
  • app/components/icons/index.tsx
  • app/components/icons/sidebar/AssessmentIcon.tsx
  • app/components/speech-to-text/ModelComparisonCard.tsx
  • app/layout.tsx
  • app/lib/FeatureFlagProvider.tsx
  • app/lib/colors.ts
  • app/lib/configTypes.ts
  • app/lib/constants/featureFlags.ts
  • app/lib/constants/keystore.ts
  • app/lib/context/AuthContext.tsx
  • app/lib/featureFlags.server.ts
  • app/lib/navConfig.ts
  • app/lib/types/datasets.ts
  • app/lib/types/nav.ts
  • package.json
💤 Files with no reviewable changes (1)
  • app/(main)/keystore/page.tsx

Comment thread app/components/speech-to-text/ModelComparisonCard.tsx
Comment thread app/layout.tsx Outdated
- Added feature flag handling in various components and API routes.
- Introduced cookies for managing feature flags and updated authentication flows.
- Removed deprecated feature flag provider and related components.
- Enhanced sidebar and assessment page to utilize feature flags for conditional rendering.
- Updated middleware to enforce feature-based access control.
- Updated OutputSchemaStep to remove eslint-disable comments for dependencies.
- Enhanced PromptAndConfigStep to require at least one configuration and a response format to proceed, with updated UI elements and messages.
- Modified ReviewStep to make the onEdit prop optional and improved dataset display.
- Adjusted Stepper logic to allow sequential unlocking of steps.
- Simplified ConfigurationStep by removing Google (Gemini) configurations and updating related UI elements.
- Updated constants to remove references to Google provider and related configurations.
- Enhanced Zustand store to include datasetName and updated dataset management methods.
- Updated types to include datasetName in AssessmentFormState.
…add "use client" directive in ModelComparisonCard
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Nitpick comments (12)
app/assessment/store.ts (2)

2-2: Use the @/ path alias for the types import.

-import { ColumnMapping } from "./types";
+import { ColumnMapping } from "@/app/assessment/types";

As per coding guidelines: "Use the @/ import alias configured in tsconfig.json to resolve imports from the project root (e.g., import { apiClient } from '@/app/lib/apiClient')".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/store.ts` at line 2, Replace the relative import of the
ColumnMapping type in app/assessment/store.ts with the project root alias;
change the import that currently references "./types" so it imports
ColumnMapping from "@/app/assessment/types" (update the import statement that
declares ColumnMapping).

23-27: Optional: avoid sharing a single DEFAULT_MAPPING reference across initial state and resets.

DEFAULT_MAPPING (and its inner arrays) is reused by reference for the initial state, setDataset's reset, and clearDataset. Today's consumers replace columnMapping with a brand-new object (see ColumnMapperStep.tsx), so this is safe — but any future code that mutates state.columnMapping.textColumns (or one of the other arrays) in place would corrupt the module-level default and every subsequent reset.

A small factory makes this robust against future regressions:

♻️ Proposed refactor
-const DEFAULT_MAPPING: ColumnMapping = {
-  textColumns: [],
-  attachments: [],
-  groundTruthColumns: [],
-};
+const createDefaultMapping = (): ColumnMapping => ({
+  textColumns: [],
+  attachments: [],
+  groundTruthColumns: [],
+});

Then use createDefaultMapping() in the initial state, setDataset, and clearDataset.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/store.ts` around lines 23 - 27, DEFAULT_MAPPING and its arrays
are shared by reference causing potential future mutation bugs; add a factory
function (e.g. createDefaultMapping()) that returns a fresh ColumnMapping object
with new arrays and replace usages of DEFAULT_MAPPING in the initial state, in
setDataset, and in clearDataset so each state/init/reset gets its own
independent object (leave ColumnMapperStep.tsx unchanged but ensure it still
receives a new object).
app/assessment/components/Stepper.tsx (1)

62-67: Use colors.text.white for consistency.

Per the AI summary, this PR introduces colors.text.white (already used in ColumnMapperStep.tsx line 434). Replacing the hardcoded "#ffffff" here keeps the design tokens centralized.

♻️ Proposed refactor
-                color: isActive
-                  ? "#ffffff"
-                  : isCompleted
-                    ? colors.text.primary
-                    : colors.text.secondary,
+                color: isActive
+                  ? colors.text.white
+                  : isCompleted
+                    ? colors.text.primary
+                    : colors.text.secondary,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/Stepper.tsx` around lines 62 - 67, The color value
in Stepper's render uses a hardcoded "#ffffff"; replace that literal with the
design token colors.text.white for consistency (update the ternary expression
setting color, e.g., inside the component or function rendering the step circle
where isActive/isCompleted are evaluated). Ensure the border expression remains
the same but if it references the same white anywhere else in this component,
swap to colors.text.white as well so design tokens are centralized (look for the
color: ... and border: `1px solid ${...}` expressions in Stepper.tsx).
app/assessment/components/DatasetStep.tsx (1)

286-304: Avoid synthesizing a native change event on a hidden file input.

handleDrop constructs a DataTransfer, mutates fileInputRef.current.files, and dispatches a native change event to piggyback on handleFileSelect. While this happens to work because React delegates events at the root, it duplicates the extension allowlist (lines 133 vs. 290) and is fragile — e.g., it silently no-ops when the dropped file fails the allowlist check (no toast like handleFileSelect produces).

Extract a shared acceptFile(file) and call it directly in both handlers.

♻️ Suggested refactor
+  const ALLOWED_EXTS = [".csv", ".xlsx", ".xls"];
+  const acceptFile = (file: File) => {
+    const hasValidExt = ALLOWED_EXTS.some((ext) =>
+      file.name.toLowerCase().endsWith(ext),
+    );
+    if (!hasValidExt) {
+      toast.error("Please select a CSV or Excel (.xlsx, .xls) file");
+      return;
+    }
+    setUploadedFile(file);
+    if (!datasetName) {
+      setDatasetName(file.name.replace(/\.(csv|xlsx|xls)$/i, ""));
+    }
+  };

   const handleFileSelect = (event: React.ChangeEvent<HTMLInputElement>) => {
     const file = event.target.files?.[0];
     if (!file) return;
-    const allowedExts = [".csv", ".xlsx", ".xls"];
-    ...
+    acceptFile(file);
+    event.target.value = "";
   };

   const handleDrop = (e: React.DragEvent) => {
     e.preventDefault();
     setIsDragging(false);
     const file = e.dataTransfer.files?.[0];
-    const allowedExts = [".csv", ".xlsx", ".xls"];
-    if (file && allowedExts.some(...)) {
-      const dt = new DataTransfer();
-      ...
-    }
+    if (file) acceptFile(file);
   };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/DatasetStep.tsx` around lines 286 - 304, handleDrop
currently synthesizes a native change event and duplicates the allowlist logic;
extract a shared function (e.g., acceptFile(file: File)) that encapsulates the
extension allowlist check, state updates, toasts, and any file-input
assignment/processing performed by handleFileSelect, then replace the
DataTransfer + dispatch logic in handleDrop with a direct call to
acceptFile(file); update handleFileSelect to call acceptFile(file) as well so
the allowlist and side-effects live in one place (refer to handleDrop and
handleFileSelect to locate where to extract and call acceptFile).
app/assessment/components/ReviewStep.tsx (2)

18-26: Avoid nesting interactive elements; remove the unused stepNumber prop.

Two minor items in AccordionSection:

  • The header is a <div role="button" tabIndex={0}> (line 44–60) that contains an inner <button> (line 99–112) for the Edit action. Nested interactive elements are non-conformant per the WAI-ARIA spec and can confuse assistive tech even though stopPropagation prevents the visual click bubbling. Consider rendering the header as <button type="button"> and placing Edit outside the toggle (e.g., as a sibling positioned via flex), or rework the layout so only one element is interactive.
  • stepNumber is declared on AccordionSectionProps (line 20) and threaded through every call site but never read — safe to drop.

Also applies to: 44-112

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/ReviewStep.tsx` around lines 18 - 26,
AccordionSection currently nests an inner interactive Edit button inside a
header implemented as a div with role="button" and declares a never-used
stepNumber prop; remove stepNumber from AccordionSectionProps and all call
sites, and refactor the header so the toggle is a single native control (change
the header container to <button type="button"> and keep onToggle on that button)
while moving the Edit action outside the toggle (render the Edit control as a
sibling element — e.g., a flex-aligned button or icon button — so it is not a
child of the toggle), and ensure keyboard handlers and aria attributes are
preserved on the new toggle button and that stopPropagation/handlers are updated
accordingly.

200-203: Prefer CSS :focus over imperative onFocus/onBlur for border styling.

Inline event handlers that mutate e.target.style work but defeat memoization and can desync with controlled re-renders. A Tailwind focus:border-[var(--accent)] (or a small CSS class) achieves the same result declaratively.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/ReviewStep.tsx` around lines 200 - 203, The JSX
currently uses imperative onFocus and onBlur handlers that mutate
e.target.style.borderColor; replace those handlers on the affected input
element(s) in ReviewStep.tsx with a declarative focus style (e.g., a Tailwind
class like focus:border-[var(--accent)] or a small CSS class using :focus) so
border color is handled by CSS instead of the onFocus/onBlur functions, and
remove the inline event handlers (search for the onFocus={(e)=>
(e.target.style.borderColor = colors.accent.primary)} and onBlur={(e)=>
(e.target.style.borderColor = colors.border)} occurrences to update the
element’s className accordingly).
app/assessment/components/OutputSchemaStep.tsx (2)

118-156: Unsafe type casts when ingesting external JSON.

fromJsonSchema casts actualDef.enum as string[] (line 142) and actualDef.type as SchemaPropertyType (line 144) without runtime validation. A user pasting a valid-looking JSON Schema with "enum": [1, 2, 3] or "type": "null" would silently produce a broken SchemaProperty. Consider filtering enum values to strings and falling back to "string" when actualDef.type is not in the supported SchemaPropertyType union.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/OutputSchemaStep.tsx` around lines 118 - 156,
fromJsonSchema currently unsafely casts actualDef.enum and actualDef.type;
update it to validate and sanitize inputs: when handling enum, set enumValues =
Array.isArray(actualDef.enum) ? (actualDef.enum.filter(v => typeof v ===
"string") as string[]) : []; and when deriving type, check actualDef.type
against the supported SchemaPropertyType union (e.g., allowedTypes = new
Set<SchemaPropertyType>(["string","number","boolean","object","enum","integer","..."]))
and use actualDef.type only if it is in allowedTypes, otherwise fallback to
"string"; keep all other behavior (isArray, children, required, genId()) intact.

8-20: Replace the Google Fonts dependency with an inline SVG asterisk.

useMaterialSymbols lazily injects a <link> to fonts.googleapis.com solely to render one glyph (asterisk at line 334). This adds:

  • A third-party network request on first mount (privacy/GDPR consideration when self-hosted otherwise).
  • A render flash where the text "asterisk" is visible before the font loads.
  • A failure mode in restricted/offline environments.

An inline SVG (or even the * character) would remove the external dependency entirely.

♻️ Suggested refactor
-/* ── Lazy-load Material Symbols ── */
-const MATERIAL_SYMBOLS_HREF =
-  "https://fonts.googleapis.com/css2?family=Material+Symbols+Outlined:opsz,wght,FILL,GRAD@24,400,0,0&icon_names=asterisk";
-
-function useMaterialSymbols() {
-  useEffect(() => {
-    if (document.querySelector(`link[href="${MATERIAL_SYMBOLS_HREF}"]`)) return;
-    const link = document.createElement("link");
-    link.rel = "stylesheet";
-    link.href = MATERIAL_SYMBOLS_HREF;
-    document.head.appendChild(link);
-  }, []);
-}
-          <span
-            className="material-symbols-outlined"
-            style={{ fontSize: "18px" }}
-          >
-            asterisk
-          </span>
+          <svg className="w-[18px] h-[18px]" viewBox="0 0 24 24" fill="currentColor" aria-hidden="true">
+            <path d="M12 2v20M4.93 6.93l14.14 10.14M19.07 6.93L4.93 17.07" stroke="currentColor" strokeWidth="2" strokeLinecap="round"/>
+          </svg>

You can then remove the useMaterialSymbols() call at line 561.

Also applies to: 330-336

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/OutputSchemaStep.tsx` around lines 8 - 20, The
current useMaterialSymbols hook injects an external Google Fonts link
(MATERIAL_SYMBOLS_HREF) just to render a single "asterisk" glyph; remove the
external dependency by deleting MATERIAL_SYMBOLS_HREF and the useMaterialSymbols
hook and removing its invocation, then replace the JSX that currently renders
the Material Symbols "asterisk" (where the icon name "asterisk" is used) with a
small inline SVG asterisk or the plain "*" character (update the
OutputSchemaStep JSX where that glyph is rendered); ensure any imports or
references solely for the font/hook are cleaned up.
app/assessment/page.tsx (1)

1-5: Organize the assessment route under the (main) route group.

app/assessment/page.tsx is an authenticated route but lives at the App Router root instead of under app/(main)/, inconsistent with all other authenticated feature routes in the codebase. Move to app/(main)/assessment/page.tsx (along with the colocated AssessmentPageClient.tsx and components/ directory). Route groups are URL-transparent, so the /assessment path is preserved while conforming to the project's routing convention: "Organize routes using Next.js App Router route groups: app/(auth)/ for unauthenticated flows (invite, verify) and app/(main)/ for authenticated application routes."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/page.tsx` around lines 1 - 5, The assessment route file lives
at the App Router root but should be inside the authenticated route group; move
app/assessment/page.tsx plus its colocated AssessmentPageClient.tsx and
components/ into app/(main)/assessment/, preserve the URL path (/assessment)
since route groups are transparent, and update any imports that reference the
old paths to the new location (e.g., import AssessmentPageClient from
"@/app/(main)/assessment/AssessmentPageClient" or use the appropriate alias).
Ensure the default export function AssessmentPage still returns
<AssessmentPageClient /> and adjust any relative imports inside
AssessmentPageClient or components to reflect the new directory structure.
app/assessment/config/constants.ts (3)

266-273: Silent GPT4_STYLE_CONFIG fallback can mislead the UI for unknown models.

When modelName doesn't match any entry, this returns the GPT-4 family params (top_p, temperature). For a reasoning-only model that hasn't been added to ASSESSMENT_MODEL_CONFIGS yet, the configuration UI will silently render irrelevant sliders, and buildDefaultParams will seed top_p/temperature into CompletionParams — values that may be invalid for that model. Consider either returning an empty definition {} (fail-closed) or logging a console.warn so missing entries surface during development.

♻️ Suggested change
 export function getModelConfigDefinition(
   modelName: string,
 ): Record<string, ConfigParamDefinition> {
-  return (
-    ASSESSMENT_MODEL_CONFIGS.find((item) => item.model_name === modelName)
-      ?.config ?? GPT4_STYLE_CONFIG
-  );
+  const entry = ASSESSMENT_MODEL_CONFIGS.find(
+    (item) => item.model_name === modelName,
+  );
+  if (!entry) {
+    if (process.env.NODE_ENV !== "production") {
+      console.warn(
+        `[assessment] Unknown model "${modelName}"; falling back to empty config.`,
+      );
+    }
+    return {};
+  }
+  return entry.config;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/config/constants.ts` around lines 266 - 273, The function
getModelConfigDefinition currently falls back to GPT4_STYLE_CONFIG when
ASSESSMENT_MODEL_CONFIGS has no matching model, which silently surfaces
inappropriate GPT-4 params in the UI; change it to fail-closed by returning an
empty definition {} (or at minimum emit a console.warn including the unknown
modelName) instead of GPT4_STYLE_CONFIG so the UI and downstream
buildDefaultParams don't seed invalid top_p/temperature values; update the logic
in getModelConfigDefinition and add a warning referencing
ASSESSMENT_MODEL_CONFIGS, GPT4_STYLE_CONFIG, and buildDefaultParams so missing
model entries are visible during development.

14-18: Tighten provider typing for type-safe lookups.

AssessmentModelConfig.provider is the literal "openai", but getModelsByProvider/getDefaultModelForProvider accept arbitrary string. Deriving a Provider union from PROVIDER_OPTIONS (or vice versa) would catch typos at the call sites and remove the need for the hardcoded "gpt-4o-mini" fallback string in getDefaultModelForProvider.

export type Provider = (typeof PROVIDER_OPTIONS)[number]["value"]; // "openai"

export function getModelsByProvider(provider: Provider): ModelOption[] { /* ... */ }

export function getDefaultModelForProvider(provider: Provider): string {
  // ASSESSMENT_MODEL_CONFIGS guaranteed non-empty for any valid Provider
  const entry = ASSESSMENT_MODEL_CONFIGS.find((m) => m.provider === provider);
  return entry?.model_name ?? ASSESSMENT_MODEL_CONFIGS[0].model_name;
}

Also applies to: 250-264

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/config/constants.ts` around lines 14 - 18, Change the loose
provider typing to a derived union and update callers: derive a Provider type
from PROVIDER_OPTIONS (e.g., Provider = (typeof
PROVIDER_OPTIONS)[number]["value"]), change AssessmentModelConfig.provider from
the literal "openai" to this Provider type, and update getModelsByProvider and
getDefaultModelForProvider to accept Provider rather than string; then remove
the hardcoded fallback model string ("gpt-4o-mini") in
getDefaultModelForProvider and return ASSESSMENT_MODEL_CONFIGS[0].model_name
when an entry isn't found (use ASSESSMENT_MODEL_CONFIGS and model_name for the
guaranteed default).

53-216: Reduce duplication: hoist the shared reasoning config like GPT4_STYLE_CONFIG.

The effort + summary parameter blocks are repeated almost verbatim across o3, o3-mini, o4-mini, gpt-5, gpt-5-mini, gpt-5-nano, gpt-5.1, and gpt-5.2. Only the effort.options differ (["low","medium","high"], ["minimal","low","medium","high"], ["none","low","medium","high"], ["none","low","medium","high","xhigh"]). Extracting helpers makes future additions (e.g., new effort tiers, copy tweaks) a one-line change and prevents drift like the current "null"/no-"null" inconsistency.

♻️ Proposed refactor
const SUMMARY_PARAM = {
  type: "enum",
  default: "auto",
  options: ["auto", "detailed", "concise"],
  description: "Summarize the reasoning result.",
} as const satisfies ConfigParamDefinition;

const reasoningConfig = (
  effortOptions: readonly string[],
): Record<string, ConfigParamDefinition> => ({
  effort: {
    type: "enum",
    default: "medium",
    options: [...effortOptions],
    description:
      "How long the model spends reasoning. Higher = better but slower.",
  },
  summary: { ...SUMMARY_PARAM },
});

// usage
{ provider: "openai", model_name: "o3", config: reasoningConfig(["low", "medium", "high"]) },
{ provider: "openai", model_name: "gpt-5", config: reasoningConfig(["minimal", "low", "medium", "high"]) },
{ provider: "openai", model_name: "gpt-5.2", config: reasoningConfig(["none", "low", "medium", "high", "xhigh"]) },
// "*-chat-latest" / "*-pro" models that only need summary:
{ provider: "openai", model_name: "gpt-5.1-chat-latest", config: { summary: { ...SUMMARY_PARAM } } },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/config/constants.ts` around lines 53 - 216, Hoist the
duplicated effort+summary blocks by creating a shared SUMMARY_PARAM constant and
a reasoningConfig(effortOptions) factory, then replace the repeated config
blocks for model_name values like "o3-mini", "o3", "o4-mini", "gpt-5",
"gpt-5-mini", "gpt-5-nano", "gpt-5.1", and "gpt-5.2" to call
reasoningConfig(...) with the appropriate effort options (e.g.,
["low","medium","high"], ["minimal","low","medium","high"],
["none","low","medium","high","xhigh"]) and replace the "gpt-5.1-chat-latest"
entry to use only SUMMARY_PARAM; ensure the types satisfy ConfigParamDefinition
and that the shared SUMMARY_PARAM options do not include the stray "null" value
so all model entries inherit the consistent summary options.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/assessment/AssessmentPageClient.tsx`:
- Around line 367-379: The payload being built in AssessmentPageClient.tsx omits
the tracked columnMapping.groundTruthColumns, so ground-truth labels are never
sent; update the payload construction (where payload is created) to include the
groundTruthColumns mapping (e.g., add a ground_truth_columns or
groundTruthColumns field) by mapping columnMapping.groundTruthColumns into the
same simple shape as attachments/text_columns (extracting the column and any
needed type/format), ensuring it follows the server's expected key name and
structure so reference labels are included in the POST body.

In `@app/assessment/components/ColumnMapperStep.tsx`:
- Around line 74-103: The current Record<string, ColumnConfig> state
(columnConfigs) keyed by column name causes collisions when columns contains
duplicate names; change columnConfigs to be an array aligned with the columns
array (e.g., ColumnConfig[] managed by setColumnConfigs) so each column instance
is tracked by index, update any places using columnConfigs (initialization loop
that builds configs from columns and columnMapping, state updates when toggling
roles via setColumnConfigs, and handleNext which currently iterates
Object.entries(columnConfigs)) to use the index-aligned array, and ensure the
rendered list uses key={index} (or a stable index-based id) instead of
key={column} to eliminate duplicate-key warnings and preserve per-column state
for duplicates.

In `@app/assessment/components/DatasetStep.tsx`:
- Around line 60-94: fetchAndParseFile currently swallows all failure cases by
returning null, which lets handleDatasetSelect silently continue and leaves
isLoadingColumns true/columns empty so canProceed (which checks datasetId &&
!isLoadingColumns) can allow advancing with zero columns; change
fetchAndParseFile to throw descriptive errors (or return an error object) for
each failure path (non-ok response, missing file_content, missing sheet, empty
data), then update handleDatasetSelect to catch those specific errors, call the
toast/error UI with the error message, set isLoadingColumns to false and ensure
columns state is cleared or marked invalid so users cannot proceed, and keep the
canProceed logic (datasetId && !isLoadingColumns) intact so it blocks forward
progression when loading/parse fails; reference functions: fetchAndParseFile,
handleDatasetSelect and the canProceed/isLoadingColumns/columns state variables.

In `@app/assessment/components/OutputSchemaStep.tsx`:
- Around line 452-456: The useEffect that sets draft state when the modal opens
(useEffect that calls setDraftSchema(JSON.parse(JSON.stringify(schema))))
currently lists only [open] but also reads schema — either include schema in the
dependency array or intentionally silence the linter with an explanatory
eslint-disable-next-line comment; prefer keeping the effect triggered only on
open by using a ref to hold the latest schema (e.g., schemaRef.current updated
elsewhere) or add "// eslint-disable-next-line react-hooks/exhaustive-deps -- we
only want to run this when `open` changes, not when `schema` changes" above the
effect. Similarly, the effect that calls setSchema (the effect depending on
codeValue and editorMode) must include setSchema in its dependency array or be
wrapped so a stable setter is used (e.g., useCallback/memoize in parent) or add
a clear eslint-disable-next-line with comment explaining why setSchema is
intentionally excluded to avoid stale-closure bugs.

In `@app/assessment/components/PromptAndConfigStep.tsx`:
- Around line 1066-1165: The primary CTA always targets version 1 causing stale
selections; change the logic to compute the "latestVersion" from the loaded
version state (e.g. derive latestVersionId or latestVersionIndex from
versionStateByConfig[config.id].items[0] or the highest-version item) and use
that instead of the hard-coded 1 when computing defaultSelected, building the
loading key in loadingSelectionKeys (currently `${config.id}:1`), and when
calling toggleVersionSelection(config, 1); update those references to use the
computed latestVersion identifier so the button and selected state target the
newest saved version.

In `@app/assessment/config/ConfigurationStep.tsx`:
- Around line 827-927: The primary CTA is hard-coded to version "1"; instead
compute the default version from the latest available version in versions.items
and use that id everywhere instead of 1. For example, derive const
latestVersionId = versions.items[0]?.id ?? 1 and replace uses of the literal 1
in isSelected(config.id, 1), loadingSelectionKeys[`${config.id}:1`], and the
onClick handler void toggleVersionSelection(config, 1) with
isSelected(config.id, latestVersionId),
loadingSelectionKeys[`${config.id}:${latestVersionId}`], and void
toggleVersionSelection(config, latestVersionId) respectively (keeping a safe
fallback when versions.items is empty).

In `@app/assessment/config/constants.ts`:
- Line 86: Remove the invalid "null" string from the reasoning.summary options
arrays in the model config constants so the only accepted values are "auto",
"concise", and "detailed"; locate each config object that contains an options:
["auto", "detailed", "concise", "null"] (e.g., the reasoning.summary options
entries for the o3, o4-mini, gpt-5*, and gpt-5.1/5.2 variants) and change those
arrays to ["auto", "detailed", "concise"] to match the o3-mini entry and the
OpenAI Responses API allowed values.

In `@app/assessment/store.ts`:
- Around line 40-47: setDataset currently overwrites columnMapping with
DEFAULT_MAPPING every call; change it so columnMapping is only reset when the
dataset actually changes by comparing the incoming datasetId to the current
state.datasetId — if they differ, set columnMapping to DEFAULT_MAPPING,
otherwise preserve state.columnMapping (and continue to update datasetId,
datasetName, columns, sampleRow as already done). Refer to the setDataset setter
and the symbols columnMapping, DEFAULT_MAPPING and state.datasetId to implement
this conditional logic.

In `@middleware.ts`:
- Around line 44-45: The middleware currently reads feature flags from a
client-writable cookie via
parseFeatures(request.cookies.get(FEATURES_COOKIE)?.value), which is untrusted;
change the check so feature gating is based on a server-trusted signal: either
read an HttpOnly/signed feature cookie issued by the server and verify its
signature (implement a verifyFeatureCookie function) or fetch the user’s feature
state from trusted server storage (e.g., getFeaturesForUser(userId) or
validateFeaturesAgainstServer(userId, features)) before allowing routes that
require ASSESSMENT; update the code paths that use parseFeatures and
FEATURES_COOKIE (including the other usage around the later block) to call the
new verification method and fall back to denying the feature if verification
fails.

---

Nitpick comments:
In `@app/assessment/components/DatasetStep.tsx`:
- Around line 286-304: handleDrop currently synthesizes a native change event
and duplicates the allowlist logic; extract a shared function (e.g.,
acceptFile(file: File)) that encapsulates the extension allowlist check, state
updates, toasts, and any file-input assignment/processing performed by
handleFileSelect, then replace the DataTransfer + dispatch logic in handleDrop
with a direct call to acceptFile(file); update handleFileSelect to call
acceptFile(file) as well so the allowlist and side-effects live in one place
(refer to handleDrop and handleFileSelect to locate where to extract and call
acceptFile).

In `@app/assessment/components/OutputSchemaStep.tsx`:
- Around line 118-156: fromJsonSchema currently unsafely casts actualDef.enum
and actualDef.type; update it to validate and sanitize inputs: when handling
enum, set enumValues = Array.isArray(actualDef.enum) ? (actualDef.enum.filter(v
=> typeof v === "string") as string[]) : []; and when deriving type, check
actualDef.type against the supported SchemaPropertyType union (e.g.,
allowedTypes = new
Set<SchemaPropertyType>(["string","number","boolean","object","enum","integer","..."]))
and use actualDef.type only if it is in allowedTypes, otherwise fallback to
"string"; keep all other behavior (isArray, children, required, genId()) intact.
- Around line 8-20: The current useMaterialSymbols hook injects an external
Google Fonts link (MATERIAL_SYMBOLS_HREF) just to render a single "asterisk"
glyph; remove the external dependency by deleting MATERIAL_SYMBOLS_HREF and the
useMaterialSymbols hook and removing its invocation, then replace the JSX that
currently renders the Material Symbols "asterisk" (where the icon name
"asterisk" is used) with a small inline SVG asterisk or the plain "*" character
(update the OutputSchemaStep JSX where that glyph is rendered); ensure any
imports or references solely for the font/hook are cleaned up.

In `@app/assessment/components/ReviewStep.tsx`:
- Around line 18-26: AccordionSection currently nests an inner interactive Edit
button inside a header implemented as a div with role="button" and declares a
never-used stepNumber prop; remove stepNumber from AccordionSectionProps and all
call sites, and refactor the header so the toggle is a single native control
(change the header container to <button type="button"> and keep onToggle on that
button) while moving the Edit action outside the toggle (render the Edit control
as a sibling element — e.g., a flex-aligned button or icon button — so it is not
a child of the toggle), and ensure keyboard handlers and aria attributes are
preserved on the new toggle button and that stopPropagation/handlers are updated
accordingly.
- Around line 200-203: The JSX currently uses imperative onFocus and onBlur
handlers that mutate e.target.style.borderColor; replace those handlers on the
affected input element(s) in ReviewStep.tsx with a declarative focus style
(e.g., a Tailwind class like focus:border-[var(--accent)] or a small CSS class
using :focus) so border color is handled by CSS instead of the onFocus/onBlur
functions, and remove the inline event handlers (search for the onFocus={(e)=>
(e.target.style.borderColor = colors.accent.primary)} and onBlur={(e)=>
(e.target.style.borderColor = colors.border)} occurrences to update the
element’s className accordingly).

In `@app/assessment/components/Stepper.tsx`:
- Around line 62-67: The color value in Stepper's render uses a hardcoded
"#ffffff"; replace that literal with the design token colors.text.white for
consistency (update the ternary expression setting color, e.g., inside the
component or function rendering the step circle where isActive/isCompleted are
evaluated). Ensure the border expression remains the same but if it references
the same white anywhere else in this component, swap to colors.text.white as
well so design tokens are centralized (look for the color: ... and border: `1px
solid ${...}` expressions in Stepper.tsx).

In `@app/assessment/config/constants.ts`:
- Around line 266-273: The function getModelConfigDefinition currently falls
back to GPT4_STYLE_CONFIG when ASSESSMENT_MODEL_CONFIGS has no matching model,
which silently surfaces inappropriate GPT-4 params in the UI; change it to
fail-closed by returning an empty definition {} (or at minimum emit a
console.warn including the unknown modelName) instead of GPT4_STYLE_CONFIG so
the UI and downstream buildDefaultParams don't seed invalid top_p/temperature
values; update the logic in getModelConfigDefinition and add a warning
referencing ASSESSMENT_MODEL_CONFIGS, GPT4_STYLE_CONFIG, and buildDefaultParams
so missing model entries are visible during development.
- Around line 14-18: Change the loose provider typing to a derived union and
update callers: derive a Provider type from PROVIDER_OPTIONS (e.g., Provider =
(typeof PROVIDER_OPTIONS)[number]["value"]), change
AssessmentModelConfig.provider from the literal "openai" to this Provider type,
and update getModelsByProvider and getDefaultModelForProvider to accept Provider
rather than string; then remove the hardcoded fallback model string
("gpt-4o-mini") in getDefaultModelForProvider and return
ASSESSMENT_MODEL_CONFIGS[0].model_name when an entry isn't found (use
ASSESSMENT_MODEL_CONFIGS and model_name for the guaranteed default).
- Around line 53-216: Hoist the duplicated effort+summary blocks by creating a
shared SUMMARY_PARAM constant and a reasoningConfig(effortOptions) factory, then
replace the repeated config blocks for model_name values like "o3-mini", "o3",
"o4-mini", "gpt-5", "gpt-5-mini", "gpt-5-nano", "gpt-5.1", and "gpt-5.2" to call
reasoningConfig(...) with the appropriate effort options (e.g.,
["low","medium","high"], ["minimal","low","medium","high"],
["none","low","medium","high","xhigh"]) and replace the "gpt-5.1-chat-latest"
entry to use only SUMMARY_PARAM; ensure the types satisfy ConfigParamDefinition
and that the shared SUMMARY_PARAM options do not include the stray "null" value
so all model entries inherit the consistent summary options.

In `@app/assessment/page.tsx`:
- Around line 1-5: The assessment route file lives at the App Router root but
should be inside the authenticated route group; move app/assessment/page.tsx
plus its colocated AssessmentPageClient.tsx and components/ into
app/(main)/assessment/, preserve the URL path (/assessment) since route groups
are transparent, and update any imports that reference the old paths to the new
location (e.g., import AssessmentPageClient from
"@/app/(main)/assessment/AssessmentPageClient" or use the appropriate alias).
Ensure the default export function AssessmentPage still returns
<AssessmentPageClient /> and adjust any relative imports inside
AssessmentPageClient or components to reflect the new directory structure.

In `@app/assessment/store.ts`:
- Line 2: Replace the relative import of the ColumnMapping type in
app/assessment/store.ts with the project root alias; change the import that
currently references "./types" so it imports ColumnMapping from
"@/app/assessment/types" (update the import statement that declares
ColumnMapping).
- Around line 23-27: DEFAULT_MAPPING and its arrays are shared by reference
causing potential future mutation bugs; add a factory function (e.g.
createDefaultMapping()) that returns a fresh ColumnMapping object with new
arrays and replace usages of DEFAULT_MAPPING in the initial state, in
setDataset, and in clearDataset so each state/init/reset gets its own
independent object (leave ColumnMapperStep.tsx unchanged but ensure it still
receives a new object).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 513748a0-05aa-428e-ad41-acfebb76fbd9

📥 Commits

Reviewing files that changed from the base of the PR and between 5d8d9bc and fcbf161.

📒 Files selected for processing (21)
  • app/api/auth/logout/route.ts
  • app/api/users/me/route.ts
  • app/assessment/AssessmentPageClient.tsx
  • app/assessment/components/ColumnMapperStep.tsx
  • app/assessment/components/DatasetStep.tsx
  • app/assessment/components/OutputSchemaStep.tsx
  • app/assessment/components/PromptAndConfigStep.tsx
  • app/assessment/components/ReviewStep.tsx
  • app/assessment/components/Stepper.tsx
  • app/assessment/config/ConfigurationStep.tsx
  • app/assessment/config/constants.ts
  • app/assessment/page.tsx
  • app/assessment/store.ts
  • app/assessment/types.ts
  • app/components/Sidebar.tsx
  • app/lib/authCookie.ts
  • app/lib/constants.ts
  • app/lib/context/AuthContext.tsx
  • app/lib/featureState.ts
  • app/lib/types/auth.ts
  • middleware.ts
✅ Files skipped from review due to trivial changes (2)
  • app/lib/constants.ts
  • app/assessment/types.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • app/lib/context/AuthContext.tsx

Comment on lines +367 to +379
const payload = {
experiment_name: experimentName.trim(),
dataset_id: parseInt(datasetId, 10),
prompt_template: promptTemplate || null,
text_columns: columnMapping.textColumns,
attachments: columnMapping.attachments.map(
({ column, type, format }) => ({ column, type, format }),
),
output_schema: schemaToJsonSchema(outputSchema) || null,
configs: configs.map(({ config_id, config_version }) => ({
config_id,
config_version,
})),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Submission drops the mapped ground-truth columns.

columnMapping.groundTruthColumns is tracked in this page, but it never makes it into the POST body. Any assessment that depends on reference labels will be submitted without them.

Suggested fix
       const payload = {
         experiment_name: experimentName.trim(),
         dataset_id: parseInt(datasetId, 10),
         prompt_template: promptTemplate || null,
         text_columns: columnMapping.textColumns,
+        ground_truth_columns: columnMapping.groundTruthColumns,
         attachments: columnMapping.attachments.map(
           ({ column, type, format }) => ({ column, type, format }),
         ),
         output_schema: schemaToJsonSchema(outputSchema) || null,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const payload = {
experiment_name: experimentName.trim(),
dataset_id: parseInt(datasetId, 10),
prompt_template: promptTemplate || null,
text_columns: columnMapping.textColumns,
attachments: columnMapping.attachments.map(
({ column, type, format }) => ({ column, type, format }),
),
output_schema: schemaToJsonSchema(outputSchema) || null,
configs: configs.map(({ config_id, config_version }) => ({
config_id,
config_version,
})),
const payload = {
experiment_name: experimentName.trim(),
dataset_id: parseInt(datasetId, 10),
prompt_template: promptTemplate || null,
text_columns: columnMapping.textColumns,
ground_truth_columns: columnMapping.groundTruthColumns,
attachments: columnMapping.attachments.map(
({ column, type, format }) => ({ column, type, format }),
),
output_schema: schemaToJsonSchema(outputSchema) || null,
configs: configs.map(({ config_id, config_version }) => ({
config_id,
config_version,
})),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/AssessmentPageClient.tsx` around lines 367 - 379, The payload
being built in AssessmentPageClient.tsx omits the tracked
columnMapping.groundTruthColumns, so ground-truth labels are never sent; update
the payload construction (where payload is created) to include the
groundTruthColumns mapping (e.g., add a ground_truth_columns or
groundTruthColumns field) by mapping columnMapping.groundTruthColumns into the
same simple shape as attachments/text_columns (extracting the column and any
needed type/format), ensuring it follows the server's expected key name and
structure so reference labels are included in the POST body.

Comment on lines +74 to +103
const [columnConfigs, setColumnConfigs] = useState<
Record<string, ColumnConfig>
>(() => {
const configs: Record<string, ColumnConfig> = {};

columns.forEach((column) => {
if (columnMapping.textColumns.includes(column)) {
configs[column] = { role: "text" };
return;
}

if (columnMapping.groundTruthColumns.includes(column)) {
configs[column] = { role: "ground_truth" };
return;
}

const attachment = columnMapping.attachments.find(
(item) => item.column === column,
);
configs[column] = attachment
? {
role: "attachment",
attachmentType: attachment.type,
attachmentFormat: attachment.format,
}
: { role: "unmapped" };
});

return configs;
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Confirm column source path and check for any de-duplication step.
fd -t f -e ts -e tsx | xargs rg -nP --type=ts --type=tsx -C2 'onColumnsLoaded|columnMapping\.textColumns|sheet_to_json' 2>/dev/null | head -120

Repository: ProjectTech4DevAI/kaapi-frontend

Length of output: 58


🏁 Script executed:

# Check if the file exists
fd -type f "ColumnMapperStep.tsx"

Repository: ProjectTech4DevAI/kaapi-frontend

Length of output: 247


🏁 Script executed:

# Get file size and read the target lines (74-103)
wc -l app/assessment/components/ColumnMapperStep.tsx
sed -n '74,103p' app/assessment/components/ColumnMapperStep.tsx

Repository: ProjectTech4DevAI/kaapi-frontend

Length of output: 940


🏁 Script executed:

# Check lines 240-246 and 166 mentioned in scratchpad
sed -n '160,175p' app/assessment/components/ColumnMapperStep.tsx
sed -n '235,250p' app/assessment/components/ColumnMapperStep.tsx

Repository: ProjectTech4DevAI/kaapi-frontend

Length of output: 1012


🏁 Script executed:

# Find where columns come from - look for onColumnsLoaded or similar patterns
rg "onColumnsLoaded|DatasetStep" --type ts --type tsx -n

Repository: ProjectTech4DevAI/kaapi-frontend

Length of output: 104


🏁 Script executed:

# Search for sheet_to_json or XLSX usage
rg "sheet_to_json|XLSX" --type ts --type tsx -n

Repository: ProjectTech4DevAI/kaapi-frontend

Length of output: 104


🏁 Script executed:

# Find DatasetStep and onColumnsLoaded references
rg "onColumnsLoaded|DatasetStep" -n

Repository: ProjectTech4DevAI/kaapi-frontend

Length of output: 801


🏁 Script executed:

# Search for XLSX usage
rg "XLSX|sheet_to_json" -n

Repository: ProjectTech4DevAI/kaapi-frontend

Length of output: 984


🏁 Script executed:

# Read the full ColumnMapperStep.tsx to understand props and context
head -100 app/assessment/components/ColumnMapperStep.tsx

Repository: ProjectTech4DevAI/kaapi-frontend

Length of output: 2497


🏁 Script executed:

# Check where columns prop comes from and how it's used
sed -n '1,50p' app/assessment/components/ColumnMapperStep.tsx

Repository: ProjectTech4DevAI/kaapi-frontend

Length of output: 1209


🏁 Script executed:

# Check DatasetStep around line 228 to see how headers are extracted
sed -n '210,240p' app/assessment/components/DatasetStep.tsx

Repository: ProjectTech4DevAI/kaapi-frontend

Length of output: 946


🏁 Script executed:

# Look for parsed.headers definition and how XLSX processes headers
sed -n '75,95p' app/assessment/components/DatasetStep.tsx

Repository: ProjectTech4DevAI/kaapi-frontend

Length of output: 733


🏁 Script executed:

# Find handleColumnsLoaded in AssessmentPageClient.tsx
rg "handleColumnsLoaded" -A 10 app/assessment/AssessmentPageClient.tsx

Repository: ProjectTech4DevAI/kaapi-frontend

Length of output: 777


🏁 Script executed:

# Check if there's any deduplication logic in AssessmentPageClient or elsewhere
rg "duplicate|dedup|suffix|__" app/assessment/ -n

Repository: ProjectTech4DevAI/kaapi-frontend

Length of output: 643


🏁 Script executed:

# Read handleNext function to verify it emits column names once (silently collapsing duplicates)
sed -n '148,173p' app/assessment/components/ColumnMapperStep.tsx

Repository: ProjectTech4DevAI/kaapi-frontend

Length of output: 863


Duplicate column names cause silent state collisions and React key warnings.

columnConfigs is keyed by column name (line 79), and the rendered list uses key={column} at line 240. CSV/Excel datasets routinely contain duplicate headers (e.g., two unnamed columns, or repeated value). XLSX.utils.sheet_to_json preserves these duplicates verbatim in the headers array passed through onColumnsLoaded with no upstream deduplication.

With duplicates:

  1. Only one config entry survives—later forEach iterations overwrite earlier ones.
  2. Toggling a role on the second "name" updates state for both visually rendered rows.
  3. React emits a duplicate-key warning.
  4. handleNext iterates Object.entries(columnConfigs), which has only unique keys, so duplicates are silently collapsed when emitting textColumns / attachments / groundTruthColumns.

Recommendation: Either de-duplicate upstream in DatasetStep.tsx (e.g., suffix repeats: name, name__2) before calling onColumnsLoaded, or key the component by index throughout instead of by column name.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/ColumnMapperStep.tsx` around lines 74 - 103, The
current Record<string, ColumnConfig> state (columnConfigs) keyed by column name
causes collisions when columns contains duplicate names; change columnConfigs to
be an array aligned with the columns array (e.g., ColumnConfig[] managed by
setColumnConfigs) so each column instance is tracked by index, update any places
using columnConfigs (initialization loop that builds configs from columns and
columnMapping, state updates when toggling roles via setColumnConfigs, and
handleNext which currently iterates Object.entries(columnConfigs)) to use the
index-aligned array, and ensure the rendered list uses key={index} (or a stable
index-based id) instead of key={column} to eliminate duplicate-key warnings and
preserve per-column state for duplicates.

Comment on lines +60 to +94
// Fetch file via proxy (server-side S3 download) and parse with XLSX
const fetchAndParseFile = async (
id: string | number,
): Promise<{ headers: string[]; rows: string[][] } | null> => {
// Proxy downloads from S3 server-side and returns base64
const res = await fetch(
`/api/assessment/datasets/${id}?fetch_content=true`,
{
headers: { "X-API-KEY": apiKey },
},
);
if (!res.ok) return null;
const json = await res.json();
const base64 = json?.file_content;
if (!base64) return null;

// Decode base64 to binary and parse with XLSX
const binary = Uint8Array.from(atob(base64), (c) => c.charCodeAt(0));
const workbook = XLSX.read(binary, { type: "array" });
const sheet = workbook.Sheets[workbook.SheetNames[0]];
if (!sheet) return null;

const rawData: string[][] = XLSX.utils.sheet_to_json(sheet, {
header: 1,
defval: "",
});
if (rawData.length === 0) return null;

const headers = rawData[0].map(String);
const rows = rawData
.slice(1)
.filter((row) => row.some((cell) => String(cell).trim() !== ""));

return { headers, rows: rows.map((row) => row.map(String)) };
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Silent failure when dataset parse/fetch fails — user gets stuck.

fetchAndParseFile returns null on every failure path (HTTP non-OK at line 71, missing file_content at 74, missing sheet at 80, empty rawData at 86). In handleDatasetSelect, when parsed is null, neither the catch block nor any toast fires, so the user sees no feedback — yet canProceed (line 313) only checks datasetId && !isLoadingColumns, allowing them to advance to the Map Columns step with zero columns loaded.

Surface the error and prevent forward progression in this state.

🛡️ Suggested fix
   const fetchAndParseFile = async (
     id: string | number,
   ): Promise<{ headers: string[]; rows: string[][] } | null> => {
     // Proxy downloads from S3 server-side and returns base64
     const res = await fetch(
       `/api/assessment/datasets/${id}?fetch_content=true`,
       {
         headers: { "X-API-KEY": apiKey },
       },
     );
-    if (!res.ok) return null;
+    if (!res.ok) {
+      const body = await res.json().catch(() => ({}));
+      throw new Error(
+        body.error || body.message || body.detail || `Failed to fetch dataset (HTTP ${res.status})`,
+      );
+    }
     const json = await res.json();
     const base64 = json?.file_content;
-    if (!base64) return null;
+    if (!base64) throw new Error("Dataset content is unavailable.");
   const handleDatasetSelect = async (id: string, name?: string) => {
     setDatasetId(id);
     ...
     setIsLoadingColumns(true);
     try {
       const parsed = await fetchAndParseFile(id);
-      if (parsed?.headers) {
+      if (!parsed?.headers?.length) {
+        toast.error("Could not read columns from this dataset.");
+        return;
+      }
         const firstRow = parsed.rows[0] || [];
         ...
         onColumnsLoaded(parsed.headers, sampleRow);
-      }
     } catch (e) {
-      console.error("Failed to fetch dataset columns:", e);
+      toast.error(e instanceof Error ? e.message : "Failed to fetch dataset columns");
     } finally {
       setIsLoadingColumns(false);
     }
   };

Also applies to: 218-234

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/DatasetStep.tsx` around lines 60 - 94,
fetchAndParseFile currently swallows all failure cases by returning null, which
lets handleDatasetSelect silently continue and leaves isLoadingColumns
true/columns empty so canProceed (which checks datasetId && !isLoadingColumns)
can allow advancing with zero columns; change fetchAndParseFile to throw
descriptive errors (or return an error object) for each failure path (non-ok
response, missing file_content, missing sheet, empty data), then update
handleDatasetSelect to catch those specific errors, call the toast/error UI with
the error message, set isLoadingColumns to false and ensure columns state is
cleared or marked invalid so users cannot proceed, and keep the canProceed logic
(datasetId && !isLoadingColumns) intact so it blocks forward progression when
loading/parse fails; reference functions: fetchAndParseFile, handleDatasetSelect
and the canProceed/isLoadingColumns/columns state variables.

Comment on lines +452 to +456
useEffect(() => {
if (open) {
setDraftSchema(JSON.parse(JSON.stringify(schema)) as SchemaProperty[]);
}
}, [open]);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

useEffect dependency arrays are incomplete — will trip react-hooks/exhaustive-deps.

  • Line 452–456: depends on schema but lists only [open]. The intent (initialize draft only on open) is correct; either capture this with a ref-based pattern or add an eslint-disable-next-line with a comment explaining why schema is intentionally excluded.
  • Line 649–670: reads setSchema but lists only [codeValue, editorMode]. If the parent ever passes a fresh setSchema reference, the effect will use the stale one.

Also applies to: 648-670

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/OutputSchemaStep.tsx` around lines 452 - 456, The
useEffect that sets draft state when the modal opens (useEffect that calls
setDraftSchema(JSON.parse(JSON.stringify(schema)))) currently lists only [open]
but also reads schema — either include schema in the dependency array or
intentionally silence the linter with an explanatory eslint-disable-next-line
comment; prefer keeping the effect triggered only on open by using a ref to hold
the latest schema (e.g., schemaRef.current updated elsewhere) or add "//
eslint-disable-next-line react-hooks/exhaustive-deps -- we only want to run this
when `open` changes, not when `schema` changes" above the effect. Similarly, the
effect that calls setSchema (the effect depending on codeValue and editorMode)
must include setSchema in its dependency array or be wrapped so a stable setter
is used (e.g., useCallback/memoize in parent) or add a clear
eslint-disable-next-line with comment explaining why setSchema is intentionally
excluded to avoid stale-closure bugs.

Comment on lines +1066 to +1165
const defaultSelected = isSelected(config.id, 1);
const defaultLoading =
loadingSelectionKeys[`${config.id}:1`];
const isExpanded = expandedConfigId === config.id;
const versions =
versionStateByConfig[config.id] ??
buildInitialVersionState();
const knownVersionCount = versions.items.length;
const hasVersionsPanel =
knownVersionCount > 0 ||
versions.hasMore ||
versions.isLoading ||
Boolean(versions.error);
const previewVersions = versions.items.slice(0, 3);
const versionCountLabel =
knownVersionCount > 0
? `${knownVersionCount}${versions.hasMore ? "+" : ""}`
: "Check";

return (
<div
key={config.id}
className="flex min-h-[188px] flex-col rounded-2xl border p-4"
style={{
borderColor: isExpanded
? colors.accent.primary
: defaultSelected
? colors.accent.primary
: colors.border,
backgroundColor: colors.bg.primary,
boxShadow: isExpanded
? "0 10px 24px rgba(15, 23, 42, 0.06)"
: "0 1px 2px rgba(15, 23, 42, 0.03)",
}}
>
<div className="flex min-h-[88px] items-start justify-between gap-3">
<div className="min-w-0 flex-1">
<div
className="truncate text-sm font-semibold"
style={{ color: colors.text.primary }}
>
{config.name}
</div>
<div
className="mt-1 text-xs"
style={{ color: colors.text.secondary }}
>
{config.updated_at
? formatRelativeTime(config.updated_at)
: "Saved behavior"}
</div>
{config.description && (
<div
className="mt-3 text-xs leading-5"
style={{ color: colors.text.secondary }}
>
{config.description}
</div>
)}
</div>
{defaultSelected && (
<span
className="shrink-0 rounded-full px-2.5 py-1 text-[11px] font-medium"
style={{
backgroundColor: colors.bg.secondary,
color: colors.text.primary,
border: `1px solid ${colors.border}`,
}}
>
Added
</span>
)}
</div>

<div className="mt-4 flex items-center gap-2">
<button
onClick={() =>
void toggleVersionSelection(config, 1)
}
disabled={Boolean(defaultLoading)}
className="cursor-pointer min-w-[152px] rounded-xl px-3 py-2.5 text-sm font-medium"
style={{
backgroundColor: defaultSelected
? colors.bg.secondary
: colors.accent.primary,
color: defaultSelected
? colors.text.primary
: "#fff",
border: `1px solid ${defaultSelected ? colors.border : colors.accent.primary}`,
cursor: defaultLoading
? "progress"
: "pointer",
}}
>
{defaultLoading
? "Working..."
: defaultSelected
? "Added"
: "Use this behavior"}
</button>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Primary CTA is pinned to the oldest saved version.

defaultSelected and the main button both target version 1, so clicking “Use this behavior” selects v1 even when the config has newer saved versions. This will apply stale behavior settings unless the user manually opens the versions list first.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/PromptAndConfigStep.tsx` around lines 1066 - 1165,
The primary CTA always targets version 1 causing stale selections; change the
logic to compute the "latestVersion" from the loaded version state (e.g. derive
latestVersionId or latestVersionIndex from
versionStateByConfig[config.id].items[0] or the highest-version item) and use
that instead of the hard-coded 1 when computing defaultSelected, building the
loading key in loadingSelectionKeys (currently `${config.id}:1`), and when
calling toggleVersionSelection(config, 1); update those references to use the
computed latestVersion identifier so the button and selected state target the
newest saved version.

Comment on lines +827 to +927
const selectedVersions = selectedCountForConfig(
config.id,
);
const defaultSelected = isSelected(config.id, 1);
const defaultLoading =
loadingSelectionKeys[`${config.id}:1`];
const knownVersionCount = versions.items.length;
const hasVersionsPanel =
knownVersionCount > 0 ||
versions.hasMore ||
versions.isLoading ||
Boolean(versions.error);
const previewVersions = versions.items.slice(0, 3);
const versionCountLabel =
knownVersionCount > 0
? `${knownVersionCount}${versions.hasMore ? "+" : ""}`
: "Check";

return (
<div
key={config.id}
className="self-start rounded-2xl border transition-shadow"
style={{
borderColor: isExpanded
? colors.accent.primary
: colors.border,
backgroundColor: colors.bg.primary,
boxShadow: isExpanded
? "0 10px 22px rgba(23, 23, 23, 0.07)"
: "0 1px 2px rgba(23, 23, 23, 0.03)",
}}
>
<button
onClick={() => toggleConfigExpansion(config.id)}
className="cursor-pointer w-full rounded-2xl p-4 text-left"
>
<div className="flex items-start justify-between gap-3">
<div className="min-w-0">
<h4
className="truncate text-base font-semibold"
style={{ color: colors.text.primary }}
>
{config.name}
</h4>
<div
className="mt-1 text-xs"
style={{ color: colors.text.secondary }}
>
{formatRelativeTime(config.updated_at)}
</div>
</div>
{selectedVersions > 0 && (
<span
className="rounded-full px-2.5 py-1 text-[11px] font-medium"
style={{
backgroundColor:
"rgba(23, 23, 23, 0.06)",
color: colors.text.primary,
}}
>
In use
</span>
)}
</div>

<p
className="mt-3 text-sm leading-6"
style={{ color: colors.text.secondary }}
>
{config.description ||
"No description provided for this configuration."}
</p>
</button>

<div className="px-4 pb-4 pt-1">
<div className="flex items-center gap-2">
<button
onClick={(event) => {
event.stopPropagation();
void toggleVersionSelection(config, 1);
}}
disabled={Boolean(defaultLoading)}
className="flex-1 rounded-xl px-4 py-2.5 text-sm font-medium"
style={{
backgroundColor: defaultSelected
? colors.bg.secondary
: colors.accent.primary,
color: defaultSelected
? colors.text.primary
: "#ffffff",
border: `1px solid ${defaultSelected ? colors.border : colors.accent.primary}`,
cursor: defaultLoading
? "progress"
: "pointer",
}}
>
{defaultLoading
? "Working..."
: defaultSelected
? "Added"
: "Use this behavior"}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Primary CTA always selects version 1.

The card shows the current config metadata, but the default action is hard-coded to toggleVersionSelection(config, 1). For configs with saved revisions, “Use this behavior” picks the oldest version, not the latest version the card represents. Please derive the default version from the latest available config version instead of pinning it to 1.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/config/ConfigurationStep.tsx` around lines 827 - 927, The
primary CTA is hard-coded to version "1"; instead compute the default version
from the latest available version in versions.items and use that id everywhere
instead of 1. For example, derive const latestVersionId = versions.items[0]?.id
?? 1 and replace uses of the literal 1 in isSelected(config.id, 1),
loadingSelectionKeys[`${config.id}:1`], and the onClick handler void
toggleVersionSelection(config, 1) with isSelected(config.id, latestVersionId),
loadingSelectionKeys[`${config.id}:${latestVersionId}`], and void
toggleVersionSelection(config, latestVersionId) respectively (keeping a safe
fallback when versions.items is empty).

summary: {
type: "enum",
default: "auto",
options: ["auto", "detailed", "concise", "null"],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

OpenAI Responses API reasoning summary parameter accepted values

💡 Result:

The accepted values for the reasoning.summary parameter in the OpenAI Responses API are "auto", "concise", and "detailed". "auto" selects the most detailed summarizer available for the model (equivalent to "detailed" for most models). "concise" is supported for computer-use-preview models and reasoning models after gpt-5. "detailed" is supported by models like o4-mini. Support varies by model; check model documentation for specifics. A deprecated generate_summary parameter uses the same values. Examples from official docs use reasoning: { summary: "auto" } with models like gpt-5.5 and o3.

Citations:


Remove "null" string from summary enum options—it is not an accepted OpenAI API value.

The OpenAI Responses API documentation specifies that the reasoning.summary parameter accepts only "auto", "concise", and "detailed". The literal string "null" is not a valid value and will be rejected or cause runtime failures when users select it from the UI dropdown.

Note that o3-mini (line 67) correctly omits "null" with options ["auto", "detailed", "concise"], while the other models (o3, o4-mini, gpt-5 variants, gpt-5.1 variants, gpt-5.2 variants, and gpt-5.2-pro) incorrectly include it. Remove "null" from all occurrences.

🔧 Suggested fix
       summary: {
         type: "enum",
         default: "auto",
-        options: ["auto", "detailed", "concise", "null"],
+        options: ["auto", "detailed", "concise"],
         description: "Summarize the reasoning result.",
       },

Apply to all occurrences at lines: 86, 105, 124, 143, 162, 181, 193, 212, 224, 236.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
options: ["auto", "detailed", "concise", "null"],
options: ["auto", "detailed", "concise"],
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/config/constants.ts` at line 86, Remove the invalid "null"
string from the reasoning.summary options arrays in the model config constants
so the only accepted values are "auto", "concise", and "detailed"; locate each
config object that contains an options: ["auto", "detailed", "concise", "null"]
(e.g., the reasoning.summary options entries for the o3, o4-mini, gpt-5*, and
gpt-5.1/5.2 variants) and change those arrays to ["auto", "detailed", "concise"]
to match the o3-mini entry and the OpenAI Responses API allowed values.

Comment thread app/assessment/store.ts
Comment on lines +40 to +47
setDataset: (datasetId, columns, sampleRow, datasetName) =>
set((state) => ({
datasetId,
datasetName: datasetName ?? state.datasetName,
columns,
sampleRow,
columnMapping: DEFAULT_MAPPING,
})),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

setDataset silently discards any existing columnMapping, even on re-load of the same dataset.

setDataset unconditionally resets columnMapping to DEFAULT_MAPPING. Per the relevant snippet from app/assessment/AssessmentPageClient.tsx, handleColumnsLoaded calls setDataset(currentId, cols, firstRow) with the current datasetId. If this callback ever fires after the user has already mapped columns (e.g., a refetch, a re-render after columns load, or re-entering the step), the mapping the user just configured will be wiped without warning.

Consider only resetting columnMapping when the dataset actually changes, e.g.:

♻️ Proposed refinement
     setDataset: (datasetId, columns, sampleRow, datasetName) =>
       set((state) => ({
         datasetId,
         datasetName: datasetName ?? state.datasetName,
         columns,
         sampleRow,
-        columnMapping: DEFAULT_MAPPING,
+        columnMapping:
+          state.datasetId === datasetId ? state.columnMapping : { ...DEFAULT_MAPPING },
       })),

Please confirm the intent — if wiping on every setDataset call is deliberate, this can be ignored.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
setDataset: (datasetId, columns, sampleRow, datasetName) =>
set((state) => ({
datasetId,
datasetName: datasetName ?? state.datasetName,
columns,
sampleRow,
columnMapping: DEFAULT_MAPPING,
})),
setDataset: (datasetId, columns, sampleRow, datasetName) =>
set((state) => ({
datasetId,
datasetName: datasetName ?? state.datasetName,
columns,
sampleRow,
columnMapping:
state.datasetId === datasetId ? state.columnMapping : { ...DEFAULT_MAPPING },
})),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/store.ts` around lines 40 - 47, setDataset currently
overwrites columnMapping with DEFAULT_MAPPING every call; change it so
columnMapping is only reset when the dataset actually changes by comparing the
incoming datasetId to the current state.datasetId — if they differ, set
columnMapping to DEFAULT_MAPPING, otherwise preserve state.columnMapping (and
continue to update datasetId, datasetName, columns, sampleRow as already done).
Refer to the setDataset setter and the symbols columnMapping, DEFAULT_MAPPING
and state.datasetId to implement this conditional logic.

Comment thread app/lib/featureState.ts
Comment on lines +61 to +66
export function removeFeatureFromClient(feature: string): void {
const fromCookie = parseFeatures(readCookie(FEATURES_COOKIE));
const nextFeatures = fromCookie.filter((value) => value !== feature);
writeFeaturesCookie(nextFeatures);
syncSessionFeatures(nextFeatures);
broadcastFeatures(nextFeatures);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Removing one flag can clear the entire feature set.

removeFeatureFromClient() only trusts the cookie. If that cookie is absent or malformed while localStorage still has valid session features, nextFeatures becomes [] and syncSessionFeatures([]) wipes every flag, not just the one being removed. Fall back to the stored session features before filtering, or bail out when no trustworthy source list is available.

Comment thread middleware.ts
Comment on lines +44 to 45
const features = parseFeatures(request.cookies.get(FEATURES_COOKIE)?.value);
const isAuthenticated = role === "superuser" || role === "user";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Middleware feature gating trusts a client-writable cookie.

app/lib/featureState.ts writes kaapi_features with document.cookie, so any authenticated user can mint ASSESSMENT locally and satisfy this check. That makes the middleware redirect advisory only. Use a server-issued/HttpOnly signal here, or revalidate the feature against trusted server state before granting the route.

Also applies to: 70-74

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@middleware.ts` around lines 44 - 45, The middleware currently reads feature
flags from a client-writable cookie via
parseFeatures(request.cookies.get(FEATURES_COOKIE)?.value), which is untrusted;
change the check so feature gating is based on a server-trusted signal: either
read an HttpOnly/signed feature cookie issued by the server and verify its
signature (implement a verifyFeatureCookie function) or fetch the user’s feature
state from trusted server storage (e.g., getFeaturesForUser(userId) or
validateFeaturesAgainstServer(userId, features)) before allowing routes that
require ASSESSMENT; update the code paths that use parseFeatures and
FEATURES_COOKIE (including the other usage around the later block) to call the
new verification method and fall back to denying the feature if verification
fails.

@vprashrex vprashrex changed the title Feature/assesment Assessment: Assessment Pipeline Apr 28, 2026
@vprashrex vprashrex self-assigned this Apr 28, 2026
@vprashrex vprashrex added the enhancement New feature or request label Apr 28, 2026
@vprashrex vprashrex linked an issue Apr 28, 2026 that may be closed by this pull request
@vprashrex vprashrex moved this to In Review in Kaapi-dev Apr 28, 2026
Copy link
Copy Markdown
Collaborator

@Ayush8923 Ayush8923 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve added a few comments that will mostly require updates across multiple files. It would be better to address these first so the next round of review becomes easier. I’ll continue reviewing further accordingly.

Comment on lines +26 to +32
const response = await fetch(
`${backendUrl}/api/v1/assessment/assessments/${assessment_id}/results${queryString}`,
{
method: "GET",
headers: { "X-API-KEY": apiKey },
},
);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the apiClient fetch function and directly pass the value in that function, we don't need to add the backendUrl again and again.

Take the reference from this file:
https://github.com/ProjectTech4DevAI/kaapi-frontend/blob/main/app/api/onboard/route.ts

Update these changes across the code changes

Comment on lines +23 to +24
const searchParams = request.nextUrl.searchParams.toString();
const queryString = searchParams ? `?${searchParams}` : "";
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update this code like this:

const queryParams = new URLSearchParams();
queryParams.set("get_trace_info", "true");

then pass the value directly, currently don't know what queryString we are passing in this.

Comment on lines +35 to +48
const contentType = response.headers.get("content-type") || "";
if (
contentType.includes("text/csv") ||
contentType.includes("spreadsheetml") ||
contentType.includes("octet-stream") ||
contentType.includes("application/zip")
) {
const blob = await response.blob();
const headers = new Headers();
headers.set("Content-Type", contentType);
const disposition = response.headers.get("content-disposition");
if (disposition) headers.set("Content-Disposition", disposition);
return new NextResponse(blob, { status: response.status, headers });
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create the own function for this, currently this function avoid SRP principle. And then use that function here and pass the accordingly.

Comment on lines +9 to +22
const apiKey = request.headers.get("X-API-KEY");

if (!apiKey) {
return NextResponse.json(
{ error: "Missing X-API-KEY header" },
{ status: 401 },
);
}

const { assessment_id } = await context.params;
const backendUrl = process.env.BACKEND_URL || "http://localhost:8000";

const response = await fetch(
`${backendUrl}/api/v1/assessment/assessments/${assessment_id}/retry`,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update this also, same as per this comment: https://github.com/ProjectTech4DevAI/kaapi-frontend/pull/122/changes#r3153720284

This code will be break if the user will login via google auth.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +62 to +74
<svg
className="w-5 h-5"
fill="none"
viewBox="0 0 24 24"
stroke="currentColor"
>
<path
strokeLinecap="round"
strokeLinejoin="round"
strokeWidth={2}
d="M6 18L18 6M6 6l12 12"
/>
</svg>
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the svgs inside the components/icons and then use it here. please check the other icons for refrence.

Comment on lines +117 to +122
try {
const response = await fetch("/api/assessment/datasets", {
headers: { "X-API-KEY": selectedKey.key },
});
if (!response.ok || cancelled) return;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the apiFetch helper function in the client side, we just need to pass the require parameter then the function will automatic handle the apiKey etc.

Please update this also across the file.

For reference check this:
https://github.com/ProjectTech4DevAI/kaapi-frontend/blob/main/app/(main)/document/page.tsx#L133-L137

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

🧹 Nitpick comments (2)
app/assessment/components/EvaluationsTab.tsx (2)

6-6: Switch this import to the project alias.

Using a relative path here drifts from the repo import rule and makes later file moves noisier than they need to be.

Suggested change
-import DataViewModal, { jsonResultsToTableData } from "./DataViewModal";
+import DataViewModal, {
+  jsonResultsToTableData,
+} from "@/app/assessment/components/DataViewModal";

As per coding guidelines: Use the @/ import alias configured in tsconfig.json to resolve imports from the project root.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/EvaluationsTab.tsx` at line 6, Replace the relative
import of DataViewModal and jsonResultsToTableData in EvaluationsTab.tsx with
the project root alias import; locate the import line that references
"./DataViewModal" and change it to use the
"@/assessment/components/DataViewModal" path so DataViewModal and
jsonResultsToTableData are loaded via the configured tsconfig alias instead of a
relative path.

80-94: Prefer date-fns for relative time formatting.

This hand-rolled formatter bypasses the app’s date handling standard and will drift from the rest of the UI over time.

Suggested change
+import { formatDistanceToNow } from "date-fns";
@@
 function formatRelativeTime(dateStr: string): string {
-  const date = new Date(dateStr);
-  const now = new Date();
-  const diffMs = now.getTime() - date.getTime();
-  const diffMins = Math.floor(diffMs / 60000);
-  const diffHours = Math.floor(diffMs / 3600000);
-  const diffDays = Math.floor(diffMs / 86400000);
-
-  if (diffMins < 1) return "just now";
-  if (diffMins < 60) return `${diffMins}m ago`;
-  if (diffHours < 24) return `${diffHours}h ago`;
-  if (diffDays < 30) return `${diffDays} day${diffDays !== 1 ? "s" : ""} ago`;
-  const diffMonths = Math.floor(diffDays / 30);
-  return `about ${diffMonths} month${diffMonths !== 1 ? "s" : ""} ago`;
+  return formatDistanceToNow(new Date(dateStr), { addSuffix: true });
 }

As per coding guidelines: Use date-fns 4.1.0 and date-fns-tz 3.2.0 for date/time handling across the application.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/EvaluationsTab.tsx` around lines 80 - 94, The
custom function formatRelativeTime bypasses the app's date utilities; replace
its logic by using date-fns (v4.1.0) and date-fns-tz (v3.2.0) to produce
consistent relative times—specifically swap out the body of formatRelativeTime
to call date-fns' formatDistanceToNow (or formatRelative with proper tz
conversion via date-fns-tz) to compute the human-friendly string, ensure you
parse the incoming dateStr into a Date with the app timezone functions, and
maintain the same return semantics ("just now", shortened units) or map
formatDistanceToNow results to match existing UI text.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/api/assessment/datasets/route.ts`:
- Around line 37-41: The response currently returns raw exception details in
NextResponse.json (see the catch blocks around the error variable), which can
leak internal information; instead, log the full error server-side (e.g.,
console.error or your server logger) inside those catch blocks and change the
JSON response to return a generic, client-safe message (e.g., "An internal error
occurred") and remove or replace the details field; update both places using
NextResponse.json so only the generic message is sent to clients while full
error details remain in server logs.
- Around line 21-32: Replace the manual fetch call in this route (and the
similar block at lines ~68-80) with the shared apiClient(request,
"/api/v1/assessment/datasets", { method: "GET", headers: { "X-API-KEY": apiKey }
}) so the server proxy handles header/cookie forwarding; use the returned body
and status to construct the NextResponse and when returning errors extract the
message as body.error || body.message || body.detail and include that in the
JSON response. Ensure you pass the incoming request object to apiClient and
replace usage of backendUrl/apiKey + fetch with apiClient and
NextResponse.json(body, { status }).

In `@app/api/assessment/evaluations/`[evaluation_id]/results/route.ts:
- Around line 21-32: Replace the manual fetch that builds backendUrl/apiKey with
a call to apiClient so cookie and API-key forwarding are preserved: construct
the endpoint using evaluation_id and the existing searchParams (e.g.,
`/api/v1/assessment/evaluations/${evaluation_id}/results${queryString}`), call
apiClient(request, endpoint, { method: "GET" }), then use the returned { status,
data, headers } to build and return the Response (preserving status and relevant
headers). Remove direct use of backendUrl and apiKey in this handler and ensure
the function names referenced are apiClient and evaluation_id to locate the code
to change.

In `@app/api/assessment/evaluations/`[evaluation_id]/retry/route.ts:
- Around line 18-32: This route is using raw fetch which drops cookies and
duplicates forwarding logic; replace the fetch call with the shared apiClient
call: call apiClient(request,
`/api/v1/assessment/evaluations/${evaluation_id}/retry`, { method: "POST" })
(referencing evaluation_id and apiClient) and use the returned { status, data,
headers } to build the NextResponse (e.g., NextResponse.json(data, { status,
headers })) so X-API-KEY and Cookie are automatically forwarded and response
headers/status are preserved.

In `@app/api/assessment/events/route.ts`:
- Around line 15-23: The handler currently builds backendUrl and calls fetch
directly (using backendUrl and manually setting "X-API-KEY") which bypasses
shared forwarding and auth; replace that fetch call with the shared apiClient by
calling apiClient(request, "/api/v1/assessment/events", { method: "GET",
headers: { Accept: "text/event-stream" }, cache: "no-store" }) so Cookie and API
key forwarding and centralized error handling are preserved; remove backendUrl
and the manual "X-API-KEY" header and ensure apiClient is imported/used in this
route handler.
- Around line 25-29: The handler currently returns raw upstream response text
(variable text) in NextResponse.json which can leak internals; update the
failure branch that checks response.ok/response.body to log the full response
text to server logs (e.g., use console.error or your request logger) and return
a sanitized JSON error to the client via NextResponse.json (keep status from
response.status || 500 and replace details with a generic message like "Upstream
service error" or omit details entirely). Locate the failing block that calls
NextResponse.json and the response.text() call to make these changes (preserve
status handling but remove/raw-text exposure).
- Around line 16-24: The upstream fetch to
`${backendUrl}/api/v1/assessment/events` can throw and currently bubbles into an
unstructured 500; wrap the call to fetch (the code that assigns response) in a
try/catch inside the route handler (the exported GET handler or default route
function) and on error return a controlled 502 JSON Response with an explanatory
message and minimal error info. Preserve existing headers/behavior on success,
but in the catch return new Response(JSON.stringify({ error: "Bad Gateway",
message: "Failed to connect to upstream assessment events" , detail: /* short
error message */ }), { status: 502, headers: { "Content-Type":
"application/json" } }). Ensure you reference backendUrl and apiKey variables
unchanged.

In `@app/assessment/components/EvaluationsTab.tsx`:
- Around line 926-928: The checks that set isCompletedChild only match
"completed", causing runs with status "completed_with_errors" to lose
preview/export actions; update the status checks (where isCompletedChild is
computed and the similar logic between the isFailedChild/isCompletedChild blocks
around the isCompletedChild and childRun.status references) to treat
"completed_with_errors" as completed too (e.g., include it in the condition or
use an array/Set of completed-like statuses) so preview/export actions are
enabled for partially successful runs.
- Around line 238-240: Replace direct window fetch calls in EvaluationsTab
(e.g., the request to "/api/assessment/assessments" and the other occurrences
flagged) with the shared clientFetch utility so client-side requests go through
the token-refresh/401 handling; locate the fetch invocations in
EvaluationsTab.tsx (the one creating "response" around
"/api/assessment/assessments" and the other fetches at the ranges called out)
and swap them to call clientFetch(endpoint, options) passing the same
headers/body as before, keeping the surrounding await/response handling
unchanged so AUTH_EXPIRED_EVENT and refresh logic are used.
- Around line 824-832: The badge currently uses run.status.replace("_", " ")
which only replaces the first underscore; replace that with the app's existing
status formatter used elsewhere (e.g., formatStatusLabel or formatStatus) so all
underscores are handled correctly: in EvaluationsTab change
{run.status.replace("_", " ")} to the canonical formatter call (e.g.,
{formatStatusLabel(run.status)}) and add the necessary import/usage while
keeping statusStyle/bg/text as-is.

In `@app/assessment/components/JsonEditor.tsx`:
- Around line 168-173: The textarea in JsonEditor lacks an accessible name and
ARIA invalid state; update the textarea element used in the JsonEditor component
to include an accessible label (either by adding an associated <label> with
htmlFor matching the textarea id or adding aria-label/aria-labelledby) and set
aria-invalid={!!error} so assistive tech knows when the input is invalid; also
ensure the error span has role="alert" or is referenced via aria-describedby on
the textarea (use the error variable/id) so screen readers announce the error.

In `@app/components/speech-to-text/ModelComparisonCard.tsx`:
- Around line 151-153: The button elements (e.g., the one using
onClick={handleExpandToggle} and the other at the later occurrence) lack
explicit types and may act as type="submit" inside forms; update both button JSX
elements (the one referencing handleExpandToggle and the other button around
lines ~368) to include type="button" to prevent accidental form submission and
keep their onClick behavior unchanged.
- Around line 151-173: The expand/collapse icon button in ModelComparisonCard is
missing an accessible name and state; update the button (the element using
onClick={handleExpandToggle}) to include an accessible label (e.g., aria-label
or aria-labelledby) and expose its expanded state with
aria-expanded={isExpanded}, and if there is a collapsible panel add
aria-controls pointing to that panel's id; ensure the label text clearly
indicates the action (e.g., "Expand details" / "Collapse details") and that the
aria attributes reference the existing isExpanded state and the panel element
rendered by this component.
- Around line 74-78: The effect currently only resets expansion when status ===
"pending", so changing modelId while status is "success" leaves stale state;
update the logic in the useEffect (or add a separate useEffect) to also reset
expansion when modelId changes — e.g., track previous modelId with a ref and in
useEffect compare prevModelId !== modelId then call setIsExpanded(false) (and
update the ref), or add a useEffect that depends solely on modelId and calls
setIsExpanded(false); ensure you reference the existing useEffect, status,
modelId, and setIsExpanded symbols when making the change.
- Around line 156-172: The inline SVG used in ModelComparisonCard (the chevron
SVG controlled by isExpanded) should be replaced with a static SVG asset per
repo guidelines: move the SVG markup into a file under /public (e.g.,
chevron.svg) or import it as a static asset via Next.js, then replace the inline
<svg> block inside the ModelComparisonCard component with an <img> (or
next/image) reference to that asset and keep the transform/transition styling by
applying the same rotation logic to the image element or its wrapper; apply the
same change for the other inline SVG at the other location mentioned (lines
~377-389) so all inline SVGs are served as static assets.

---

Nitpick comments:
In `@app/assessment/components/EvaluationsTab.tsx`:
- Line 6: Replace the relative import of DataViewModal and
jsonResultsToTableData in EvaluationsTab.tsx with the project root alias import;
locate the import line that references "./DataViewModal" and change it to use
the "@/assessment/components/DataViewModal" path so DataViewModal and
jsonResultsToTableData are loaded via the configured tsconfig alias instead of a
relative path.
- Around line 80-94: The custom function formatRelativeTime bypasses the app's
date utilities; replace its logic by using date-fns (v4.1.0) and date-fns-tz
(v3.2.0) to produce consistent relative times—specifically swap out the body of
formatRelativeTime to call date-fns' formatDistanceToNow (or formatRelative with
proper tz conversion via date-fns-tz) to compute the human-friendly string,
ensure you parse the incoming dateStr into a Date with the app timezone
functions, and maintain the same return semantics ("just now", shortened units)
or map formatDistanceToNow results to match existing UI text.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0cb313bf-88d2-4263-bfb4-5ef971355683

📥 Commits

Reviewing files that changed from the base of the PR and between fcbf161 and fd34470.

📒 Files selected for processing (14)
  • app/api/assessment/assessments/[assessment_id]/results/route.ts
  • app/api/assessment/assessments/[assessment_id]/retry/route.ts
  • app/api/assessment/assessments/route.ts
  • app/api/assessment/datasets/[dataset_id]/route.ts
  • app/api/assessment/datasets/route.ts
  • app/api/assessment/evaluations/[evaluation_id]/results/route.ts
  • app/api/assessment/evaluations/[evaluation_id]/retry/route.ts
  • app/api/assessment/evaluations/route.ts
  • app/api/assessment/events/route.ts
  • app/assessment/components/DataViewModal.tsx
  • app/assessment/components/EvaluationsTab.tsx
  • app/assessment/components/JsonEditor.tsx
  • app/assessment/schemaUtils.ts
  • app/components/speech-to-text/ModelComparisonCard.tsx
✅ Files skipped from review due to trivial changes (3)
  • app/assessment/schemaUtils.ts
  • app/api/assessment/assessments/route.ts
  • app/assessment/components/DataViewModal.tsx
🚧 Files skipped from review as they are similar to previous changes (3)
  • app/api/assessment/assessments/[assessment_id]/results/route.ts
  • app/api/assessment/evaluations/route.ts
  • app/api/assessment/datasets/[dataset_id]/route.ts

Comment thread app/api/assessment/datasets/route.ts Outdated
Comment thread app/api/assessment/datasets/route.ts
Comment thread app/api/assessment/evaluations/[evaluation_id]/results/route.ts Outdated
Comment on lines +18 to +32
const { evaluation_id } = await context.params;
const backendUrl = process.env.BACKEND_URL || "http://localhost:8000";

const response = await fetch(
`${backendUrl}/api/v1/assessment/evaluations/${evaluation_id}/retry`,
{
method: "POST",
headers: {
"X-API-KEY": apiKey,
},
},
);

const data = await response.json();
return NextResponse.json(data, { status: response.status });
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use apiClient instead of raw fetch here.

This proxy only forwards X-API-KEY, so backend cookies/session context are dropped. That can break cookie-backed auth flows and duplicates forwarding logic the shared helper already centralizes.

Suggested refactor
 import { NextRequest, NextResponse } from "next/server";
+import { apiClient } from "@/app/lib/apiClient";
@@
-    const backendUrl = process.env.BACKEND_URL || "http://localhost:8000";
-
-    const response = await fetch(
-      `${backendUrl}/api/v1/assessment/evaluations/${evaluation_id}/retry`,
-      {
-        method: "POST",
-        headers: {
-          "X-API-KEY": apiKey,
-        },
-      },
-    );
-
-    const data = await response.json();
-    return NextResponse.json(data, { status: response.status });
+    const { status, data } = await apiClient(
+      request,
+      `/api/v1/assessment/evaluations/${evaluation_id}/retry`,
+      { method: "POST" },
+    );
+
+    return NextResponse.json(data, { status });

Based on learnings: Use apiClient(request, endpoint, options) in server-side Next.js route handlers to forward requests to the backend, which automatically relays X-API-KEY and Cookie headers and returns { status, data, headers }.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const { evaluation_id } = await context.params;
const backendUrl = process.env.BACKEND_URL || "http://localhost:8000";
const response = await fetch(
`${backendUrl}/api/v1/assessment/evaluations/${evaluation_id}/retry`,
{
method: "POST",
headers: {
"X-API-KEY": apiKey,
},
},
);
const data = await response.json();
return NextResponse.json(data, { status: response.status });
import { NextRequest, NextResponse } from "next/server";
import { apiClient } from "@/app/lib/apiClient";
// ... other imports/code ...
export async function POST(request: NextRequest, context: { params: Promise<{ evaluation_id: string }> }) {
// ... auth logic ...
const { evaluation_id } = await context.params;
const { status, data } = await apiClient(
request,
`/api/v1/assessment/evaluations/${evaluation_id}/retry`,
{ method: "POST" },
);
return NextResponse.json(data, { status });
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/evaluations/`[evaluation_id]/retry/route.ts around lines
18 - 32, This route is using raw fetch which drops cookies and duplicates
forwarding logic; replace the fetch call with the shared apiClient call: call
apiClient(request, `/api/v1/assessment/evaluations/${evaluation_id}/retry`, {
method: "POST" }) (referencing evaluation_id and apiClient) and use the returned
{ status, data, headers } to build the NextResponse (e.g.,
NextResponse.json(data, { status, headers })) so X-API-KEY and Cookie are
automatically forwarded and response headers/status are preserved.

Comment thread app/api/assessment/events/route.ts Outdated
Comment thread app/assessment/components/JsonEditor.tsx
Comment thread app/components/speech-to-text/ModelComparisonCard.tsx Outdated
Comment thread app/components/speech-to-text/ModelComparisonCard.tsx
Comment thread app/components/speech-to-text/ModelComparisonCard.tsx
Comment on lines +156 to +172
<svg
className="w-4 h-4"
fill="none"
viewBox="0 0 24 24"
stroke="currentColor"
style={{
transform: isExpanded ? "rotate(180deg)" : "rotate(0deg)",
transition: "transform 0.2s ease",
}}
>
<path
strokeLinecap="round"
strokeLinejoin="round"
strokeWidth={2}
d="M19 9l-7 7-7-7"
/>
</svg>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Replace inline SVGs with static SVG assets per repo guideline.

This file embeds SVG markup directly; the project guideline requires SVGs from /public (or static import via Next.js image handling).

As per coding guidelines, “SVGs follow Next.js defaults and should be imported as static assets via next/image or referenced from /public”.

Also applies to: 377-389

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/speech-to-text/ModelComparisonCard.tsx` around lines 156 -
172, The inline SVG used in ModelComparisonCard (the chevron SVG controlled by
isExpanded) should be replaced with a static SVG asset per repo guidelines: move
the SVG markup into a file under /public (e.g., chevron.svg) or import it as a
static asset via Next.js, then replace the inline <svg> block inside the
ModelComparisonCard component with an <img> (or next/image) reference to that
asset and keep the transform/transition styling by applying the same rotation
logic to the image element or its wrapper; apply the same change for the other
inline SVG at the other location mentioned (lines ~377-389) so all inline SVGs
are served as static assets.

…nd enhance accessibility

- Replaced inline SVG icons with imported icon components in DataViewModal and DatasetStep.
- Added error handling for forbidden API responses across various components.
- Improved accessibility by adding aria attributes and roles to relevant elements in JsonEditor and EvaluationsTab.
- Refactored API request logic to use a centralized apiFetch function for consistency.
- Enhanced state management and effects in ModelComparisonCard for better user experience.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (5)
app/api/assessment/events/route.ts (1)

19-24: ⚠️ Potential issue | 🟠 Major

Sanitize SSE proxy failures before returning them.

details: text echoes the upstream response body back to the browser. That can leak backend internals from the event service. Log the full body server-side and return a generic error payload to the client instead.

Suggested fix
     if (!response.ok || !response.body) {
       const text = await response.text();
+      console.error("Assessment events upstream error:", response.status, text);
       return NextResponse.json(
-        { error: "Failed to connect assessment event stream", details: text },
+        { error: "Failed to connect assessment event stream" },
         { status: response.status || 500 },
       );
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/events/route.ts` around lines 19 - 24, The handler
currently returns the upstream SSE body directly to the client (response.text()
-> details: text); change this to log the full body server-side (use the
existing logger or processLogger/console.error to record the text and
response.status) and return a sanitized JSON to the client via NextResponse.json
(e.g., { error: "Failed to connect assessment event stream" }) with the same
status code; keep using the existing response variable and NextResponse.json
call but remove echoing `text` in the client payload while preserving
server-side logging for debugging.
app/assessment/components/EvaluationsTab.tsx (2)

786-786: ⚠️ Potential issue | 🟡 Minor

Use the existing status formatter for the assessment badge.

replace("_", " ") only changes the first underscore, so completed_with_errors renders as completed with_errors. Use the formatStatusLabel function defined at line 540 which uses a global regex.

🐛 Suggested fix
-                          {run.status.replace("_", " ")}
+                          {formatStatusLabel(run.status)}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/EvaluationsTab.tsx` at line 786, The badge
currently uses run.status.replace("_", " ") which only replaces the first
underscore; update the badge to call the existing formatStatusLabel function
(defined as formatStatusLabel) with run.status so all underscores are replaced
via the global regex and the label is consistently formatted; locate the usage
near the EvaluationsTab component where run.status is rendered and replace the
replace call with formatStatusLabel(run.status).

881-883: ⚠️ Potential issue | 🟠 Major

completed_with_errors child runs lose preview/export actions.

isCompletedChild only matches "completed", so partially successful runs with status "completed_with_errors" cannot be previewed or exported. These runs have valid results that users should be able to access.

🐛 Suggested fix
                             const isFailedChild = childRun.status === "failed";
                             const isCompletedChild =
-                              childRun.status === "completed";
+                              childRun.status === "completed" ||
+                              childRun.status === "completed_with_errors";
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/EvaluationsTab.tsx` around lines 881 - 883, The
isCompletedChild check only matches childRun.status === "completed", causing
"completed_with_errors" runs to miss preview/export actions; update the logic
that defines isCompletedChild (the variable using childRun.status) to include
"completed_with_errors" as a valid completed state (e.g., check for status ===
"completed" || status === "completed_with_errors" or use a startsWith/includes
test) so those child runs show the preview/export actions.
app/assessment/AssessmentPageClient.tsx (1)

354-367: ⚠️ Potential issue | 🟠 Major

Submission drops the mapped ground-truth columns.

columnMapping.groundTruthColumns is tracked in the component state (line 305), but it is not included in the POST body. Any assessment that depends on reference labels will be submitted without them.

🛡️ Suggested fix
       const payload = {
         experiment_name: experimentName.trim(),
         dataset_id: parseInt(datasetId, 10),
         prompt_template: promptTemplate || null,
         text_columns: columnMapping.textColumns,
+        ground_truth_columns: columnMapping.groundTruthColumns,
         attachments: columnMapping.attachments.map(
           ({ column, type, format }) => ({ column, type, format }),
         ),
         output_schema: schemaToJsonSchema(outputSchema) || null,
         configs: configs.map(({ config_id, config_version }) => ({
           config_id,
           config_version,
         })),
       };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/AssessmentPageClient.tsx` around lines 354 - 367, The POST
payload in AssessmentPageClient.tsx is missing the mapped ground-truth columns
(state symbol columnMapping.groundTruthColumns) so reference labels are not
sent; update the payload construction (the payload object near the code creating
experiment_name, text_columns, attachments) to include a ground_truth_columns
property that serializes columnMapping.groundTruthColumns in the same shape as
text_columns/attachments (e.g., map each entry to the expected { column, ... }
structure or pass the raw mapping if the API accepts it), ensuring the key name
matches the API contract used by submit/create handlers.
app/assessment/components/DatasetStep.tsx (1)

76-105: ⚠️ Potential issue | 🟠 Major

Silent failure when dataset parse/fetch fails — user gets stuck.

The past review concern remains valid. fetchAndParseFile returns null on every failure path (missing file_content at line 85, missing sheet at line 91, empty rawData at line 97). In handleDatasetSelect, when parsed is null, no error feedback is shown, yet canProceed (line 321) checks only datasetId && !isLoadingColumns, potentially allowing navigation forward with zero columns loaded.

🛡️ Suggested fix
   const fetchAndParseFile = async (
     id: string | number,
   ): Promise<{ headers: string[]; rows: string[][] } | null> => {
     const json = await apiFetch<DatasetFileResponse>(
       `/api/assessment/datasets/${id}?fetch_content=true`,
       apiKey,
     );
     const base64 = json?.file_content;
-    if (!base64) return null;
+    if (!base64) throw new Error("Dataset content is unavailable.");

     // Decode base64 to binary and parse with XLSX
     const binary = Uint8Array.from(atob(base64), (c) => c.charCodeAt(0));
     const workbook = XLSX.read(binary, { type: "array" });
     const sheet = workbook.Sheets[workbook.SheetNames[0]];
-    if (!sheet) return null;
+    if (!sheet) throw new Error("No worksheet found in the file.");

     const rawData: string[][] = XLSX.utils.sheet_to_json(sheet, {
       header: 1,
       defval: "",
     });
-    if (rawData.length === 0) return null;
+    if (rawData.length === 0) throw new Error("Dataset appears to be empty.");

Then in handleDatasetSelect:

     try {
       const parsed = await fetchAndParseFile(id);
-      if (parsed?.headers) {
+      if (!parsed?.headers?.length) {
+        toast.error("Could not read columns from this dataset.");
+        return;
+      }
         const firstRow = parsed.rows[0] || [];
         ...
         onColumnsLoaded(parsed.headers, sampleRow);
-      }
     } catch (e) {
       if (handleForbiddenApiError(e, onForbidden)) return;
-      console.error("Failed to fetch dataset columns:", e);
+      toast.error(e instanceof Error ? e.message : "Failed to fetch dataset columns");
     } finally {

Also applies to: 224-242

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/DatasetStep.tsx` around lines 76 - 105,
fetchAndParseFile currently returns null on multiple failure paths which leaves
handleDatasetSelect unaware of failures; update fetchAndParseFile to throw
descriptive errors (e.g., "no file_content", "no sheet", "empty data") or return
a {error: string} result so callers can react, and ensure handleDatasetSelect
wraps the call in try/catch (or checks the error) to set
setIsLoadingColumns(false), clear/set columns via setColumns([]) on failure, and
surface a user-visible message (via setAlert or the existing error UI) instead
of silently doing nothing; also update canProceed to include a check on
columns.length (or !isLoadingColumns && columns.length > 0) so navigation is
blocked when no columns are loaded.
🧹 Nitpick comments (7)
app/components/speech-to-text/ModelComparisonCard.tsx (1)

70-70: Consider stabilizing detailsId with useId() to avoid collisions.

Line 70 builds DOM ids directly from modelId; if duplicate model ids are rendered, aria-controls can become ambiguous. useId() + model suffix is safer.

🧩 Proposed refactor
-import React, { useState, useEffect } from "react";
+import React, { useState, useEffect, useId } from "react";
@@
-  const detailsId = `model-card-details-${modelId}`;
+  const reactId = useId();
+  const detailsId = `model-card-details-${modelId}-${reactId}`;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/speech-to-text/ModelComparisonCard.tsx` at line 70, Replace
the unstable DOM id generation in ModelComparisonCard (detailsId built from
modelId) with a stable, React-generated id using useId(): import useId from
React, call const reactId = useId(), then compose detailsId by combining reactId
with the model-specific suffix (e.g.,
`${reactId}-model-card-details-${modelId}`) and update any
aria-controls/aria-labelledby usages to use this new detailsId so ids remain
unique even when modelId values collide; ensure useId() is called inside the
ModelComparisonCard component function.
app/assessment/components/JsonEditor.tsx (2)

16-23: Prefer shared theme tokens over local hardcoded hex values.

C currently hardcodes syntax colors, which makes future theming/dark-mode consistency harder. Consider moving these to shared tokens (CSS vars or centralized color map) and reading from there.

As per coding guidelines, "Use Tailwind CSS 4.x for styling, with custom color system and styles defined in /app/globals.css for cases not supported by Tailwind".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/JsonEditor.tsx` around lines 16 - 23, Replace the
local hardcoded color map C in JsonEditor.tsx with shared theme tokens: remove
hex literals and reference the centralized color system (CSS custom properties
defined in /app/globals.css or the project-wide color map) using the token names
used by our Tailwind 4.x custom color system; update the JsonEditor component to
read from those tokens (e.g., via getComputedStyle/CSS variables or importing
the shared color map) so syntax colors (keys: key, string, number, boolean,
null, punct) come from the global theme rather than local hex values, preserving
existing property names in C to minimize downstream changes.

3-3: Memoize highlighted HTML to reduce avoidable recomputation.

highlight(value) runs during every render. Memoizing the computed HTML keeps this cheaper when rerenders happen without value changes.

♻️ Proposed refactor
-import { useRef, useCallback, useId } from "react";
+import { useRef, useCallback, useId, useMemo } from "react";
...
   const textareaId = useId();
   const errorId = `${textareaId}-error`;
+  const highlightedHtml = useMemo(() => `${highlight(value)}\n`, [value]);
...
-          dangerouslySetInnerHTML={{ __html: highlight(value) + "\n" }}
+          dangerouslySetInnerHTML={{ __html: highlightedHtml }}

Please verify with React Profiler on a large JSON payload (e.g., several thousand characters) to confirm reduced render cost during non-text updates.

Also applies to: 83-85, 234-234

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/JsonEditor.tsx` at line 3, The call to
highlight(value) is recomputing on every render; wrap the highlighted HTML
computation in React.useMemo so it only recalculates when the input value
changes (e.g., const highlightedHtml = useMemo(() => highlight(value),
[value])). Update all occurrences where highlight(value) is used (including the
places around the existing useRef/useCallback hooks and the usage near the
render that injects the highlighted HTML) to use the memoized highlightedHtml
variable instead; run React Profiler afterwards with a large JSON payload to
confirm reduced work during renders that do not change value.
app/assessment/config/api.ts (1)

16-16: Use the root import alias here.

Switch this relative import to @/app/assessment/types for consistency with the repo standard.

Suggested fix
-import { ConfigSelection } from "../types";
+import { ConfigSelection } from "@/app/assessment/types";

As per coding guidelines, **/*.{ts,tsx,js,jsx}: Use the @/ import alias configured in tsconfig.json to resolve imports from the project root.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/config/api.ts` at line 16, The import in api.ts currently uses
a relative path for ConfigSelection; change it to the project root alias by
importing ConfigSelection from "@/app/assessment/types" instead of "../types" so
it follows the repo's tsconfig alias convention and keeps imports consistent
(update the import statement that references ConfigSelection).
app/assessment/AssessmentPageClient.tsx (1)

48-52: Global window property for navigation lock is acceptable but consider alternatives.

Using window.__assessmentForbiddenNavLock works for preventing duplicate redirects, but this pattern can be fragile. A module-scoped variable or a ref would be cleaner and wouldn't pollute the global namespace.

♻️ Alternative using module-scoped variable
+let assessmentForbiddenNavLock = false;
+
-declare global {
-  interface Window {
-    __assessmentForbiddenNavLock?: boolean;
-  }
-}

// Then in handleAssessmentForbidden:
-      if (
-        typeof window !== "undefined" &&
-        window.__assessmentForbiddenNavLock
-      ) {
+      if (assessmentForbiddenNavLock) {
         return;
       }
       ...
-      if (typeof window !== "undefined") {
-        window.__assessmentForbiddenNavLock = true;
-      }
+      assessmentForbiddenNavLock = true;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/AssessmentPageClient.tsx` around lines 48 - 52, The code uses
the global flag window.__assessmentForbiddenNavLock to prevent duplicate
redirects; replace this global with a module-scoped variable or a React ref to
avoid polluting window. Locate uses of window.__assessmentForbiddenNavLock in
AssessmentPageClient.tsx (e.g., in the redirect/navigation handler) and change
them to a top-level let (module-scoped) or a useRef inside the component (e.g.,
navigationLockRef) that you check/set instead of reading/writing
window.__assessmentForbiddenNavLock; ensure the logic for checking and setting
the lock is preserved and remove the global declaration block.
app/assessment/components/EvaluationsTab.tsx (2)

278-345: Potential stale closure in loadConfigDetail deduplication check.

The check at line 283 (if (configDetailsByKey[key] || configLoadingKeys[key]) return) reads from state objects that are dependencies of the useCallback. Since object references change on every state update, this should work correctly, but the deduplication logic relies on the closure capturing the latest state. Consider using a ref for the loading keys set if you encounter duplicate requests.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/EvaluationsTab.tsx` around lines 278 - 345, The
deduplication check inside loadConfigDetail can use stale closure values for
configLoadingKeys/configDetailsByKey; switch to using a mutable ref for at least
the loading set (e.g., configLoadingKeysRef) that you keep in sync whenever
setConfigLoadingKeys runs (update ref in the same setter or via useEffect), read
configLoadingKeysRef.current in the early-return check inside loadConfigDetail,
and update both state (setConfigLoadingKeys) and the ref when marking/clearing
keys so UI state stays consistent while dedupe uses the always-current ref; keep
references to loadConfigDetail, configLoadingKeys state setter, and
configDetailsByKey updates intact.

398-401: Raw fetch bypasses shared error handling.

triggerDownload uses raw fetch instead of apiFetch. While this is needed for blob downloads, the 403 handling at lines 402-405 is manual. Consider extracting a download-specific helper or documenting why raw fetch is required here.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/components/EvaluationsTab.tsx` around lines 398 - 401, The raw
fetch in triggerDownload bypasses shared API error handling; extract or use a
download-specific helper (e.g., apiFetchBlob) that accepts the URL, headers
(buildAuthHeaders()), and export_format, performs the fetch for a blob, and
invokes the same centralized error handling used elsewhere (e.g., the existing
apiFetch or handleApiError logic) so 403/unauthorized responses are handled
consistently instead of duplicating the 403 check in triggerDownload; update
triggerDownload to call that helper and return the blob/trigger the download.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/api/assessment/datasets/`[dataset_id]/route.ts:
- Around line 79-85: The DELETE response handling should special-case HTTP 204:
after calling apiClient(request, `/api/v1/assessment/datasets/${dataset_id}`, {
method: "DELETE" }) inspect the returned status and if it equals 204 return an
empty NextResponse with that status instead of calling NextResponse.json(data, {
status }), otherwise continue to return NextResponse.json(data, { status });
update the route handler in route.ts where apiClient and dataset_id are used to
implement this conditional.
- Around line 45-55: The route currently calls fetch(signedUrl) then awaits
fileResponse.arrayBuffer() and base64-encodes it (using
Buffer.from(...).toString("base64")), which can OOM on large objects; update the
logic in the handler in route.ts to first inspect
fileResponse.headers.get('content-length') (or otherwise determine size) and
enforce a strict max size (e.g., reject with NextResponse.json({ error: "File
too large" }, { status: 413 })) before calling arrayBuffer(), or alternatively
return the signedUrl to the client instead of downloading and encoding
server-side; ensure you change the branch that currently performs arrayBuffer()
and Buffer.from(...) to use the size check or return the signed URL so the
server never buffers large files into memory.

In `@app/assessment/config/api.ts`:
- Around line 25-33: The helper getApiKey currently always returns keys[0].key
which breaks when multiple keys exist; update getApiKey to resolve the same
selected/active keystore entry the rest of the assessment uses (instead of
blindly picking index 0) by reading the selection marker the app stores (e.g.,
an activeKeyId or selectedIndex stored alongside STORAGE_KEY) or accept the
selected key id as an argument; locate the function getApiKey and STORAGE_KEY
usage, read the selection field the rest of the assessment uses (or add a
parameter to getApiKey to receive selectedKeyId), then return the key object
matching that id (or null if not found) rather than keys[0].key.

In `@app/assessment/useAssessmentEvents.ts`:
- Around line 90-95: The SSE connection error handling in useAssessmentEvents
currently calls onForbidden() and then throws, which causes the reconnect loop
to keep retrying; instead, make 403 a terminal case by invoking onForbidden()
and returning early (do not throw) so the reconnect/breaker isn't fed a
retryable error. Update the block that checks response.status === 403 (the SSE
response handling in useAssessmentEvents) to call onForbidden() and exit the
function/connection setup immediately for 403, while continuing to throw for
other non-ok statuses.

In `@app/components/speech-to-text/ModelComparisonCard.tsx`:
- Around line 130-137: The visual spinner rendered in ModelComparisonCard.tsx
when status === "pending" is not announced to assistive tech; update the spinner
container (the div that currently has className "w-4 h-4 ... animate-spin") to
include an accessible status by adding role="status" and an appropriate text
alternative such as aria-label="Loading" or an offscreen <span> with
aria-live="polite" (or aria-hidden on purely decorative span) so screen readers
receive a clear "loading" announcement; ensure the change is applied inside the
component where the status variable is checked so it only appears during the
pending state.

---

Duplicate comments:
In `@app/api/assessment/events/route.ts`:
- Around line 19-24: The handler currently returns the upstream SSE body
directly to the client (response.text() -> details: text); change this to log
the full body server-side (use the existing logger or
processLogger/console.error to record the text and response.status) and return a
sanitized JSON to the client via NextResponse.json (e.g., { error: "Failed to
connect assessment event stream" }) with the same status code; keep using the
existing response variable and NextResponse.json call but remove echoing `text`
in the client payload while preserving server-side logging for debugging.

In `@app/assessment/AssessmentPageClient.tsx`:
- Around line 354-367: The POST payload in AssessmentPageClient.tsx is missing
the mapped ground-truth columns (state symbol columnMapping.groundTruthColumns)
so reference labels are not sent; update the payload construction (the payload
object near the code creating experiment_name, text_columns, attachments) to
include a ground_truth_columns property that serializes
columnMapping.groundTruthColumns in the same shape as text_columns/attachments
(e.g., map each entry to the expected { column, ... } structure or pass the raw
mapping if the API accepts it), ensuring the key name matches the API contract
used by submit/create handlers.

In `@app/assessment/components/DatasetStep.tsx`:
- Around line 76-105: fetchAndParseFile currently returns null on multiple
failure paths which leaves handleDatasetSelect unaware of failures; update
fetchAndParseFile to throw descriptive errors (e.g., "no file_content", "no
sheet", "empty data") or return a {error: string} result so callers can react,
and ensure handleDatasetSelect wraps the call in try/catch (or checks the error)
to set setIsLoadingColumns(false), clear/set columns via setColumns([]) on
failure, and surface a user-visible message (via setAlert or the existing error
UI) instead of silently doing nothing; also update canProceed to include a check
on columns.length (or !isLoadingColumns && columns.length > 0) so navigation is
blocked when no columns are loaded.

In `@app/assessment/components/EvaluationsTab.tsx`:
- Line 786: The badge currently uses run.status.replace("_", " ") which only
replaces the first underscore; update the badge to call the existing
formatStatusLabel function (defined as formatStatusLabel) with run.status so all
underscores are replaced via the global regex and the label is consistently
formatted; locate the usage near the EvaluationsTab component where run.status
is rendered and replace the replace call with formatStatusLabel(run.status).
- Around line 881-883: The isCompletedChild check only matches childRun.status
=== "completed", causing "completed_with_errors" runs to miss preview/export
actions; update the logic that defines isCompletedChild (the variable using
childRun.status) to include "completed_with_errors" as a valid completed state
(e.g., check for status === "completed" || status === "completed_with_errors" or
use a startsWith/includes test) so those child runs show the preview/export
actions.

---

Nitpick comments:
In `@app/assessment/AssessmentPageClient.tsx`:
- Around line 48-52: The code uses the global flag
window.__assessmentForbiddenNavLock to prevent duplicate redirects; replace this
global with a module-scoped variable or a React ref to avoid polluting window.
Locate uses of window.__assessmentForbiddenNavLock in AssessmentPageClient.tsx
(e.g., in the redirect/navigation handler) and change them to a top-level let
(module-scoped) or a useRef inside the component (e.g., navigationLockRef) that
you check/set instead of reading/writing window.__assessmentForbiddenNavLock;
ensure the logic for checking and setting the lock is preserved and remove the
global declaration block.

In `@app/assessment/components/EvaluationsTab.tsx`:
- Around line 278-345: The deduplication check inside loadConfigDetail can use
stale closure values for configLoadingKeys/configDetailsByKey; switch to using a
mutable ref for at least the loading set (e.g., configLoadingKeysRef) that you
keep in sync whenever setConfigLoadingKeys runs (update ref in the same setter
or via useEffect), read configLoadingKeysRef.current in the early-return check
inside loadConfigDetail, and update both state (setConfigLoadingKeys) and the
ref when marking/clearing keys so UI state stays consistent while dedupe uses
the always-current ref; keep references to loadConfigDetail, configLoadingKeys
state setter, and configDetailsByKey updates intact.
- Around line 398-401: The raw fetch in triggerDownload bypasses shared API
error handling; extract or use a download-specific helper (e.g., apiFetchBlob)
that accepts the URL, headers (buildAuthHeaders()), and export_format, performs
the fetch for a blob, and invokes the same centralized error handling used
elsewhere (e.g., the existing apiFetch or handleApiError logic) so
403/unauthorized responses are handled consistently instead of duplicating the
403 check in triggerDownload; update triggerDownload to call that helper and
return the blob/trigger the download.

In `@app/assessment/components/JsonEditor.tsx`:
- Around line 16-23: Replace the local hardcoded color map C in JsonEditor.tsx
with shared theme tokens: remove hex literals and reference the centralized
color system (CSS custom properties defined in /app/globals.css or the
project-wide color map) using the token names used by our Tailwind 4.x custom
color system; update the JsonEditor component to read from those tokens (e.g.,
via getComputedStyle/CSS variables or importing the shared color map) so syntax
colors (keys: key, string, number, boolean, null, punct) come from the global
theme rather than local hex values, preserving existing property names in C to
minimize downstream changes.
- Line 3: The call to highlight(value) is recomputing on every render; wrap the
highlighted HTML computation in React.useMemo so it only recalculates when the
input value changes (e.g., const highlightedHtml = useMemo(() =>
highlight(value), [value])). Update all occurrences where highlight(value) is
used (including the places around the existing useRef/useCallback hooks and the
usage near the render that injects the highlighted HTML) to use the memoized
highlightedHtml variable instead; run React Profiler afterwards with a large
JSON payload to confirm reduced work during renders that do not change value.

In `@app/assessment/config/api.ts`:
- Line 16: The import in api.ts currently uses a relative path for
ConfigSelection; change it to the project root alias by importing
ConfigSelection from "@/app/assessment/types" instead of "../types" so it
follows the repo's tsconfig alias convention and keeps imports consistent
(update the import statement that references ConfigSelection).

In `@app/components/speech-to-text/ModelComparisonCard.tsx`:
- Line 70: Replace the unstable DOM id generation in ModelComparisonCard
(detailsId built from modelId) with a stable, React-generated id using useId():
import useId from React, call const reactId = useId(), then compose detailsId by
combining reactId with the model-specific suffix (e.g.,
`${reactId}-model-card-details-${modelId}`) and update any
aria-controls/aria-labelledby usages to use this new detailsId so ids remain
unique even when modelId values collide; ensure useId() is called inside the
ModelComparisonCard component function.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a8bc382f-67d4-4a19-bfe1-93e4b7b924b0

📥 Commits

Reviewing files that changed from the base of the PR and between fd34470 and c5640ad.

📒 Files selected for processing (19)
  • app/api/assessment/assessments/[assessment_id]/results/route.ts
  • app/api/assessment/assessments/[assessment_id]/retry/route.ts
  • app/api/assessment/assessments/route.ts
  • app/api/assessment/datasets/[dataset_id]/route.ts
  • app/api/assessment/datasets/route.ts
  • app/api/assessment/evaluations/[evaluation_id]/results/route.ts
  • app/api/assessment/evaluations/[evaluation_id]/retry/route.ts
  • app/api/assessment/evaluations/route.ts
  • app/api/assessment/events/route.ts
  • app/api/assessment/utils.ts
  • app/assessment/AssessmentPageClient.tsx
  • app/assessment/components/DataViewModal.tsx
  • app/assessment/components/DatasetStep.tsx
  • app/assessment/components/EvaluationsTab.tsx
  • app/assessment/components/JsonEditor.tsx
  • app/assessment/config/api.ts
  • app/assessment/errorUtils.ts
  • app/assessment/useAssessmentEvents.ts
  • app/components/speech-to-text/ModelComparisonCard.tsx
✅ Files skipped from review due to trivial changes (1)
  • app/assessment/components/DataViewModal.tsx
🚧 Files skipped from review as they are similar to previous changes (5)
  • app/api/assessment/assessments/route.ts
  • app/api/assessment/datasets/route.ts
  • app/api/assessment/assessments/[assessment_id]/retry/route.ts
  • app/api/assessment/assessments/[assessment_id]/results/route.ts
  • app/api/assessment/evaluations/route.ts

Comment on lines +45 to +55
const fileResponse = await fetch(signedUrl);
if (!fileResponse.ok) {
return NextResponse.json(
{ error: "Failed to fetch file from storage" },
{ status: 502 },
);
}

const arrayBuffer = await fileResponse.arrayBuffer();
const base64 = Buffer.from(arrayBuffer).toString("base64");
return NextResponse.json(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Cap fetch_content before base64-encoding the whole file.

This path buffers the entire object into memory and then expands it again as base64. A single large dataset can spike memory hard enough to take the route down. Either return the signed URL to the client or enforce a strict size cap before arrayBuffer().

Possible guardrail
       const fileResponse = await fetch(signedUrl);
       if (!fileResponse.ok) {
         return NextResponse.json(
           { error: "Failed to fetch file from storage" },
           { status: 502 },
         );
       }

+      const contentLength = Number(
+        fileResponse.headers.get("content-length") || "0",
+      );
+      const MAX_EMBEDDED_BYTES = 5 * 1024 * 1024;
+      if (contentLength > MAX_EMBEDDED_BYTES) {
+        return NextResponse.json(
+          { error: "Dataset file is too large to embed in the response" },
+          { status: 413 },
+        );
+      }
+
       const arrayBuffer = await fileResponse.arrayBuffer();
       const base64 = Buffer.from(arrayBuffer).toString("base64");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/datasets/`[dataset_id]/route.ts around lines 45 - 55, The
route currently calls fetch(signedUrl) then awaits fileResponse.arrayBuffer()
and base64-encodes it (using Buffer.from(...).toString("base64")), which can OOM
on large objects; update the logic in the handler in route.ts to first inspect
fileResponse.headers.get('content-length') (or otherwise determine size) and
enforce a strict max size (e.g., reject with NextResponse.json({ error: "File
too large" }, { status: 413 })) before calling arrayBuffer(), or alternatively
return the signedUrl to the client instead of downloading and encoding
server-side; ensure you change the branch that currently performs arrayBuffer()
and Buffer.from(...) to use the size check or return the signed URL so the
server never buffers large files into memory.

Comment on lines +79 to +85
const { status, data } = await apiClient(
request,
`/api/v1/assessment/datasets/${dataset_id}`,
{ method: "DELETE" },
);

return NextResponse.json(data, { status });
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle 204 deletes without NextResponse.json().

If the backend returns 204 No Content, NextResponse.json(data, { status: 204 }) still tries to build a body, which is invalid for 204 responses and can throw. Return an empty NextResponse for that status instead.

Suggested fix
     const { status, data } = await apiClient(
       request,
       `/api/v1/assessment/datasets/${dataset_id}`,
       { method: "DELETE" },
     );

+    if (status === 204) {
+      return new NextResponse(null, { status });
+    }
+
     return NextResponse.json(data, { status });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const { status, data } = await apiClient(
request,
`/api/v1/assessment/datasets/${dataset_id}`,
{ method: "DELETE" },
);
return NextResponse.json(data, { status });
const { status, data } = await apiClient(
request,
`/api/v1/assessment/datasets/${dataset_id}`,
{ method: "DELETE" },
);
if (status === 204) {
return new NextResponse(null, { status });
}
return NextResponse.json(data, { status });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/api/assessment/datasets/`[dataset_id]/route.ts around lines 79 - 85, The
DELETE response handling should special-case HTTP 204: after calling
apiClient(request, `/api/v1/assessment/datasets/${dataset_id}`, { method:
"DELETE" }) inspect the returned status and if it equals 204 return an empty
NextResponse with that status instead of calling NextResponse.json(data, {
status }), otherwise continue to return NextResponse.json(data, { status });
update the route handler in route.ts where apiClient and dataset_id are used to
implement this conditional.

Comment on lines +25 to +33
function getApiKey(): string | null {
if (typeof window === "undefined") return null;

try {
const stored = localStorage.getItem(STORAGE_KEY);
if (!stored) return null;

const keys = JSON.parse(stored);
return keys.length > 0 ? keys[0].key : null;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't always use the first stored API key.

The assessment flow already tracks an active keystore entry elsewhere, but this helper always sends keys[0].key. As soon as a user has multiple keys, config reads/writes can hit the wrong tenant or fail with 403s. Plumb the selected key into this module, or resolve the same selected-key source the rest of the assessment code uses.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/config/api.ts` around lines 25 - 33, The helper getApiKey
currently always returns keys[0].key which breaks when multiple keys exist;
update getApiKey to resolve the same selected/active keystore entry the rest of
the assessment uses (instead of blindly picking index 0) by reading the
selection marker the app stores (e.g., an activeKeyId or selectedIndex stored
alongside STORAGE_KEY) or accept the selected key id as an argument; locate the
function getApiKey and STORAGE_KEY usage, read the selection field the rest of
the assessment uses (or add a parameter to getApiKey to receive selectedKeyId),
then return the key object matching that id (or null if not found) rather than
keys[0].key.

Comment on lines +90 to +95
if (!response.ok || !response.body) {
if (response.status === 403 && onForbidden) {
onForbidden();
}
throw new Error(`SSE connection failed: ${response.status}`);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Treat 403 as terminal for this connection.

This currently calls onForbidden() and then throws, so the reconnect loop keeps retrying a permanently denied stream and can spam toasts or redirects. Stop the loop after the first 403 instead of feeding it back into the breaker.

Suggested fix
       if (!response.ok || !response.body) {
         if (response.status === 403 && onForbidden) {
+          cancelled = true;
           onForbidden();
+          return;
         }
         throw new Error(`SSE connection failed: ${response.status}`);
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/assessment/useAssessmentEvents.ts` around lines 90 - 95, The SSE
connection error handling in useAssessmentEvents currently calls onForbidden()
and then throws, which causes the reconnect loop to keep retrying; instead, make
403 a terminal case by invoking onForbidden() and returning early (do not throw)
so the reconnect/breaker isn't fed a retryable error. Update the block that
checks response.status === 403 (the SSE response handling in
useAssessmentEvents) to call onForbidden() and exit the function/connection
setup immediately for 403, while continuing to throw for other non-ok statuses.

Comment on lines +130 to +137
{status === "pending" && (
<div
className="w-4 h-4 border-2 border-t-transparent rounded-full animate-spin"
style={{
borderColor: colors.text.secondary,
borderTopColor: "transparent",
}}
/>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Pending state is not announced to screen readers.

Line 130 currently renders only a visual spinner, so assistive tech may miss that work is in progress. Add a status role/label.

♿ Proposed fix
             {status === "pending" && (
               <div
+                role="status"
+                aria-live="polite"
+                aria-label="Transcription in progress"
                 className="w-4 h-4 border-2 border-t-transparent rounded-full animate-spin"
                 style={{
                   borderColor: colors.text.secondary,
                   borderTopColor: "transparent",
                 }}
               />
             )}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
{status === "pending" && (
<div
className="w-4 h-4 border-2 border-t-transparent rounded-full animate-spin"
style={{
borderColor: colors.text.secondary,
borderTopColor: "transparent",
}}
/>
{status === "pending" && (
<div
role="status"
aria-live="polite"
aria-label="Transcription in progress"
className="w-4 h-4 border-2 border-t-transparent rounded-full animate-spin"
style={{
borderColor: colors.text.secondary,
borderTopColor: "transparent",
}}
/>
)}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/speech-to-text/ModelComparisonCard.tsx` around lines 130 -
137, The visual spinner rendered in ModelComparisonCard.tsx when status ===
"pending" is not announced to assistive tech; update the spinner container (the
div that currently has className "w-4 h-4 ... animate-spin") to include an
accessible status by adding role="status" and an appropriate text alternative
such as aria-label="Loading" or an offscreen <span> with aria-live="polite" (or
aria-hidden on purely decorative span) so screen readers receive a clear
"loading" announcement; ensure the change is applied inside the component where
the status variable is checked so it only appears during the pending state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request ready-for-review

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

Assessment: Multi-step LLM workflow

2 participants