Skip to content

Conversation

@brendan-kellam
Copy link
Contributor

@brendan-kellam brendan-kellam commented Nov 13, 2025

After benchmarking our search api, I noticed that the zod parseAsync call we were doing in searchApi.ts was contributing to a significant portion of the total search time (average 64.6% in my benchmarks), especially for larger queries that return payloads above > 10MB. This issue colinhacks/zod#205 confirms others are hitting this with zod as well.

We don't really need to parse zoekt's response bodies (since we always expect them to be valid), so this PR removes the parseAsync call and instead does a simple cast. The results are pretty dramatic: my initial benchmarks are shoing a increase in search performance by a order of magnitude (89.8% reduction in average search time). Here's the before & after on the benchmark:

Before:
📈 Latency Distribution:
   Min (p0):    407.04ms
   p50:        2660.49ms
   p75:       11770.50ms
   p90:       50848.31ms
   p95:       54193.05ms
   p99:       74902.95ms
   Max (p100):84267.76ms

After:
📈 Latency Distribution:
   Min (p0):    293.31ms
   p50:         569.53ms
   p75:        1855.31ms
   p90:        3682.20ms
   p95:        4733.21ms
   p99:        5561.01ms
   Max (p100): 5805.22ms

With this change, I figured we can bump the default number of results requested to 100k (up from 5k!). In my testing performance was good and I was able to get 100k results in <5 seconds.

Summary by CodeRabbit

  • New Features

    • Added debug timing information to search responses for performance visibility.
  • Changed

    • Increased default search result count from 5,000 to 100,000.
  • Bug Fixes

    • Resolved a significant performance bottleneck in the search API, yielding substantial performance gains.

@coderabbitai
Copy link

coderabbitai bot commented Nov 13, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Search optimization increases default result limits from 5,000 to 100,000 entries. Schema-based validation in the client API is replaced with type assertions. Performance instrumentation and timing measurements are added to the search flow, with debug timings returned in responses.

Changes

Cohort / File(s) Summary
Search Configuration & Documentation
CHANGELOG.md, packages/web/src/app/[domain]/search/components/searchResultsPage.tsx
Increased default search result count from 5,000 to 100,000 after optimization pass; documented in changelog.
Schema & Type Definitions
packages/web/src/features/search/schemas.ts, packages/web/src/features/search/zoektSchema.ts
Added optional __debug_timings field to search response schema; introduced ZoektSearchResponse type alias.
Search API Implementation
packages/web/src/features/search/searchApi.ts
Instrumented search flow with performance measurements for fetch, parsing, and transformation steps; added timing breakdown in response; introduced ZoektSearchResponse handling and transformZoektSearchResponse function.
Client API
packages/web/src/app/api/(client)/client.ts
Removed schema-based validation; replaced with type assertions for search, fetchFileSource, getRepos, and getVersion functions; broadened error handling return types.

Sequence Diagram

sequenceDiagram
    participant Client
    participant SearchAPI as Search API
    participant Zoekt
    participant Response

    Client->>SearchAPI: search({ query, matches, ... })
    
    rect rgb(240, 248, 255)
    Note over SearchAPI: measure: fetch
    SearchAPI->>Zoekt: fetch search results
    Zoekt-->>SearchAPI: raw response
    end
    
    rect rgb(240, 248, 255)
    Note over SearchAPI: measure: parse_json
    SearchAPI->>SearchAPI: JSON.parse(response)
    end
    
    rect rgb(240, 248, 255)
    Note over SearchAPI: measure: transform
    SearchAPI->>SearchAPI: transformZoektSearchResponse()
    end
    
    SearchAPI->>Response: SearchResponse + __debug_timings
    Response-->>Client: { results, __debug_timings }
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

  • packages/web/src/features/search/searchApi.ts: Significant refactoring with performance instrumentation, new response transformation pathway, and error handling updates—requires careful review of timing measurement logic and response shape changes.
  • packages/web/src/app/api/(client)/client.ts: Wholesale removal of schema-based validation across multiple functions in favor of type assertions—needs verification that this doesn't introduce runtime validation gaps.
  • packages/web/src/features/search/zoektSchema.ts: Appears to contain duplicate ZoektSearchResponse type definitions that should be consolidated.

Possibly related PRs

Suggested reviewers

  • msukkari

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(web): Search performance improvements' directly relates to the main objective of this PR, which is to optimize search performance by removing expensive zod validation and increasing result counts.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

This comment has been minimized.

@brendan-kellam
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Nov 13, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/web/src/app/api/(client)/client.ts (1)

43-52: Restore ServiceError handling when fetching repos.
Line 51 currently casts the raw JSON to GetReposResponse, so a backend ServiceError payload comes back as if it were a repo list and downstream code will explode when it dereferences repo fields. Please propagate the error by updating the signature and checking isServiceError before casting.

Apply this diff:

-export const getRepos = async (): Promise<GetReposResponse> => {
+export const getRepos = async (): Promise<GetReposResponse | ServiceError> => {
     const result = await fetch("/api/repos", {
         method: "GET",
         headers: {
             "Content-Type": "application/json",
         },
     }).then(response => response.json());

-    return result as GetReposResponse | ServiceError;
+    if (isServiceError(result)) {
+        return result;
+    }
+
+    return result as GetReposResponse;
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 06c84f0 and d4ae3ff.

📒 Files selected for processing (6)
  • CHANGELOG.md (1 hunks)
  • packages/web/src/app/[domain]/search/components/searchResultsPage.tsx (1 hunks)
  • packages/web/src/app/api/(client)/client.ts (4 hunks)
  • packages/web/src/features/search/schemas.ts (1 hunks)
  • packages/web/src/features/search/searchApi.ts (4 hunks)
  • packages/web/src/features/search/zoektSchema.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*

📄 CodeRabbit inference engine (.cursor/rules/style.mdc)

Filenames should always be camelCase. Exception: if there are filenames in the same directory with a format other than camelCase, use that format to keep things consistent.

Files:

  • packages/web/src/features/search/schemas.ts
  • CHANGELOG.md
  • packages/web/src/app/api/(client)/client.ts
  • packages/web/src/app/[domain]/search/components/searchResultsPage.tsx
  • packages/web/src/features/search/zoektSchema.ts
  • packages/web/src/features/search/searchApi.ts
🪛 LanguageTool
CHANGELOG.md

[grammar] ~18-~18: Ensure spelling is correct
Context: ... in search api, resulting in a order of magnitutde performance improvement. [#615](https:/...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build

@brendan-kellam brendan-kellam merged commit a814bd6 into main Nov 13, 2025
9 checks passed
@brendan-kellam brendan-kellam deleted the bkellam/search_perf_improvements branch November 13, 2025 07:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants