Conversation
Add a cursor-paginated API endpoint that returns which documentation pages have changed since a given date, using Elasticsearch search_after for efficient deep pagination with no 10k hit limit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…OT compatibility The Lambda NativeAOT build failed because JsonSerializer.Deserialize<T> requires runtime code generation. Use JsonDocument.Parse instead, which is fully AOT-compatible. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use params overload for .Sort() so both sort fields (LastUpdated, Url) are applied instead of the second overwriting the first - Add SearchTitle and Type to source filter to prevent deserialization failures on required DocumentationDocument properties - Handle ES returning date sort values as double with TryGetDouble fallback Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Mpdreamz
left a comment
There was a problem hiding this comment.
Nice!
This should use PIT explicitly, multiple queries can share the same one.
That way deploys do not interfere with this API
…ation Ensures consistent results across paginated requests by opening a PIT on the first page and threading its ID through the cursor for subsequent pages. Auto-recovers from expired PITs by opening a fresh one and retrying without search_after. Old 2-element cursors are backward compatible. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdded GET /changes (query: since, optional cursor, optional size). Implemented ChangesUsecase (page-size clamping, tolerant base64url JSON cursor decode/encode) and public DTOs for request/response/changed pages. Added IChangesGateway and an Elasticsearch-backed ChangesGateway that uses PIT + search_after, retries once on expired/missing PIT, and produces next-page cursors. Introduced SharedPointInTimeManager to manage PIT lifecycle, registered DI services for the changes flow, and added JSON source-gen entries for the new response DTOs. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/services/Elastic.Documentation.Search/ChangesGateway.cs`:
- Around line 41-51: The current ChangesGateway flow logs but continues when
response.IsValidResponse is false, which leads to a 200 OK with empty/truncated
data; change the post-request handling in ChangesGateway (the block that reads
response.IsValidResponse and then calls BuildResult) to surface failures: if
response.IsValidResponse is still false after retries, throw a descriptive
exception (or return a failed result type) including
response.ElasticsearchServerError?.Error.Reason, response.DebugInformation and
response.HttpStatusCode so the caller (ChangesUsecase.cs /
MappingsExtensions.cs) receives an error instead of a 200; ensure the thrown
type is appropriate for your API error handling pipeline so it becomes an HTTP
error response.
- Around line 34-38: The retry path for an expired PIT must not reset pagination
to the beginning; instead, when IsExpiredPit(response) and request.Cursor?.PitId
is not null, log via LogPitExpired, obtain a new pitId via OpenPit(ctx), and
call Search with the original request.Cursor's sort/position preserved but with
its PitId replaced by the new pitId (i.e., pass request with Cursor updated to
new PitId and same sort/last-position) rather than request with Cursor = null;
alternatively, if you cannot reconstruct the prior sort position, surface a
clear error so the client can explicitly restart pagination. Ensure you update
the call to Search(request with { Cursor = updatedCursor }, pitId, fetchSize,
ctx) and preserve symbols request.Cursor, pitId, Search, IsExpiredPit, OpenPit,
LogPitExpired.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4029dba8-eb52-421d-ac11-beaa8b146b02
📒 Files selected for processing (6)
src/api/Elastic.Documentation.Api.Core/Changes/ChangesUsecase.cssrc/api/Elastic.Documentation.Api.Core/Changes/IChangesGateway.cssrc/api/Elastic.Documentation.Api.Core/SerializationContext.cssrc/api/Elastic.Documentation.Api.Infrastructure/MappingsExtensions.cssrc/services/Elastic.Documentation.Search/ChangesGateway.cssrc/services/Elastic.Documentation.Search/ServicesExtension.cs
When a PIT expired mid-pagination, the retry was resetting the cursor to null, silently restarting from the beginning. Now the retry preserves the original search_after sort values and only replaces the PIT ID. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…y data Previously, a failed search response was logged as a warning but execution continued to BuildResult, returning HTTP 200 with empty/truncated data. Now throws InvalidOperationException so the failure surfaces as an error to the caller. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/services/Elastic.Documentation.Search/ChangesGateway.cs (1)
42-46:⚠️ Potential issue | 🔴 CriticalES failures still return 200 OK with empty/truncated data.
When
response.IsValidResponseis false after retry, this logs a warning but continues toBuildResult. Callers receive a successful response hiding backend outages.Throw an exception so the API layer can return an appropriate error status:
Proposed fix
if (!response.IsValidResponse) { - logger.LogWarning("Elasticsearch changes query returned invalid response. Reason: {Reason}", - response.ElasticsearchServerError?.Error.Reason ?? "Unknown"); + var reason = response.ElasticsearchServerError?.Error.Reason ?? "Unknown"; + logger.LogError("Elasticsearch changes query failed. Reason: {Reason}", reason); + throw new InvalidOperationException($"Changes query failed: {reason}"); },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/services/Elastic.Documentation.Search/ChangesGateway.cs` around lines 42 - 46, The current code in ChangesGateway (when checking response.IsValidResponse) only logs a warning and proceeds to BuildResult, hiding ES failures; modify the error path so that when response.IsValidResponse is false (after retries) you throw an exception (e.g., InvalidOperationException or a domain-specific exception) including the Elasticsearch error reason and any response metadata instead of continuing to BuildResult, so the API layer can translate it into an error status; update the method that performs the query (the method containing response, in ChangesGateway) to throw with a clear message and include response.ElasticsearchServerError?.Error.Reason ?? "Unknown" and relevant context.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/services/Elastic.Documentation.Search/ChangesGateway.cs`:
- Around line 42-46: The current code in ChangesGateway (when checking
response.IsValidResponse) only logs a warning and proceeds to BuildResult,
hiding ES failures; modify the error path so that when response.IsValidResponse
is false (after retries) you throw an exception (e.g., InvalidOperationException
or a domain-specific exception) including the Elasticsearch error reason and any
response metadata instead of continuing to BuildResult, so the API layer can
translate it into an error status; update the method that performs the query
(the method containing response, in ChangesGateway) to throw with a clear
message and include response.ElasticsearchServerError?.Error.Reason ?? "Unknown"
and relevant context.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4bae2922-2f5a-4c07-a785-09e843860fa8
📒 Files selected for processing (1)
src/services/Elastic.Documentation.Search/ChangesGateway.cs
Added Point In Time (PIT) support to the changes endpoint for consistent pagination across requests. The PIT ID is threaded through the cursor and automatically refreshed if it expires, resuming from the same position. |
| ) | ||
| ) | ||
| ); | ||
| var pitId = await ResolvePitId(request.Cursor?.PitId, ctx); |
There was a problem hiding this comment.
Thanks for tackling this!
I think this requires a bit more work, e.g
- We have one static pit id registration in the class.
- If it expires using locks, only one requests updates or starts a pit id.
That way we don't have to communicate the PIT back the to user or start multiple if many consumers are scrolling this API.
There was a problem hiding this comment.
Switched to a shared PIT. This works because last_updated only ever increases, so if the PIT refreshes mid-session the cursor just moves forward. Worst case, you get some duplicates, never missed docs.
Extract PIT lifecycle from ChangesGateway into SharedPointInTimeManager, a thread-safe singleton that holds a single PIT for all concurrent paginators. Reduces ES resource usage for the public changes API. The cursor no longer carries a PIT ID — it's just [epochMs, url]. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ption Address CodeQL finding by catching only OperationCanceledException in DisposeAsync instead of generic Exception, letting unexpected errors propagate rather than being silently swallowed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/services/Elastic.Documentation.Search/SharedPointInTimeManager.cs`:
- Around line 85-100: The early return when (_pitId is null) skips the finally
block and leaks _semaphore; ensure _semaphore is disposed even when no PIT was
opened by disposing it unconditionally. Modify the method in
SharedPointInTimeManager so that you do not return before disposing _semaphore
(e.g., remove the early return and wrap the PIT-close logic in a try/finally
that always calls _semaphore.Dispose(), or explicitly call _semaphore.Dispose()
before returning when _pitId is null), keeping references to _pitId,
clientAccessor.Client.ClosePointInTimeAsync, and LogPitClosed intact.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 280fb2a9-6858-45fe-ac68-08814cd5c978
📒 Files selected for processing (1)
src/services/Elastic.Documentation.Search/SharedPointInTimeManager.cs
src/services/Elastic.Documentation.Search/SharedPointInTimeManager.cs
Outdated
Show resolved
Hide resolved
Move the _pitId null check inside the try block so the finally block always executes, preventing _semaphore from leaking when DisposeAsync is called without a PIT having been opened. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| { | ||
| if (_pitId is not null && DateTimeOffset.UtcNow < _expiresAt) | ||
| return _pitId; | ||
|
|
There was a problem hiding this comment.
I think it makes sense to handle expired pits here and not in the call site. That way GetPitIdAsync() is the only method that needs to be public.
Merge HandleExpiredPitAsync into GetPitIdAsync via an optional expiredPitId parameter so call sites no longer need to orchestrate the invalidate-then-reopen dance. This makes GetPitIdAsync the single public entry point for PIT lifecycle management. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
GET /v1/changes?since=<ISO8601>&size=100&cursor=<token>API endpoint that returns which documentation pages have changed since a given datesearch_afterfor cursor-based pagination with no 10k hit limiturl,title,lastUpdatedper changed page, withhasMoreandnextCursorfor paginationHow to use
Get the first page of changes
Response:
{ "pages": [ { "url": "/docs/getting-started", "title": "Getting Started", "lastUpdated": "2026-03-15T10:30:00Z" }, { "url": "/docs/api-reference", "title": "API Reference", "lastUpdated": "2026-03-16T14:00:00Z" } ], "hasMore": true, "nextCursor": "eyJhYmM..." }Get subsequent pages
Pass the
nextCursorvalue back as thecursorparameter:Repeat until
hasMoreisfalse.Parameters
sincesizecursornextCursor.Notes
SharedPointInTimeManager, so all concurrent paginators share one PIT rather than each opening their own. This reduces ES resource consumption.last_updatedcan only increase).Details
New files:
Api.Core/Changes/IChangesGateway.cs— gateway interface + domain recordsApi.Core/Changes/ChangesUsecase.cs— usecase with Base64Url cursor encode/decode + API modelsSearch/ChangesGateway.cs— Elasticsearch range query + search_after paginationSearch/SharedPointInTimeManager.cs— thread-safe singleton managing a shared PIT lifecycle (open, refresh, expire, close on shutdown)Modified files:
SerializationContext.cs— AOT serialization registrationMappingsExtensions.cs— endpoint mappingSearch/ServicesExtension.cs— DI registrationTest plan
sinceparameter is missingsincedatehasMore: falseand no cursor[epochMs, url](no PIT ID)🤖 Generated with Claude Code