feat(api): migrate GET /api/apify/runs/{runId}#463
Conversation
Ports the Apify run-status endpoint from the legacy Express service into mono/api as a RESTful Next.js route. Uses the Apify SDK (not raw fetch) to match sibling start-scrape helpers. Wire format renames datasetId -> dataset_id (snake_case). Auth is required via validateAuthContext; no per-account access check (runId is an Apify-scoped identifier, not user-scoped). Does not preserve the legacy silent-error-to-RUNNING behaviour; errors propagate to a clean 500. Row 27 of the Agent API migration.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 57 minutes and 50 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (2)
📝 WalkthroughWalkthroughA new GET API endpoint is introduced at Changes
Sequence DiagramsequenceDiagram
actor Client
participant Route as Route Handler<br/>/api/apify/runs/[runId]
participant Validator as validateGetScraperResultsRequest
participant Handler as getScraperResultsHandler
participant Apify as Apify Client
Client->>Route: GET /api/apify/runs/[runId]
Route->>Validator: validateGetScraperResultsRequest(request, runId)
Validator->>Validator: Parse & validate runId (Zod)
Validator->>Validator: validateAuthContext(request)
alt Validation or Auth Failed
Validator-->>Route: NextResponse (400/403)
else Success
Validator-->>Route: { runId }
end
alt Validation Passed
Route->>Handler: getScraperResultsHandler(request, runId)
Handler->>Apify: apifyClient.run(runId).get()
Apify-->>Handler: Run data (status, defaultDatasetId)
alt Status is SUCCEEDED & dataset_id exists
Handler->>Apify: apifyClient.dataset(dataset_id).listItems()
Apify-->>Handler: Dataset items
Handler-->>Route: 200 { status, dataset_id, data }
else Status is FAILED or ABORTED
Handler-->>Route: 500 { status, dataset_id }
else Status is SUCCEEDED (no dataset)
Handler-->>Route: 500 { status, dataset_id }
else Any other status
Handler-->>Route: 200 { status, dataset_id }
end
end
Route-->>Client: JSON Response + CORS Headers
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Preview smoke —
The |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (2)
lib/apify/getScraperResultsHandler.ts (1)
27-63: Split the status response branching into a small helper.
getScraperResultsHandlerexceeds the 20-line guideline and currently handles validation, Apify orchestration, dataset fetch, and response mapping in one function. Extracting the status-to-response branch would keep the route orchestration easier to maintain.As per coding guidelines,
**/*.{js,ts,tsx,jsx,py,java,cs,go,rb,php}: “Flag functions longer than 20 lines”.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/apify/getScraperResultsHandler.ts` around lines 27 - 63, The function getScraperResultsHandler is doing both orchestration and response-mapping; extract the status-to-response branching into a small helper (e.g., mapActorStatusToResponse or buildScraperResponse) that takes the actor status object ({ status, dataset_id }), plus optional data (from getDataset) and getCorsHeaders(), and returns the proper NextResponse for each branch (SUCCEEDED with/without dataset_id or data, FAILED/ABORTED -> 500, other -> 200). In practice keep validateGetScraperResultsRequest, getActorStatus and getDataset calls in getScraperResultsHandler but delegate all conditional logic that inspects status and dataset_id to the new helper (reference symbols: getScraperResultsHandler, validateGetScraperResultsRequest, getActorStatus, getDataset, getCorsHeaders); ensure the new helper returns NextResponse and replace the inline branching with a single call to it.lib/apify/validateGetScraperResultsRequest.ts (1)
6-32: Export the actual Zod schema, not just the shape.
getScraperResultsParamsSchemais named/exported as a schema, but Line 6 exports only the object shape. This makes the exported API less reusable and forces Line 32 to recreate the schema. Export thez.object(...)directly and infer from it.♻️ Proposed cleanup
-export const getScraperResultsParamsSchema = { +export const getScraperResultsParamsSchema = z.object({ runId: z.string().min(1).describe("The Apify run identifier from the URL path."), -}; +}); -export type GetScraperResultsParams = z.infer<z.ZodObject<typeof getScraperResultsParamsSchema>>; +export type GetScraperResultsParams = z.infer<typeof getScraperResultsParamsSchema>; @@ - const parsed = z.object(getScraperResultsParamsSchema).safeParse({ runId }); + const parsed = getScraperResultsParamsSchema.safeParse({ runId });As per coding guidelines,
lib/**/validate*.ts: “Create validate functions invalidate<EndpointName>Body.tsorvalidate<EndpointName>Query.tsfiles that export both the schema and inferred TypeScript type”.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/apify/validateGetScraperResultsRequest.ts` around lines 6 - 32, getScraperResultsParamsSchema currently exports a plain object shape instead of a Zod schema which forces recreate of the schema in validateGetScraperResultsRequest; change getScraperResultsParamsSchema to export the actual Zod object (e.g. const getScraperResultsParamsSchema = z.object({ runId: z.string().min(1).describe(...) })) and update GetScraperResultsParams to infer from z.infer<typeof getScraperResultsParamsSchema>, then in validateGetScraperResultsRequest use getScraperResultsParamsSchema.safeParse({ runId }) (and remove the inline z.object(...) there) so the single exported schema is reused across the file and by callers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/api/apify/runs/`[runId]/route.ts:
- Around line 13-36: The route lacks route-level tests for the OPTIONS preflight
and the async GET params resolution; add tests that call the exported OPTIONS
function and assert it returns status 200 with headers from getCorsHeaders(),
and add a test that invokes the exported GET function with a mock NextRequest
and a promise-based params object (resolving to { runId }) to ensure the async
params are awaited and that GET delegates to getScraperResultsHandler with the
resolved runId; reference the exported functions OPTIONS and GET and the handler
getScraperResultsHandler (and helper getCorsHeaders) when locating the code to
test.
In `@lib/apify/getActorStatus.ts`:
- Around line 20-25: The getActorStatus function silently returns a fallback
"UNKNOWN" and null dataset when apifyClient.run(...).get() returns undefined;
update getActorStatus to guard for a missing run (the local variable run from
apifyClient.run(runId).get()) and throw an Error (or propagate a descriptive
error) instead of returning a default status, removing the ?? "UNKNOWN" and ??
null fallbacks; update callers/tests by adding a unit/integration test that
simulates apifyClient.run(...).get() returning undefined and asserts that
getActorStatus throws so this regression cannot recur.
In `@lib/apify/getDataset.ts`:
- Around line 12-15: getDataset currently calls
apifyClient.dataset(datasetId).listItems() once and returns only the first page
(default 1000 items); change getDataset to page through results by calling
listItems repeatedly with a page limit (e.g., 1000) and an increasing offset (or
using the API's pagination token) until you've collected result.total items (or
a page returns no items). Accumulate items into an array and return the full
array (or null if initial call fails); update references to
apifyClient.dataset(...).listItems and the getDataset function to implement this
loop and respect result.total and per-page result.items.
In `@lib/apify/getScraperResultsHandler.ts`:
- Around line 64-65: The catch block inside getScraperResultsHandler currently
logs the raw caught value (error); change it to log a sanitized representation
instead: extract and log only safe fields such as error.name, error.message, and
a truncated error.stack (or omit stack in production), and if the error looks
like an HTTP/axios error (presence of config/headers/request/response), remove
or redact sensitive subfields (headers, authorization tokens, cookies, and full
request config) before logging; ensure the symbol getScraperResultsHandler's
catch uses this sanitized object and avoid logging the original error variable
directly.
In `@lib/apify/validateGetScraperResultsRequest.ts`:
- Around line 16-19: Update the validator comment to use account-scoped
terminology: replace "user-" with "account-" (and any other "user"/"entity"
occurrences) so it reads that a `runId` is an Apify-scoped identifier, not an
account- or artist-scoped resource; ensure the doc block in
validateGetScraperResultsRequest.ts consistently uses "account" (or specific
terms like "artist", "workspace", "organization" if applicable) to follow repo
guidelines.
---
Nitpick comments:
In `@lib/apify/getScraperResultsHandler.ts`:
- Around line 27-63: The function getScraperResultsHandler is doing both
orchestration and response-mapping; extract the status-to-response branching
into a small helper (e.g., mapActorStatusToResponse or buildScraperResponse)
that takes the actor status object ({ status, dataset_id }), plus optional data
(from getDataset) and getCorsHeaders(), and returns the proper NextResponse for
each branch (SUCCEEDED with/without dataset_id or data, FAILED/ABORTED -> 500,
other -> 200). In practice keep validateGetScraperResultsRequest, getActorStatus
and getDataset calls in getScraperResultsHandler but delegate all conditional
logic that inspects status and dataset_id to the new helper (reference symbols:
getScraperResultsHandler, validateGetScraperResultsRequest, getActorStatus,
getDataset, getCorsHeaders); ensure the new helper returns NextResponse and
replace the inline branching with a single call to it.
In `@lib/apify/validateGetScraperResultsRequest.ts`:
- Around line 6-32: getScraperResultsParamsSchema currently exports a plain
object shape instead of a Zod schema which forces recreate of the schema in
validateGetScraperResultsRequest; change getScraperResultsParamsSchema to export
the actual Zod object (e.g. const getScraperResultsParamsSchema = z.object({
runId: z.string().min(1).describe(...) })) and update GetScraperResultsParams to
infer from z.infer<typeof getScraperResultsParamsSchema>, then in
validateGetScraperResultsRequest use getScraperResultsParamsSchema.safeParse({
runId }) (and remove the inline z.object(...) there) so the single exported
schema is reused across the file and by callers.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f0b9d110-1bc1-4f1e-b469-a905e0195ace
⛔ Files ignored due to path filters (4)
lib/apify/__tests__/getActorStatus.test.tsis excluded by!**/*.test.*,!**/__tests__/**and included bylib/**lib/apify/__tests__/getDataset.test.tsis excluded by!**/*.test.*,!**/__tests__/**and included bylib/**lib/apify/__tests__/getScraperResultsHandler.test.tsis excluded by!**/*.test.*,!**/__tests__/**and included bylib/**lib/apify/__tests__/validateGetScraperResultsRequest.test.tsis excluded by!**/*.test.*,!**/__tests__/**and included bylib/**
📒 Files selected for processing (5)
app/api/apify/runs/[runId]/route.tslib/apify/getActorStatus.tslib/apify/getDataset.tslib/apify/getScraperResultsHandler.tslib/apify/validateGetScraperResultsRequest.ts
| export async function OPTIONS() { | ||
| return new NextResponse(null, { | ||
| status: 200, | ||
| headers: getCorsHeaders(), | ||
| }); | ||
| } | ||
|
|
||
| /** | ||
| * GET /api/apify/runs/{runId} | ||
| * | ||
| * Returns the status (and, on SUCCEEDED, the dataset items) of an Apify actor | ||
| * run. Authentication is required via `x-api-key` or `Authorization: Bearer`. | ||
| * | ||
| * @param request - The incoming request. | ||
| * @param options - Route options containing params. | ||
| * @param options.params - Route params containing the Apify `runId`. | ||
| * @returns A NextResponse with `{ status, dataset_id, data? }` shape. | ||
| */ | ||
| export async function GET( | ||
| request: NextRequest, | ||
| { params }: { params: Promise<{ runId: string }> }, | ||
| ) { | ||
| const { runId } = await params; | ||
| return getScraperResultsHandler(request, runId); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Search for tests that exercise the new /api/apify/runs route surface.
# Expectation: Tests should cover OPTIONS and GET delegation/async params behavior.
rg -nP 'api/apify/runs|\bOPTIONS\s*\(|\bGET\s*\(' --glob '*.{test,spec}.{ts,tsx}' -C 3Repository: recoupable/api
Length of output: 9747
🏁 Script executed:
# Check if route-level test file exists for the apify runs endpoint
find . -path "*/api/apify/runs*/__tests__/route*" -o -path "*/api/apify/runs*/*.test.ts" 2>/dev/null | head -20Repository: recoupable/api
Length of output: 40
🏁 Script executed:
# Also check the directory structure of the apify runs endpoint
ls -la app/api/apify/runs/\[runId\]/ 2>/dev/null || find app/api/apify -type d -name "runs" | xargs ls -la 2>/dev/null | head -30Repository: recoupable/api
Length of output: 226
Add route-level tests for OPTIONS and async GET parameter handling.
This endpoint lacks route-level test coverage. Handler tests cover the business logic, but the route surface—CORS preflight via OPTIONS and async parameter resolution—should have dedicated tests per project pattern. Similar routes (e.g., app/api/admins/sandboxes) include tests for both. Add tests covering the OPTIONS response status and CORS headers, and verify the async params resolution is exercised before delegation to the handler.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/api/apify/runs/`[runId]/route.ts around lines 13 - 36, The route lacks
route-level tests for the OPTIONS preflight and the async GET params resolution;
add tests that call the exported OPTIONS function and assert it returns status
200 with headers from getCorsHeaders(), and add a test that invokes the exported
GET function with a mock NextRequest and a promise-based params object
(resolving to { runId }) to ensure the async params are awaited and that GET
delegates to getScraperResultsHandler with the resolved runId; reference the
exported functions OPTIONS and GET and the handler getScraperResultsHandler (and
helper getCorsHeaders) when locating the code to test.
| export async function getActorStatus(runId: string) { | ||
| const run = await apifyClient.run(runId).get(); | ||
| return { | ||
| status: run?.status ?? "UNKNOWN", | ||
| dataset_id: run?.defaultDatasetId ?? null, | ||
| }; |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Apify JavaScript client run(runId).get() behavior when the run ID does not exist: does it return undefined/null or throw?
💡 Result:
The Apify JavaScript client method run(runId).get returns undefined when the run ID does not exist, rather than throwing an error.
Citations:
- 1: https://docs.apify.com/api/client/js/reference/class/RunClient
- 2: https://docs.apify.com/api/client/js/reference/next/class/RunClient
🏁 Script executed:
# Check if there are other related files that handle errors from apifyClient calls
rg "apifyClient" --type ts --type tsx -B 2 -A 2 | head -50Repository: recoupable/api
Length of output: 85
🏁 Script executed:
# Search for how apifyClient is used and error handling patterns
rg "apifyClient" -A 3 -B 1Repository: recoupable/api
Length of output: 8389
🏁 Script executed:
# Check the test file to see expected behavior
cat -n lib/apify/__tests__/getActorStatus.test.tsRepository: recoupable/api
Length of output: 1494
🏁 Script executed:
# Check the current implementation of getActorStatus.ts
cat -n lib/apify/getActorStatus.tsRepository: recoupable/api
Length of output: 1339
Align with documented error-propagation design: throw on missing Apify runs.
Lines 23-24 return "UNKNOWN" for undefined runs, contradicting the JSDoc (lines 8-14) which explicitly states errors should propagate so handlers respond 500 cleanly. The SDK returns undefined for nonexistent run IDs; silently converting that to "UNKNOWN" masks missing runs as valid states, defeating the design goal of distinguishing real statuses from upstream outages.
Add a guard to throw when the run is missing, and remove the fallback:
Proposed fix
export async function getActorStatus(runId: string) {
const run = await apifyClient.run(runId).get();
+ if (!run) {
+ throw new Error(`Apify run not found: ${runId}`);
+ }
return {
- status: run?.status ?? "UNKNOWN",
- dataset_id: run?.defaultDatasetId ?? null,
+ status: run.status,
+ dataset_id: run.defaultDatasetId ?? null,
};
}Also add a test case for the undefined run scenario to prevent regression.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export async function getActorStatus(runId: string) { | |
| const run = await apifyClient.run(runId).get(); | |
| return { | |
| status: run?.status ?? "UNKNOWN", | |
| dataset_id: run?.defaultDatasetId ?? null, | |
| }; | |
| export async function getActorStatus(runId: string) { | |
| const run = await apifyClient.run(runId).get(); | |
| if (!run) { | |
| throw new Error(`Apify run not found: ${runId}`); | |
| } | |
| return { | |
| status: run.status, | |
| dataset_id: run?.defaultDatasetId ?? null, | |
| }; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/apify/getActorStatus.ts` around lines 20 - 25, The getActorStatus
function silently returns a fallback "UNKNOWN" and null dataset when
apifyClient.run(...).get() returns undefined; update getActorStatus to guard for
a missing run (the local variable run from apifyClient.run(runId).get()) and
throw an Error (or propagate a descriptive error) instead of returning a default
status, removing the ?? "UNKNOWN" and ?? null fallbacks; update callers/tests by
adding a unit/integration test that simulates apifyClient.run(...).get()
returning undefined and asserts that getActorStatus throws so this regression
cannot recur.
| export async function getDataset(datasetId: string): Promise<unknown[] | null> { | ||
| const result = await apifyClient.dataset(datasetId).listItems(); | ||
| if (!result) return null; | ||
| return result.items as unknown[]; |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Apify JavaScript client DatasetClient listItems default limit pagination result total offset limit
💡 Result:
The Apify JavaScript API client DatasetClient.listItems method lists items from a dataset using offset-based pagination. It returns a PaginatedList (or PaginatedIterator) object with these key properties: - items: Array of dataset items - total: Total number of items available in the dataset - offset: Starting position of the current page (default: 0) - count: Number of items returned in this response - limit: Number of items requested per page Parameters (from DatasetClientListItemOptions): - offset (number, optional): Position of first returned entry (default: 0) - limit (number, optional): Maximum number of entries requested. Default/maximum not explicitly stated in docs but API endpoint limits to 1000. - chunkSize (number, optional): For async iteration, max items per API call (respects API limits) - desc (boolean): Sort in descending order - Other filters: fields, clean, skipEmpty, etc. Examples: const { items, total } = await datasetClient.listItems({ limit: 100 }); const { items } = await datasetClient.listItems({ offset: 100, limit: 50, desc: true }); For full iteration: for await (const item of datasetClient.listItems({ limit: 1000 })) { // process item } Default limit is unspecified but practically 1000 max per call per API docs. Use offset/limit for pagination, total for knowing when to stop (offset + count >= total).
Citations:
- 1: https://docs.apify.com/api/client/js/reference/class/DatasetClient
- 2: https://github.com/apify/apify-client-js
- 3: https://docs.apify.com/api/client/js/reference/next/class/DatasetClient
- 4: https://docs.apify.com/api/client/js/docs/concepts/pagination.md
- 5: https://docs.apify.com/api/client/js/docs/concepts/pagination
- 6: https://docs.apify.com/api/client/js/reference/interface/DatasetClientListItemOptions
🏁 Script executed:
# Find and check package.json for Apify SDK version
fd -e package.json | head -5 | xargs -I {} grep -l "apify" {} 2>/dev/null || grep -r "apify" package.json 2>/dev/null | head -10Repository: recoupable/api
Length of output: 40
🏁 Script executed:
# Also check if package-lock.json exists for exact version
git ls-files | grep -E "package(-lock)?\.json" | head -3Repository: recoupable/api
Length of output: 70
🏁 Script executed:
# Search for usage of getDataset throughout the codebase
rg "getDataset" --type ts --type tsx -B 2 -A 2Repository: recoupable/api
Length of output: 85
🏁 Script executed:
# Check if there are other pagination patterns in lib/apify/
fd . lib/apify/ -e ts -e tsx | head -10 | xargs -I {} sh -c 'echo "=== {} ===" && cat -n {}'Repository: recoupable/api
Length of output: 16869
🏁 Script executed:
cat package.json | grep -A 2 -B 2 "apify"Repository: recoupable/api
Length of output: 204
Add pagination loop to fetch all dataset items beyond the 1000-item default limit.
The Apify SDK's listItems() defaults to limit 1000, returning only the first page. For datasets with >1000 items, this silently truncates results without error. The response includes a total field indicating available items—checking this reveals the truncation risk.
Implement pagination to accumulate all items across pages:
♻️ Pagination fix
export async function getDataset(datasetId: string): Promise<unknown[] | null> {
- const result = await apifyClient.dataset(datasetId).listItems();
- if (!result) return null;
- return result.items as unknown[];
+ const items: unknown[] = [];
+ const limit = 1000;
+ let offset = 0;
+
+ while (true) {
+ const result = await apifyClient.dataset(datasetId).listItems({ offset, limit });
+ if (!result) return null;
+
+ items.push(...(result.items as unknown[]));
+
+ offset += result.items.length;
+ if (result.items.length === 0 || offset >= result.total) break;
+ }
+
+ return items;
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/apify/getDataset.ts` around lines 12 - 15, getDataset currently calls
apifyClient.dataset(datasetId).listItems() once and returns only the first page
(default 1000 items); change getDataset to page through results by calling
listItems repeatedly with a page limit (e.g., 1000) and an increasing offset (or
using the API's pagination token) until you've collected result.total items (or
a page returns no items). Accumulate items into an array and return the full
array (or null if initial call fails); update references to
apifyClient.dataset(...).listItems and the getDataset function to implement this
loop and respect result.total and per-page result.items.
| } catch (error) { | ||
| console.error("[ERROR] getScraperResultsHandler error:", error); |
There was a problem hiding this comment.
Avoid logging raw caught errors.
Line 65 logs the full thrown value. Upstream/auth/client errors can carry headers, tokens, request config, or other sensitive metadata. Log sanitized fields instead.
🛡️ Proposed sanitized logging
} catch (error) {
- console.error("[ERROR] getScraperResultsHandler error:", error);
+ const message = error instanceof Error ? error.message : String(error);
+ console.error("[ERROR] getScraperResultsHandler error:", { message });
return NextResponse.json(📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| } catch (error) { | |
| console.error("[ERROR] getScraperResultsHandler error:", error); | |
| } catch (error) { | |
| const message = error instanceof Error ? error.message : String(error); | |
| console.error("[ERROR] getScraperResultsHandler error:", { message }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/apify/getScraperResultsHandler.ts` around lines 64 - 65, The catch block
inside getScraperResultsHandler currently logs the raw caught value (error);
change it to log a sanitized representation instead: extract and log only safe
fields such as error.name, error.message, and a truncated error.stack (or omit
stack in production), and if the error looks like an HTTP/axios error (presence
of config/headers/request/response), remove or redact sensitive subfields
(headers, authorization tokens, cookies, and full request config) before
logging; ensure the symbol getScraperResultsHandler's catch uses this sanitized
object and avoid logging the original error variable directly.
There was a problem hiding this comment.
3 issues found across 9 files
Confidence score: 3/5
- There is concrete API behavior risk in
lib/apify/getActorStatus.ts: missing Apify runs can fall through toUNKNOWNand return HTTP 200 for nonexistentrunIds, which can mislead clients and mask errors. lib/apify/getScraperResultsHandler.tsis flagged for missing rate-limiting on a scraping request path, creating operational/abuse risk and inconsistency with the project’s API rules.- The
app/api/apify/runs/[runId]/route.tsmulti-export finding may be partly convention/framework-driven, but with severity 7/10 items present, this sits in a moderate-risk range rather than a clearly safe merge. - Pay close attention to
lib/apify/getActorStatus.ts,lib/apify/getScraperResultsHandler.ts,app/api/apify/runs/[runId]/route.ts- correct status handling for missing runs, enforce rate limiting, and validate route export conventions.
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="app/api/apify/runs/[runId]/route.ts">
<violation number="1" location="app/api/apify/runs/[runId]/route.ts:13">
P1: Custom agent: **Module should export a single primary function whose name matches the filename**
Module violates single-primary-export rule by exporting multiple top-level functions (`OPTIONS`, `GET`) and none matches filename basename `route`.</violation>
</file>
<file name="lib/apify/getScraperResultsHandler.ts">
<violation number="1" location="lib/apify/getScraperResultsHandler.ts:32">
P1: Custom agent: **API Design Consistency and Maintainability**
Scraping results endpoint is missing rate-limiting enforcement in its request path, violating the rule requiring rate limiting for scraping APIs.</violation>
</file>
<file name="lib/apify/getActorStatus.ts">
<violation number="1" location="lib/apify/getActorStatus.ts:21">
P1: Handle missing Apify runs explicitly instead of defaulting to `UNKNOWN`; otherwise nonexistent `runId`s are returned as HTTP 200.</violation>
</file>
Architecture diagram
sequenceDiagram
participant Client
participant API as "mono/api (Next.js)"
participant Auth as "Auth Service"
participant Apify as "Apify SDK/API"
Note over Client,Apify: NEW: migrated route GET /api/apify/runs/{runId}
Client->>API: GET /api/apify/runs/{runId}
API->>API: Validate runId (Zod)
API->>Auth: validateAuthContext(request)
alt Auth Failed
Auth-->>Client: 401 Unauthorized
end
Auth-->>API: AuthContext (Account/Org)
Note over API,Apify: Interaction via Apify SDK (CHANGED from raw fetch)
API->>Apify: getActorStatus(runId)
Apify-->>API: { status, defaultDatasetId }
alt status == "SUCCEEDED"
opt has dataset_id
API->>Apify: getDataset(dataset_id)
Apify-->>API: { items }
end
alt dataset found
API-->>Client: 200 OK { status, dataset_id, data: items }
else dataset missing/null
API-->>Client: 500 Internal Server Error
end
else status == "RUNNING" | "READY"
API-->>Client: 200 OK { status, dataset_id }
else status == "FAILED" | "ABORTED"
API-->>Client: CHANGED: 500 Internal Server Error { status, dataset_id }
else SDK/Network Error
API-->>Client: CHANGED: 500 Internal Server Error (No longer masks as RUNNING)
end
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| * | ||
| * @returns A NextResponse with CORS headers. | ||
| */ | ||
| export async function OPTIONS() { |
There was a problem hiding this comment.
P1: Custom agent: Module should export a single primary function whose name matches the filename
Module violates single-primary-export rule by exporting multiple top-level functions (OPTIONS, GET) and none matches filename basename route.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At app/api/apify/runs/[runId]/route.ts, line 13:
<comment>Module violates single-primary-export rule by exporting multiple top-level functions (`OPTIONS`, `GET`) and none matches filename basename `route`.</comment>
<file context>
@@ -0,0 +1,37 @@
+ *
+ * @returns A NextResponse with CORS headers.
+ */
+export async function OPTIONS() {
+ return new NextResponse(null, {
+ status: 200,
</file context>
| runId: string, | ||
| ): Promise<NextResponse> { | ||
| try { | ||
| const validated = await validateGetScraperResultsRequest(request, runId); |
There was a problem hiding this comment.
P1: Custom agent: API Design Consistency and Maintainability
Scraping results endpoint is missing rate-limiting enforcement in its request path, violating the rule requiring rate limiting for scraping APIs.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At lib/apify/getScraperResultsHandler.ts, line 32:
<comment>Scraping results endpoint is missing rate-limiting enforcement in its request path, violating the rule requiring rate limiting for scraping APIs.</comment>
<file context>
@@ -0,0 +1,73 @@
+ runId: string,
+): Promise<NextResponse> {
+ try {
+ const validated = await validateGetScraperResultsRequest(request, runId);
+ if (validated instanceof NextResponse) {
+ return validated;
</file context>
- Drop lib/apify/getActorStatus.ts and lib/apify/getDataset.ts helpers; call the SDK directly from the handler. - Flatten getScraperResultsHandler branching into a single success path plus a shared status-code fallback. - Declare getScraperResultsParamsSchema as a z.object directly. - Trim jsdoc across the PR.
There was a problem hiding this comment.
1 issue found across 8 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="lib/apify/getScraperResultsHandler.ts">
<violation number="1" location="lib/apify/getScraperResultsHandler.ts:22">
P2: Handle `undefined` from `apifyClient.run(...).get()` explicitly; otherwise nonexistent runs return 200 with `status: "UNKNOWN"` and mask an error condition.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| const status = run?.status ?? "UNKNOWN"; | ||
| const dataset_id = run?.defaultDatasetId ?? null; |
There was a problem hiding this comment.
P2: Handle undefined from apifyClient.run(...).get() explicitly; otherwise nonexistent runs return 200 with status: "UNKNOWN" and mask an error condition.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At lib/apify/getScraperResultsHandler.ts, line 22:
<comment>Handle `undefined` from `apifyClient.run(...).get()` explicitly; otherwise nonexistent runs return 200 with `status: "UNKNOWN"` and mask an error condition.</comment>
<file context>
@@ -1,72 +1,39 @@
- const { status, dataset_id } = await getActorStatus(validated.runId);
+ const run = await apifyClient.run(validated.runId).get();
+ const status = run?.status ?? "UNKNOWN";
+ const dataset_id = run?.defaultDatasetId ?? null;
</file context>
| const status = run?.status ?? "UNKNOWN"; | |
| const dataset_id = run?.defaultDatasetId ?? null; | |
| if (!run) { | |
| throw new Error("Apify run not found"); | |
| } | |
| const status = run.status; | |
| const dataset_id = run.defaultDatasetId ?? null; |
- Switch validator from validateAuthContext to validateAdminAuth so only admin accounts can poll Apify run status. - Run auth before the runId schema check — an unauthenticated request should never reveal param-level errors.
Preview smoke testAgainst preview Results
End-to-end chain exercisedTriggered scrape on PinkPantheress's Instagram social profile ( // POST /api/socials/02061320-978a-4394-a2c1-6062272683a8/scrape → 200
{ "runId": "VpqICClParjRjKNCf", "datasetId": "c4t9gsY5fAWbX0GNu" }Polling the new endpoint returned SUCCEEDED on the first call with populated // GET /api/apify/runs/VpqICClParjRjKNCf → 200
{
"status": "SUCCEEDED",
"dataset_id": "c4t9gsY5fAWbX0GNu",
"data": [
{
"inputUrl": "https://www.instagram.com/pinkpantheress",
"id": "39559476848",
"username": "pinkpantheress",
"fullName": "🫀",
"biography": "",
"externalUrls": [
{ "title": "VISIT MY STORE 💋🤭🤓❤️", "url": "..." }
]
// ...rest of the Instagram profile scrape
}
]
}Findings
Not directly exercised (no fixtures available)
🤖 Tested with Claude Code |
Ports the Apify run-status endpoint to
GET /api/apify/runs/{runId}using the Apify SDK; response renamesdatasetIdtodataset_id(snake_case). Errors now surface as 500 rather than being masked asRUNNINGwith empty dataset. Auth required; no per-account access check sincerunIdis not an account-scoped resource.Test plan
GET /api/apify/runs/{runId}withx-api-keyreturns 200 with status anddataset_idSummary by CodeRabbit