Conversation
7cdaa79 to
f745f14
Compare
8d4811b to
0a249a3
Compare
|
@cubic-dev-ai review pls |
@mogery I've started the AI code review. It'll take a few minutes to complete. |
There was a problem hiding this comment.
6 issues found across 51 files
Prompt for AI agents (all 6 issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="apps/api/src/lib/gcs-jobs.ts">
<violation number="1" location="apps/api/src/lib/gcs-jobs.ts:215">
Map results upload retries swallow every error, allowing all attempts to fail without logging or throwing, so the data is never written but the caller still sees success.</violation>
</file>
<file name="apps/api/src/controllers/v2/map.ts">
<violation number="1" location="apps/api/src/controllers/v2/map.ts:58">
logRequest no longer stores `req.body.integration`, so integration-originated map requests cannot be attributed in the new requests table.</violation>
</file>
<file name="apps/api/src/controllers/v1/map.ts">
<violation number="1" location="apps/api/src/controllers/v1/map.ts:465">
logMap returns a promise and is never awaited here, so any failure inside logMap (e.g., saveMapToGCS rejecting) becomes an unhandled rejection and the map log may be lost. Await or explicitly handle the promise.</violation>
</file>
<file name="apps/api/src/services/worker/crawl-logic.ts">
<violation number="1" location="apps/api/src/services/worker/crawl-logic.ts:157">
Batch scrapes now record `credits_cost = 0` whenever `credits_billed` is still null, so missing billing data is indistinguishable from a free job. Preserve the null (or defer logging) instead of forcing zero to avoid corrupting billing records.</violation>
</file>
<file name="apps/api/src/lib/extract/extraction-service.ts">
<violation number="1" location="apps/api/src/lib/extract/extraction-service.ts:682">
Persist the actual exception message when logging the multi-entity transform failure so the stored extract record contains actionable diagnostics instead of a generic string.</violation>
<violation number="2" location="apps/api/src/lib/extract/extraction-service.ts:792">
Log the actual scrape exception message instead of a generic string so extract failures remain diagnosable.</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
There was a problem hiding this comment.
4 issues found across 68 files
Prompt for AI agents (all 4 issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="apps/api/src/controllers/v0/admin/crawl-check.ts">
<violation number="1" location="apps/api/src/controllers/v0/admin/crawl-check.ts:173">
`status` is now always `finished` because the code treats the mere presence of a `requests` row as completion, but those rows are created when the crawl is enqueued, so active crawls no longer report their real status.</violation>
</file>
<file name="apps/api/src/controllers/v1/crawl-status.ts">
<violation number="1" location="apps/api/src/controllers/v1/crawl-status.ts:71">
`getJobs` no longer provides any persisted document payload when the queue job or GCS blob is missing, so websocket consumers stop receiving completed crawl documents even though the scrapes still exist in Supabase.</violation>
</file>
<file name="apps/api/src/controllers/v0/search.ts">
<violation number="1" location="apps/api/src/controllers/v0/search.ts:245">
`logSearch` is async and can throw, but the controller fire-and-forgets it, so any logging/GCS failure produces an unhandled promise rejection. Await the call (or at least add a `.catch`) so failures are captured instead of crashing the process.</violation>
</file>
<file name="apps/api/src/controllers/v0/admin/zdrcleaner.ts">
<violation number="1" location="apps/api/src/controllers/v0/admin/zdrcleaner.ts:141">
Requests are removed from the cleanup schedule after the first successful blob deletion, so blobs that failed to delete are never retried. Defer clearing `dr_clean_by` until all blobs for a request delete successfully.</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
| batch.map(async (row: { request_id: string; blob_id: string }) => { | ||
| try { | ||
| await cleanUpJob(row.blob_id); | ||
| cleanedUpRequestIds.add(row.request_id); |
There was a problem hiding this comment.
Requests are removed from the cleanup schedule after the first successful blob deletion, so blobs that failed to delete are never retried. Defer clearing dr_clean_by until all blobs for a request delete successfully.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/api/src/controllers/v0/admin/zdrcleaner.ts, line 141:
<comment>Requests are removed from the cleanup schedule after the first successful blob deletion, so blobs that failed to delete are never retried. Defer clearing `dr_clean_by` until all blobs for a request delete successfully.</comment>
<file context>
@@ -97,13 +100,102 @@ async function cleanUp(specificTeamId: string | null, _logger: Logger) {
+ batch.map(async (row: { request_id: string; blob_id: string }) => {
+ try {
+ await cleanUpJob(row.blob_id);
+ cleanedUpRequestIds.add(row.request_id);
+ } catch (error) {
+ logger.error(`Error cleaning up blob`, {
</file context>
✅ Addressed in 6af8ba8
There was a problem hiding this comment.
1 issue found across 1 file (reviewed changes from recent commits).
Prompt for AI agents (all 1 issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="apps/api/src/controllers/v0/search.ts">
<violation number="1" location="apps/api/src/controllers/v0/search.ts:245">
P2: Awaiting `logSearch` makes successful searches fail whenever the analytics insert/upload throws, so a logging hiccup now returns HTTP 500 to the client. Keep the logging best-effort instead of blocking the response.</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
794fbdc to
3209e88
Compare
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
0be3b16 to
b82bee6
Compare
|
@cubic-dev-ai please re-review |
@mogery I've started the AI code review. It'll take a few minutes to complete. |
There was a problem hiding this comment.
4 issues found across 72 files
Prompt for AI agents (all 4 issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="apps/api/src/controllers/v0/admin/crawl-migration.ts">
<violation number="1" location="apps/api/src/controllers/v0/admin/crawl-migration.ts:36">
P1: Using .single() here throws for the normal “not found” case, so every new crawl fails to migrate. Use maybeSingle() (or handle the error) so missing rows don’t abort the migration.</violation>
</file>
<file name="apps/api/src/controllers/v1/map.ts">
<violation number="1" location="apps/api/src/controllers/v1/map.ts:466">
P2: Handle the `logMap` promise so failures writing map logs don’t become unhandled rejections.</violation>
</file>
<file name="apps/api/src/controllers/v0/admin/zdrcleaner.ts">
<violation number="1" location="apps/api/src/controllers/v0/admin/zdrcleaner.ts:141">
P1: Requests are marked cleaned even when some of their blobs fail to delete, so `dr_clean_by` is cleared and the failed blobs are never retried.</violation>
</file>
<file name="apps/api/src/__tests__/snips/zdr-helpers.ts">
<violation number="1" location="apps/api/src/__tests__/snips/zdr-helpers.ts:117">
P2: `batch_scrapes` rows never contain a url column, so this expectation always throws and the helper can’t verify cleanup.</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
| .from("requests") | ||
| .select("id") | ||
| .eq("id", crawlId) | ||
| .single(); |
There was a problem hiding this comment.
P1: Using .single() here throws for the normal “not found” case, so every new crawl fails to migrate. Use maybeSingle() (or handle the error) so missing rows don’t abort the migration.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/api/src/controllers/v0/admin/crawl-migration.ts, line 36:
<comment>Using .single() here throws for the normal “not found” case, so every new crawl fails to migrate. Use maybeSingle() (or handle the error) so missing rows don’t abort the migration.</comment>
<file context>
@@ -0,0 +1,118 @@
+ .from("requests")
+ .select("id")
+ .eq("id", crawlId)
+ .single();
+
+ if (existingRequest) {
</file context>
| .single(); | |
| .maybeSingle(); |
| logMap({ | ||
| id: result.job_id, | ||
| request_id: result.job_id, | ||
| url: req.body.url, | ||
| crawlerOptions: {}, | ||
| scrapeOptions: {}, | ||
| origin: req.body.origin ?? "api", | ||
| integration: req.body.integration, | ||
| num_tokens: 0, | ||
| credits_billed: 1, | ||
| team_id: req.auth.team_id, | ||
| options: { | ||
| search: req.body.search, | ||
| limit: req.body.limit, | ||
| ignoreSitemap: req.body.ignoreSitemap, | ||
| includeSubdomains: req.body.includeSubdomains, | ||
| filterByPath: req.body.filterByPath !== false, | ||
| useIndex: req.body.useIndex, | ||
| timeout: req.body.timeout, | ||
| location: req.body.location, | ||
| }, | ||
| results: result.links, | ||
| credits_cost: 1, | ||
| zeroDataRetention: false, // not supported | ||
| }); |
There was a problem hiding this comment.
P2: Handle the logMap promise so failures writing map logs don’t become unhandled rejections.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/api/src/controllers/v1/map.ts, line 466:
<comment>Handle the `logMap` promise so failures writing map logs don’t become unhandled rejections.</comment>
<file context>
@@ -446,22 +463,23 @@ export async function mapController(
- time_taken: result.time_taken,
- team_id: req.auth.team_id,
- mode: "map",
+ logMap({
+ id: result.job_id,
+ request_id: result.job_id,
</file context>
| logMap({ | |
| id: result.job_id, | |
| request_id: result.job_id, | |
| url: req.body.url, | |
| crawlerOptions: {}, | |
| scrapeOptions: {}, | |
| origin: req.body.origin ?? "api", | |
| integration: req.body.integration, | |
| num_tokens: 0, | |
| credits_billed: 1, | |
| team_id: req.auth.team_id, | |
| options: { | |
| search: req.body.search, | |
| limit: req.body.limit, | |
| ignoreSitemap: req.body.ignoreSitemap, | |
| includeSubdomains: req.body.includeSubdomains, | |
| filterByPath: req.body.filterByPath !== false, | |
| useIndex: req.body.useIndex, | |
| timeout: req.body.timeout, | |
| location: req.body.location, | |
| }, | |
| results: result.links, | |
| credits_cost: 1, | |
| zeroDataRetention: false, // not supported | |
| }); | |
| logMap({ | |
| id: result.job_id, | |
| request_id: result.job_id, | |
| url: req.body.url, | |
| team_id: req.auth.team_id, | |
| options: { | |
| search: req.body.search, | |
| limit: req.body.limit, | |
| ignoreSitemap: req.body.ignoreSitemap, | |
| includeSubdomains: req.body.includeSubdomains, | |
| filterByPath: req.body.filterByPath !== false, | |
| useIndex: req.body.useIndex, | |
| timeout: req.body.timeout, | |
| location: req.body.location, | |
| }, | |
| results: result.links, | |
| credits_cost: 1, | |
| zeroDataRetention: false, // not supported | |
| }).catch(error => { | |
| logger.error(`Failed to log job for team ${req.auth.team_id}: ${error}`); | |
| }); |
| batch.map(async (row: { request_id: string; blob_id: string }) => { | ||
| try { | ||
| await cleanUpJob(row.blob_id); | ||
| cleanedUpRequestIds.add(row.request_id); |
There was a problem hiding this comment.
P1: Requests are marked cleaned even when some of their blobs fail to delete, so dr_clean_by is cleared and the failed blobs are never retried.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/api/src/controllers/v0/admin/zdrcleaner.ts, line 141:
<comment>Requests are marked cleaned even when some of their blobs fail to delete, so `dr_clean_by` is cleared and the failed blobs are never retried.</comment>
<file context>
@@ -97,13 +100,102 @@ async function cleanUp(specificTeamId: string | null, _logger: Logger) {
+ batch.map(async (row: { request_id: string; blob_id: string }) => {
+ try {
+ await cleanUpJob(row.blob_id);
+ cleanedUpRequestIds.add(row.request_id);
+ } catch (error) {
+ logger.error(`Error cleaning up blob`, {
</file context>
|
|
||
| if (requestData && requestData.length === 1) { | ||
| const record = requestData[0]; | ||
| expect(record.url).not.toContain("://"); // no url stored |
There was a problem hiding this comment.
P2: batch_scrapes rows never contain a url column, so this expectation always throws and the helper can’t verify cleanup.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/api/src/__tests__/snips/zdr-helpers.ts, line 117:
<comment>`batch_scrapes` rows never contain a url column, so this expectation always throws and the helper can’t verify cleanup.</comment>
<file context>
@@ -76,70 +85,130 @@ export async function getLogs() {
+
+ if (requestData && requestData.length === 1) {
+ const record = requestData[0];
+ expect(record.url).not.toContain("://"); // no url stored
+ expect(record.kind).toBe("crawl");
+ expect(record.dr_clean_by).not.toBeNull();
</file context>
Uh oh!
There was an error while loading. Please reload this page.