-
Notifications
You must be signed in to change notification settings - Fork 1.1k
fix: db leaks + socket cleanup #900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds fingerprinting libs and integrates enhanced fingerprint injection; hardens browser lifecycle with timeouts, retries, and memory cleanup; replaces direct PgBoss usage with a pgBossClient singleton and queued-task helpers; introduces bounded integration task queues, circuit breaker for queued runs, extensive socket cleanup, and multiple timeout guards. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Areas requiring extra attention:
Possibly related PRs
Suggested labels
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
RohitR311
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amhsirak Ready for review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 12
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
server/src/workflow-management/integrations/airtable.ts (1)
184-196: Leftover loop is unreachable code.The loops at lines 185-196 will never execute because
maxLengthis alreadyMath.max(schemaData.length, listData.length, screenshotData.length), soi < schemaData.lengthafter starting atmaxLengthis always false.This dead code should be removed:
- // Push leftovers - for (let i = maxLength; i < schemaData.length; i++) { - allRecords.push({ Label: schemaData[i].Field, Value: schemaData[i].Value }); - } - for (let i = maxLength; i < listData.length; i++) { - allRecords.push(listData[i]); - } - for (let i = maxLength; i < screenshotData.length; i++) { - allRecords.push({ - Key: screenshotData[i].key, - Screenshot: screenshotData[i].url, - }); - }server/src/browser-management/controller.ts (1)
290-298: ReplaceswitchOff()withinterpreter.stopInterpretation()to stop only the workflow execution.Calling
browserSession?.switchOff()closes the entire browser session including intervals and DOM streaming, whereasstopRunningInterpretationshould only halt the current workflow execution. UsebrowserSession?.interpreter.stopInterpretation()instead to properly stop the interpretation workflow without terminating the browser session.server/src/workflow-management/scheduler/index.ts (2)
116-140: Fire-and-forget pattern for integration updates may mask failures.The
withTimeoutcalls on lines 132-136 use.catch()to handle errors, but the function doesn't await these promises. This is intentional fire-and-forget, but if a task is added and the process immediately fails or restarts, the update may never complete. This is acceptable given the retry logic in the processors, but worth noting.Also,
retries: 5is used as the initial value, but looking at the relevant snippets, the processors checktask.retries < MAX_RETRIESand increment on failure. Setting initial retries to 5 means the task will already be at max retries from the start.addGoogleSheetUpdateTask(runId, { robotId: robotMetaId, runId: runId, status: 'pending', - retries: 5, + retries: 0, }); addAirtableUpdateTask(runId, { robotId: robotMetaId, runId: runId, status: 'pending', - retries: 5, + retries: 0, });
649-652:resetRecordingStatehas no effect.This function reassigns local parameters rather than any external state. The assignments
browserId = ''andid = ''modify only the local copies of these values, not any external variables.If this function is intended to reset some global or module-level state, the implementation needs to be changed. If it's dead code, consider removing it.
server/src/pgboss-worker.ts (1)
98-122: Sameretries: 5issue as in scheduler/index.ts.The initial
retriesvalue of 5 will cause tasks to immediately exceed the retry limit.Apply the same fix as in scheduler/index.ts:
addGoogleSheetUpdateTask(runId, { robotId: robotMetaId, runId: runId, status: 'pending', - retries: 5, + retries: 0, }); addAirtableUpdateTask(runId, { robotId: robotMetaId, runId: runId, status: 'pending', - retries: 5, + retries: 0, });server/src/api/record.ts (1)
620-623:resetRecordingStatehas no effect here either.Same issue as in scheduler/index.ts - this function only modifies local parameter copies.
🧹 Nitpick comments (15)
server/src/workflow-management/integrations/gsheet.ts (2)
25-38: Bounded queue implementation has potential ordering issues.The queue eviction uses
Object.keys()[0]to find the "oldest" task, but object key ordering is only guaranteed for string keys in modern JS engines and may not reflect true insertion order in all scenarios.Consider using a
Mapwhich guarantees insertion order, or track timestamps explicitly:-export let googleSheetUpdateTasks: { [runId: string]: GoogleSheetUpdateTask } = {}; +export let googleSheetUpdateTasks: Map<string, GoogleSheetUpdateTask> = new Map();Then update the eviction logic accordingly:
- const oldestKey = Object.keys(googleSheetUpdateTasks)[0]; - if (oldestKey) { - delete googleSheetUpdateTasks[oldestKey]; - } + const oldestKey = googleSheetUpdateTasks.keys().next().value; + if (oldestKey) { + googleSheetUpdateTasks.delete(oldestKey); + }
312-364: Concurrency guard and timeout are well-implemented.The
isProcessingGoogleSheetsflag with afinallyblock ensures proper cleanup. The 60-second timeout prevents indefinite processing.However, note the inconsistent indentation within the
whileloop body (lines 325-359) - the inner code appears to have mixed indentation levels which affects readability.Consider aligning the indentation inside the
whileloop for consistency.server/src/browser-management/classes/RemoteBrowser.ts (1)
420-440: Redundant nested try-catch blocks.The inner try-catch (lines 425-435) handles all fingerprint injection errors, making the outer try-catch (lines 424, 436-439) redundant since it can never catch anything.
private async applyEnhancedFingerprinting(context: BrowserContext): Promise<void> { try { - try { - const fingerprintGenerator = new FingerprintGenerator(); - const fingerprint = fingerprintGenerator.getFingerprint(); - const fingerprintInjector = new FingerprintInjector(); - - await fingerprintInjector.attachFingerprintToPlaywright(context as any, fingerprint); - - logger.info("Enhanced fingerprinting applied successfully"); - } catch (fingerprintError: any) { - logger.warn(`Modern fingerprint injection failed: ${fingerprintError.message}. Using existing protection.`); - } + const fingerprintGenerator = new FingerprintGenerator(); + const fingerprint = fingerprintGenerator.getFingerprint(); + const fingerprintInjector = new FingerprintInjector(); + + await fingerprintInjector.attachFingerprintToPlaywright(context as any, fingerprint); + + logger.info("Enhanced fingerprinting applied successfully"); } catch (error: any) { - logger.error(`Enhanced fingerprinting failed: ${error.message}`); - // Don't throw - fallback to basic functionality + logger.warn(`Enhanced fingerprinting failed: ${error.message}. Using existing protection.`); } }server/src/routes/record.ts (1)
82-84: Queue creation pattern is idempotent but potentially inefficient.
createQueueis called before everysendoperation. While this is safe (idempotent), it adds unnecessary overhead on every request.Consider creating queues once during application startup instead of on every request:
+// At module level or in app initialization +const QUEUES_INITIALIZED = pgBossClient.start().then(async () => { + await pgBossClient.createQueue('initialize-browser-recording'); + await pgBossClient.createQueue('destroy-browser'); + await pgBossClient.createQueue('interpret-workflow'); + await pgBossClient.createQueue('stop-interpretation'); +});Then in handlers, just call
senddirectly after ensuring initialization is complete.server/src/storage/pgboss.ts (1)
27-30: Consider adding connection timeout and retry options.The PgBoss client is configured with only
max: 3connections. For production resilience, consider adding timeout and retry configurations to handle transient database connection issues.export const pgBossClient = new PgBoss({ connectionString: pgBossConnectionString, max: 3, + connectionTimeoutSeconds: 30, + retryLimit: 3, + retryDelay: 1000, });server/src/workflow-management/classes/Generator.ts (1)
907-908: Inconsistent logger API usage.Lines 908 and 995 use
logger.warn()andlogger.error()directly, while the rest of the codebase useslogger.log('warn', ...)andlogger.log('error', ...).- logger.warn(`Error removing Generator socket listeners: ${error.message}`); + logger.log('warn', `Error removing Generator socket listeners: ${error.message}`);- logger.error(`Error during Generator cleanup: ${error.message}`); + logger.log('error', `Error during Generator cleanup: ${error.message}`);Also applies to: 994-995
server/src/storage/schedule.ts (1)
44-67: Consider returning the count of cancelled jobs for better visibility.The function always returns
trueeven when no matching jobs are found. This could mask issues where cancellation was expected but the job didn't exist.-export async function cancelScheduledWorkflow(robotId: string) { +export async function cancelScheduledWorkflow(robotId: string): Promise<number> { try { const jobs = await pgBossClient.getSchedules(); const matchingJobs = jobs.filter((job: any) => { try { const data = job.data; return data && data.id === robotId; } catch { return false; } }); for (const job of matchingJobs) { logger.log('info', `Cancelling scheduled job ${job.name} for robot ${robotId}`); await pgBossClient.unschedule(job.name); } - return true; + if (matchingJobs.length === 0) { + logger.log('debug', `No scheduled jobs found for robot ${robotId}`); + } + return matchingJobs.length; } catch (error: unknown) {server/src/routes/storage.ts (1)
1151-1159: Circuit breaker may trip on non-database errors.The error handler increments
consecutiveDbErrorsfor any error inprocessQueuedRuns, including PgBoss queue errors or browser creation failures. Consider distinguishing database errors from other errors if you want the breaker to specifically protect against DB issues.If intentional to protect against all errors (not just DB), consider renaming to clarify:
-// Circuit breaker for database connection issues -let consecutiveDbErrors = 0; +// Circuit breaker for processQueuedRuns failures +let consecutiveErrors = 0;server/src/browser-management/inputHandlers.ts (1)
865-1014: Consider extracting duplicatedevaluateSelectorfunction.The
evaluateSelectorfunction is defined identically twice withinonTestPaginationScroll(lines 906-932 and 958-983). This duplicates ~25 lines of code.Extract to a shared function outside the
page.evaluatecalls:+const evaluateSelectorScript = ` + function evaluateSelector(sel) { + try { + const isXPath = sel.startsWith('//') || sel.startsWith('(//'); + if (isXPath) { + const result = document.evaluate(sel, document, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null); + const elements = []; + for (let i = 0; i < result.snapshotLength; i++) { + const node = result.snapshotItem(i); + if (node && node.nodeType === Node.ELEMENT_NODE) { + elements.push(node); + } + } + return elements; + } else { + return Array.from(document.querySelectorAll(sel)); + } + } catch (err) { + return []; + } + } +`; const onTestPaginationScroll = async (...) => { // ... - const initialCount = await currentPage.evaluate((selector) => { - function evaluateSelector(sel: string): Element[] { ... } - return evaluateSelector(selector).length; - }, listSelector); + const initialCount = await currentPage.evaluate((selector) => { + ${evaluateSelectorScript} + return evaluateSelector(selector).length; + }, listSelector);server/src/workflow-management/integrations/airtable.ts (1)
26-39: Object key order for "oldest" task may not be reliable.Using
Object.keys(airtableUpdateTasks)[0]to find the "oldest" task relies on object property insertion order. While modern Node.js maintains this order, using aMapwould provide guaranteed ordering and clearer semantics.Consider using a
Mapinstead:-export let airtableUpdateTasks: { [runId: string]: AirtableUpdateTask } = {}; +export let airtableUpdateTasks: Map<string, AirtableUpdateTask> = new Map(); export function addAirtableUpdateTask(runId: string, task: AirtableUpdateTask): boolean { - const currentSize = Object.keys(airtableUpdateTasks).length; + const currentSize = airtableUpdateTasks.size; if (currentSize >= MAX_QUEUE_SIZE) { logger.log('warn', `Airtable task queue full (${currentSize}/${MAX_QUEUE_SIZE}), dropping oldest task`); - const oldestKey = Object.keys(airtableUpdateTasks)[0]; - if (oldestKey) { - delete airtableUpdateTasks[oldestKey]; - } + const oldestKey = airtableUpdateTasks.keys().next().value; + if (oldestKey) airtableUpdateTasks.delete(oldestKey); } - airtableUpdateTasks[runId] = task; + airtableUpdateTasks.set(runId, task); return true; }server/src/workflow-management/scheduler/index.ts (1)
696-716: Comprehensive socket cleanup implementation.The
cleanupSocketConnectionfunction properly:
- Removes all listeners from the client socket
- Disconnects the client socket
- Cleans up the server-side namespace (removes listeners, disconnects sockets, deletes from
_nspsmap)One consideration: directly accessing
(serverIo as any)._nspsis fragile as it relies on Socket.IO internals that could change between versions.The internal
_nspsmap access is a known pattern for namespace cleanup but may break in future Socket.IO versions. Consider wrapping this in a more defensive check or documenting the Socket.IO version dependency.server/src/pgboss-worker.ts (1)
89-96: DuplicatewithTimeoututility across files.This function is duplicated in
scheduler/index.ts(line 107-114) andapi/record.ts. Consider extracting to a shared utility module.// server/src/utils/timeout.ts export function withTimeout<T>(promise: Promise<T>, timeoutMs: number, operation: string): Promise<T> { return Promise.race([ promise, new Promise<T>((_, reject) => setTimeout(() => reject(new Error(`${operation} timed out after ${timeoutMs}ms`)), timeoutMs) ) ]); }server/src/api/record.ts (2)
560-567: Third copy ofwithTimeoututility.This is now duplicated in three files. Strongly recommend extracting to a shared module.
1145-1165:cleanupSocketConnectionduplicated from scheduler.This is identical to the implementation in
scheduler/index.ts. Consider extracting to a shared module.Extract to a shared utility:
// server/src/utils/socket-cleanup.ts import { Socket } from "socket.io-client"; import { Server } from "socket.io"; import logger from "../logger"; export function cleanupSocketConnection(socket: Socket, serverIo: Server, browserId: string, id: string) { // ... implementation }server/src/server.ts (1)
401-419: Unhandled rejection and uncaught exception handlers.These handlers log errors and exit in production. The delay before exit (1s for rejections, 5s for exceptions) allows logs to flush.
Note: In production, exiting on unhandled rejections is the correct approach, but you may want to consider a more graceful shutdown that attempts resource cleanup.
Consider calling the SIGINT handler logic instead of raw
process.exit()to ensure proper cleanup:process.on('unhandledRejection', (reason, promise) => { console.error('Unhandled promise rejection:', reason); if (process.env.NODE_ENV === 'production') { // Trigger graceful shutdown instead of immediate exit process.emit('SIGINT'); } });
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (20)
package.json(1 hunks)server/src/api/record.ts(10 hunks)server/src/browser-management/classes/BrowserPool.ts(0 hunks)server/src/browser-management/classes/RemoteBrowser.ts(17 hunks)server/src/browser-management/controller.ts(6 hunks)server/src/browser-management/inputHandlers.ts(2 hunks)server/src/pgboss-worker.ts(13 hunks)server/src/routes/record.ts(6 hunks)server/src/routes/storage.ts(9 hunks)server/src/schedule-worker.ts(3 hunks)server/src/server.ts(8 hunks)server/src/socket-connection/connection.ts(2 hunks)server/src/storage/db.ts(1 hunks)server/src/storage/pgboss.ts(1 hunks)server/src/storage/schedule.ts(1 hunks)server/src/workflow-management/classes/Generator.ts(3 hunks)server/src/workflow-management/classes/Interpreter.ts(4 hunks)server/src/workflow-management/integrations/airtable.ts(8 hunks)server/src/workflow-management/integrations/gsheet.ts(5 hunks)server/src/workflow-management/scheduler/index.ts(10 hunks)
💤 Files with no reviewable changes (1)
- server/src/browser-management/classes/BrowserPool.ts
🧰 Additional context used
🧬 Code graph analysis (9)
server/src/storage/pgboss.ts (1)
server/src/db/config/database.js (1)
dotenv(1-1)
server/src/routes/record.ts (1)
server/src/storage/pgboss.ts (1)
pgBossClient(27-30)
server/src/storage/schedule.ts (1)
server/src/storage/pgboss.ts (1)
pgBossClient(27-30)
server/src/socket-connection/connection.ts (1)
server/src/browser-management/inputHandlers.ts (1)
removeInputHandlers(1086-1086)
server/src/routes/storage.ts (1)
server/src/storage/pgboss.ts (1)
pgBossClient(27-30)
server/src/pgboss-worker.ts (3)
server/src/workflow-management/integrations/gsheet.ts (2)
addGoogleSheetUpdateTask(25-38)processGoogleSheetUpdates(312-365)server/src/workflow-management/integrations/airtable.ts (1)
addAirtableUpdateTask(26-39)server/src/markdownify/scrape.ts (2)
convertPageToMarkdown(8-57)convertPageToHTML(63-111)
server/src/browser-management/inputHandlers.ts (2)
server/src/server.ts (1)
browserPool(99-99)maxun-core/src/browserSide/scraper.js (2)
selector(28-28)node(664-664)
server/src/workflow-management/scheduler/index.ts (7)
server/src/workflow-management/integrations/gsheet.ts (2)
addGoogleSheetUpdateTask(25-38)processGoogleSheetUpdates(312-365)server/src/workflow-management/integrations/airtable.ts (2)
addAirtableUpdateTask(26-39)processAirtableUpdates(514-569)server/src/markdownify/scrape.ts (2)
convertPageToMarkdown(8-57)convertPageToHTML(63-111)server/src/routes/index.ts (1)
workflow(11-11)server/src/browser-management/controller.ts (1)
destroyRemoteBrowser(124-200)maxun-core/src/interpret.ts (1)
run(1638-1674)server/src/server.ts (1)
io(87-94)
server/src/browser-management/classes/RemoteBrowser.ts (1)
server/src/routes/proxy.ts (1)
getDecryptedProxyConfig(165-183)
🔇 Additional comments (50)
server/src/storage/db.ts (1)
20-29: Good connection pooling configuration for leak prevention.The pool and timeout settings are well-configured:
max: 10caps concurrent connections to prevent exhaustionmin: 0allows the pool to release all idle connectionsevict: 1000aggressively cleans up stale connectionsstatement_timeout: 60000prevents runaway queriesThis aligns well with the PR objective of fixing DB leaks.
server/src/workflow-management/integrations/gsheet.ts (1)
240-250: Good use of.once()for token listener.Switching from
.on()to.once()prevents listener accumulation when tokens are refreshed, which helps address the socket/listener cleanup objectives of this PR.server/src/browser-management/classes/RemoteBrowser.ts (5)
18-19: Fingerprinting imports added.The imports for
FingerprintInjectorandFingerprintGeneratorsupport the new browser fingerprinting functionality used inapplyEnhancedFingerprinting.
460-613: Well-structured initialization with timeouts and cleanup.The initialization flow now has:
- Overall 120s timeout via
Promise.race- Per-context creation 15s timeout
- Retry logic (3 attempts) with proper browser cleanup between attempts
- Fingerprinting applied to new contexts
This is a significant improvement for handling hung browser launches.
672-689: Good centralized socket listener cleanup.The
removeAllSocketListeners()method provides a single point for cleaning up socket listeners, called both before re-registration and during shutdown. This directly addresses the socket cleanup objectives of the PR.
896-1028: Comprehensive and robust shutdown sequence.The
switchOffmethod now:
- Clears all timers/intervals
- Removes all listeners (socket, page)
- Cleans up generator and interpreter
- Uses individual 5-second timeouts for CDP detach, page close, context close, and browser close
- Each operation has its own try-catch with finally blocks to null references
This thorough cleanup addresses the DB leaks and socket cleanup objectives effectively.
285-292: Good defensive listener cleanup with closed-page check.Checking
!this.currentPage.isClosed()before callingremoveAllListenersprevents errors when operating on already-closed pages. The try-catch ensures robustness.server/src/routes/record.ts (2)
20-20: Migrated to centralized pgBossClient.The import change from local pgBoss instance to the shared
pgBossClientfromstorage/pgbossaligns with the PR's goal of centralizing database connection management.
29-61: Job polling implementation is functional.The
waitForJobCompletionfunction correctly handles job state transitions and timeout. The 200ms polling interval is reasonable for the expected job durations.server/src/storage/pgboss.ts (1)
1-80: Good singleton pattern for centralizing PgBoss job submission.The module correctly separates job submission from job processing (handled by pgboss-worker.ts), preventing duplicate connection pools. The idempotent start/stop pattern and environment variable validation are well implemented.
server/src/socket-connection/connection.ts (1)
20-24: Good socket cleanup implementation.The disconnect handler now properly removes input handlers for sockets that had them registered, preventing memory leaks from accumulated event listeners.
server/src/workflow-management/classes/Generator.ts (1)
889-910: Socket listener cleanup is well-structured.The explicit removal of specific event listeners prevents memory leaks when generators are cleaned up. The try-catch ensures cleanup continues even if individual removals fail.
server/src/routes/storage.ts (3)
1079-1083: Good circuit breaker implementation for resilience.The circuit breaker pattern protects against cascading failures during database issues. The configuration (3 errors, 30s cooldown) provides reasonable protection while allowing recovery.
667-673: Correct migration to centralized pgBossClient.Queue creation and job submission now use the shared
pgBossClientsingleton, preventing connection pool leaks that would occur with multiple PgBoss instances.
1090-1093: Good addition of ORDER BY for deterministic queue processing.The
order: [['startedAt', 'ASC']]ensures FIFO processing of queued runs, which is important for predictable behavior.server/src/browser-management/inputHandlers.ts (1)
1028-1050: Good addition of socket listener cleanup to prevent memory leaks.The
removeInputHandlersfunction properly removes all registered listeners. This is essential for preventing memory leaks when sockets disconnect.Also applies to: 1059-1084
server/src/workflow-management/integrations/airtable.ts (1)
514-568: Good concurrency guard and timeout implementation.The
isProcessingAirtableflag prevents duplicate processing, and the 60-second timeout withfinallyblock ensures the flag is always reset. The bounded processing loop is a solid pattern.server/src/schedule-worker.ts (1)
16-20: PgBoss configuration looks reasonable.The
max: 3connection limit andexpireInHours: 23are sensible defaults. The 23-hour expiration prevents job queue buildup while allowing daily jobs to complete.server/src/workflow-management/classes/Interpreter.ts (2)
387-410: Good cleanup of timers and listeners to prevent resource leaks.The
clearStatemethod now properly:
- Clears
persistenceRetryTimer(lines 387-390)- Calls interpreter cleanup if available (lines 400-403)
- Removes pausing listeners (line 409)
This ensures no dangling timers or listeners remain after cleanup.
851-866: Good retry logic with proper timer cleanup.The persistence retry now:
- Clears existing retry timer before scheduling new one (lines 857-859)
- Sets timer reference to null when it fires (line 862)
- Uses exponential backoff with max delay cap
This prevents timer accumulation on repeated failures.
server/src/browser-management/controller.ts (2)
124-200: Good timeout protection for browser destruction.The
DESTROY_TIMEOUTwithPromise.raceensures destruction doesn't hang indefinitely. The fallback to force-delete on timeout is appropriate for resource cleanup.
42-77: Good error handling for browser initialization failures.The try/catch properly:
- Notifies frontend via socket events
- Waits briefly for events to propagate
- Cleans up the failed session with
switchOff()- Returns early to prevent partial initialization
server/src/workflow-management/scheduler/index.ts (6)
107-114: Utility function looks correct.The
withTimeouthelper cleanly races a promise against a timeout. Implementation is straightforward and reusable.
265-284: Timeout logic for scrape operations is well-implemented.The per-format timeout using
Promise.raceprovides good protection against hanging conversions. The 120-second timeout is reasonable given the 100-secondnetworkidletimeout in the underlyingconvertPageToMarkdown/convertPageToHTMLfunctions.
430-441: Interpretation timeout enforcement looks good.The 600-second (10-minute) timeout for workflow interpretation provides a reasonable upper bound for long-running workflows while preventing indefinite hangs.
557-566: Good addition of browser cleanup in error handler.The try-catch around interpreter state clearing and browser destruction ensures cleanup is attempted even if one step fails. This helps prevent resource leaks.
628-647:readyForRunHandlernow properly cleans up socket infinallyblock.The addition of the
socketparameter and thefinallyblock ensures socket cleanup happens regardless of success or failure. This is a solid improvement for resource management.
654-693: Socket lifecycle handling inhandleRunRecordingis improved.Good additions:
CONNECTION_TIMEOUTof 30 secondsconnect_errorhandler that triggers cleanupdisconnecthandler that triggers cleanup- Error path cleanup when socket exists
However, the socket is created but there's no timeout handling if the
ready-for-runevent never fires. The socket would remain open indefinitely until the browser emits the event or disconnects.Consider adding a connection establishment timeout that triggers cleanup if
ready-for-runisn't received within a reasonable timeframe.server/src/pgboss-worker.ts (8)
59-63: PgBoss configuration withmax: 5limits concurrent connections.The
max: 5setting limits the connection pool size for PgBoss. This is reasonable for controlling resource usage, but ensure this aligns with expected concurrent job processing needs.
127-129: Reduced browser timeouts may cause issues on slower systems.
BROWSER_INIT_TIMEOUTreduced to 30s andBROWSER_PAGE_TIMEOUTto 15s. The summary mentions these were reduced from 60s and 45s respectively. On systems under load or with slower network conditions, these tighter timeouts could cause premature failures.Verify these timeout values are sufficient for your production environment. Consider making them configurable via environment variables for different deployment scenarios.
416-421: Good interpreter state cleanup on abort.The try-catch around
browser.interpreter.clearState()ensures the abort flow continues even if state clearing fails.
668-675: Good interpreter state cleanup on execution error.This mirrors the abort cleanup pattern, ensuring resources are released even on failures.
846-847: Worker intervals properly tracked for cleanup.The
workerIntervalsarray and its use inregisterRunExecutionWorkerensures intervals can be cleared during shutdown.Also applies to: 920-927
956-963: Consistent interval tracking for abort queue checker.Same pattern applied to the abort queue interval checker.
1068-1087: Graceful shutdown properly clears intervals.Both SIGTERM and SIGINT handlers now clear the worker intervals before stopping PgBoss. This prevents orphaned interval callbacks after shutdown begins.
1089-1089: Export reduced to onlystartWorkers.No external dependencies found on
pgBoss,registerWorkerForQueue, orregisterAbortWorkerForQueue. This reduction in export surface is safe.server/src/api/record.ts (6)
595-617:readyForRunHandlerproperly cleans up socket in finally block.Consistent with the scheduler implementation. Good.
705-723: Scrape timeout logic consistent with scheduler.Same implementation pattern, good consistency.
879-890: Interpretation timeout consistent with scheduler.Same 600-second timeout implementation.
894-896: Interpreter state cleared after successful interpretation.Good addition to clean up interpreter state before destroying the browser.
1021-1030: Comprehensive cleanup in error handler.The try-catch around interpreter state clearing and browser destruction is consistent with the pattern in other files.
1167-1189:waitForRunCompletionhas reasonable 3-hour timeout.The 180-minute maximum wait time prevents indefinite polling. The function also properly handles aborted/aborting states.
One consideration: the polling interval of 2000ms means up to 5400 database queries for a 3-hour wait. This is likely acceptable but worth noting for high-volume scenarios.
server/src/server.ts (8)
41-46: PostgreSQL pool configuration is reasonable.
max: 10,min: 0with appropriate timeout settings. TheallowExitOnIdle: truehelps with graceful shutdown.
87-94: Socket.IO server initialized with comprehensive options.Good configuration:
cleanupEmptyChildNamespaces: truehelps prevent namespace leakspingTimeout: 60000andpingInterval: 25000are reasonable for detecting stale connectionsmaxHttpBufferSize: 1e8(100MB) is generous but may be needed for large payloadstransports: ['websocket', 'polling']withallowEIO3: trueprovides broad compatibility
101-102:recentRecoveriesMap for storing recovery notifications.This pattern allows recovery notifications to be stored and sent to users when they reconnect. Good for ensuring users are notified of recovered runs even if they weren't connected when recovery happened.
152-166: Server intervals properly tracked for cleanup.Both
processQueuedRunsandbrowserPoolCleanupIntervalare tracked inserverIntervalsfor proper cleanup during shutdown.
154-160: Good error handling aroundprocessQueuedRuns.The try-catch ensures a single failure doesn't stop the interval from running again.
176-181: Startup sequence is well-ordered.The sequence: recover orphaned runs → start PgBoss client → start workers ensures the system is ready before processing new work.
191-198: Recovery notification emission on client connection is well-implemented.Users connecting to
/queued-runreceive any stored recovery notifications, then the data is cleaned up to prevent duplicate emissions.
266-399: Comprehensive SIGINT shutdown handler.The shutdown sequence is thorough:
- Wait 2 seconds for in-flight operations
- Preserve partial data from running browsers
- Clear server intervals
- Clean up all browsers
- Terminate worker processes (dev mode)
- Close Socket.IO server
- Close HTTP server
- Close PostgreSQL pool
- Stop PgBoss client
- Close Sequelize connection
- Exit with appropriate code
The
shutdownSuccessfulflag tracking is good for debugging.
| "fingerprint-generator": "^2.1.77", | ||
| "fingerprint-injector": "^2.1.77", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
fingerprint-generator npm latest version
💡 Result:
The npm package "fingerprint-generator" is published as version 2.1.72 on npmjs.com (published 4 days ago). [1] jsDelivr shows a later registry entry at 2.1.76. [2]
Sources:
- npm package page: fingerprint-generator — version 2.1.72. [1]
- jsDelivr package listing: fingerprint-generator — version 2.1.76. [2]
🌐 Web query:
fingerprint-injector npm latest version
💡 Result:
The latest published version on npm is 2.1.72 (published Nov 27, 2025). (npmjs.com)
Citations:
Version mismatch: fingerprinting dependencies specify non-existent versions.
The fingerprint-generator and fingerprint-injector packages are pinned to ^2.1.77, but this version does not exist on npm. The latest available versions are 2.1.72 (published Nov 27, 2025) and 2.1.76 (via jsDelivr). Update both dependencies to use actual published versions, such as ^2.1.72.
🤖 Prompt for AI Agents
In package.json around lines 41-42, the fingerprint-generator and
fingerprint-injector dependencies are pinned to a non-existent version ^2.1.77;
update both entries to a published version (for example ^2.1.72) or any other
valid released version available on npm/jsDelivr, save the file, and run npm
install (or yarn) to update lockfile and verify installs succeed.
| async function triggerIntegrationUpdates(runId: string, robotMetaId: string): Promise<void> { | ||
| try { | ||
| googleSheetUpdateTasks[runId] = { | ||
| addGoogleSheetUpdateTask(runId, { | ||
| robotId: robotMetaId, | ||
| runId: runId, | ||
| status: 'pending', | ||
| retries: 5, | ||
| }; | ||
| }); | ||
|
|
||
| airtableUpdateTasks[runId] = { | ||
| addAirtableUpdateTask(runId, { | ||
| robotId: robotMetaId, | ||
| runId: runId, | ||
| status: 'pending', | ||
| retries: 5, | ||
| }; | ||
| }); | ||
|
|
||
| processAirtableUpdates().catch(err => logger.log('error', `Airtable update error: ${err.message}`)); | ||
| processGoogleSheetUpdates().catch(err => logger.log('error', `Google Sheets update error: ${err.message}`)); | ||
| withTimeout(processAirtableUpdates(), 65000, 'Airtable update') | ||
| .catch(err => logger.log('error', `Airtable update error: ${err.message}`)); | ||
|
|
||
| withTimeout(processGoogleSheetUpdates(), 65000, 'Google Sheets update') | ||
| .catch(err => logger.log('error', `Google Sheets update error: ${err.message}`)); | ||
| } catch (err: any) { | ||
| logger.log('error', `Failed to update integrations for run: ${runId}: ${err.message}`); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same retries: 5 issue as in other files.
This is the third occurrence of the same bug.
addGoogleSheetUpdateTask(runId, {
robotId: robotMetaId,
runId: runId,
status: 'pending',
- retries: 5,
+ retries: 0,
});
addAirtableUpdateTask(runId, {
robotId: robotMetaId,
runId: runId,
status: 'pending',
- retries: 5,
+ retries: 0,
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async function triggerIntegrationUpdates(runId: string, robotMetaId: string): Promise<void> { | |
| try { | |
| googleSheetUpdateTasks[runId] = { | |
| addGoogleSheetUpdateTask(runId, { | |
| robotId: robotMetaId, | |
| runId: runId, | |
| status: 'pending', | |
| retries: 5, | |
| }; | |
| }); | |
| airtableUpdateTasks[runId] = { | |
| addAirtableUpdateTask(runId, { | |
| robotId: robotMetaId, | |
| runId: runId, | |
| status: 'pending', | |
| retries: 5, | |
| }; | |
| }); | |
| processAirtableUpdates().catch(err => logger.log('error', `Airtable update error: ${err.message}`)); | |
| processGoogleSheetUpdates().catch(err => logger.log('error', `Google Sheets update error: ${err.message}`)); | |
| withTimeout(processAirtableUpdates(), 65000, 'Airtable update') | |
| .catch(err => logger.log('error', `Airtable update error: ${err.message}`)); | |
| withTimeout(processGoogleSheetUpdates(), 65000, 'Google Sheets update') | |
| .catch(err => logger.log('error', `Google Sheets update error: ${err.message}`)); | |
| } catch (err: any) { | |
| logger.log('error', `Failed to update integrations for run: ${runId}: ${err.message}`); | |
| } | |
| } | |
| async function triggerIntegrationUpdates(runId: string, robotMetaId: string): Promise<void> { | |
| try { | |
| addGoogleSheetUpdateTask(runId, { | |
| robotId: robotMetaId, | |
| runId: runId, | |
| status: 'pending', | |
| retries: 0, | |
| }); | |
| addAirtableUpdateTask(runId, { | |
| robotId: robotMetaId, | |
| runId: runId, | |
| status: 'pending', | |
| retries: 0, | |
| }); | |
| withTimeout(processAirtableUpdates(), 65000, 'Airtable update') | |
| .catch(err => logger.log('error', `Airtable update error: ${err.message}`)); | |
| withTimeout(processGoogleSheetUpdates(), 65000, 'Google Sheets update') | |
| .catch(err => logger.log('error', `Google Sheets update error: ${err.message}`)); | |
| } catch (err: any) { | |
| logger.log('error', `Failed to update integrations for run: ${runId}: ${err.message}`); | |
| } | |
| } |
| private initializeMemoryManagement(): void { | ||
| setInterval(() => { | ||
| const memoryUsage = process.memoryUsage(); | ||
| const heapUsageRatio = memoryUsage.heapUsed / MEMORY_CONFIG.maxHeapSize; | ||
|
|
||
| if (heapUsageRatio > MEMORY_CONFIG.heapUsageThreshold * 1.2) { | ||
| logger.warn('Critical memory pressure detected, triggering emergency cleanup'); | ||
| this.performMemoryCleanup(); | ||
| } else if (heapUsageRatio > MEMORY_CONFIG.heapUsageThreshold) { | ||
| logger.warn('High memory usage detected, triggering cleanup'); | ||
|
|
||
| if (this.screenshotQueue.length > 0) { | ||
| this.screenshotQueue = []; | ||
| logger.info('Screenshot queue cleared due to memory pressure'); | ||
| } | ||
|
|
||
| if (global.gc && heapUsageRatio > MEMORY_CONFIG.heapUsageThreshold * 1.1) { | ||
| global.gc(); | ||
| } | ||
| } | ||
|
|
||
| if (this.screenshotQueue.length > SCREENCAST_CONFIG.maxQueueSize) { | ||
| this.screenshotQueue = this.screenshotQueue.slice(-SCREENCAST_CONFIG.maxQueueSize); | ||
| } | ||
| }, MEMORY_CONFIG.gcInterval); | ||
| this.memoryManagementInterval = setInterval(() => { | ||
| const memoryUsage = process.memoryUsage(); | ||
| const heapUsageRatio = memoryUsage.heapUsed / MEMORY_CONFIG.maxHeapSize; | ||
|
|
||
| if (heapUsageRatio > MEMORY_CONFIG.heapUsageThreshold * 1.2) { | ||
| logger.warn( | ||
| "Critical memory pressure detected, triggering emergency cleanup" | ||
| ); | ||
| this.performMemoryCleanup(); | ||
| } else if (heapUsageRatio > MEMORY_CONFIG.heapUsageThreshold) { | ||
| logger.warn("High memory usage detected, triggering cleanup"); | ||
|
|
||
| if ( | ||
| global.gc && | ||
| heapUsageRatio > MEMORY_CONFIG.heapUsageThreshold * 1.1 | ||
| ) { | ||
| global.gc(); | ||
| } | ||
| } | ||
| }, MEMORY_CONFIG.gcInterval); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initializeMemoryManagement() is never called.
The method sets up a memory management interval but is never invoked during browser initialization or elsewhere in this class. The interval will never run, making this dead code.
Either call this method during initialization (e.g., at the end of initialize()), or remove it if memory management is not yet needed:
// At the end of the initialize method, after success = true:
+ this.initializeMemoryManagement();
success = true;Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In server/src/browser-management/classes/RemoteBrowser.ts around lines 157 to
178, initializeMemoryManagement() is defined but never called so its interval
never runs; to fix, invoke this.initializeMemoryManagement() at the end of the
class initialization flow (for example at the end of the initialize() method) so
the interval is started when the browser is created, or if memory management is
not desired remove the method and associated MEMORY_CONFIG usage and tests;
ensure any call is guarded so it only runs once and that teardown/clearInterval
is handled in the browser shutdown path.
| const nsps = (io as any)._nsps; | ||
| if (nsps && nsps.has(`/${id}`)) { | ||
| const ns = nsps.get(`/${id}`); | ||
| if (ns && ns.sockets && ns.sockets.size === 0) { | ||
| nsps.delete(`/${id}`); | ||
| logger.log('debug', `Deleted empty namespace /${id} from io._nsps Map`); | ||
| } else { | ||
| logger.log('warn', `Namespace /${id} still has ${ns?.sockets?.size || 0} sockets, skipping manual deletion`); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
Socket.IO v4 dynamic namespace cleanup best practices public API
💡 Result:
Short answer — use the built‑in public APIs introduced in v4 (and v4.6.0) rather than poking internal fields.
Recommended best practices (with public APIs)
- Enable automatic cleanup for dynamic child namespaces (v4.6.0): new Server option cleanupEmptyChildNamespaces: true — Socket.IO will remove a child namespace and close its adapter when its last socket disconnects. [1][2]
- Example: const io = new Server(httpServer, { cleanupEmptyChildNamespaces: true });. [1][2]
- Track and attach logic to namespace creation with the public "new_namespace" event so you can hook lifecycle/cleanup logic centrally. Example: io.on("new_namespace", nsp => { /* attach middleware, metrics, cleanup hooks */ });. [3]
- Use nsp.sockets (Map) to check number of connected sockets when you need to decide to clean up. nsp.sockets.size is public. [3]
- If you must perform manual cleanup (when not using the automatic option):
- Disconnect remaining sockets via socket.disconnect() or io.in(nspName).disconnectSockets() (public API). [4]
- Remove event listeners from the namespace (nsp.removeAllListeners()) so handlers don’t keep references. (Namespace is an EventEmitter.)
- Close the namespace adapter if your adapter exposes a close() method (adapters added support for close() to cooperate with cleanupEmptyChildNamespaces). [5]
Note: avoid touching internal/private server maps (e.g., io.nsps / io._nsps) — that’s not a public API and may break across versions. [6]
- If you use a pub/sub adapter (Redis, etc.), prefer adapter options tuned for dynamic namespaces (e.g., redis adapter subscriptionMode "static") and rely on adapter close() to free resources when namespaces are removed. [5]
- Monitor for leaks: when creating many dynamic namespaces, instrument new_namespace, adapter create/delete events and namespace.sockets size to ensure namespaces are removed as expected. Adapter emits create-room/delete-room/join-room/leave-room events you can listen to. [5]
Primary references
- v4.6.0 changelog + cleanupEmptyChildNamespaces addition. [1]
- Server options / cleanupEmptyChildNamespaces docs. [2]
- Namespaces (dynamic namespaces, nsp.sockets) and io.of() docs. [3]
- Server API (new_namespace event; disconnectSockets). [4]
- Adapter docs (adapter events and adapter close support for cleanup). [5]
- Community/manual method (older guidance showing disconnect + remove listeners + delete namespace — not public API; use only as last resort). [6]
If you want, I can paste minimal example code that demonstrates (a) enabling cleanupEmptyChildNamespaces and (b) a safe manual cleanup routine using only public APIs.
🌐 Web query:
Socket.IO v4 _nsps internal API alternatives remove namespace
💡 Result:
Short answer: don't use the internal io._nsps. Use the public APIs + the v4.6+ cleanup option.
Options
-
If you want to disconnect all clients and effectively remove a namespace:
- Disconnect sockets in that namespace: await io.of('/my-nsp').disconnectSockets(true);
- Enable automatic cleanup so empty dynamic child namespaces are removed: const io = new Server(httpServer, { cleanupEmptyChildNamespaces: true }); (added in v4.6.0). [1][2]
-
If you only need to stop a namespace from accepting new connections, keep a reference to the Namespace (io.of('/nsp')) and remove/replace your connection handlers; then disconnect existing sockets as above. [2]
Notes
- io._nsps is an internal/private map (not part of the public API) and may change; avoid relying on it. If you must force-remove a namespace (last resort) you can delete from that map in older code, but it’s unsupported. See the library types showing the internal _nsps field. [3]
Sources
- cleanupEmptyChildNamespaces & server options (v4.6.0). [1]
- Server API (io.of, disconnectSockets, new_namespace event). [2]
- socket.io types / internal _nsps reference. [3]
🌐 Web query:
Socket.IO v4 namespace management public API documentation
💡 Result:
Official Socket.IO v4 namespace management docs — key pages and main public API surface:
-
Namespace overview (creation, main namespace, custom namespaces, dynamic namespaces, middleware, rooms, broadcasting). See “Namespaces” in the v4 docs. [1]
-
Server API (Server / io instance inherits main-namespace methods; emits new_namespace; adapter setting). [2]
Main public API surface (server-side, v4)
- io.of(name | regexp | fn) — create/get a namespace (string for static, RegExp or function for dynamic namespaces). Use returned Namespace for further ops. [1]
- Namespace.use(fn) — register a middleware for that namespace (or parent namespace to apply to children). [1]
- Namespace.on('connection', socket => { ... }) — listen for connections in that namespace. [1]
- Namespace.emit(event, ...args) — broadcast from the namespace to its connected sockets. [1]
- Namespace.to(room) / Namespace.in(room) — target a room within the namespace for emitting. [1]
- io.on('new_namespace', nsp => { ... }) — server-level event fired when a dynamic namespace is created. [2]
- io.of('/') / io.sockets — the main (default) namespace; io inherits namespace methods. [1][2]
Client-side
- io(url?) or io('/namespace') — connect to main or a specific namespace; multiplexing behavior described. [1]
For full method/property lists and examples (complete API for Namespace, Socket, and Server), consult the v4 docs pages above. [1][2]
Sources:
[1] Socket.IO v4 — Namespaces.
[2] Socket.IO v4 — Server API.
🏁 Script executed:
find . -name "package.json" -type f -exec grep -l "socket.io" {} \;Repository: getmaxun/maxun
Length of output: 72
🏁 Script executed:
find . -name "package.json" -type f | head -5 | xargs catRepository: getmaxun/maxun
Length of output: 5736
Refactor to use Socket.IO's public API instead of internal _nsps.
The code accesses (io as any)._nsps, which is Socket.IO's private internal map and not part of the public API. While the project uses Socket.IO v4.4.1 (predating the v4.6.0+ cleanupEmptyChildNamespaces option), manual cleanup can still be achieved using public APIs:
- Disconnect sockets:
io.of('/namespace').disconnectSockets(true) - Remove listeners:
nsp.removeAllListeners() - Close adapter: call
adapter.close()if your adapter supports it
Even with a version check or documentation, relying on _nsps remains fragile and unsupported. Consider upgrading to Socket.IO v4.6.0+ to use the built-in cleanupEmptyChildNamespaces option, or refactor this cleanup logic to use public methods.
🤖 Prompt for AI Agents
In server/src/browser-management/controller.ts around lines 155 to 164, replace
the direct access to the internal (io as any)._nsps map with Socket.IO public
APIs: obtain the namespace with io.of(`/${id}`), check if it exists and has zero
connected sockets, call nsp.disconnectSockets(true) to force-disconnect any
sockets, then call nsp.removeAllListeners() and, if nsp.adapter && typeof
nsp.adapter.close === 'function', call nsp.adapter.close() to let the adapter
clean up; finally avoid manipulating private maps—if you must remove the
namespace reference ensure you do so using supported Socket.IO APIs or upgrade
to v4.6.0+ to enable cleanupEmptyChildNamespaces instead.
| const connectWithRetry = async (maxRetries: number = 3): Promise<Socket | null> => { | ||
| let retryCount = 0; | ||
|
|
||
| while (retryCount < maxRetries) { | ||
| try { | ||
| const socket = await waitForConnection; | ||
| if (socket || retryCount === maxRetries - 1) { | ||
| return socket; | ||
| } | ||
| } catch (error: any) { | ||
| logger.log('warn', `Connection attempt ${retryCount + 1} failed for browser ${id}: ${error.message}`); | ||
| } | ||
|
|
||
| retryCount++; | ||
| if (retryCount < maxRetries) { | ||
| const delay = Math.pow(2, retryCount) * 1000; | ||
| logger.log('info', `Retrying connection for browser ${id} in ${delay}ms (attempt ${retryCount + 1}/${maxRetries})`); | ||
| await new Promise(resolve => setTimeout(resolve, delay)); | ||
| } | ||
| } | ||
|
|
||
| return null; | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retry logic is ineffective: same promise awaited in each iteration.
The waitForConnection promise is created once before connectWithRetry is called, but the retry loop awaits the same promise on each iteration. Once a promise is settled, awaiting it again returns the same result immediately without retrying.
The retry should create a new connection attempt each time:
- const connectWithRetry = async (maxRetries: number = 3): Promise<Socket | null> => {
- let retryCount = 0;
-
- while (retryCount < maxRetries) {
- try {
- const socket = await waitForConnection;
- if (socket || retryCount === maxRetries - 1) {
- return socket;
- }
- } catch (error: any) {
- logger.log('warn', `Connection attempt ${retryCount + 1} failed for browser ${id}: ${error.message}`);
- }
-
- retryCount++;
- if (retryCount < maxRetries) {
- const delay = Math.pow(2, retryCount) * 1000;
- logger.log('info', `Retrying connection for browser ${id} in ${delay}ms (attempt ${retryCount + 1}/${maxRetries})`);
- await new Promise(resolve => setTimeout(resolve, delay));
- }
- }
-
- return null;
- };
-
- const socket = await connectWithRetry(3);
+ // Wait for connection with single timeout - retries not effective for websocket listeners
+ const socket = await waitForConnection;Alternatively, if true retries are needed, the waitForConnection promise factory should be created inside the loop.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In server/src/browser-management/controller.ts around lines 328 to 350, the
retry loop currently awaits the same pre-created waitForConnection promise each
iteration so retries are ineffective; change the code to create a fresh
connection promise on every attempt (e.g. call a waitForConnectionFactory or
move creation of waitForConnection inside the loop) so each retry triggers a new
connection attempt, handle its success/error the same way, increment retryCount,
and keep the existing backoff/delay and logging behavior.
| await startWorkers(); | ||
|
|
||
| io = new Server(server); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Socket.IO server is re-initialized after startup.
Line 182 creates a new Server(server) instance, overwriting the one initialized on lines 87-94. This means the carefully configured options (pingTimeout, cleanupEmptyChildNamespaces, etc.) are lost.
Either remove line 182 to keep the existing configured instance, or move the configuration to this location:
- io = new Server(server);
+ // io is already initialized with options at module levelIf re-initialization is intentional (e.g., to attach to the now-listening server), preserve the options:
- io = new Server(server);
+ io = new Server(server, {
+ cleanupEmptyChildNamespaces: true,
+ pingTimeout: 60000,
+ pingInterval: 25000,
+ maxHttpBufferSize: 1e8,
+ transports: ['websocket', 'polling'],
+ allowEIO3: true
+ });📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| io = new Server(server); | |
| io = new Server(server, { | |
| cleanupEmptyChildNamespaces: true, | |
| pingTimeout: 60000, | |
| pingInterval: 25000, | |
| maxHttpBufferSize: 1e8, | |
| transports: ['websocket', 'polling'], | |
| allowEIO3: true | |
| }); |
🤖 Prompt for AI Agents
In server/src/server.ts around line 182, a new Socket.IO Server is being created
(io = new Server(server)), which overwrites the previously configured instance
from lines 87–94 and drops options like pingTimeout and
cleanupEmptyChildNamespaces; either remove this re-initialization so the
original configured `io` is used, or if you must create it here (e.g., after
server.listen), re-create it with the same options as the original configuration
(copy pingTimeout, cleanupEmptyChildNamespaces, transports, cors, etc.) and
re-attach any namespaces/middlewares previously registered so no configuration
is lost.
| process.on('SIGTERM', async () => { | ||
| await stopPgBossClient(); | ||
| }); | ||
|
|
||
| process.on('SIGINT', async () => { | ||
| await stopPgBossClient(); | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
rg -n -A5 'stopPgBossClient' --type=tsRepository: getmaxun/maxun
Length of output: 1459
🏁 Script executed:
rg -n -B5 -A10 'process\.on.*SIG' server/src/server.tsRepository: getmaxun/maxun
Length of output: 565
🏁 Script executed:
rg -n -A30 "process\.on\('SIGINT'" server/src/server.tsRepository: getmaxun/maxun
Length of output: 1626
🏁 Script executed:
rg -n 'SIGTERM' server/src/server.tsRepository: getmaxun/maxun
Length of output: 156
🏁 Script executed:
rg -n -B5 -A20 'await stopPgBossClient' server/src/server.tsRepository: getmaxun/maxun
Length of output: 1014
Signal handlers will not reliably prevent process exit.
The process.on() handlers for SIGTERM and SIGINT do not guarantee that stopPgBossClient() completes before the process exits because process.on() doesn't await async callbacks. For SIGTERM, there is no handler in server.ts, making shutdown of PgBoss unreliable. For SIGINT, both pgboss.ts and server.ts define handlers, creating redundancy and potential race conditions.
Remove these signal handlers from pgboss.ts entirely—server.ts already implements proper graceful shutdown with explicit await of stopPgBossClient() (line 384) and should be the sole coordinator of shutdown. If SIGTERM handling is needed, add it to server.ts's shutdown sequence rather than relying on async signal callbacks.
🤖 Prompt for AI Agents
In server/src/storage/pgboss.ts around lines 72-78, remove the
process.on('SIGTERM'...) and process.on('SIGINT'...) signal handlers entirely so
pgboss.ts no longer registers async shutdown callbacks; the application-level
server.ts should be the sole coordinator of graceful shutdown (it already awaits
stopPgBossClient()), so ensure stopPgBossClient remains exported and is invoked
from server.ts's shutdown sequence (and do not add duplicate signal handlers
elsewhere).
| export async function scheduleWorkflow(id: string, userId: string, cronExpression: string, timezone: string): Promise<void> { | ||
| try { | ||
| const runId = uuid(); | ||
|
|
||
| const queueName = `scheduled-workflow-${id}`; | ||
|
|
||
| logger.log('info', `Scheduling workflow ${id} with cron expression ${cronExpression} in timezone ${timezone}`); | ||
|
|
||
| await pgBossClient.createQueue(queueName); | ||
|
|
||
| await pgBossClient.schedule(queueName, cronExpression, | ||
| { id, runId, userId }, | ||
| { tz: timezone } | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
runId is generated at schedule time, not execution time.
The runId is generated once when the workflow is scheduled (line 18) and embedded in the job data. For recurring cron jobs, this means every execution will have the same runId, which could cause issues with run tracking and deduplication.
Consider generating the runId in the worker that processes the scheduled job, not in the scheduling function:
export async function scheduleWorkflow(id: string, userId: string, cronExpression: string, timezone: string): Promise<void> {
try {
- const runId = uuid();
-
const queueName = `scheduled-workflow-${id}`;
logger.log('info', `Scheduling workflow ${id} with cron expression ${cronExpression} in timezone ${timezone}`);
await pgBossClient.createQueue(queueName);
await pgBossClient.schedule(queueName, cronExpression,
- { id, runId, userId },
+ { id, userId }, // runId should be generated by the worker on each execution
{ tz: timezone }
);🤖 Prompt for AI Agents
In server/src/storage/schedule.ts around lines 16–29, the code currently
generates a runId at schedule time and embeds it in the scheduled job payload
which causes every cron occurrence to share the same runId; remove the uuid()
generation here and stop including runId in the data passed to
pgBossClient.schedule, and instead generate a new runId inside the job
processing worker when the scheduled job executes (update the job payload
typing/consumers to expect runId to be created at execution time and propagate
that new runId to any run-tracking/deduplication logic).
| private pageCloseListeners: Map<Page, () => void> = new Map(); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pageCloseListeners map is declared but never populated.
The cleanup method iterates over pageCloseListeners (lines 974-983) to remove page close listeners, but the map is never populated. Looking at notifyOnNewTab (line 1221), page close listeners are added directly without being tracked in this map.
Either populate the map when adding listeners, or remove the unused cleanup logic:
public notifyOnNewTab = (page: Page, pageIndex: number) => {
if (this.socket) {
- page.on('close', () => {
+ const closeListener = () => {
this.socket.emit('tabHasBeenClosed', pageIndex);
- })
+ };
+ page.on('close', closeListener);
+ this.pageCloseListeners.set(page, closeListener);
const parsedUrl = new URL(page.url());
const host = parsedUrl.hostname?.match(/\b(?!www\.)[a-zA-Z0-9]+/g)?.join('.');
this.socket.emit('newTab', host ? host : 'new tab')
}
}Also applies to: 974-983
🤖 Prompt for AI Agents
In server/src/workflow-management/classes/Generator.ts around lines 74-75 (and
refer to lines 974-983 and 1221), the pageCloseListeners Map is declared but
never populated while notifyOnNewTab adds close listeners directly; update
notifyOnNewTab (near line 1221) to store each listener in pageCloseListeners
keyed by the Page when you add it (so cleanup at lines 974-983 can iterate and
remove them), or if you prefer not to track them, remove the cleanup loop and
the pageCloseListeners declaration; implement whichever option you choose
consistently (either populate the map when adding listeners and remove them in
cleanup, or delete the unused map and cleanup code).
| /** | ||
| * Removes pausing-related socket listeners to prevent memory leaks | ||
| * Must be called before re-registering listeners or during cleanup | ||
| * @private | ||
| */ | ||
| private removePausingListeners(): void { | ||
| try { | ||
| this.socket.removeAllListeners('pause'); | ||
| this.socket.removeAllListeners('resume'); | ||
| this.socket.removeAllListeners('step'); | ||
| this.socket.removeAllListeners('breakpoints'); | ||
| logger.log('debug', 'Removed pausing socket listeners'); | ||
| } catch (error: any) { | ||
| logger.warn(`Error removing pausing listeners: ${error.message}`); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect logger method: logger.warn() should be logger.log('warn', ...).
Line 190 uses logger.warn() but based on the import and usage pattern throughout the file, the logger uses logger.log('warn', ...) syntax.
} catch (error: any) {
- logger.warn(`Error removing pausing listeners: ${error.message}`);
+ logger.log('warn', `Error removing pausing listeners: ${error.message}`);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /** | |
| * Removes pausing-related socket listeners to prevent memory leaks | |
| * Must be called before re-registering listeners or during cleanup | |
| * @private | |
| */ | |
| private removePausingListeners(): void { | |
| try { | |
| this.socket.removeAllListeners('pause'); | |
| this.socket.removeAllListeners('resume'); | |
| this.socket.removeAllListeners('step'); | |
| this.socket.removeAllListeners('breakpoints'); | |
| logger.log('debug', 'Removed pausing socket listeners'); | |
| } catch (error: any) { | |
| logger.warn(`Error removing pausing listeners: ${error.message}`); | |
| } | |
| } | |
| /** | |
| * Removes pausing-related socket listeners to prevent memory leaks | |
| * Must be called before re-registering listeners or during cleanup | |
| * @private | |
| */ | |
| private removePausingListeners(): void { | |
| try { | |
| this.socket.removeAllListeners('pause'); | |
| this.socket.removeAllListeners('resume'); | |
| this.socket.removeAllListeners('step'); | |
| this.socket.removeAllListeners('breakpoints'); | |
| logger.log('debug', 'Removed pausing socket listeners'); | |
| } catch (error: any) { | |
| logger.log('warn', `Error removing pausing listeners: ${error.message}`); | |
| } | |
| } |
🤖 Prompt for AI Agents
In server/src/workflow-management/classes/Interpreter.ts around lines 177 to
192, the catch block currently calls logger.warn(...) which is inconsistent with
the file's logger usage; change the call to logger.log('warn', `Error removing
pausing listeners: ${error?.message}`) (preserving the same message and safe
access to error.message) so it matches the established logger.log(level,
message) pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
server/src/routes/record.ts (2)
26-58: Potential timer leak if promise rejects early.The
setTimeouton line 50 schedules the next check, but if the promise rejects on lines 32, 39, 47, or 52, there's a brief window where the nextcheckJobStatusmay still execute. While the check on line 31 prevents infinite recursion, this could cause one extra unnecessary poll after rejection.Consider storing the timer and clearing it on early exit:
async function waitForJobCompletion(jobId: string, queueName: string, timeout = 15000): Promise<any> { return new Promise((resolve, reject) => { const startTime = Date.now(); + let timeoutId: NodeJS.Timeout | null = null; const checkJobStatus = async () => { if (Date.now() - startTime > timeout) { + if (timeoutId) clearTimeout(timeoutId); return reject(new Error(`Timeout waiting for job ${jobId} to complete`)); } try { const job = await pgBossClient.getJobById(queueName, jobId); if (!job) { + if (timeoutId) clearTimeout(timeoutId); return reject(new Error(`Job ${jobId} not found`)); } if (job.state === 'completed') { + if (timeoutId) clearTimeout(timeoutId); return resolve(job.output); } if (job.state === 'failed') { + if (timeoutId) clearTimeout(timeoutId); return reject(new Error(`Job ${jobId} failed.`)); } - setTimeout(checkJobStatus, 200); + timeoutId = setTimeout(checkJobStatus, 200); } catch (error) { + if (timeoutId) clearTimeout(timeoutId); reject(error); } }; checkJobStatus(); }); }
79-89: Verify pgBossClient is initialized before use by adjusting startup timing.The server listens for requests at line 168 in
server/src/server.ts, butstartPgBossClient()is called asynchronously inside the listen callback at line 178. This creates a race condition: requests arriving betweenserver.listen()andstartPgBossClient()completion will attempt to callpgBossClient.send()on an unstarted client.While try-catch blocks in
server/src/routes/record.ts(lines 78, 138, 232, 268) handle errors and fall back toinitializeRemoteBrowserForRecording(), this masks the underlying initialization gap. Either movestartPgBossClient()to execute beforeserver.listen()or add a startup guard to prevent route handlers from processing requests until initialization completes.server/src/api/record.ts (1)
591-613: APIformatsoverride is currently ignored
executeRunsupports an optionalrequestedFormats?: string[]parameter and correctly overridesrecording.recording_meta.formatswhen provided. However:
readyForRunHandlercallsexecuteRun(id, userId)without passingrequestedFormats.handleRunRecordingreceivesrequestedFormatsand uses it only to callhandleRunRecordingitself; it is never forwarded toreadyForRunHandler/executeRun.As a result, the
formatsfield in the POST body to/api/robots/{id}/runshas no effect.To honor the API contract, wire the argument through:
-async function readyForRunHandler(browserId: string, id: string, userId: string, socket: Socket){ +async function readyForRunHandler( + browserId: string, + id: string, + userId: string, + socket: Socket, + requestedFormats?: string[] +) { try { - const result = await executeRun(id, userId); + const result = await executeRun(id, userId, requestedFormats); @@ export async function handleRunRecording(id: string, userId: string, requestedFormats?: string[]) { @@ - const readyHandler = () => readyForRunHandler(browserId, newRunId, userId, socket!); + const readyHandler = () => readyForRunHandler(browserId, newRunId, userId, socket!, requestedFormats);This keeps scheduled runs unaffected while allowing API callers to override formats as documented.
Also applies to: 632-699, 1276-1279
♻️ Duplicate comments (7)
server/src/workflow-management/classes/Generator.ts (1)
74-75:pageCloseListenersmap is still never populated.The map is declared but
notifyOnNewTab(lines 1219-1228) still adds page close listeners directly viapage.on('close', ...)without storing them in this map. The cleanup loop at lines 974-983 will iterate over an empty map, making the cleanup ineffective.Apply the fix from the past review to populate the map:
public notifyOnNewTab = (page: Page, pageIndex: number) => { if (this.socket) { - page.on('close', () => { + const closeListener = () => { this.socket.emit('tabHasBeenClosed', pageIndex); - }) + }; + page.on('close', closeListener); + this.pageCloseListeners.set(page, closeListener); const parsedUrl = new URL(page.url()); const host = parsedUrl.hostname?.match(/\b(?!www\.)[a-zA-Z0-9]+/g)?.join('.'); this.socket.emit('newTab', host ? host : 'new tab') } }server/src/browser-management/classes/RemoteBrowser.ts (1)
154-175:initializeMemoryManagement()is still never called.The method sets up memory monitoring intervals but is never invoked during browser initialization or elsewhere. The interval will never run.
Call this method at the end of successful initialization:
this.emitLoadingProgress(60, 0); success = true; + this.initializeMemoryManagement(); logger.log('debug', `Browser initialized successfully for user ${userId}`);server/src/workflow-management/classes/Interpreter.ts (1)
177-192: Fix ordering aroundupdateSocketto actually detach listeners from the old socket, and align logger usage.Two issues here:
- Socket listener leak due to order of operations
removePausingListeners()operates onthis.socket.- In
updateSocket, you first assignthis.socket = socketand then callsubscribeToPausing(), which in turn callsremovePausingListeners().- Result: you remove listeners from the new socket (which has none) and leave all listeners attached to the old socket, which is likely the opposite of what you intend and can cause memory leaks / double‑handling if the old socket is still alive.
A minimal fix is to remove listeners before swapping sockets:
public updateSocket = (socket: Socket): void => { - this.socket = socket; - this.subscribeToPausing(); + // Detach pausing listeners from the current socket before switching + this.removePausingListeners(); + this.socket = socket; + this.subscribeToPausing(); };This way:
- First call cleans up listeners on the existing socket.
subscribeToPausing()then removes (no‑op) and registers fresh listeners on the new socket.- Subsequent socket swaps correctly clean up the previous socket’s handlers.
- Logger usage inconsistent with rest of file (previously flagged)
In
removePausingListeners’s catch block you still uselogger.warn(...), but everywhere else the file followslogger.log(level, message):- } catch (error: any) { - logger.warn(`Error removing pausing listeners: ${error.message}`); - } + } catch (error: any) { + logger.log('warn', `Error removing pausing listeners: ${error?.message ?? error}`); + }This avoids API inconsistency and matches the established logger pattern mentioned in the earlier review.
Also applies to: 200-226, 409-410, 731-734
server/src/pgboss-worker.ts (1)
98-122: Initialize integration taskretriesto 0 instead of 5Both Google Sheets and Airtable tasks are enqueued with
retries: 5. The processors (processAirtableUpdates/processGoogleSheetUpdates) treatretriesas an attempt counter starting from 0 and only retry whileretries < MAX_RETRIES. Initializing to 5 effectively disables retries (and can make logging misleading).Consider:
addGoogleSheetUpdateTask(runId, { robotId: robotMetaId, runId: runId, status: 'pending', - retries: 5, + retries: 0, }); addAirtableUpdateTask(runId, { robotId: robotMetaId, runId: runId, status: 'pending', - retries: 5, + retries: 0, });server/src/workflow-management/scheduler/index.ts (1)
104-137: Integration task retries should start at 0; helper duplication mirrors other files
triggerIntegrationUpdatesenqueues both Google Sheets and Airtable tasks withretries: 5, then starts the processors viawithTimeout. As in the other modules, the processors interpretretriesas a counter from 0, so starting at 5 effectively skips the retry window.Suggest:
addGoogleSheetUpdateTask(runId, { robotId: robotMetaId, runId: runId, status: 'pending', - retries: 5, + retries: 0, }); addAirtableUpdateTask(runId, { robotId: robotMetaId, runId: runId, status: 'pending', - retries: 5, + retries: 0, });Also,
withTimeouthere is identical to the helpers inpgboss-worker.tsandapi/record.ts; a shared utility would avoid drift if you ever tweak its behavior.server/src/browser-management/inputHandlers.ts (1)
1059-1084: Remove unuseddom:inputlistener cleanup or add matching registration
removeInputHandlerscallssocket.removeAllListeners("dom:input"), but there is no correspondingsocket.on("dom:input", ...)registration inregisterInputHandlers. This is harmless but misleading and may cause future drift.Consider either:
- Removing the
socket.removeAllListeners("dom:input");line, or- Adding the missing
"dom:input"registration if it was intended to be supported.server/src/api/record.ts (1)
556-586: Integration task retries should start at 0, not 5As in the worker and scheduler modules,
triggerIntegrationUpdateshere enqueues tasks withretries: 5, but the Airtable/Google Sheets processors treatretriesas a retry counter starting from 0. Initializing to 5 effectively prevents any retries.Consider:
addGoogleSheetUpdateTask(runId, { robotId: robotMetaId, runId: runId, status: 'pending', - retries: 5, + retries: 0, }); addAirtableUpdateTask(runId, { robotId: robotMetaId, runId: runId, status: 'pending', - retries: 5, + retries: 0, });
🧹 Nitpick comments (8)
server/src/browser-management/classes/RemoteBrowser.ts (1)
417-437: Redundant nested try-catch blocks.The outer try-catch (lines 421, 433-436) is unreachable because the inner catch (lines 430-432) swallows all errors without rethrowing. Simplify to a single try-catch:
private async applyEnhancedFingerprinting(context: BrowserContext): Promise<void> { try { - try { - const fingerprintGenerator = new FingerprintGenerator(); - const fingerprint = fingerprintGenerator.getFingerprint(); - const fingerprintInjector = new FingerprintInjector(); - - await fingerprintInjector.attachFingerprintToPlaywright(context as any, fingerprint); - - logger.info("Enhanced fingerprinting applied successfully"); - } catch (fingerprintError: any) { - logger.warn(`Modern fingerprint injection failed: ${fingerprintError.message}. Using existing protection.`); - } - } catch (error: any) { - logger.error(`Enhanced fingerprinting failed: ${error.message}`); - // Don't throw - fallback to basic functionality + const fingerprintGenerator = new FingerprintGenerator(); + const fingerprint = fingerprintGenerator.getFingerprint(); + const fingerprintInjector = new FingerprintInjector(); + + await fingerprintInjector.attachFingerprintToPlaywright(context as any, fingerprint); + + logger.info("Enhanced fingerprinting applied successfully"); + } catch (error: any) { + logger.warn(`Modern fingerprint injection failed: ${error.message}. Using existing protection.`); + // Don't throw - fallback to basic functionality } }server/src/routes/storage.ts (1)
1085-1087: Add logging when circuit breaker prevents processing.The early return when the circuit breaker is open happens silently. This could make it difficult to diagnose why queued runs aren't being processed during cooldown periods.
Add a log message:
if (Date.now() < circuitBreakerOpenUntil) { + logger.log('debug', `Circuit breaker open, skipping queue processing until ${new Date(circuitBreakerOpenUntil).toISOString()}`); return; }server/src/workflow-management/classes/Interpreter.ts (1)
146-152: Bounded persistence retries and timer cleanup look good; consider minor tidy‑ups.The new
persistenceRetryTimer,MAX_PERSISTENCE_RETRIES, and corresponding cleanup inclearState+flushPersistenceBuffergive you bounded, backoff‑based retries and avoid timer leaks. Behaviorally this looks solid.Two minor cleanups you might consider:
- The guard in the catch block is a no‑op and can be dropped:
- if (!this.persistenceRetryCount) { - this.persistenceRetryCount = 0; - }
- Optionally clear any outstanding retry timer on a successful flush, to avoid a stray timeout firing later (it will early‑return, but you can skip the extra call):
try { const sequelize = require('../../storage/db').default; await sequelize.transaction(async (transaction: any) => { // ... }); this.persistenceRetryCount = 0; + if (this.persistenceRetryTimer) { + clearTimeout(this.persistenceRetryTimer); + this.persistenceRetryTimer = null; + }These are purely maintenance/clarity improvements; current logic is functionally fine.
Also applies to: 372-391, 851-867
server/src/browser-management/inputHandlers.ts (1)
871-1014: Pagination scroll test handler is correct; selector helper duplication is optional to DRY upThe scroll test logic (pre/post item counts, scroll restore, error emission) looks solid and matches the intended XPath/CSS behavior. Since
evaluateSelectoris already implemented insrc/helpers/clientPaginationDetector.ts, consider extracting a shared helper or at least avoiding duplicating the same logic twice inside this function to keep maintenance simpler.server/src/pgboss-worker.ts (1)
89-96: Timeout handling is good; consider centralizing helper usageThe new timeouts around interpretation and scrape conversions are a solid guardrail against hung jobs. Right now you:
- Define
withTimeoutonce.- Re‑implement Promise.race-based timeouts for
SCRAPE_TIMEOUTandINTERPRETATION_TIMEOUT.For consistency and easier tuning, consider using
withTimeoutfor the scrape conversions as well (or extracting a shared async-utils module so this helper is reused across this file,scheduler/index.ts, andapi/record.ts).Also applies to: 248-265, 398-412
server/src/workflow-management/scheduler/index.ts (2)
625-644: Socket-awarereadyForRunHandlerand cleanup are correct;resetRecordingStateis effectively a no-opPassing the socket into
readyForRunHandlerand always invokingcleanupSocketConnectioninfinallynicely guarantees socket cleanup regardless of run outcome. One small nit:resetRecordingState(browserId, id)only reassigns its parameters, so it has no observable effect and can be removed unless you later evolve it to mutate shared state.Also applies to: 693-713
651-691: Verify BACKEND_URL fallback port consistency with API runnerHere the connection URL falls back to
http://localhost:5000, whileserver/src/api/record.ts’shandleRunRecordingfalls back tohttp://localhost:8080. If both paths are intended to talk to the same backend instance, this mismatch could cause scheduled vs API runs to behave differently in local/dev environments.Please double-check which port is correct and align both call sites (or configure via a single shared constant).
server/src/api/record.ts (1)
1097-1114: Verify BACKEND_URL fallback port vs schedulerHere the socket client falls back to
http://localhost:8080, while the scheduler version useshttp://localhost:5000. If both should target the same backend, this discrepancy could cause API-triggered runs to connect to a different port than scheduler-triggered runs.Recommend aligning both to a single constant (e.g.,
SOCKET_BACKEND_URL) so local/dev configs don’t silently diverge.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
package.json(1 hunks)server/src/api/record.ts(10 hunks)server/src/browser-management/classes/RemoteBrowser.ts(17 hunks)server/src/browser-management/inputHandlers.ts(2 hunks)server/src/pgboss-worker.ts(13 hunks)server/src/routes/record.ts(6 hunks)server/src/routes/storage.ts(9 hunks)server/src/workflow-management/classes/Generator.ts(3 hunks)server/src/workflow-management/classes/Interpreter.ts(4 hunks)server/src/workflow-management/scheduler/index.ts(10 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- package.json
🧰 Additional context used
🧬 Code graph analysis (5)
server/src/routes/storage.ts (1)
server/src/storage/pgboss.ts (1)
pgBossClient(27-30)
server/src/routes/record.ts (1)
server/src/storage/pgboss.ts (1)
pgBossClient(27-30)
server/src/browser-management/inputHandlers.ts (3)
server/src/server.ts (1)
browserPool(99-99)maxun-core/src/browserSide/scraper.js (2)
selector(28-28)node(664-664)src/helpers/clientPaginationDetector.ts (1)
evaluateSelector(290-318)
server/src/api/record.ts (7)
server/src/workflow-management/integrations/gsheet.ts (2)
addGoogleSheetUpdateTask(25-38)processGoogleSheetUpdates(312-365)server/src/workflow-management/integrations/airtable.ts (2)
addAirtableUpdateTask(26-39)processAirtableUpdates(514-569)server/src/markdownify/scrape.ts (2)
convertPageToMarkdown(26-77)convertPageToHTML(85-135)server/src/routes/webhook.ts (1)
sendWebhook(404-434)maxun-core/src/interpret.ts (1)
run(1664-1700)server/src/browser-management/controller.ts (1)
destroyRemoteBrowser(124-200)server/src/server.ts (1)
io(87-94)
server/src/workflow-management/scheduler/index.ts (6)
server/src/workflow-management/integrations/gsheet.ts (2)
addGoogleSheetUpdateTask(25-38)processGoogleSheetUpdates(312-365)server/src/workflow-management/integrations/airtable.ts (2)
addAirtableUpdateTask(26-39)processAirtableUpdates(514-569)server/src/markdownify/scrape.ts (2)
convertPageToMarkdown(26-77)convertPageToHTML(85-135)server/src/browser-management/controller.ts (1)
destroyRemoteBrowser(124-200)maxun-core/src/interpret.ts (1)
run(1664-1700)server/src/server.ts (1)
io(87-94)
🔇 Additional comments (23)
server/src/workflow-management/classes/Generator.ts (2)
889-910: LGTM!The
removeSocketListenersmethod correctly removes all event listeners that are registered ininitializeSocketListeners,initializeDOMListeners, andregisterEventHandlers. Error handling is appropriate.
966-997: Well-structured cleanup method.The cleanup method properly:
- Removes socket listeners first
- Defensively checks
page.isClosed()before removing listeners- Resets internal state completely
- Has granular error handling
Once the
pageCloseListenerspopulation issue is fixed (per the earlier comment), this cleanup will be fully effective.server/src/browser-management/classes/RemoteBrowser.ts (5)
233-277: Good defensive cleanup pattern for scroll listener.Properly removes existing
dom:scrolllistener before re-registering to prevent duplicate handlers. The try-catch ensures stability even if removal fails.
653-670: LGTM!The method correctly removes all socket listeners registered in
registerEditorEvents. Error handling is appropriate.
877-1009: Excellent comprehensive cleanup implementation.The
switchOffmethod now properly:
- Clears all intervals and timeouts first (lines 878-908)
- Removes socket listeners (line 910)
- Cleans up generator (lines 926-934)
- Uses 5-second timeout guards for async operations (CDP, page, context, browser close)
- Nulls resources in
finallyblocks ensuring cleanup even on timeout- Has individual try-catch blocks preventing cascading failures
The cleanup order (listeners → generator → interpreter → DOM → CDP → page → context → browser) is correct.
1018-1027: LGTM!Good conditional logic to only re-register the scroll event listener when DOM streaming is active, preventing unnecessary listener setup.
445-594: Good timeout protection for initialization.The initialization now has:
- Overall 120-second timeout guard (line 447, 589-593)
- Nested 15-second timeout for context creation (lines 508-513)
- Retry logic with cleanup between attempts (lines 571-584)
This prevents indefinite hangs during browser initialization.
server/src/routes/record.ts (2)
18-18: LGTM: Import updated to use pgBossClient singleton.The import path change aligns with the PR's objective to consolidate queue operations through a shared pgBossClient singleton.
139-169: LGTM: Consistent queue operation pattern.The endpoint follows the same pgBossClient pattern established in the
/startendpoint with appropriate fallback handling.server/src/routes/storage.ts (8)
665-667: LGTM: Consistent queue operations.The queue creation and job sending pattern is consistent with the approach in
record.ts.
1077-1081: Circuit breaker state is shared across all queue processing.The circuit breaker variables are module-level, meaning the circuit breaker state affects all users and all queued runs globally. If one user's queued runs trigger database errors, all queue processing stops during the cooldown period.
This is acceptable for protecting the database from cascading failures, but be aware that a single problematic run could temporarily block queue processing for all users.
Consider whether per-user circuit breakers would be more appropriate for your use case, or if the global circuit breaker adequately serves your database protection needs.
1149-1158: LGTM: Circuit breaker error handling is correct.The circuit breaker properly tracks consecutive errors and opens after reaching the threshold. The cooldown mechanism will prevent excessive database load during outages.
1130-1138: LGTM: Queue operations properly error-handled.The queue operations for execution are wrapped in the try-catch block that feeds the circuit breaker, ensuring queue failures are properly tracked.
1238-1238: LGTM: Exporting processQueuedRuns for external use.The export allows other modules (likely server initialization or scheduled jobs) to trigger queue processing, which aligns with the PR's goals for improved queue management.
973-980: LGTM: Schedule cancellation updated.The comment and code correctly reflect the use of
cancelScheduledWorkflow, which internally uses pgBossClient as per the PR's consolidation objectives.
591-592: LGTM: Comment accurately describes queue behavior.The updated comment correctly reflects the use of pgBossClient for queueing runs when browser limits are reached.
1054-1056: LGTM: Abort queue operations consistent.The abort functionality correctly uses pgBossClient following the same pattern as other queue operations.
server/src/workflow-management/classes/Interpreter.ts (1)
392-407: Interpreter cleanup hook is correctly integrated.The additional
this.interpreter.cleanup()call (guarded by a type check and wrapped in the existing try/catch) cleanly extends the shutdown path without changing external behavior. This should help prevent lingering interpreter‑level resources.server/src/browser-management/inputHandlers.ts (1)
1049-1050: NewtestPaginationScrollwiring looks consistentRegistering
"testPaginationScroll"here and exportingregisterInputHandlers/removeInputHandlersaligns with the new server-side pagination test handler; I don't see issues with the event wiring.Also applies to: 1086-1086
server/src/pgboss-worker.ts (1)
846-847: Worker interval tracking and shutdown cleanup look correctTracking queue-check
setIntervalhandles inworkerIntervalsand clearing them on SIGTERM/SIGINT prevents orphaned timers from keeping the process alive. The pattern is consistent for both user and abort queues; I don’t see issues here.Also applies to: 920-928, 956-964, 1072-1077, 1082-1087
server/src/workflow-management/scheduler/index.ts (1)
272-290: Scrape/interpretation timeouts and interpreter cleanup look goodThe added
SCRAPE_TIMEOUTandINTERPRETATION_TIMEOUTwrappers around markdown/HTML conversion andInterpretRecordingprovide clear upper bounds on run duration, and the follow‑upbrowser.interpreter.clearState()+destroyRemoteBrowsercalls in both normal and error paths should significantly reduce long‑lived interpreter/browser leaks. I don’t see logic bugs in these flows.Also applies to: 427-438, 668-675, 677-679
server/src/api/record.ts (2)
591-613: Socket lifecycle and run-wait logic look robustThe new
readyForRunHandler+cleanupSocketConnectionpattern ensures:
- Client socket listeners are removed and the socket is disconnected in all cases.
- Server namespaces are torn down when
browserIdis known.
Combined withwaitForRunCompletion’s 3‑hour cap and explicit aborted/failed run handling, this should greatly reduce long-lived socket and DB resource leakage for API runs.Also applies to: 1097-1139, 1141-1161, 1163-1185
711-728: Scrape/interpretation timeouts and interpreter cleanup are well integratedThe addition of
SCRAPE_TIMEOUTaround markdown/HTML conversions andINTERPRETATION_TIMEOUTaroundInterpretRecording, plusbrowser.interpreter.clearState()beforedestroyRemoteBrowser, gives you clear safeguards against hung conversions and interpreter state leaks. The error paths also correctly update run status, fire webhooks, and emit socket events, so behavior remains observable.Also applies to: 875-886, 890-893, 1006-1007
| import { pgBoss, registerWorkerForQueue, registerAbortWorkerForQueue } from '../pgboss-worker'; | ||
| import { cancelScheduledWorkflow, scheduleWorkflow } from '../storage/schedule'; | ||
| import { pgBossClient } from '../storage/pgboss'; | ||
| chromium.use(stealthPlugin()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, check if the file exists and read the relevant section
if [ -f "server/src/routes/storage.ts" ]; then
echo "=== File exists, reading lines 1-30 ==="
head -n 30 "server/src/routes/storage.ts" | cat -n
else
echo "File not found at server/src/routes/storage.ts"
echo "=== Searching for storage.ts files ==="
find . -name "storage.ts" -type f 2>/dev/null
fiRepository: getmaxun/maxun
Length of output: 1773
🏁 Script executed:
# Search for chromium and stealthPlugin usage in the codebase
echo "=== Searching for chromium imports/usage ==="
rg "chromium" --type ts -B 2 -A 2 server/src/
echo -e "\n=== Searching for stealthPlugin ==="
rg "stealthPlugin" --type ts -B 2 -A 2 server/src/Repository: getmaxun/maxun
Length of output: 3220
🏁 Script executed:
# Check package.json for stealth-related dependencies
echo "=== Checking package.json ==="
fd -e json "package.json" -x cat {} \; | jq '.dependencies, .devDependencies' 2>/dev/null | grep -i -E "(playwright|puppeteer|stealth)"Repository: getmaxun/maxun
Length of output: 373
🏁 Script executed:
# Check if chromium.use pattern is used elsewhere
echo "=== Searching for .use( pattern with chromium ==="
rg "chromium\.use\(" --type ts -B 5 -A 2
echo -e "\n=== Check browserConnection.ts more thoroughly ==="
rg -n "stealthPlugin|\.use\(" server/src/browser-management/browserConnection.ts
echo -e "\n=== Search for playwright-extra usage ==="
rg "playwright-extra" --type ts -B 2 -A 2Repository: getmaxun/maxun
Length of output: 1350
Critical: Remove misplaced line from storage.ts
Line 18 contains chromium.use(stealthPlugin()); which does not belong in this file. This code is already correctly implemented in browser/server.ts with proper imports (chromium from 'playwright-extra' and stealthPlugin from 'puppeteer-extra-plugin-stealth'). Delete this line from storage.ts entirely—it will cause a ReferenceError at runtime and serves no purpose in the router module.
🤖 Prompt for AI Agents
In server/src/routes/storage.ts around line 18, remove the stray line
`chromium.use(stealthPlugin());` entirely — it does not belong in this router
module and will cause a ReferenceError at runtime; after deletion, also scan the
file for any unused imports or references to `chromium` or `stealthPlugin` and
remove them to avoid lint/errors so the module contains only relevant router
code.
| const queuedRun = await Run.findOne({ | ||
| where: { status: 'queued' }, | ||
| order: [['startedAt', 'ASC']] | ||
| order: [['startedAt', 'ASC']], | ||
| }); | ||
|
|
||
| consecutiveDbErrors = 0; | ||
| if (!queuedRun) return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix circuit breaker reset logic.
The consecutiveDbErrors counter is reset on line 1092 immediately after a successful database query, even if the query returns no queued runs. This means the circuit breaker will reset even when no actual work is processed.
The counter should only reset after successfully processing a queued run, not just after any successful query.
Apply this diff to fix the reset logic:
const queuedRun = await Run.findOne({
where: { status: 'queued' },
order: [['startedAt', 'ASC']],
});
- consecutiveDbErrors = 0;
if (!queuedRun) return;
+ // Reset error counter after successfully finding and beginning to process a run
+ consecutiveDbErrors = 0;
+
const userId = queuedRun.runByUserId;Alternatively, consider resetting the counter only after the entire run processing succeeds (after line 1138), which would provide even stronger guarantees that the system is healthy before resetting the circuit breaker.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const queuedRun = await Run.findOne({ | |
| where: { status: 'queued' }, | |
| order: [['startedAt', 'ASC']] | |
| order: [['startedAt', 'ASC']], | |
| }); | |
| consecutiveDbErrors = 0; | |
| if (!queuedRun) return; | |
| const queuedRun = await Run.findOne({ | |
| where: { status: 'queued' }, | |
| order: [['startedAt', 'ASC']], | |
| }); | |
| if (!queuedRun) return; | |
| // Reset error counter after successfully finding and beginning to process a run | |
| consecutiveDbErrors = 0; |
🤖 Prompt for AI Agents
In server/src/routes/storage.ts around lines 1088-1093 the code resets
consecutiveDbErrors immediately after the database findOne call, which clears
the circuit breaker even when no queued run was returned; change the logic so
that consecutiveDbErrors is NOT reset here — instead reset it only after a
queuedRun is actually processed successfully (move the reset to after the
successful run-processing block, e.g. after line 1138) or at the exact point
where the full run processing completes without error, ensuring the counter only
clears when real work finished.
closes : #898
Summary by CodeRabbit
New Features
Bug Fixes
Chores
✏️ Tip: You can customize this high-level summary in your review settings.