fix(api): defer docker startup tasks until ready#1952
Conversation
- Purpose: keep PM2 startup on the critical path limited to Nest bootstrap and socket binding. - Before: Docker template scans and digest refreshes were scheduled with Nest timeout decorators during application startup, which could consume readiness budget before PM2 saw the process as ready. - Problem: background Docker work could make startup timing noisier and harder to diagnose during clean restarts. - Now: the API sends the PM2 ready signal once the server is listening, then schedules Docker startup work from a dedicated helper outside the bootstrap path. - Coverage: adds a focused Vitest spec for delayed execution, rejection logging, and missing-provider behavior.
- Purpose: trim temporary bootstrap progress logging from the Nest startup path. - Before: the deferred Docker startup change added several step-by-step bootstrap info logs while validating startup sequencing. - Problem: those extra lines add noise to normal startup output without changing behavior now that the readiness path is confirmed. - Now: the API keeps the durable startup and ready-signal behavior while removing the temporary progress logs. - Scope: this follow-up only changes logging in api/src/unraid-api/main.ts and leaves local dev config changes out of the commit.
WalkthroughAdds a centralized Docker startup task scheduler and listener, removes NestJS Changes
Sequence DiagramsequenceDiagram
participant Bootstrap as Bootstrap Process
participant App as NestJS App
participant EventBus as EventEmitter2
participant Listener as DockerStartupTasksListener
participant Scheduler as scheduleDockerStartupTasks
participant Scanner as DockerTemplateScannerService
participant Job as ContainerStatusJob
Bootstrap->>App: start & listen
App-->>Bootstrap: server listening
Bootstrap->>EventBus: emit APP_READY_EVENT (app.ready)
EventBus->>Listener: handleAppReady
Listener->>Scheduler: schedule tasks (delay)
Note over Scheduler: setTimeout(delay)
par Delayed parallel execution
Scheduler->>Scanner: invoke bootstrapScan()
Scanner-->>Scheduler: completes / throws (caught & logged)
Scheduler->>Job: invoke refreshContainerDigestsAfterStartup()
Job-->>Scheduler: completes / throws (caught & logged)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1952 +/- ##
=======================================
Coverage 51.74% 51.74%
=======================================
Files 1026 1028 +2
Lines 70725 70792 +67
Branches 7890 7884 -6
=======================================
+ Hits 36595 36630 +35
- Misses 34007 34039 +32
Partials 123 123 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
- Purpose: move deferred Docker startup work behind a Nest domain event instead of triggering it directly from main.ts. - Before: main.ts signaled readiness and then manually imported the Docker startup scheduler. - Problem: the startup flow worked, but the orchestration lived outside the Docker module and felt less idiomatic to Nest. - Now: main.ts emits an app-ready event after listen(), and a Docker listener provider reacts with @onevent to schedule the delayed scan and digest refresh. - How: added an app lifecycle event constant, registered a Docker startup listener provider, kept the scheduling helper testable, and updated the spec to cover both the helper and the event-driven listener path.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
api/src/unraid-api/graph/resolvers/docker/docker-startup-tasks.ts (1)
8-15: MakeNestAppGetter.get()generic so Line 24 can drop the cast.Returning
unknownhere is what forcesas TingetOptionalProvider. A genericget<T>(typeOrToken: Type<T>) => Tkeeps the provider contract intact and avoids weakening the helper right at the DI boundary.♻️ Proposed refactor
interface NestAppGetter { - get( - typeOrToken: unknown, + get<T>( + typeOrToken: Type<T>, options?: { strict?: boolean; } - ): unknown; + ): T; } @@ const getOptionalProvider = <T>(app: NestAppGetter, token: Type<T>): T | null => { try { - return app.get(token, { strict: false }) as T; + return app.get(token, { strict: false }); } catch { return null; } };As per coding guidelines, "Avoid using casting whenever possible, prefer proper typing from the start".
Also applies to: 22-25
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/unraid-api/graph/resolvers/docker/docker-startup-tasks.ts` around lines 8 - 15, The NestAppGetter.get signature returns unknown which forces a cast in getOptionalProvider; change NestAppGetter to be generic (e.g., get<T>(typeOrToken: Type<T>, options?: { strict?: boolean }): T | undefined) so callers like getOptionalProvider can drop the `as T` cast and preserve DI typing integrity; update any call sites of NestAppGetter.get (notably the getOptionalProvider helper) to use the new generic form and adjust return/nullable types accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/src/unraid-api/graph/resolvers/docker/docker-startup-tasks.spec.ts`:
- Line 1: Add an afterEach hook in each top-level describe block that calls
vi.useRealTimers() to guarantee timers are restored even if a test fails; locate
the two describe blocks in docker-startup-tasks.spec.ts (the blocks containing
tests that call vi.useFakeTimers()) and insert afterEach(() => {
vi.useRealTimers(); }) inside each block (alongside or after existing tests that
currently call vi.useRealTimers() manually) so fake timers are always cleaned
up.
In `@api/src/unraid-api/main.ts`:
- Around line 27-39: The ready-sent guard is set unconditionally even if
process.send('ready') fails or returns false; update the code around
process.send to wrap the call in a try/catch and check its boolean return value,
only setting readySignalState[READY_SIGNAL_SENT_SYMBOL] and calling
apiLogger.info when process.send returns a truthy value, and on failure (false
return or caught error) log the error via apiLogger.warn or apiLogger.error
(including the caught exception) and return false so the send can be retried
later; reference process.send, readySignalState[READY_SIGNAL_SENT_SYMBOL],
READY_SIGNAL_SENT_SYMBOL, and apiLogger.info/apiLogger.warn in your fix.
---
Nitpick comments:
In `@api/src/unraid-api/graph/resolvers/docker/docker-startup-tasks.ts`:
- Around line 8-15: The NestAppGetter.get signature returns unknown which forces
a cast in getOptionalProvider; change NestAppGetter to be generic (e.g.,
get<T>(typeOrToken: Type<T>, options?: { strict?: boolean }): T | undefined) so
callers like getOptionalProvider can drop the `as T` cast and preserve DI typing
integrity; update any call sites of NestAppGetter.get (notably the
getOptionalProvider helper) to use the new generic form and adjust
return/nullable types accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 8544651e-7138-49c4-94d7-8233151a8c60
📒 Files selected for processing (6)
api/src/index.tsapi/src/unraid-api/graph/resolvers/docker/container-status.job.tsapi/src/unraid-api/graph/resolvers/docker/docker-startup-tasks.spec.tsapi/src/unraid-api/graph/resolvers/docker/docker-startup-tasks.tsapi/src/unraid-api/graph/resolvers/docker/docker-template-scanner.service.tsapi/src/unraid-api/main.ts
💤 Files with no reviewable changes (1)
- api/src/unraid-api/graph/resolvers/docker/docker-template-scanner.service.ts
| @@ -0,0 +1,97 @@ | |||
| import { describe, expect, it, vi } from 'vitest'; | |||
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n api/src/unraid-api/graph/resolvers/docker/docker-startup-tasks.spec.tsRepository: unraid/api
Length of output: 5004
Add afterEach hooks to restore real timers in both describe blocks.
Test isolation risk: if any test fails before vi.useRealTimers() executes, fake timers remain active for subsequent tests. Both describe blocks (lines 13–87 and 89–122) have tests using vi.useFakeTimers() without centralized teardown. Use afterEach to ensure reliable cleanup.
- First describe block: Tests at lines 14–42 and 44–75 restore at lines 41 and 74
- Second describe block: Test at lines 90–122 restores at line 121
Proposed fix
-import { describe, expect, it, vi } from 'vitest';
+import { afterEach, describe, expect, it, vi } from 'vitest';
describe('scheduleDockerStartupTasks', () => {
+ afterEach(() => {
+ vi.useRealTimers();
+ });
+
it('schedules docker startup work after the provided delay', async () => {
vi.useFakeTimers();
// ... test body ...
- vi.useRealTimers();
});
it('warns when a background docker startup task rejects', async () => {
vi.useFakeTimers();
// ... test body ...
- vi.useRealTimers();
});
});
describe('DockerStartupTasksListener', () => {
+ afterEach(() => {
+ vi.useRealTimers();
+ });
+
it('schedules docker startup work when the app ready event is emitted', async () => {
vi.useFakeTimers();
// ... test body ...
- vi.useRealTimers();
});
});🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/src/unraid-api/graph/resolvers/docker/docker-startup-tasks.spec.ts` at
line 1, Add an afterEach hook in each top-level describe block that calls
vi.useRealTimers() to guarantee timers are restored even if a test fails; locate
the two describe blocks in docker-startup-tasks.spec.ts (the blocks containing
tests that call vi.useFakeTimers()) and insert afterEach(() => {
vi.useRealTimers(); }) inside each block (alongside or after existing tests that
currently call vi.useRealTimers() manually) so fake timers are always cleaned
up.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
api/src/unraid-api/main.ts (1)
39-41:⚠️ Potential issue | 🟠 MajorOnly mark PM2 ready as sent after confirming IPC send success.
On Line 39,
process.send('ready')can fail (returnfalse) or throw; Line 40 currently sets the guard unconditionally, which can suppress retries and leave PM2 waiting.🛡️ Suggested fix
- process.send('ready'); - readySignalState[READY_SIGNAL_SENT_SYMBOL] = true; - apiLogger.info('Sent PM2 ready signal while %s', reason); - return true; + try { + const sent = process.send('ready'); + if (sent === false) { + apiLogger.warn('PM2 ready signal was not accepted while %s', reason); + return false; + } + readySignalState[READY_SIGNAL_SENT_SYMBOL] = true; + apiLogger.info('Sent PM2 ready signal while %s', reason); + return true; + } catch (error: unknown) { + apiLogger.warn(error, 'Failed to send PM2 ready signal while %s', reason); + return false; + }Node.js process.send() behavior: does it return false when IPC is unavailable/backpressured, and can it throw synchronously? Please answer with the official Node.js docs link.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/unraid-api/main.ts` around lines 39 - 41, Wrap the process.send('ready') call in a try/catch and check its boolean return; only set readySignalState[READY_SIGNAL_SENT_SYMBOL] = true after process.send('ready') returns truthy, and on exception or falsy return log via apiLogger.error (including the thrown error or a descriptive message) and avoid setting the READY_SIGNAL_SENT_SYMBOL so retries can occur; update the apiLogger.info('Sent PM2 ready signal while %s', reason) to run only after confirming send success.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@api/src/unraid-api/main.ts`:
- Around line 39-41: Wrap the process.send('ready') call in a try/catch and
check its boolean return; only set readySignalState[READY_SIGNAL_SENT_SYMBOL] =
true after process.send('ready') returns truthy, and on exception or falsy
return log via apiLogger.error (including the thrown error or a descriptive
message) and avoid setting the READY_SIGNAL_SENT_SYMBOL so retries can occur;
update the apiLogger.info('Sent PM2 ready signal while %s', reason) to run only
after confirming send success.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 51e2b9eb-9f82-48ed-bc59-676ada8ea99a
📒 Files selected for processing (5)
api/src/unraid-api/app/app-lifecycle.events.tsapi/src/unraid-api/graph/resolvers/docker/docker-startup-tasks.spec.tsapi/src/unraid-api/graph/resolvers/docker/docker-startup-tasks.tsapi/src/unraid-api/graph/resolvers/docker/docker.module.tsapi/src/unraid-api/main.ts
✅ Files skipped from review due to trivial changes (1)
- api/src/unraid-api/graph/resolvers/docker/docker.module.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- api/src/unraid-api/graph/resolvers/docker/docker-startup-tasks.spec.ts
- api/src/unraid-api/graph/resolvers/docker/docker-startup-tasks.ts
- Purpose: make PM2 ready signaling resilient when the IPC channel rejects or throws during process.send.
- Before: the code marked READY_SIGNAL_SENT_SYMBOL as sent and logged success even if process.send('ready') failed.
- Problem: a false return or thrown exception could permanently suppress future ready-signal retries and misreport startup state.
- Now: the code wraps process.send in try/catch, only sets the sent guard after a truthy return, and logs warn/error on failure paths.
- How: signalProcessReady checks the boolean return value, preserves retryability by returning false on failure, and keeps apiLogger.info only on confirmed success.
|
This plugin has been deployed to Cloudflare R2 and is available for testing. |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
api/src/unraid-api/main.ts (1)
195-198: WrapAPP_READY_EVENTemission to prevent bootstrap failure from non-critical deferred work.Line 198 runs after
listen()and PM2 signal completion. Ifapp.get(EventEmitter2)fails to resolve or a listener throws synchronously,bootstrapNestServer()rejects from work that should be best-effort. Catch and log failures instead.Suggested fix
- app.get(EventEmitter2).emit(APP_READY_EVENT, appReadyEvent); + try { + app.get(EventEmitter2).emit(APP_READY_EVENT, appReadyEvent); + } catch (error: unknown) { + apiLogger.error(error, 'Failed to emit %s', APP_READY_EVENT); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/src/unraid-api/main.ts` around lines 195 - 198, The APP_READY_EVENT emission in bootstrapNestServer() is doing non-critical work synchronously and may reject bootstrap if app.get(EventEmitter2) fails or a listener throws; wrap the retrieval and emit in a try/catch and log failures instead of letting them bubble. Concretely, around the code that constructs AppReadyEvent and calls app.get(EventEmitter2).emit(APP_READY_EVENT, appReadyEvent) catch errors from app.get(EventEmitter2) and from emit, use the existing logger (or app.get(Logger) / console) to warn or debug the error, and ensure the function continues without throwing so bootstrap completes even if deferred listeners fail.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@api/src/unraid-api/main.ts`:
- Around line 195-198: The APP_READY_EVENT emission in bootstrapNestServer() is
doing non-critical work synchronously and may reject bootstrap if
app.get(EventEmitter2) fails or a listener throws; wrap the retrieval and emit
in a try/catch and log failures instead of letting them bubble. Concretely,
around the code that constructs AppReadyEvent and calls
app.get(EventEmitter2).emit(APP_READY_EVENT, appReadyEvent) catch errors from
app.get(EventEmitter2) and from emit, use the existing logger (or
app.get(Logger) / console) to warn or debug the error, and ensure the function
continues without throwing so bootstrap completes even if deferred listeners
fail.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: dd80aff9-6c26-46aa-b6ab-884d22785aa3
📒 Files selected for processing (1)
api/src/unraid-api/main.ts
Summary
@Timeoutdecorators with a dedicated startup task helper that logs failures without blocking readinessTesting
pnpm --dir api test src/unraid-api/graph/resolvers/docker/docker-startup-tasks.spec.tspnpm --dir api type-checkSummary by CodeRabbit
Chores
New Features
Tests