Problem
SQLWarehouseConnector.ensureWarehouseRunning() runs an independent poll loop per caller. When N analytics queries hit a cold/stopped warehouse at the same time (typical dashboard: one SSE stream per chart), each request:
- Calls
warehouses.get on its own backoff schedule
- Can call
warehouses.start independently (didStart is per-call, not per-warehouse)
- Only shares work after one caller observes
RUNNING, via the 30s _recentlyRunning cache
That cache does not dedupe concurrent cold-start bursts — exactly when control-plane load is worst.
Location: packages/appkit/src/connectors/sql-warehouse/client.ts (ensureWarehouseRunning, ~L370–481), invoked from packages/appkit/src/plugins/analytics/analytics.ts (_handleQueryRoute, ~L283–294).
Impact
| Dimension |
Effect |
| Performance |
N charts → ~N× warehouses.get polling (+ up to N× warehouses.start) for the same warehouseId |
| Security / abuse |
Authenticated clients can amplify privileged workspace API traffic by opening many concurrent /api/analytics/query/* SSE connections |
| Cost |
With autoStartWarehouse: true (default), redundant start + poll RPCs during cold start |
Not introduced by appkit-ui #416 (frontend indicator). #416 improves UX during the wait but does not change backend fan-out.
Current mitigation (insufficient)
_recentlyRunning + WAREHOUSE_RUNNING_CACHE_TTL_MS (30s): short-circuits subsequent requests after a prior success
WarehousePollBackoff jitter: spreads poll times across loops, does not merge them
CacheManager.getOrExecute singleflight: applies to SQL results, not readiness (readiness is intentionally outside query cache in analytics.ts)
Proposed fix
Add per-warehouseId in-flight deduplication on SQLWarehouseConnector, mirroring CacheManager.inFlightRequests:
// Sketch — on SQLWarehouseConnector
private _readinessInFlight = new Map<string, ReadinessInFlightEntry>();
interface ReadinessInFlightEntry {
promise: Promise<void>;
subscribers: Set<(update: WarehouseStatusUpdate) => void>;
refCount: number;
sharedController: AbortController;
}
Owner (first caller for warehouseId):
- Runs the existing poll loop
- Issues one
warehouses.get / one warehouses.start sequence
- Broadcasts each
WarehouseStatusUpdate to all subscribers
Joiners (concurrent callers):
- Register
onStatus in subscribers
await shared promise
- Ref-count abort: shared work aborts only when all waiters have abandoned (same pattern as
CacheManager.getOrExecute)
On RUNNING: set _recentlyRunning, resolve promise, clear in-flight entry.
On error / timeout: reject all waiters, clear entry so a later retry can start fresh.
Edge cases
- Late joiner after terminal state: resolve immediately from
_recentlyRunning or replay last emitted status
- Mixed options (
timeoutMs, autoStart, signal): owner wins; document that concurrent callers for the same warehouse share owner options
- Telemetry: one
sql.warehouseReady span per warehouse burst, attribute db.warehouse.waiter_count
Test plan
References
Problem
SQLWarehouseConnector.ensureWarehouseRunning()runs an independent poll loop per caller. When N analytics queries hit a cold/stopped warehouse at the same time (typical dashboard: one SSE stream per chart), each request:warehouses.geton its own backoff schedulewarehouses.startindependently (didStartis per-call, not per-warehouse)RUNNING, via the 30s_recentlyRunningcacheThat cache does not dedupe concurrent cold-start bursts — exactly when control-plane load is worst.
Location:
packages/appkit/src/connectors/sql-warehouse/client.ts(ensureWarehouseRunning, ~L370–481), invoked frompackages/appkit/src/plugins/analytics/analytics.ts(_handleQueryRoute, ~L283–294).Impact
warehouses.getpolling (+ up to N×warehouses.start) for the samewarehouseId/api/analytics/query/*SSE connectionsautoStartWarehouse: true(default), redundant start + poll RPCs during cold startNot introduced by appkit-ui #416 (frontend indicator). #416 improves UX during the wait but does not change backend fan-out.
Current mitigation (insufficient)
_recentlyRunning+WAREHOUSE_RUNNING_CACHE_TTL_MS(30s): short-circuits subsequent requests after a prior successWarehousePollBackoffjitter: spreads poll times across loops, does not merge themCacheManager.getOrExecutesingleflight: applies to SQL results, not readiness (readiness is intentionally outside query cache inanalytics.ts)Proposed fix
Add per-
warehouseIdin-flight deduplication onSQLWarehouseConnector, mirroringCacheManager.inFlightRequests:Owner (first caller for
warehouseId):warehouses.get/ onewarehouses.startsequenceWarehouseStatusUpdateto all subscribersJoiners (concurrent callers):
onStatusinsubscribersawaitsharedpromiseCacheManager.getOrExecute)On
RUNNING: set_recentlyRunning, resolve promise, clear in-flight entry.On error / timeout: reject all waiters, clear entry so a later retry can start fresh.
Edge cases
_recentlyRunningor replay last emitted statustimeoutMs,autoStart,signal): owner wins; document that concurrent callers for the same warehouse share owner optionssql.warehouseReadyspan per warehouse burst, attributedb.warehouse.waiter_countTest plan
ensureWarehouseRunningcalls → exactly 1warehouses.start, boundedwarehouses.getcount (mock SDK)STARTING)_recentlyRunningstill short-circuits without in-flight entrywarehouse_statusthen resultsReferences
packages/appkit/src/cache/index.ts(getOrExecute,inFlightRequests)