Skip to content

[BUG] SSE Connection Leak in useTaskSubscription Combined with Polling Race in Scans Page #1138

Description

@ionfwsrijan

Description

Two interconnected real-time update mechanisms for task/scan status suffer from race conditions, memory leaks, and race-to-reconnect bugs that compound each other.

Bug 1: SSE Connection Leak on Component Unmount

The useTaskSubscription hook manages a Server-Sent Events (SSE) connection via EventSource:

// useTaskSubscription.ts ~ line 138
es.onerror = () => {
  // ... exponential backoff logic ...
  reconnectTimerRef.current = setTimeout(() => {
    connectSSE();  // <-- If component unmounted, this creates orphan EventSource
  }, backoffDelay);
};

The EventSource API does not support AbortController. The only way to close it is es.close(). The cleanup function attempts to handle this:

// useTaskSubscription.ts ~ lines 57-60
function cleanupAll() {
  cleanupRef.current = true;
  if (esRef.current) {
    esRef.current.close();
    esRef.current = null;
  }
  clearTimeout(reconnectTimerRef.current);
}

However, there is a critical race:

  1. Component starts reconnecting (sets reconnectTimerRef).
  2. Component unmounts before the timeout firescleanupAll() is called → esRef.current is set to null.
  3. The previously scheduled setTimeout fires after cleanup and calls connectSSE().
  4. Inside connectSSE(), there is no guard checking cleanupRef.current before calling new EventSource(...).
  5. A new EventSource is created, but esRef.current is never set to it (because cleanupAll() already ran).
  6. The new EventSource is an orphan — never cleaned up, holds an open HTTP connection, and leaks memory.

Bug 2: Polling Race with Unbounded Concurrent Requests

The Scans component also polls via setInterval:

// Scans.tsx ~ lines 90-92
useEffect(() => {
  const interval = setInterval(loadTasks, 5000);
  return () => clearInterval(interval);
}, []);

The loadTasks function is async but setInterval does not await it:

// Scans.tsx ~ lines 129-164
const loadTasks = useCallback(async () => {
  requestSeqRef.current += 1;
  const currentSeq = requestSeqRef.current;
  
  abortRef.current?.abort();
  abortRef.current = new AbortController();
  
  try {
    const response = await fetch(url, { signal: abortRef.current.signal });
    const data = await response.json();
    
    if (currentSeq !== requestSeqRef.current) return;  // Stale response check
    
    setTasks(data.tasks);
  } catch (err) {
    if (err instanceof DOMException && err.name === 'AbortError') return;
    // Handle error...
  }
}, []);

If a single loadTasks call takes longer than 5 seconds:

  1. The next setInterval tick fires and calls loadTasks again.
  2. The NEW call aborts the PREVIOUS request (abortRef.current?.abort()).
  3. But the new request also takes >5s, and this pattern repeats.
  4. Under slow network conditions, the browser queues many concurrent fetch attempts, each aborting the previous one.
  5. The AbortError catch blocks work correctly, but bandwidth and connection pool slots are wasted.

Bug 3: Compound Effect — SSE + Polling Conflict

When both mechanisms are active:

  • The SSE provides real-time updates, but polling also fires every 5s.
  • When the SSE connection drops and reconnects, it triggers a full state refresh.
  • Simultaneously, the polling interval fires, causing a duplicate fetch.
  • If the SSE reconnect creates an orphan EventSource (Bug 1), that EventSource continues consuming bandwidth and emitting events that are never processed.

Impact

  • Memory leak: Each mount/unmount cycle of a component using useTaskSubscription can leak an EventSource + closure scope. Over time, accumulated EventSource objects consume memory and keep HTTP connections open.
  • Backend load: Abandoned EventSource connections appear as active SSE connections to the backend, causing unnecessary resource consumption.
  • Unnecessary network traffic: Duplicate polling + SSE requests waste bandwidth, especially on mobile or slow connections.
  • Visual glitches: Users may see task states flicker as stale polling responses overwrite fresh SSE data.

Proposed Fix

Fix 1: Guard Against Orphan EventSource in connectSSE

function connectSSE() {
  if (cleanupRef.current) return;  // <-- ADD THIS GUARD
  
  if (esRef.current) {
    esRef.current.close();
  }
  
  const es = new EventSource(url);
  esRef.current = es;
  
  es.onopen = () => { /* ... */ };
  es.onmessage = (event) => { /* ... */ };
  es.onerror = () => {
    es.close();
    esRef.current = null;
    
    if (cleanupRef.current) return;  // <-- ADD THIS GUARD
    
    const backoffDelay = Math.min(1000 * Math.pow(2, retryCountRef.current), 30000);
    retryCountRef.current += 1;
    
    reconnectTimerRef.current = setTimeout(() => {
      if (cleanupRef.current) return;  // <-- ADD THIS GUARD
      connectSSE();
    }, backoffDelay);
  };
}

Fix 2: Replace setInterval with Chained setTimeout

// Scans.tsx
const pollRef = useRef<number | null>(null);

const scheduleNextPoll = useCallback(() => {
  if (cleanupRef.current) return;
  pollRef.current = window.setTimeout(async () => {
    await loadTasks();
    scheduleNextPoll();  // Schedule next AFTER this one completes
  }, 5000);
}, [loadTasks]);

useEffect(() => {
  scheduleNextPoll();
  return () => {
    cleanupRef.current = true;
    if (pollRef.current !== null) {
      clearTimeout(pollRef.current);
    }
  };
}, [scheduleNextPoll]);

Fix 3: SSE-Only Mode with Polling as Degraded Fallback

Make the SSE the primary update mechanism and fall back to polling only when SSE is unavailable:

function useTaskSubscription() {
  const [sseConnected, setSseConnected] = useState(false);
  
  // SSE connection
  useEffect(() => {
    const es = connectSSEWithGuards();
    es.onopen = () => setSseConnected(true);
    es.onerror = () => setSseConnected(false);
    return () => { /* cleanup */ };
  }, []);
  
  // Polling as degraded fallback only
  useEffect(() => {
    if (sseConnected) return;  // Don't poll if SSE is working
    const interval = setInterval(fetchTasks, 15000);  // 15s instead of 5s
    return () => clearInterval(interval);
  }, [sseConnected, fetchTasks]);
}

Fix 4: Add Version Counter for Stale Callback Protection

Beyond the simple cleanupRef guard, use a version counter to ensure that even if an old callback somehow fires, it doesn't create a new connection:

const versionRef = useRef(0);

const connectSSE = useCallback(() => {
  const myVersion = ++versionRef.current;
  // ... in the reconnect setTimeout:
  if (myVersion !== versionRef.current || cleanupRef.current) return;
  connectSSE();
}, []);

Metadata

Metadata

Assignees

Labels

priority:mediumImportant issue with normal urgencytype:bugBug fix work category bonus label

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions