Skip to content

Conversation

@cezudas
Copy link
Contributor

@cezudas cezudas commented Oct 31, 2025

Fixes OPS-2896.

Screenshot 2025-11-03 at 09 00 23 505117582-e9018058-4dc3-4090-b857-6b14830b0587 505117575-85fe894d-8971-4c1e-8231-f2d7149cae3b

Copilot AI review requested due to automatic review settings October 31, 2025 08:21
@linear
Copy link

linear bot commented Oct 31, 2025

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements connection timeout handling for the AI assistant chat to prevent indefinite hanging on poor network connections. It introduces client-side monitoring with progressive warnings and server-side heartbeat improvements.

Key changes:

  • Added connection monitoring hook that tracks SSE activity and displays warnings/errors based on message gaps
  • Implemented timeout-aware fetch wrapper with AbortSignal combining for graceful cancellation
  • Modified server heartbeat to use typed stream messages instead of raw SSE comments

Reviewed Changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
packages/ui-components/src/components/assistant-ui/thread/thread.tsx Added props for slow warning and connection error states, passed through to UI components
packages/ui-components/src/components/assistant-ui/thread/connection-slow-warning.tsx New component displaying amber warning banner when connection is slow
packages/ui-components/src/components/assistant-ui/thread/connection-error.tsx New component displaying red error banner for connection failures
packages/ui-components/src/components/assistant-ui/assistant-ui-chat-container.tsx Threaded connection state props through container component
packages/ui-components/src/components/ai-chat-container/step-settings-assistant-ui-chat-container.tsx Threaded connection state props through step settings container
packages/shared/src/lib/ai/chat/index.ts Exported new constants module
packages/shared/src/lib/ai/chat/constants.ts Defined shared heartbeat interval constant
packages/server/api/src/app/ai/chat/chat-request-router.ts Updated heartbeat to use shared constant and typed stream messages
packages/react-ui/src/app/features/builder/ai-chat/step-settings-assistant-ui-chat.tsx Integrated connection monitoring states from hook
packages/react-ui/src/app/features/ai/lib/use-connection-monitoring.ts Core connection monitoring logic with progressive timeout handling
packages/react-ui/src/app/features/ai/lib/types.ts Type definitions for connection monitoring
packages/react-ui/src/app/features/ai/lib/connection-timeout-error.ts Custom error class for connection timeouts
packages/react-ui/src/app/features/ai/lib/chat-utils.ts Added fetch wrapper with timeout and SSE activity tracking
packages/react-ui/src/app/features/ai/lib/assistant-ui-chat-hook.ts Integrated connection monitoring and custom fetch with timeout
packages/react-ui/src/app/features/ai/assistant/assistant-ui-chat.tsx Wired connection monitoring states to UI

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +16 to +18
const RESPONSE_WARNING_MS = SSE_HEARTBEAT_INTERVAL_MS * 2;

const SSE_MESSAGE_GAP_TIMEOUT_MS = SSE_HEARTBEAT_INTERVAL_MS * 4;
Copy link

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These timeout multipliers (2x and 4x) are magic numbers. Consider extracting them as named constants with explanatory comments about why these specific multipliers were chosen, e.g., const WARNING_THRESHOLD_MULTIPLIER = 2; // Show warning after missing 2 heartbeats.

Suggested change
const RESPONSE_WARNING_MS = SSE_HEARTBEAT_INTERVAL_MS * 2;
const SSE_MESSAGE_GAP_TIMEOUT_MS = SSE_HEARTBEAT_INTERVAL_MS * 4;
// Show warning after missing 2 heartbeats
const WARNING_THRESHOLD_MULTIPLIER = 2;
const RESPONSE_WARNING_MS = SSE_HEARTBEAT_INTERVAL_MS * WARNING_THRESHOLD_MULTIPLIER;
// Consider connection lost after missing 4 heartbeats
const MESSAGE_GAP_TIMEOUT_MULTIPLIER = 4;
const SSE_MESSAGE_GAP_TIMEOUT_MS = SSE_HEARTBEAT_INTERVAL_MS * MESSAGE_GAP_TIMEOUT_MULTIPLIER;

Copilot uses AI. Check for mistakes.
return controller.signal;
};

const CONNECTION_TIMEOUT_MS = 15000 + SSE_HEARTBEAT_INTERVAL_MS;
Copy link

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded 15000ms added to the heartbeat interval is a magic number. Consider extracting it as a named constant with a comment explaining its purpose, e.g., const INITIAL_CONNECTION_GRACE_PERIOD_MS = 15000;.

Suggested change
const CONNECTION_TIMEOUT_MS = 15000 + SSE_HEARTBEAT_INTERVAL_MS;
// Grace period for initial connection before heartbeat timeout applies
const INITIAL_CONNECTION_GRACE_PERIOD_MS = 15000;
const CONNECTION_TIMEOUT_MS = INITIAL_CONNECTION_GRACE_PERIOD_MS + SSE_HEARTBEAT_INTERVAL_MS;

Copilot uses AI. Check for mistakes.
@alexandrudanpop
Copy link
Contributor

@cezudas should we also add this issue in ai-sdk? looks like something that should be resolved in the framework

cezudas added a commit that referenced this pull request Oct 31, 2025
Part of OPS-2896.

The new UI components are not actively rendered yet; this PR is a
pre-requisite for
[PR-1548.](#1548)
@cezudas
Copy link
Contributor Author

cezudas commented Oct 31, 2025

@cezudas should we also add this issue in ai-sdk? looks like something that should be resolved in the framework

I just rolled back to res.write(`: heartbeat\n\n`) and added wrapResponseForSSETracking to the frontend, which intercepts any SSE message, including:
: heartbeat comments (every 15 seconds from server)
: connection-established comments
data: {...} actual data messages


const SSE_MESSAGE_GAP_TIMEOUT_MS = SSE_HEARTBEAT_INTERVAL_MS * 4;

export const useConnectionMonitoring = ({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cezudas is there anything we can do to make this cleaner and easier to reason about? I see a lot of ref and useEffect, but I wonder how we can make it more maintainable in the long run. Right now it's pretty hard to read and reason about.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i know there might be caveats I might be missing, but maybe the code could be something like:

export const useConnectionMonitoring = ({ chatStatus, messages, stopChat }) => {
  const [warning, setWarning] = useState<'slow' | 'error' | null>(null);
  const lastActivityRef = useRef(Date.now());
  
  // Update activity timestamp when messages arrive
  useEffect(() => {
    if (isActivelyLoading(chatStatus)) {
      lastActivityRef.current = Date.now();
      setWarning(null);
    }
  }, [chatStatus, messages.length]);
  
  // Single monitoring loop
  useEffect(() => {
    if (!isActivelyLoading(chatStatus)) {
      setWarning(null);
      return;
    }
    
    const interval = setInterval(() => {
      const elapsed = Date.now() - lastActivityRef.current;
      
      if (elapsed > SSE_MESSAGE_GAP_TIMEOUT_MS) {
        stopChat();
        setWarning('error');
        clearInterval(interval);
      } else if (elapsed > RESPONSE_WARNING_MS) {
        setWarning('slow');
      }
    }, 1000);
    
    return () => clearInterval(interval);
  }, [chatStatus, stopChat]);
  
  return { warning, clearWarning: () => setWarning(null) };
};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your proposed solution is much cleaner, but there are some caveats:

Important:

  1. SSE activity tracking: Your version relies only on messages.length changing, but the current implementation also tracks raw SSE activity (heartbeats, partial chunks) via wrapResponseForSSETracking. This is important because:
  • Server might send heartbeat comments that don't create new messages
  • Large messages arrive in chunks
  1. Precision timing: The interval approach can drift and be less precise. If you need to show a warning at exactly 10s, the interval might check at 9s, then 10s, then 11s.

Less important:

  1. Polling overhead: Checking every 1000ms adds unnecessary work when the connection is healthy. The current implementation only schedules checks when needed.

  2. Tab visibility: Current implementation pauses monitoring when tab is hidden (battery/performance optimization).

Comment on lines +34 to +35
const isShowingSlowWarningRef = useRef<boolean>(isShowingSlowWarning);
const connectionErrorRef = useRef<string | null>(connectionError);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need both ref and useState here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need both because of stale closures in async callbacks:

  1. useState is needed for React to re-render the UI when warnings/errors change
  2. useRef is needed because:
  • onSSEActivity is registered globally and called from the fetch wrapper
  • Timer callbacks (setTimeout) capture values at creation time
  • Adding these states to dependencies would cause constant re-registration and timer recreation
    The refs act as a "current value pointer" that callbacks can read without being recreated.

@sonarqubecloud
Copy link

sonarqubecloud bot commented Nov 4, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants