Skip to content

Conversation

@shadowfax92
Copy link
Contributor

@shadowfax92 shadowfax92 commented Nov 12, 2025

Why refactor the controller bootstrap?

We were spinning up overlapping controllers every time Chrome poked the service worker. Refactoring into a real singleton (BrowserOSController) and managing it from background/index.ts means one KeepAlive, one socket, and deterministic restarts. It's boring by design, which is exactly what we wanted.

How do we prevent future "extension not connected" loops?

We taught ControllerBridge to track all extension sockets and only send work through a selected primary. When the primary dies, we promote the next client and keep going. Pending requests are rejected loudly so agents know to retry, and standbys disconnecting no longer nuke the bridge.

Why strip the backoff logic?

Exponential backoff sounded fancy, but in practice it just delayed recovery when the controller was killed intentionally. A flat 30s retry is predictable, debuggable, and avoids random jitter that made logs harder to read.

Why collapse env/config sprawl?

We realized keeping both config/constants.ts and config/environment.ts was garbage duplication, so we deleted the unused env loader entirely and made WEBSOCKET_CONFIG the single source of truth. That let us set a fixed reconnectIntervalMs (30s) and drop the exponential backoff math—far fewer moving parts, same resiliency.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Nov 12, 2025

Greptile Overview

Greptile Summary

This PR implements a comprehensive fix for extension disconnection issues by refactoring the BrowserOS controller architecture. The changes simplify WebSocket reconnection logic, implement a multi-client bridge with failover capabilities, and introduce better environment configuration management.

Key architectural changes include:

  • WebSocket Connection Simplification: Replaced exponential backoff with a fixed 30-second reconnection interval to ensure more predictable reconnection behavior
  • Multi-Client Architecture: Transformed ControllerBridge from single-connection to multi-client with primary/secondary failover, ensuring continuity when connections drop
  • Controller Refactoring: Extracted BrowserOSController into a separate class with robust lifecycle management, response queuing for disconnected states, and comprehensive error handling
  • Environment Management: Removed complex environment configuration system in favor of simpler approach, added .env.prod support, and provided default configuration URL

The refactoring addresses reliability issues by implementing response queuing, automatic client promotion on disconnect, and better separation of concerns between controller logic and lifecycle management.

Important Files Changed

Filename Score Overview
.gitignore 5/5 Added .env.prod to prevent production environment files from being committed
packages/controller-ext/manifest.json 5/5 Removed trailing newline for formatting consistency
.env.example 4/5 Added default config URL and environment file usage instructions
packages/controller-ext/src/config/environment.ts 2/5 Completely removed 196-line environment configuration system
packages/controller-ext/src/config/constants.ts 4/5 Simplified WebSocket reconnection from exponential backoff to fixed 30s interval
packages/controller-ext/src/websocket/WebSocketClient.ts 4/5 Removed exponential backoff logic, simplified reconnection mechanism
packages/controller-ext/src/background/BrowserOSController.ts 4/5 New controller class with response queuing, error handling, and connection management
packages/controller-server/src/ControllerBridge.ts 4/5 Transformed from single-client to multi-client architecture with failover
packages/controller-ext/src/background/index.ts 4/5 Refactored to focus on controller lifecycle management with singleton pattern

Confidence score: 3/5

  • This PR addresses real connectivity issues but introduces significant architectural changes that require careful testing
  • Score reflects concerns about the complete removal of environment configuration system and potential impact of simplified reconnection logic
  • Pay close attention to the deleted environment.ts file and the new multi-client bridge architecture for potential runtime issues

Sequence Diagram

sequenceDiagram
    participant Extension as "BrowserOS Extension"
    participant WebSocketClient as "WebSocket Client"
    participant Server as "Controller Bridge Server"
    participant BrowserOSController as "BrowserOS Controller"
    participant ActionRegistry as "Action Registry"

    Note over Extension,ActionRegistry: Extension Startup & Connection
    Extension->>BrowserOSController: "Initialize controller"
    BrowserOSController->>WebSocketClient: "Create WebSocket client (port)"
    BrowserOSController->>ActionRegistry: "Register all actions"
    BrowserOSController->>WebSocketClient: "connect()"
    
    WebSocketClient->>Server: "WebSocket connection request"
    Server->>Server: "Register client & set as primary"
    Server-->>WebSocketClient: "Connection established"
    WebSocketClient-->>BrowserOSController: "Connection status: CONNECTED"
    
    Note over Extension,ActionRegistry: Heartbeat Mechanism
    loop Every 20s
        WebSocketClient->>Server: "ping message"
        Server-->>WebSocketClient: "pong response"
    end

    Note over Extension,ActionRegistry: Request Processing Flow
    Server->>WebSocketClient: "Action request message"
    WebSocketClient->>BrowserOSController: "handleIncomingMessage()"
    BrowserOSController->>BrowserOSController: "processRequest() - validate & track"
    BrowserOSController->>ActionRegistry: "dispatch(action, payload)"
    ActionRegistry->>ActionRegistry: "Execute specific action"
    ActionRegistry-->>BrowserOSController: "Action response"
    BrowserOSController->>WebSocketClient: "send(response)"
    WebSocketClient->>Server: "Response message"

    Note over Extension,ActionRegistry: Connection Failure & Recovery
    Server->>Server: "Connection lost"
    Server->>Server: "Clear pending requests"
    WebSocketClient->>WebSocketClient: "Detect connection failure"
    WebSocketClient->>WebSocketClient: "Start reconnection timer"
    WebSocketClient->>Server: "Reconnection attempt"
    Server->>Server: "Re-register client as primary"
    Server-->>WebSocketClient: "Reconnection successful"
    WebSocketClient->>BrowserOSController: "Flush queued responses"
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Comments (1)

  1. packages/controller-ext/src/config/environment.ts

    logic: Complete removal of environment configuration system eliminates type safety and validation. Ensure replacement configuration approach provides equivalent safety and that all dependent code is updated. What configuration system is replacing this comprehensive environment management, and have all imports of these types and functions been updated throughout the codebase?

9 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +183 to +184
this.wsClient.onMessage((message: ProtocolResponse) => {
this.handleIncomingMessage(message);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Type mismatch - receiving ProtocolResponse but processing as request message. Should this be ProtocolRequest since you're checking for rawMessage.action on line 195?

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/controller-ext/src/background/BrowserOSController.ts
Line: 183:184

Comment:
**logic:** Type mismatch - receiving ProtocolResponse but processing as request message. Should this be ProtocolRequest since you're checking for rawMessage.action on line 195?

How can I resolve this? If you propose a fix, please make it concise.

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
@shadowfax92 shadowfax92 merged commit 878939a into main Nov 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants