```{contents}
```
## WebSockets —


**WebSocket** is a persistent, full-duplex communication protocol that allows **bi-directional real-time data exchange** between client and server over a single TCP connection.

It enables:

* Live chat systems
* Real-time dashboards
* Streaming LLM responses
* Multiplayer applications
* Collaborative tools

---

### Where WebSockets Fit in the Architecture

```
Client ↔ WebSocket Server ↔ LLM / Database / Services
```

Unlike HTTP, the connection remains open.

---

### Why WebSockets for LLM Systems

* Low latency streaming
* Two-way communication
* Supports long conversations
* Efficient for continuous updates

---

### FastAPI WebSocket Server

#### Demonstration

```python
from fastapi import FastAPI, WebSocket
import asyncio

app = FastAPI()

@app.websocket("/ws")
async def websocket_endpoint(ws: WebSocket):
    await ws.accept()

    while True:
        message = await ws.receive_text()
        await ws.send_text(f"Echo: {message}")
```

---

### Browser Client Example

#### Demonstration

```html
<script>
const ws = new WebSocket("ws://localhost:8000/ws");

ws.onopen = () => console.log("Connected");

ws.onmessage = (event) => console.log("Received:", event.data);

ws.send("Hello Server");
</script>
```

---

### WebSocket + LLM Streaming

#### Demonstration

```python
@app.websocket("/chat")
async def chat(ws: WebSocket):
    await ws.accept()

    prompt = await ws.receive_text()
    
    async for chunk in llm.astream(prompt):
        await ws.send_text(chunk.content)
```

---

### Scaling WebSockets

```
Load Balancer → WebSocket Servers → Redis Pub/Sub → Workers
```

---

### WebSocket vs SSE

| Feature    | WebSocket   | SSE                      |
| ---------- | ----------- | ------------------------ |
| Direction  | Two-way     | Server → Client          |
| Complexity | Higher      | Lower                    |
| Use cases  | Chat, games | Streaming, notifications |

---

### Mental Model

```
WebSocket = Permanent phone call between client and server
```

---

### Key Takeaways

* Enables true real-time systems
* Best for interactive AI applications
* More complex than SSE but more powerful
* Essential for chat-style LLM interfaces