Day 42 of the TechFromZero series — a browser-to-browser video chat in ~250 lines of code, zero backend.
🌐 Live demo: https://webrtc-from-zero.vercel.app 📺 Open it in two browser tabs, follow the 3-step handshake, and watch your own face appear on both panes.
| What it does | |
|---|---|
| getUserMedia | Browser asks for the webcam + mic. Returns a MediaStream. |
| RTCPeerConnection | The actual P2P pipe. You add your tracks, the other side gets them. |
| Signaling (SDP + ICE) | Each peer describes itself in an SDP blob. They swap blobs ANY way they like (HTTP, WebSocket, paper plane, … here: copy-paste). |
WebRTC doesn't care how the two peers find each other. That's why this app needs no server.
Clone the repo and step through history one commit at a time. Each commit adds exactly one idea:
| Commit | What it adds |
|---|---|
| 1 | Next.js 16 + React 19 + TypeScript + Tailwind 4 scaffold |
| 2 | getUserMedia → local webcam preview + stop button |
| 3 | RTCPeerConnection + remote pane + addTrack / ontrack |
| 4 | Caller: createOffer + ICE gather + dump SDP into a textarea |
| 5 | Callee: paste offer → createAnswer → dump answer |
| 6 | Caller: paste answer → handshake completes → frames flow |
| 7 | Live connection-state badge (idle → connecting → connected) |
| 8 | Mute / camera-off / hang-up controls + final polish |
git clone https://github.com/dev48v/webrtc-from-zero
cd webrtc-from-zero
npm install
npm run devOpen http://localhost:3000 in two browser windows (Chrome / Firefox / Safari all work).
- Both tabs click ▶ Start camera and allow webcam access.
- Tab A clicks 1️⃣ Create offer → copies the JSON blob → pastes into Tab B.
- Tab B clicks 2️⃣ Accept offer + create answer → copies back → Tab A pastes.
- Tab A clicks 3️⃣ Accept answer + connect — the dot turns green, you see your own face on both tabs.
This demo is intentionally trimmed to focus on the WebRTC primitives. To ship a real call app you would:
| Concern | Add |
|---|---|
| Skip the copy-paste step | A signaling channel (WebSocket, Firebase Realtime, Ably, Pusher, …). The server only relays JSON — it never sees the media. |
| Cross-NAT calls reliably | A TURN server (e.g. coturn, Twilio Network Traversal, Cloudflare Calls). STUN alone fails ~15% of corporate networks. |
| Multi-party calls | Either a mesh (each peer has N-1 connections) or an SFU (LiveKit, mediasoup, Cloudflare Calls) that forwards streams. |
| Screen share | navigator.mediaDevices.getDisplayMedia() — drop into the same addTrack pipeline. |
| Recording | MediaRecorder on the local stream, or pipe the SFU output to a recorder. |
The handshake itself doesn't change. The 8 commits above are the floor — everything else is layers on top.
- Next.js 16 — App Router, React 19, TypeScript
- Tailwind CSS 4
- Plain browser
RTCPeerConnection+MediaStream— no peer.js, no SDK - Google's public STUN server (free for development)
No paid services. No API keys. No backend.
MIT