OpenAI Realtime Translation SDK — WebSocket + WebRTC transports, React hook, zero-dependency core.
npm install @ekaone/rendition| Transport | When to use |
|---|---|
"webrtc" |
Browser mic capture — audio flows via RTCPeerConnection, no manual audio piping |
"websocket" |
Server-side pipelines — Twilio, SIP, broadcast ingest, media workers |
createTranslationSession() auto-selects: "webrtc" in browser, "websocket" in Node.
npm install react react-domimport { useTranslationSession } from "@ekaone/rendition/react";
function TranslationWidget() {
const {
status,
outputTranscript,
inputTranscript,
isSpeaking,
mediaStream,
connect,
disconnect,
error,
} = useTranslationSession({
apiKey: ephemeralClientSecret, // from your backend — never ship raw key to browser
targetLanguage: "es", // BCP-47 tag
transport: "webrtc", // mic → WebRTC → OpenAI
});
function toggleMic() {
if (!mediaStream) return;
for (const track of mediaStream.getAudioTracks()) {
track.enabled = !track.enabled;
}
}
return (
<div>
<p>Status: {status} {isSpeaking && "(speaking)"}</p>
<button onClick={connect} disabled={status !== "idle" && status !== "closed"}>Start</button>
<button onClick={disconnect} disabled={status === "idle" || status === "closed"}>Stop</button>
<button onClick={toggleMic} disabled={!mediaStream}>Mute / Unmute</button>
{error && <p>Error: {error}</p>}
<p><strong>Source:</strong> {inputTranscript}</p>
<p><strong>Translation:</strong> {outputTranscript}</p>
</div>
);
}| Option | Type | Default | Description |
|---|---|---|---|
apiKey |
string |
required | OpenAI API key or ephemeral client secret |
targetLanguage |
string |
required | BCP-47 output language tag, e.g. "es", "fr", "ja" |
transport |
"webrtc" | "websocket" |
"webrtc" in browser |
Transport to use |
safetyIdentifier |
string |
— | Hashed user ID sent as OpenAI-Safety-Identifier header |
model |
string |
"gpt-realtime-translate" |
Model override |
wsEndpoint |
string |
OpenAI default | Custom WebSocket endpoint |
onAudioDelta |
(delta: string) => void |
— | Override audio playback (WebSocket path) |
autoConnect |
boolean |
false |
Connect automatically on mount |
| Field | Type | Description |
|---|---|---|
status |
TranslationSessionStatus |
idle | connecting | ready | translating | closing | closed | error |
outputTranscript |
string |
Accumulated translated transcript |
inputTranscript |
string |
Accumulated source transcript |
isSpeaking |
boolean |
true while translated transcript deltas are actively arriving (~700 ms idle reset) |
mediaStream |
MediaStream | null |
Local mic stream (WebRTC only) — use to mute/replace the mic track |
connect |
() => Promise<void> |
Start the session |
disconnect |
() => Promise<void> |
Gracefully close |
appendAudio |
(b64: string) => void |
Append audio manually (WebSocket path) |
error |
string | null |
Last error message |
import { createTranslationSession } from "@ekaone/rendition";
const session = await createTranslationSession({
apiKey: process.env.OPENAI_API_KEY,
targetLanguage: "fr",
transport: "websocket",
});
session.on("session.output_transcript.delta", (evt) => {
process.stdout.write(evt.delta);
});
session.on("session.output_audio.delta", (evt) => {
// write evt.delta (base64 PCM16) to your media output
});
session.appendAudio(base64Pcm16Chunk);
await session.close();Server-side variant using the ws package — supports Authorization and
OpenAI-Safety-Identifier headers that browser WebSocket cannot set.
npm install wsimport { createNodeWebSocketSession } from "@ekaone/rendition";
const session = await createNodeWebSocketSession({
apiKey: process.env.OPENAI_API_KEY,
targetLanguage: "ja",
safetyIdentifier: "hashed-user-id",
});Audio helpers are available via a dedicated subpath so they do not bloat the session protocol surface:
import {
float32ToBase64Pcm16,
base64Pcm16ToFloat32,
playPcm16Delta,
} from "@ekaone/rendition/utils";
// Web Audio Float32 → base64 PCM16 (for appendAudio)
const b64 = float32ToBase64Pcm16(float32Array);
// base64 PCM16 → Float32 (for custom playback)
const float32 = base64Pcm16ToFloat32(b64);
// Play a delta chunk directly via Web Audio API
playPcm16Delta(audioContext, b64, 24000);The same helpers are also re-exported from the main @ekaone/rendition entry for
backwards compatibility.
| Event | Payload | Description |
|---|---|---|
session.output_audio.delta |
{ delta: string } |
Base64 PCM16 translated audio chunk |
session.output_transcript.delta |
{ delta: string } |
Translated transcript delta |
session.input_transcript.delta |
{ delta: string } |
Source transcript delta |
session.closed |
— | Session fully flushed and closed |
session.created |
{ session } |
Session created |
session.updated |
{ session } |
Session config updated |
error |
{ error } |
API error |
idle → connecting → ready → translating → closing → closed
↘ error
WebSocket teardown follows the spec:
- Send
session.close - Keep reading events — do not close the socket yet
- Receive
session.closed→ close the socket
Skipping step 2 drops buffered translated audio still draining from the session.
| Import path | Contents |
|---|---|
@ekaone/rendition |
createTranslationSession, createNodeWebSocketSession, WebSocketSession, WebRTCSession, audio utils, all types |
@ekaone/rendition/react |
useTranslationSession, hook types |
@ekaone/rendition/utils |
float32ToBase64Pcm16, base64Pcm16ToFloat32, playPcm16Delta |
src/
├── types/ # Shared TypeScript types
├── core/
│ ├── emitter.ts # Typed zero-dep event emitter
│ ├── audio.ts # PCM16 ↔ Float32, Web Audio playback
│ ├── websocket-session.ts # Browser + Node WebSocket sessions
│ ├── webrtc-session.ts # Browser WebRTC session
│ └── factory.ts # createTranslationSession()
├── react/
│ ├── useTranslationSession.ts
│ └── index.ts
├── utils/
│ └── index.ts # Audio helper re-exports
└── index.ts # Core barrel
MIT © Eka Prasetia