Place WhatsApp voice calls from Node.js.
Wraps WhatsApp Web's official VoIP WASM stack and uses Baileys for authentication and signaling. Audio (MP3, WAV, or Float32Array) is encoded with Opus and sent over the live RTP session.
Author: ShellTear
- ✅ Outbound 1:1 voice calls
- ✅ Stream audio from MP3/WAV files
- ✅ Receive remote audio as
Float32Array - ✅ Mute / unmute / hang up
- ❌ Group calls
- ❌ Video
- ❌ Inbound calls
- Node.js ≥ 20
ffmpegonPATH(used to decode/resample audio sources)- A linked WhatsApp account (you'll scan a QR on first run)
This package isn't published on npm. Pull it in directly from git:
git clone https://github.com/SheIITear/baileys-caller
cd baileys-caller
npm install
npm run buildYou can also depend on it from another project via a git URL in package.json:
{
"dependencies": {
"baileys-caller": "git+https://github.com/SheIITear/baileys-caller.git",
"@whiskeysockets/baileys": "^7.0.0-rc11"
}
}@whiskeysockets/baileys is a peer dependency — install it in your project alongside this one.
import { VoipClient } from "baileys-caller";
const client = new VoipClient({ authDir: "./auth" });
await client.connect(); // first run prints a QR for WhatsApp > Linked Devices
const call = await client.call("12345678901", {
audioSource: "./hello.mp3",
});
call.on("ringing", () => console.log("ringing"));
call.on("connected", () => console.log("connected"));
call.on("audio", (pcm) => { /* 16 kHz mono Float32Array from the peer */ });
call.on("ended", (reason) => console.log("ended:", reason));
await call.waitForEnd();
client.disconnect();Run the bundled example from a clone:
npx tsx examples/call.mts ./auth 12345678901 ./hello.mp3| Option | Type | Description |
|---|---|---|
authDir |
string |
Baileys multi-file auth state directory |
Connects to WhatsApp. On first run a QR code is printed; scan it from WhatsApp > Settings > Linked Devices. Subsequent runs reuse authDir.
Places an outbound call. phoneNumber is digits only (e.g. "12345678901").
| Option | Type | Description |
|---|---|---|
audioSource |
string | "silence" |
Path to MP3/WAV, or "silence" for an empty stream |
durationMs |
number? |
Auto-hangup after N ms |
Closes the WhatsApp socket and releases resources.
Returned by client.call(). Extends EventEmitter.
| Event | Payload | When |
|---|---|---|
ringing |
— | Remote device is ringing |
connected |
— | Call answered, media flowing |
audio |
Float32Array |
16 kHz mono PCM frame from the remote peer |
ended |
string |
Call ended (hangup, timeout, rejected) |
error |
Error |
Fatal error |
call.end(): void— hang upcall.mute(muted: boolean): void— toggle outgoing mutecall.waitForEnd(): Promise<string>— resolves with end reason
call.callId: string
- Baileys handles WhatsApp authentication, encryption, and signaling stanzas.
- The WhatsApp Web VoIP WASM stack runs in-process to negotiate the call, encode/decode Opus, and manage the RTP/SRTP session.
- A pthread pool of
worker_threadsmirrors the browser's Web Worker pool the WASM expects. - Outbound audio is decoded with
ffmpeg, resampled to 16 kHz mono, fed into the WASM, and delivered to the relay. - Inbound audio is exposed as
Float32Arraychunks via theaudioevent.
authDir stores Baileys session keys after the first QR scan. Treat it like a credential — anyone with that directory can act as your linked device.
The WASM binary and its loader (whatsapp.wasm, loader.js, worker-modules.js) live under assets/wasm/. To refresh them from a current WhatsApp Web session:
npm run fetch-wasmMIT © ShellTear