From Tamil பின்மொழி (pin mozhi) -- "afterword"
An AI-powered CLI for testing SIP and WebRTC voice endpoints. Describe what you want to test in plain English, and Pinmoli handles the protocol details -- INVITE flows, codec negotiation, RTP streaming, failure analysis.
Think "Postman for Voice", but conversational.
$ pinmoli
Pinmoli - SIP/WebRTC Testing Agent
You: Test sip:+15551234567@trunk.example.com with INVITE, wait 15 seconds for a response
Pinmoli: Running INVITE test against sip:+15551234567@trunk.example.com...
[sip_test] INVITE sip:+15551234567@trunk.example.com
├─ 100 Trying (12ms)
├─ 180 Ringing (45ms)
├─ 200 OK (1203ms) — codec: PCMU/8000
├─ ACK sent
├─ RTP: sent 150 packets (voice-hello, 3.0s)
├─ RTP: waiting 15s for agent response...
├─ RTP: received 1247 packets (15.0s)
└─ BYE sent, 200 OK
Call completed successfully. The agent answered after 1.2s and spoke for
the full 15-second window. Codec negotiated: PCMU/8000 (G.711 u-law).
- Natural language interface -- describe tests in plain English, the AI agent translates to protocol operations
- Full SIP call flows -- OPTIONS pings, INVITE with SDP offer/answer, REGISTER with auth, ACK, BYE
- WebRTC via WHIP -- connect to any WHIP endpoint (LiveKit, Cloudflare, Janus), negotiate ICE/DTLS/SRTP, send and receive audio
- Bidirectional RTP audio -- send pre-generated or custom speech, receive and measure agent responses
- DTMF send and receive (RFC 4733) -- send telephone-event RTP packets during active calls, detect incoming DTMF from the remote side
- Runtime speech synthesis -- generate custom TTS audio on the fly with espeak
- Failure analysis -- pattern-matched diagnostics with actionable recovery steps
- Test persistence -- save and reload test configurations (SQLite with FTS5)
- Works with any SIP or WebRTC endpoint -- LiveKit, Daily.co, Twilio, Cloudflare, Asterisk, FreeSWITCH, or any RFC 3261/WHIP-compliant server
- Automatic packet capture -- every session captures SIP signaling and RTP media to pcap (Wireshark-ready), fail-fast if volume not mounted
- Real codec negotiation -- PCMU (G.711 u-law), PCMA (G.711 A-law), G722 (wideband), opus. Transcodes audio at send time to match the negotiated codec
- Runs in Docker -- all dependencies (ffmpeg, espeak, tcpdump, tini) included, no local setup required
Pinmoli is built on pi, the same open-source agent framework that powers OpenClaw. Where OpenClaw uses pi to build a general-purpose personal AI assistant (messaging gateway, file operations, shell commands across 50+ integrations), Pinmoli takes the opposite approach: a domain-restricted agent that does exactly one thing -- SIP/WebRTC testing -- and does it well.
The key difference is scope. OpenClaw embeds pi-coding-agent to give an LLM full access to read, write, edit, and bash tools across an entire system. Pinmoli uses only pi-agent-core and pi-ai with a locked-down tool allowlist of 7 voice-testing tools. The LLM cannot touch the filesystem, run shell commands, or do anything outside voice protocol testing.
pi-ai pi-tui
Multi-provider LLM abstraction Terminal UI with diff rendering
Anthropic, OpenAI, Google, Editor, Markdown, Box, Text
Bedrock, Mistral, Groq, ... Keyboard input, layout engine
│ │
▼ ▼
pi-agent-core Pinmoli TUI (src/ui/tui.ts)
Agent loop, tool execution, Wraps pi-tui for interactive mode
event subscription, AbortSignal Falls back to raw Terminal for tests
│
▼
PinmoliAgent (src/agent/runtime.ts)
Domain-restricted system prompt
7-tool allowlist, event routing
Pinmoli uses three pi packages:
| Package | Role in Pinmoli |
|---|---|
@mariozechner/pi-agent-core |
Agent loop -- receives user input, calls LLM, executes tools, streams events back |
@mariozechner/pi-ai |
LLM provider abstraction -- swap between Gemini, Claude, GPT with one config change |
@mariozechner/pi-tui |
Terminal rendering -- differential updates, editor with autocomplete, flicker-free output |
User input
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ TUI (src/ui/tui.ts) │
│ pi-tui Editor → reads input → sends to agent │
│ Agent events → streamed back → rendered in real time │
│ Ctrl+C: abort current operation / clear input / quit │
└──────────────────────┬───────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ Agent (src/agent/runtime.ts) │
│ pi-agent-core Agent with pi-ai model │
│ │
│ System prompt constrains LLM to SIP testing only: │
│ "You are Pinmoli, a SIP/WebRTC testing assistant. │
│ You ONLY help test voice protocols. │
│ You CANNOT edit files, run bash, or access the filesystem." │
│ │
│ Tool allowlist enforced by registry (src/tools/registry.ts): │
│ sip_test, webrtc_test, generate_audio, analyze_failure, │
│ save_test, load_test, list_tests │
└──────────────────────┬───────────────────────────────────────────┘
│ LLM decides which tool to call
▼
┌──────────────────────────────────────────────────────────────────┐
│ Tools (src/tools/*.ts) │
│ │
│ sip_test ─────► SIP Engine (async generator, streams events) │
│ webrtc_test ──► WebRTC Engine (WHIP signaling, werift stack) │
│ generate_audio ► ffmpeg/espeak (sine, DTMF, silence, speech) │
│ analyze_failure ► Pattern matching on event history │
│ save/load/list ► SQLite with FTS5 (src/storage/db.ts) │
└──────────────────────┬───────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ SIP Engine (src/sip/engine.ts) │
│ │
│ async function* runSipTest(config): AsyncGenerator<SipEvent> │
│ │
│ ┌─ protocol.ts ── SIP message builder (INVITE, ACK, BYE) │
│ ├─ sdp.ts ─────── SDP offer/answer (opus, PCMU, PCMA, G722) │
│ ├─ rtp-receiver.ts ── RTP/DTMF send/receive on UDP socket │
│ ├─ dtmf.ts ───── RFC 4733 encode/decode, DtmfDetector │
│ └─ audio.ts ───── Sample resolution (WAV files, generated) │
│ │
│ Yields events as they happen: │
│ SIP messages, RTP stats, DTMF, diagnostics, codec negotiation │
└──────────────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ WebRTC Engine (src/webrtc/engine.ts) │
│ │
│ async function* runWebRtcTest(config): AsyncGenerator<TestEvent>│
│ │
│ ┌─ whip.ts ───── WHIP signaling (RFC 9725: POST offer→answer) │
│ ├─ audio-frames.ts ── PCM16 frame chunking + WAV save │
│ └─ werift ────── Pure TS WebRTC stack (ICE/DTLS/SRTP/RTP) │
│ │
│ Yields events as they happen: │
│ WHIP signaling, ICE/DTLS, RTP stats, DTMF, agent audio │
└──────────────────────────────────────────────────────────────────┘
The SIP engine is an async function* that yields events as they happen -- a SIP 100 Trying at 12ms, a 200 OK at 1200ms, RTP packet counts every second. The TUI renders each event the moment it arrives. No buffering, no callbacks, no polling.
// The engine yields events in real time
for await (const event of runSipTest(config)) {
tui.render(event); // instant display
}This design makes the engine usable outside the TUI too -- pipe events to NDJSON, feed them into a test assertion, or stream them over a websocket.
General-purpose agents (like OpenClaw) give the LLM access to bash, file I/O, and the full system. That power makes sense for a personal assistant. For a SIP testing tool, it's a liability -- you don't want an LLM accidentally rm -rf-ing your project while trying to debug a codec mismatch.
Pinmoli's agent can only call 7 tools, all voice-testing related. The system prompt explicitly forbids filesystem access, and the tool registry enforces the allowlist at runtime. The LLM stays in its lane.
- Docker
- An LLM provider credential (GCP service account key for Gemini, or an API key for Anthropic/OpenAI)
Pull the published image and run with any supported LLM provider:
docker pull ghcr.io/arakoodev/pinmoli:latestAnthropic:
docker run --rm -it --network host \
-v $(pwd)/captures:/app/captures \
-e ANTHROPIC_API_KEY=sk-ant-... \
ghcr.io/arakoodev/pinmoliOpenAI:
docker run --rm -it --network host \
-v $(pwd)/captures:/app/captures \
-e OPENAI_API_KEY=sk-... \
ghcr.io/arakoodev/pinmoliGoogle Gemini (API key):
docker run --rm -it --network host \
-v $(pwd)/captures:/app/captures \
-e GEMINI_API_KEY=... \
ghcr.io/arakoodev/pinmoliGoogle Vertex AI (service account):
docker run --rm -it --network host \
-v $(pwd)/captures:/app/captures \
-v /path/to/key.json:/credentials.json:ro \
ghcr.io/arakoodev/pinmoli --service-account /credentials.jsonGroq:
docker run --rm -it --network host \
-v $(pwd)/captures:/app/captures \
-e GROQ_API_KEY=gsk_... \
ghcr.io/arakoodev/pinmoliThe provider is auto-detected from whichever env var you set. Use --provider to override.
The image is published automatically on every push to main via GitHub Actions. Tagged releases (v*) produce versioned images (e.g., ghcr.io/arakoodev/pinmoli:0.2.0).
For contributors building from source:
git clone git@github.com:arakoodev/pinmoli.git
cd pinmoli
docker compose build
docker compose up -d
# Start the TUI
docker compose exec pinmoli npx tsx src/cli.ts
# With a GCP service account
docker compose exec pinmoli npx tsx src/cli.ts \
--service-account /app/secrets/my-key.jsonThe source directory is bind-mounted, so code changes are reflected immediately.
You're in. Type a test request. Every example below has a corresponding integration test in test/integration/readme-prompts.test.ts.
SIP basics:
You: Send OPTIONS to sip:trunk.example.com
You: INVITE sip:+15551234567@sip.livekit.cloud with opus and PCMU
You: Register at sip:pbx.example.com with username admin password secret
Codec negotiation:
You: Test with PCMA codec -- I want to verify A-law support
You: Call the agent using G722 and wait 20 seconds for a response
You: Test sip:pbx.example.com offering only PCMA and PCMU, see which it picks
DTMF and IVR navigation:
You: Call sip:+15551234567@trunk.example.com and press 1-2-3-# after the greeting
You: Call sip:+18005551234@trunk.example.com, press 1 for sales, then 0 for operator
You: Connect via WebRTC to https://agent.example.com/whip and enter PIN 1234#
Speech generation:
You: Generate speech saying "What is the weather today?" then call the agent
You: Generate a 1000Hz sine wave for 5 seconds, then test the endpoint
You: Make the greeting say "Por favor espere" in Spanish, then test
Bidirectional conversations:
You: Call sip:agent@example.com, listen for 5 seconds first, then send my greeting
You: INVITE sip:agent@livekit.cloud, send the greeting, wait 30 seconds for a response
WebRTC:
You: Test the WHIP endpoint at https://my-agent.example.com/whip with bearer token abc123
Failure analysis:
You: Why did it fail?
You: What went wrong? (after a 488 codec mismatch)
Save, load, and batch:
You: Save this test as "production-health-check"
You: Show me all saved tests, then run one
You: Compare sip:trunk-us.example.com and sip:trunk-eu.example.com
You: Test these servers: sip:a.example.com, sip:b.example.com, sip:c.example.com
Troubleshooting:
You: Try again with 15 second timeout
Advanced combos:
You: Generate speech "Hello, I need billing support", call with PCMA, then press 2 for billing
You: Test sip:agent@broken-trunk.com, analyze the failure, fix it with TCP, save the config
If you just want to run SIP tests programmatically without the conversational TUI:
docker compose exec pinmoli npx tsx -e "
import { runSipTest } from './src/sip/engine.js';
for await (const event of runSipTest({
uri: 'sip:trunk.example.com',
method: 'OPTIONS',
codecs: ['PCMU']
})) { console.log(JSON.stringify(event)); }
"Pinmoli auto-detects your LLM provider from environment variables. Set one and go:
| Provider | --provider |
Env var | Default model |
|---|---|---|---|
| Anthropic | anthropic |
ANTHROPIC_API_KEY |
claude-sonnet-4-5 |
| OpenAI | openai |
OPENAI_API_KEY |
gpt-4o |
| Google Gemini | google |
GEMINI_API_KEY |
gemini-2.5-flash |
| Google Vertex AI | google-vertex |
--service-account <path> |
gemini-2.5-flash |
| Groq | groq |
GROQ_API_KEY |
llama-3.3-70b-versatile |
| OpenRouter | openrouter |
OPENROUTER_API_KEY |
anthropic/claude-sonnet-4.5 |
# Just set the env var — provider is auto-detected
ANTHROPIC_API_KEY=sk-ant-... pinmoli
# Or be explicit
pinmoli --provider openai --model gpt-4o
# Override the default model
pinmoli --provider anthropic --model claude-haiku-4-5
# Vertex AI (service account)
pinmoli --service-account /path/to/key.json
# Switch provider at runtime via slash command
/model anthropic claude-sonnet-4-5
/model google gemini-2.5-pro
/model # show current provider/modelpinmoli [options]
--provider <name> LLM provider (anthropic, openai, google, google-vertex, groq, openrouter)
--model <id> Model ID (default depends on provider)
--service-account <path> GCP service account JSON (implies google-vertex)
--help Show usage
| Variable | Provider |
|---|---|
ANTHROPIC_API_KEY |
Anthropic |
OPENAI_API_KEY |
OpenAI |
GEMINI_API_KEY |
Google Gemini |
GROQ_API_KEY |
Groq |
OPENROUTER_API_KEY |
OpenRouter |
GOOGLE_APPLICATION_CREDENTIALS |
Google Vertex AI (with GOOGLE_CLOUD_PROJECT, GOOGLE_CLOUD_LOCATION) |
The default docker-compose.yml uses network_mode: host so SIP and RTP traffic reaches the network directly. Modify if your setup requires bridged networking with explicit port mapping.
Pinmoli exposes 7 tools to the AI agent. You don't call these directly -- you describe what you want and the agent picks the right tool. See SKILLS.md for full parameter reference.
| Tool | Purpose |
|---|---|
sip_test |
Run OPTIONS, INVITE, or REGISTER against a SIP endpoint. Supports DTMF send/receive via dtmfDigits. |
webrtc_test |
Connect to a WHIP endpoint, negotiate ICE/DTLS/SRTP, send audio, capture agent response. Supports DTMF. |
generate_audio |
Create custom audio samples (sine, DTMF dual-tone, silence, TTS speech) |
analyze_failure |
Diagnose a failed test and suggest fixes |
save_test |
Save a test configuration by name |
load_test |
Reload and run a saved test |
list_tests |
List all saved test configurations |
| Sample | Description | Duration |
|---|---|---|
voice-hello |
"Hello, this is a test call from Pinmoli" | ~3s |
sine-440hz |
440 Hz sine wave | 3s |
sine-1000hz |
1000 Hz sine wave | 3s |
dtmf-123 |
DTMF tones 1-2-3 | 1.5s |
silence |
Silence | 3s |
All samples are PCMU @ 8kHz mono (G.711 u-law), the standard SIP codec.
Ask the agent to generate custom speech:
You: Generate speech saying "Please transfer me to billing"
You: Now call sip:+15551234567@trunk.example.com with that audio
Or generate tones:
You: Generate a 1000Hz sine wave for 5 seconds, then test the endpoint
Every Pinmoli session automatically captures all SIP signaling and RTP media traffic to a pcap file. Open it in Wireshark for protocol-level debugging.
The entrypoint.sh runs tcpdump in the background for the entire session:
- Captures port 5060 (SIP) and UDP ports 10000-65535 (RTP/SRTP)
- Saves to
/app/captures/pinmoli-YYYYMMDD-HHMMSS.pcapinside the container - Stops automatically when the session ends (EXIT trap)
Docker Compose (development): Captures appear at ./captures/ automatically — the source directory is bind-mounted.
Docker Run (GHCR image): Mount a volume so captures persist after the container exits:
# Create the captures directory (first time only)
mkdir -p captures
# Mount it when running Pinmoli
docker run --rm -it --network host \
-v $(pwd)/captures:/app/captures \
-e ANTHROPIC_API_KEY=sk-ant-... \
ghcr.io/arakoodev/pinmoliAfter the session, your captures are in ./captures/:
ls captures/
# pinmoli-20260305-143022.pcap
# Open in Wireshark
wireshark captures/pinmoli-20260305-143022.pcapPrevious captures from earlier runs are preserved — new sessions create new pcap files with unique timestamps.
If you don't need packet capture (e.g., CI/CD), set PINMOLI_NO_CAPTURE=1:
docker run --rm -it --network host \
-e PINMOLI_NO_CAPTURE=1 \
-e ANTHROPIC_API_KEY=sk-ant-... \
ghcr.io/arakoodev/pinmolipinmoli/
├── src/
│ ├── cli.ts # Entry point, REPL loop
│ ├── agent/runtime.ts # PinmoliAgent wraps pi-agent-core
│ ├── ui/
│ │ ├── tui.ts # PinmoliTUI wraps pi-tui
│ │ ├── tool-output.ts # Collapsible tool result rendering
│ │ └── test-terminal.ts # Test-mode Terminal implementation
│ ├── tools/
│ │ ├── registry.ts # 7-tool allowlist enforcement
│ │ ├── index.ts # Tool registration
│ │ ├── sip-test.ts # SIP test execution (async generator)
│ │ ├── webrtc-test.ts # WebRTC test execution (WHIP + werift)
│ │ ├── generate-audio.ts # Audio generation (ffmpeg, espeak)
│ │ ├── analyze-failure.ts # Diagnostic pattern matching
│ │ └── save/load/list-tests.ts
│ ├── sip/
│ │ ├── engine.ts # SIP test orchestration (async generator)
│ │ ├── protocol.ts # SIP message building
│ │ ├── sdp.ts # SDP offer/answer builder
│ │ ├── rtp-receiver.ts # RTP/DTMF packet send/receive
│ │ ├── codec.ts # Codec table, transcoding (PCMU↔PCMA), lookup
│ │ ├── dtmf.ts # RFC 4733 encode/decode, DtmfDetector
│ │ └── audio.ts # Audio sample resolution
│ ├── webrtc/
│ │ ├── engine.ts # WebRTC test orchestration (async generator)
│ │ ├── whip.ts # WHIP signaling client (RFC 9725)
│ │ └── audio-frames.ts # PCM16 frame chunking + WAV save
│ ├── storage/db.ts # SQLite + FTS5 persistence
│ ├── validation/schemas.ts # TypeBox schemas
│ └── commands/service-account.ts
├── audio-samples/ # Pre-generated PCMU WAV files
├── test/
│ ├── unit/ # Protocol, SDP, RTP, DTMF, storage, tools, lint, WebRTC
│ ├── integration/ # TUI flows, e2e, bidirectional RTP, speech
│ └── live/ # Tests against real SIP and WebRTC endpoints
├── eslint-plugin-pinmoli.cjs # 14 lint rules from real bugs
├── Dockerfile # Alpine + Node 20 + ffmpeg + espeak + tcpdump + tini
├── docker-compose.yml
└── entrypoint.sh
All tests run inside Docker.
# Start the container
docker compose up -d
# Run all tests
docker compose exec pinmoli npx vitest run
# Unit tests only (~1s)
docker compose exec pinmoli npx vitest run test/unit/
# Integration tests
docker compose exec pinmoli npx vitest run test/integration/
# Live tests (hits real SIP endpoints, requires network)
docker compose exec pinmoli npx vitest run test/live/
# Type-check
docker compose exec pinmoli npx tsc --noEmit
# Lint
docker compose exec pinmoli npm run lintOnly one process can bind the SIP port. Kill the conflicting process inside the container:
docker compose exec pinmoli sh -c 'kill $(lsof -ti:5060)'- NAT/firewall -- the host must be reachable on the RTP port advertised in SDP. Private IPs (WSL2
172.x, Docker172.x) are not routable from the internet. - No agent running -- the remote SIP endpoint accepted the call but has no worker to generate audio.
- Run from a host with a public IP or use Docker with
network_mode: host.
This is usually a synthetic 503 generated by the sip npm library when the remote drops the TCP connection (e.g., LiveKit agent timeout). It's not a real SIP 503. Common causes:
- AI agent worker not running on the remote side
- Malformed SDP or unroutable IPs in headers
- Missing ACK after 200 OK
Check that your credentials are configured:
# If using a volume-mounted service account, verify it's accessible inside the container
docker compose exec pinmoli ls -la /app/secrets/my-key.json
# Or pass credentials via environment variable
docker run --rm -it --network host \
-e ANTHROPIC_API_KEY=sk-ant-... \
ghcr.io/arakoodev/pinmoli# Fork and clone
git clone https://github.com/your-fork/pinmoli.git
cd pinmoli
# Build the container
docker compose build
# Run tests (must pass before submitting a PR)
docker compose up -d
docker compose exec pinmoli npx vitest run
docker compose exec pinmoli npx tsc --noEmit
docker compose exec pinmoli npm run lintAll commands run inside Docker -- the container includes ffmpeg, espeak, and other dependencies that aren't available locally.
MIT