Generate realistic AI-powered customer service phone conversations for testing Voice Intelligence, analytics pipelines, and contact center workflows.
π This is a GitHub Template - Click the "Use this template" button above to create your own repository with this code!
A production-grade system for generating realistic synthetic call data using Twilio Programmable Voice and Segment CDP. Features random customer-agent pairing for realistic scenarios (including challenging interactions), AI-powered conversations with OpenAI, Voice Intelligence transcription, and ML-based customer profiling with churn risk and propensity scores.
Architecture: Built with production-grade patterns including comprehensive test coverage for core TwiML functions, retry logic with exponential backoff, circuit breakers, and webhook signature validation.
Generates realistic synthetic call data for testing, development, and analytics:
- Random Pairing - Creates realistic scenarios including challenging interactions (frustrated customers with inexperienced agents)
- AI Conversations - OpenAI-powered realistic agent-customer conversations with Voice Intelligence transcription
- Customer Profiling - Creates and updates Segment CDP profiles with ML scores
- ML Analytics - Calculates churn risk, propensity to buy, and satisfaction scores
- Complete Pipeline - End-to-end automation from pairing to analytics
β One-Command Deployment - Pre-deployment checks + deploy + post-deployment validation β Production Testing - Smoke tests against real Twilio and Segment APIs β Comprehensive Test Coverage - 634 tests across unit, integration, and E2E β Realistic Pairing - Random customer-agent matching creates diverse, realistic scenarios β Segment CDP Integration - Automatic profile creation and ML score updates β Twilio Serverless - Conference webhooks and AI conversation orchestration
- Backend: Node.js 18+, Twilio Serverless Functions
- AI: OpenAI gpt-4o-mini, Twilio Voice Intelligence
- Data: Segment CDP, Twilio Sync
- Testing: Jest (634 tests), Newman (Postman)
- CI/CD: GitHub Actions
- Code Quality: ESLint, Prettier
Per 100 synthetic calls (assuming 2-minute average conversation, 5-minute maximum):
| Service | Usage | Cost |
|---|---|---|
| Twilio Voice | 200 minutes @ $0.013/min | ~$2.60 |
| Twilio Voice Intelligence | 200 minutes @ $0.02/min | ~$4.00 |
| OpenAI gpt-4o-mini | ~1M tokens @ $0.15/1M input, $0.60/1M output | ~$0.30 |
| Twilio Sync | Included in usage | Free tier |
| Segment CDP | Up to 10K MTUs/month | Free tier |
| Total | ~$6.90 per 100 calls |
Budget Planning:
MAX_DAILY_CALLS=1000(default) = ~$67.50/day maximumMAX_DAILY_CALLS=100= ~$6.75/day for testing- Adjust
MAX_DAILY_CALLSin.envto control spending
Cost Controls Built-in:
- β Auto-termination at 5 minutes - Prevents runaway conversation costs
- β
Rate limiting -
MAX_DAILY_CALLSprevents accidental overspending - β Efficient model - gpt-4o-mini is optimized for cost and performance
Cost-Saving Tips:
- Use shorter conversations for testing (conferences auto-terminate at 5 minutes)
- Start with
MAX_DAILY_CALLS=10during development - Monitor OpenAI usage at platform.openai.com/usage
- Use Twilio's free trial credits for initial testing
- Using This Template
- Quick Start
- Development Tools
- Detailed Setup
- Development Workflow
- Testing Strategy
- CI/CD Pipeline
- Project Structure
- Example Use Cases
- Contributing
- β‘ Quick Start Guide - 5-minute setup for trying the system
- π¦ Deployment Guide - Production deployment with advanced configuration
- ποΈ Architecture - System architecture and data flow diagrams
- π§ API Documentation - Complete API reference
- π¨ Error Handling Guide - Error handling patterns and debugging
- π Segment Setup - Configure Segment CDP integration
- π Event Streams Setup - Configure Twilio Event Streams
- πΎ Sync Setup - Configure Twilio Sync for state management
This repository is a GitHub Template. Create your own synthetic call data generator in 3 steps:
- Click "Use this template" at the top of this page
- Name your repository (e.g.,
my-call-data-generator) - Choose visibility (public or private)
- Click "Create repository from template"
# Clone YOUR new repository (not this template!)
git clone https://github.com/YOUR-USERNAME/YOUR-REPO-NAME.git
cd YOUR-REPO-NAME
# Create TwiML App
twilio api:core:applications:create --friendly-name "Synthetic Call Generator"
# Copy the SID for .env
# Install dependencies
npm install
# Configure environment
cp .env.example .env
# Edit .env with credentials (see Quick Start section for details)
# Update phone numbers in assets/customers.json with YOUR Twilio numbers# Deploy to Twilio
npm run deploy
# Generate your first test call
curl -X POST "https://YOUR-DOMAIN.twil.io/create-conference" \
-u "$TWILIO_ACCOUNT_SID:$TWILIO_AUTH_TOKEN" \
-d "agentPhoneNumber=YOUR_PHONE_NUMBER"That's it! Check Twilio Console β Voice Intelligence β Transcripts to see your synthetic conversation.
Update Personas: Edit assets/customers.json and assets/agents.json to match your business:
- Customer pain points and technical proficiency
- Agent characteristics and competence levels
- Introduction scripts and conversation patterns
Adjust Call Behavior:
- Duration: Modify
timeLimitinfunctions/utils/add-participant.js(default: 5 minutes) - AI Model: Change OpenAI model in
functions/respond.js(default: gpt-4o-mini) - Speech Recognition: Adjust
speechModelinfunctions/transcribe.js(default: experimental_conversations)
Add Integrations:
- Segment CDP: See docs/segment-setup-guide.md
- Kinesis Streaming: See docs/event-streams-setup.md
Before installing, create a TwiML Application using the Twilio CLI:
# Create TwiML App (requires Twilio CLI)
twilio api:core:applications:create --friendly-name "Synthetic Call Generator"
# Copy the SID (starts with AP...) for the next step# Install dependencies
npm install
# Create .env file
cp .env.example .envEdit .env with your credentials:
# Get from https://console.twilio.com
TWILIO_ACCOUNT_SID=ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TWILIO_AUTH_TOKEN=your_auth_token_here
OPENAI_API_KEY=sk-...
# From step 1 above
TWIML_APP_SID=APxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# Required: Skip webhook validation for TwiML App calls
SKIP_WEBHOOK_VALIDATION=true
# Your Twilio phone numbers (find at console.twilio.com β Phone Numbers)
AGENT_PHONE_NUMBER=+1234567890
CUSTOMER_PHONE_NUMBER=+1234567890
# Sync Service SID (see docs/sync-setup-guide.md)
SYNC_SERVICE_SID=ISxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# Optional: Segment CDP integration
SEGMENT_WRITE_KEY=your_segment_write_key_hereThe persona files contain example phone numbers. Replace them with YOUR Twilio numbers:
# Update persona phone numbers with YOUR Twilio numbers from the prerequisites
# Edit assets/customers.json - replace the PhoneNumber fields
# Agents automatically use AGENT_PHONE_NUMBER from .env# Run pre-deployment checks (validates env, tests, APIs)
npm run pre-deployExpected: β ALL CHECKS PASSED (7/7) (some tests may skip if optional services not configured)
# Deploy with automatic validation
npm run deployThis runs:
- Pre-deployment checks
- Twilio serverless deployment
- Post-deployment validation
# Create your first synthetic conference
node src/main.jsWhat happens:
- Pairs a customer with an agent (random for realistic scenarios)
- Creates Segment CDP profiles
- Generates Twilio conference with AI conversation
- Updates profiles with ML scores (churn risk, propensity, satisfaction)
Pairing Strategies (configurable):
random(default) - Random pairing for diverse scenarios (frustrated customer + inexperienced agent = sparks fly! π₯)frustrated- Match difficult customers with experienced agentspatient- Patient customers with any agent
π For detailed instructions, see docs/quick-start.md
Note: If calls fail with 'busy' status, ensure you've updated assets/customers.json with YOUR Twilio phone numbers (step 3).
# Pre-deployment validation (env, tests, credentials, data files)
npm run pre-deploy
# Safe deployment with all checks
npm run deploy:safe
# Post-deployment validation
npm run post-deploy
# Smoke test (validates real APIs without deploying)
npm run smoke-test# Run all tests (634 tests, 26 suites)
npm test
# Watch mode
npm run test:watch
# Coverage report
npm run test:coverage
# E2E tests only
npm run test:e2e# Start local Twilio serverless development server
npm run dev
# Validate customer and agent data
node scripts/validate-customers.js
node scripts/validate-agents.jsGet your tokens:
- GitHub token: github.com/settings/tokens
- Twilio credentials: console.twilio.com
- OpenAI API key: platform.openai.com/api-keys
- Anthropic API key: console.anthropic.com
# Start Twilio Functions locally
npm run dev
# In another terminal, run tests in watch mode
npm run test:watch
# Create GitHub issues from your todos
npm run create-issue from-todosIf you prefer manual setup or encounter issues:
Required:
- Node.js β₯18.0.0 (nodejs.org)
- npm β₯8.0.0 (comes with Node.js)
- Git (git-scm.com)
- Twilio Account with:
- Account SID and Auth Token (console.twilio.com)
- A Sync Service SID (setup guide)
- Two or more phone numbers with voice capability (search and buy)
- TwiML App (created during setup below)
- OpenAI API Key (platform.openai.com/api-keys)
- Twilio CLI installed (docs)
Optional:
- Segment Write Key for CDP integration (setup guide)
- Voice Intelligence SID for advanced transcription (Twilio Console)
- Python 3.8+ with uv package manager (for Python development)
-
Install Node.js dependencies:
npm install
-
Install Python dependencies (if using Python):
uv sync --group test --group dev -
Install global tools:
npm install -g twilio-cli newman twilio plugins:install @twilio-labs/plugin-serverless
-
Authenticate with Twilio:
twilio login
-
Set up environment variables:
cp .env.example .env # Edit .env with your credentials
# Development
npm run dev # Start local Twilio Functions server
npm run build # Run linting, tests, and formatting checks
# Testing
npm test # Run all Jest tests
npm run test:watch # Run tests in watch mode
npm run test:coverage # Generate coverage report
npm run test:api # Run Newman API tests
uv run pytest # Run Python tests (if applicable)
# Code Quality
npm run lint # Check code quality
npm run lint:fix # Fix linting issues automatically
npm run format # Format code with Prettier
npm run format:check # Check if code is formatted
# Deployment
npm run twilio:deploy # Deploy to Twilio production
npm run twilio:deploy:dev # Deploy to development environmentWe practice strict TDD with comprehensive coverage:
- Write failing test (Red)
- Write minimal code to pass (Green)
- Refactor while keeping tests green
- Unit Tests: Individual function testing (Jest)
- Integration Tests: Component interactions and regression prevention
- E2E Tests: Full pipeline validation with real Twilio APIs
- API Tests: End-to-end validation (Newman)
- Coverage Target: >80% for all test types
Critical regression tests protect against production issues:
Fast static code analysis (~200ms) that validates:
- β
Using
max_completion_tokens(not deprecatedmax_tokens) - β
No unsupported
temperatureparameter for gpt-5-nano - β Prevents 400 BadRequest errors from OpenAI
npm test tests/integration/openai-api-parameters.test.jsFull E2E test (~6-7 minutes) that validates:
- β Transcripts contain real AI conversations (not error messages)
- β Multi-speaker dialogue (agent + customer)
- β Contextual customer responses (not generic errors)
- β Agent introductions are captured correctly
npm test tests/e2e/transcript-content-validation.test.js# All tests
npm test
# Watch mode for development
npm run test:watch
# Coverage report
npm run test:coverage
# Fast regression tests only (recommended for CI)
npm test tests/integration/
# API tests (Newman/Postman)
npm run test:api
newman run postman/collection.json -e postman/environment.jsonThe CI/CD pipeline (.github/workflows/test.yml) automatically:
- Test Node.js - Runs Jest tests with coverage
- Code Quality - ESLint and Prettier validation
- API Testing - Validates endpoints with Newman
Set these in your repository settings β Secrets:
TWILIO_ACCOUNT_SID # Your Twilio Account SID
TWILIO_AUTH_TOKEN # Your Twilio Auth Token
The GITHUB_TOKEN is automatically provided by GitHub Actions.
- Development:
npm run twilio:deploy:dev - Production:
npm run twilio:deploy:prod
twilio-synthetic-call-data-generator/
βββ .github/
β βββ workflows/test.yml # CI/CD pipeline
β βββ ISSUE_TEMPLATE/ # Bug/feature templates
β βββ PULL_REQUEST_TEMPLATE.md
βββ functions/ # Twilio Serverless Functions
β βββ voice-handler.js # Conference participant routing
β βββ transcribe.js # Speech-to-text capture
β βββ respond.js # OpenAI response generation
β βββ conference-status-webhook.js
β βββ transcription-webhook.js
β βββ utils/ # Shared utilities
βββ src/ # Core application
β βββ main.js # Entry point
β βββ personas/ # Customer/agent loaders
β βββ pairing/ # Pairing strategies
β βββ orchestration/ # Conference creation
β βββ segment/ # CDP integration
βββ scripts/ # Deployment & validation
β βββ pre-deployment-check.js
β βββ post-deployment-validation.js
β βββ smoke-test.js
βββ tests/ # 634 tests (unit/integration/e2e)
βββ docs/ # Documentation
βββ postman/ # API test collections
βββ customers.json # Customer personas
βββ package.json # Dependencies & scripts
βββ README.md # This file
# Generate 100 synthetic calls with random pairing
node scripts/generate-bulk-calls.js --count 100 --cps 1
# Results: Recordings, transcripts, Voice Intelligence insights
# β Feeds into Segment CDP β Data warehouse β BI tools# Generate diverse scenarios (frustrated + inexperienced agent, etc.)
npm run start # Creates random pairings
# Extract Voice Intelligence operator results
# β Sentiment analysis, PII detection, call classification
# β Use for supervised ML training data# Deploy new TwiML function
npm run deploy
# Validate with E2E tests
npm run smoke-test
# Generate synthetic calls to test behavior
node src/main.js- Fork the repository
- Create feature branch:
git checkout -b feature/amazing-feature - Follow TDD: Write tests first, then implementation
- Run checks:
npm run build(linting + tests + formatting) - Commit changes: Use conventional commits
- Push and create Pull Request
- Tests Required: Comprehensive test coverage for all code
- TDD Approach: Red β Green β Refactor
- Code Quality: Must pass ESLint + Prettier
- Documentation: Update relevant docs
- No Secrets: Never commit credentials
- Twilio Functions: twilio.com/docs/serverless
- Jest Testing: jestjs.io
- Newman API Testing: learning.postman.com/docs/running-collections/using-newman-cli
- GitHub Actions: docs.github.com/actions
MIT License - see LICENSE file for details.
Ready to build? Start with git clone and npm run setup - you'll be ready to party! π