A production-grade system for generating realistic synthetic call data using Twilio Programmable Voice and Segment CDP. Features random customer-agent pairing for realistic scenarios (including challenging interactions), AI-powered conversations with OpenAI, Voice Intelligence transcription, and ML-based customer profiling with churn risk and propensity scores.
Architecture: Built with production-grade patterns including comprehensive test coverage for core TwiML functions, retry logic with exponential backoff, circuit breakers, and webhook signature validation.
Generates realistic synthetic call data for testing, development, and analytics:
- Random Pairing - Creates realistic scenarios including challenging interactions (frustrated customers with inexperienced agents)
- AI Conversations - OpenAI-powered realistic agent-customer conversations with Voice Intelligence transcription
- Customer Profiling - Creates and updates Segment CDP profiles with ML scores
- ML Analytics - Calculates churn risk, propensity to buy, and satisfaction scores
- Complete Pipeline - End-to-end automation from pairing to analytics
β One-Command Deployment - Pre-deployment checks + deploy + post-deployment validation β Production Testing - Smoke tests against real Twilio and Segment APIs β Comprehensive Test Coverage - 634 tests across unit, integration, and E2E β Realistic Pairing - Random customer-agent matching creates diverse, realistic scenarios β Segment CDP Integration - Automatic profile creation and ML score updates β Twilio Serverless - Conference webhooks and AI conversation orchestration
- Backend: Node.js 18+, Twilio Serverless Functions
- AI: OpenAI GPT-5-nano, Twilio Voice Intelligence
- Data: Segment CDP, Twilio Sync
- Testing: Jest (634 tests), Newman (Postman)
- CI/CD: GitHub Actions
- Code Quality: ESLint, Prettier
Per 100 synthetic calls (assuming 2-minute average conversation, 5-minute maximum):
| Service | Usage | Cost |
|---|---|---|
| Twilio Voice | 200 minutes @ $0.013/min | ~$2.60 |
| Twilio Voice Intelligence | 200 minutes @ $0.02/min | ~$4.00 |
| OpenAI GPT-5-nano | ~1M tokens @ $0.05/1M input, $0.40/1M output | ~$0.15 |
| Twilio Sync | Included in usage | Free tier |
| Segment CDP | Up to 10K MTUs/month | Free tier |
| Total | ~$6.75 per 100 calls |
Budget Planning:
MAX_DAILY_CALLS=1000(default) = ~$67.50/day maximumMAX_DAILY_CALLS=100= ~$6.75/day for testing- Adjust
MAX_DAILY_CALLSin.envto control spending
Cost Controls Built-in:
- β Auto-termination at 5 minutes - Prevents runaway conversation costs
- β
Rate limiting -
MAX_DAILY_CALLSprevents accidental overspending - β Efficient model - GPT-5-nano is 20x cheaper than GPT-4
Cost-Saving Tips:
- Use shorter conversations for testing (conferences auto-terminate at 5 minutes)
- Start with
MAX_DAILY_CALLS=10during development - Monitor OpenAI usage at platform.openai.com/usage
- Use Twilio's free trial credits for initial testing
- Quick Start
- Development Tools
- Detailed Setup
- Development Workflow
- Testing Strategy
- CI/CD Pipeline
- Project Structure
- Example Use Cases
- Contributing
- β‘ Quick Start Guide - 5-minute setup for trying the system
- π¦ Deployment Guide - Production deployment with advanced configuration
- ποΈ Architecture - System architecture and data flow diagrams
- π§ API Documentation - Complete API reference
- π¨ Error Handling Guide - Error handling patterns and debugging
- π Segment Setup - Configure Segment CDP integration
- π Event Streams Setup - Configure Twilio Event Streams
- πΎ Sync Setup - Configure Twilio Sync for state management
# Install dependencies
npm install
# Create .env file
cp .env.example .envEdit .env with your credentials:
# Get from https://console.twilio.com
TWILIO_ACCOUNT_SID=ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TWILIO_AUTH_TOKEN=your_auth_token_here
# Get from https://app.segment.com β Sources β Node.js
SEGMENT_WRITE_KEY=your_segment_write_key_here# Run pre-deployment checks (validates env, tests, APIs)
npm run pre-deployExpected: β ALL CHECKS PASSED (7/7)
# Deploy with automatic validation
npm run deployThis runs:
- Pre-deployment checks
- Twilio serverless deployment
- Post-deployment validation
# Create your first synthetic conference
node src/main.jsWhat happens:
- Pairs a customer with an agent (random for realistic scenarios)
- Creates Segment CDP profiles
- Generates Twilio conference with AI conversation
- Updates profiles with ML scores (churn risk, propensity, satisfaction)
Pairing Strategies (configurable):
random(default) - Random pairing for diverse scenarios (frustrated customer + inexperienced agent = sparks fly! π₯)frustrated- Match difficult customers with experienced agentspatient- Patient customers with any agent
π For detailed instructions, see docs/quick-start.md
# Pre-deployment validation (env, tests, credentials, data files)
npm run pre-deploy
# Safe deployment with all checks
npm run deploy:safe
# Post-deployment validation
npm run post-deploy
# Smoke test (validates real APIs without deploying)
npm run smoke-test# Run all tests (634 tests, 26 suites)
npm test
# Watch mode
npm run test:watch
# Coverage report
npm run test:coverage
# E2E tests only
npm run test:e2e# Start local Twilio serverless development server
npm run dev
# Validate customer and agent data
node scripts/validate-customers.js
node scripts/validate-agents.jsGet your tokens:
- GitHub token: github.com/settings/tokens
- Twilio credentials: console.twilio.com
- OpenAI API key: platform.openai.com/api-keys
- Anthropic API key: console.anthropic.com
# Start Twilio Functions locally
npm run dev
# In another terminal, run tests in watch mode
npm run test:watch
# Create GitHub issues from your todos
npm run create-issue from-todosIf you prefer manual setup or encounter issues:
Required:
- Node.js β₯18.0.0 (nodejs.org)
- npm β₯8.0.0 (comes with Node.js)
- Git (git-scm.com)
Optional (for Python development):
- Python 3.8+
- uv package manager:
curl -LsSf https://astral.sh/uv/install.sh | sh
-
Install Node.js dependencies:
npm install
-
Install Python dependencies (if using Python):
uv sync --group test --group dev -
Install global tools:
npm install -g twilio-cli newman twilio plugins:install @twilio-labs/plugin-serverless
-
Authenticate with Twilio:
twilio login
-
Set up environment variables:
cp .env.example .env # Edit .env with your credentials
# Development
npm run dev # Start local Twilio Functions server
npm run build # Run linting, tests, and formatting checks
# Testing
npm test # Run all Jest tests
npm run test:watch # Run tests in watch mode
npm run test:coverage # Generate coverage report
npm run test:api # Run Newman API tests
uv run pytest # Run Python tests (if applicable)
# Code Quality
npm run lint # Check code quality
npm run lint:fix # Fix linting issues automatically
npm run format # Format code with Prettier
npm run format:check # Check if code is formatted
# Deployment
npm run twilio:deploy # Deploy to Twilio production
npm run twilio:deploy:dev # Deploy to development environmentWe practice strict TDD with comprehensive coverage:
- Write failing test (Red)
- Write minimal code to pass (Green)
- Refactor while keeping tests green
- Unit Tests: Individual function testing (Jest)
- Integration Tests: Component interactions and regression prevention
- E2E Tests: Full pipeline validation with real Twilio APIs
- API Tests: End-to-end validation (Newman)
- Coverage Target: >80% for all test types
Critical regression tests protect against production issues:
Fast static code analysis (~200ms) that validates:
- β
Using
max_completion_tokens(not deprecatedmax_tokens) - β
No unsupported
temperatureparameter for gpt-5-nano - β Prevents 400 BadRequest errors from OpenAI
npm test tests/integration/openai-api-parameters.test.jsFull E2E test (~6-7 minutes) that validates:
- β Transcripts contain real AI conversations (not error messages)
- β Multi-speaker dialogue (agent + customer)
- β Contextual customer responses (not generic errors)
- β Agent introductions are captured correctly
npm test tests/e2e/transcript-content-validation.test.js# All tests
npm test
# Watch mode for development
npm run test:watch
# Coverage report
npm run test:coverage
# Fast regression tests only (recommended for CI)
npm test tests/integration/
# API tests (Newman/Postman)
npm run test:api
newman run postman/collection.json -e postman/environment.jsonThe CI/CD pipeline (.github/workflows/test.yml) automatically:
- Test Node.js - Runs Jest tests with coverage
- Code Quality - ESLint and Prettier validation
- API Testing - Validates endpoints with Newman
Set these in your repository settings β Secrets:
TWILIO_ACCOUNT_SID # Your Twilio Account SID
TWILIO_AUTH_TOKEN # Your Twilio Auth Token
The GITHUB_TOKEN is automatically provided by GitHub Actions.
- Development:
npm run twilio:deploy:dev - Production:
npm run twilio:deploy:prod
twilio-synthetic-call-data-generator/
βββ .github/
β βββ workflows/test.yml # CI/CD pipeline
β βββ ISSUE_TEMPLATE/ # Bug/feature templates
β βββ PULL_REQUEST_TEMPLATE.md
βββ functions/ # Twilio Serverless Functions
β βββ voice-handler.js # Conference participant routing
β βββ transcribe.js # Speech-to-text capture
β βββ respond.js # OpenAI response generation
β βββ conference-status-webhook.js
β βββ transcription-webhook.js
β βββ utils/ # Shared utilities
βββ src/ # Core application
β βββ main.js # Entry point
β βββ personas/ # Customer/agent loaders
β βββ pairing/ # Pairing strategies
β βββ orchestration/ # Conference creation
β βββ segment/ # CDP integration
βββ scripts/ # Deployment & validation
β βββ pre-deployment-check.js
β βββ post-deployment-validation.js
β βββ smoke-test.js
βββ tests/ # 634 tests (unit/integration/e2e)
βββ docs/ # Documentation
βββ postman/ # API test collections
βββ customers.json # Customer personas
βββ package.json # Dependencies & scripts
βββ README.md # This file
# Generate 100 synthetic calls with random pairing
node scripts/generate-bulk-calls.js --count 100 --cps 1
# Results: Recordings, transcripts, Voice Intelligence insights
# β Feeds into Segment CDP β Data warehouse β BI tools# Generate diverse scenarios (frustrated + inexperienced agent, etc.)
npm run start # Creates random pairings
# Extract Voice Intelligence operator results
# β Sentiment analysis, PII detection, call classification
# β Use for supervised ML training data# Deploy new TwiML function
npm run deploy
# Validate with E2E tests
npm run smoke-test
# Generate synthetic calls to test behavior
node src/main.js- Fork the repository
- Create feature branch:
git checkout -b feature/amazing-feature - Follow TDD: Write tests first, then implementation
- Run checks:
npm run build(linting + tests + formatting) - Commit changes: Use conventional commits
- Push and create Pull Request
- Tests Required: Comprehensive test coverage for all code
- TDD Approach: Red β Green β Refactor
- Code Quality: Must pass ESLint + Prettier
- Documentation: Update relevant docs
- No Secrets: Never commit credentials
- Twilio Functions: twilio.com/docs/serverless
- Jest Testing: jestjs.io
- Newman API Testing: learning.postman.com/docs/running-collections/using-newman-cli
- GitHub Actions: docs.github.com/actions
MIT License - see LICENSE file for details.
Ready to build? Start with git clone and npm run setup - you'll be ready to party! π