Skip to content

Internal development repository for Twilio Synthetic Call Data Generator - includes full development history, internal docs, and AI coordination files

License

Notifications You must be signed in to change notification settings

wittyreference/mc-twilio-synthetic-call-data-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

41 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Twilio Synthetic Call Data Generator

A production-grade system for generating realistic synthetic call data using Twilio Programmable Voice and Segment CDP. Features random customer-agent pairing for realistic scenarios (including challenging interactions), AI-powered conversations with OpenAI, Voice Intelligence transcription, and ML-based customer profiling with churn risk and propensity scores.

Architecture: Built with production-grade patterns including comprehensive test coverage for core TwiML functions, retry logic with exponential backoff, circuit breakers, and webhook signature validation.

🎯 What It Does

Generates realistic synthetic call data for testing, development, and analytics:

  1. Random Pairing - Creates realistic scenarios including challenging interactions (frustrated customers with inexperienced agents)
  2. AI Conversations - OpenAI-powered realistic agent-customer conversations with Voice Intelligence transcription
  3. Customer Profiling - Creates and updates Segment CDP profiles with ML scores
  4. ML Analytics - Calculates churn risk, propensity to buy, and satisfaction scores
  5. Complete Pipeline - End-to-end automation from pairing to analytics

πŸš€ Features

βœ… One-Command Deployment - Pre-deployment checks + deploy + post-deployment validation βœ… Production Testing - Smoke tests against real Twilio and Segment APIs βœ… Comprehensive Test Coverage - 634 tests across unit, integration, and E2E βœ… Realistic Pairing - Random customer-agent matching creates diverse, realistic scenarios βœ… Segment CDP Integration - Automatic profile creation and ML score updates βœ… Twilio Serverless - Conference webhooks and AI conversation orchestration

πŸ›  Tech Stack

  • Backend: Node.js 18+, Twilio Serverless Functions
  • AI: OpenAI GPT-5-nano, Twilio Voice Intelligence
  • Data: Segment CDP, Twilio Sync
  • Testing: Jest (634 tests), Newman (Postman)
  • CI/CD: GitHub Actions
  • Code Quality: ESLint, Prettier

πŸ’° Cost Estimation

Per 100 synthetic calls (assuming 2-minute average conversation, 5-minute maximum):

Service Usage Cost
Twilio Voice 200 minutes @ $0.013/min ~$2.60
Twilio Voice Intelligence 200 minutes @ $0.02/min ~$4.00
OpenAI GPT-5-nano ~1M tokens @ $0.05/1M input, $0.40/1M output ~$0.15
Twilio Sync Included in usage Free tier
Segment CDP Up to 10K MTUs/month Free tier
Total ~$6.75 per 100 calls

Budget Planning:

  • MAX_DAILY_CALLS=1000 (default) = ~$67.50/day maximum
  • MAX_DAILY_CALLS=100 = ~$6.75/day for testing
  • Adjust MAX_DAILY_CALLS in .env to control spending

Cost Controls Built-in:

  • βœ… Auto-termination at 5 minutes - Prevents runaway conversation costs
  • βœ… Rate limiting - MAX_DAILY_CALLS prevents accidental overspending
  • βœ… Efficient model - GPT-5-nano is 20x cheaper than GPT-4

Cost-Saving Tips:

  • Use shorter conversations for testing (conferences auto-terminate at 5 minutes)
  • Start with MAX_DAILY_CALLS=10 during development
  • Monitor OpenAI usage at platform.openai.com/usage
  • Use Twilio's free trial credits for initial testing

πŸ“‹ Table of Contents

πŸ“š Documentation Quick Links

⚑ Quick Start (5 Minutes)

1. Install & Configure

# Install dependencies
npm install

# Create .env file
cp .env.example .env

Edit .env with your credentials:

# Get from https://console.twilio.com
TWILIO_ACCOUNT_SID=ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TWILIO_AUTH_TOKEN=your_auth_token_here

# Get from https://app.segment.com β†’ Sources β†’ Node.js
SEGMENT_WRITE_KEY=your_segment_write_key_here

2. Validate Setup

# Run pre-deployment checks (validates env, tests, APIs)
npm run pre-deploy

Expected: βœ“ ALL CHECKS PASSED (7/7)

3. Deploy to Twilio

# Deploy with automatic validation
npm run deploy

This runs:

  1. Pre-deployment checks
  2. Twilio serverless deployment
  3. Post-deployment validation

4. Generate Synthetic Calls

# Create your first synthetic conference
node src/main.js

What happens:

  • Pairs a customer with an agent (random for realistic scenarios)
  • Creates Segment CDP profiles
  • Generates Twilio conference with AI conversation
  • Updates profiles with ML scores (churn risk, propensity, satisfaction)

Pairing Strategies (configurable):

  • random (default) - Random pairing for diverse scenarios (frustrated customer + inexperienced agent = sparks fly! πŸ”₯)
  • frustrated - Match difficult customers with experienced agents
  • patient - Patient customers with any agent

πŸ“– For detailed instructions, see docs/quick-start.md


πŸ›  Development Tools

Deployment Automation

# Pre-deployment validation (env, tests, credentials, data files)
npm run pre-deploy

# Safe deployment with all checks
npm run deploy:safe

# Post-deployment validation
npm run post-deploy

# Smoke test (validates real APIs without deploying)
npm run smoke-test

Testing

# Run all tests (634 tests, 26 suites)
npm test

# Watch mode
npm run test:watch

# Coverage report
npm run test:coverage

# E2E tests only
npm run test:e2e

Development

# Start local Twilio serverless development server
npm run dev

# Validate customer and agent data
node scripts/validate-customers.js
node scripts/validate-agents.js

Get your tokens:

4. Start Development

# Start Twilio Functions locally
npm run dev

# In another terminal, run tests in watch mode
npm run test:watch

# Create GitHub issues from your todos
npm run create-issue from-todos

πŸ“– Detailed Setup

If you prefer manual setup or encounter issues:

Prerequisites

Required:

Optional (for Python development):

  • Python 3.8+
  • uv package manager: curl -LsSf https://astral.sh/uv/install.sh | sh

Manual Installation Steps

  1. Install Node.js dependencies:

    npm install
  2. Install Python dependencies (if using Python):

    uv sync --group test --group dev
  3. Install global tools:

    npm install -g twilio-cli newman
    twilio plugins:install @twilio-labs/plugin-serverless
  4. Authenticate with Twilio:

    twilio login
  5. Set up environment variables:

    cp .env.example .env
    # Edit .env with your credentials

πŸ”„ Development Workflow

Core Commands

# Development
npm run dev                # Start local Twilio Functions server
npm run build              # Run linting, tests, and formatting checks

# Testing
npm test                   # Run all Jest tests
npm run test:watch         # Run tests in watch mode
npm run test:coverage      # Generate coverage report
npm run test:api           # Run Newman API tests
uv run pytest              # Run Python tests (if applicable)

# Code Quality
npm run lint               # Check code quality
npm run lint:fix           # Fix linting issues automatically
npm run format             # Format code with Prettier
npm run format:check       # Check if code is formatted

# Deployment
npm run twilio:deploy      # Deploy to Twilio production
npm run twilio:deploy:dev  # Deploy to development environment

πŸ§ͺ Testing Strategy

Test-Driven Development (TDD)

We practice strict TDD with comprehensive coverage:

  1. Write failing test (Red)
  2. Write minimal code to pass (Green)
  3. Refactor while keeping tests green

Test Types & Coverage

  • Unit Tests: Individual function testing (Jest)
  • Integration Tests: Component interactions and regression prevention
  • E2E Tests: Full pipeline validation with real Twilio APIs
  • API Tests: End-to-end validation (Newman)
  • Coverage Target: >80% for all test types

Regression Prevention Tests

Critical regression tests protect against production issues:

OpenAI API Compatibility (tests/integration/openai-api-parameters.test.js)

Fast static code analysis (~200ms) that validates:

  • βœ… Using max_completion_tokens (not deprecated max_tokens)
  • βœ… No unsupported temperature parameter for gpt-5-nano
  • βœ… Prevents 400 BadRequest errors from OpenAI
npm test tests/integration/openai-api-parameters.test.js

Transcript Content Validation (tests/e2e/transcript-content-validation.test.js)

Full E2E test (~6-7 minutes) that validates:

  • βœ… Transcripts contain real AI conversations (not error messages)
  • βœ… Multi-speaker dialogue (agent + customer)
  • βœ… Contextual customer responses (not generic errors)
  • βœ… Agent introductions are captured correctly
npm test tests/e2e/transcript-content-validation.test.js

Running Tests

# All tests
npm test

# Watch mode for development
npm run test:watch

# Coverage report
npm run test:coverage

# Fast regression tests only (recommended for CI)
npm test tests/integration/

# API tests (Newman/Postman)
npm run test:api
newman run postman/collection.json -e postman/environment.json

πŸš€ CI/CD Pipeline

GitHub Actions Workflow

The CI/CD pipeline (.github/workflows/test.yml) automatically:

  1. Test Node.js - Runs Jest tests with coverage
  2. Code Quality - ESLint and Prettier validation
  3. API Testing - Validates endpoints with Newman

Required GitHub Secrets

Set these in your repository settings β†’ Secrets:

TWILIO_ACCOUNT_SID    # Your Twilio Account SID
TWILIO_AUTH_TOKEN     # Your Twilio Auth Token

The GITHUB_TOKEN is automatically provided by GitHub Actions.

Deployment Environments

  • Development: npm run twilio:deploy:dev
  • Production: npm run twilio:deploy:prod

πŸ“ Project Structure

twilio-synthetic-call-data-generator/
β”œβ”€β”€ .github/
β”‚   β”œβ”€β”€ workflows/test.yml      # CI/CD pipeline
β”‚   β”œβ”€β”€ ISSUE_TEMPLATE/        # Bug/feature templates
β”‚   └── PULL_REQUEST_TEMPLATE.md
β”œβ”€β”€ functions/                 # Twilio Serverless Functions
β”‚   β”œβ”€β”€ voice-handler.js      # Conference participant routing
β”‚   β”œβ”€β”€ transcribe.js         # Speech-to-text capture
β”‚   β”œβ”€β”€ respond.js           # OpenAI response generation
β”‚   β”œβ”€β”€ conference-status-webhook.js
β”‚   β”œβ”€β”€ transcription-webhook.js
β”‚   └── utils/               # Shared utilities
β”œβ”€β”€ src/                     # Core application
β”‚   β”œβ”€β”€ main.js             # Entry point
β”‚   β”œβ”€β”€ personas/           # Customer/agent loaders
β”‚   β”œβ”€β”€ pairing/            # Pairing strategies
β”‚   β”œβ”€β”€ orchestration/      # Conference creation
β”‚   └── segment/            # CDP integration
β”œβ”€β”€ scripts/                # Deployment & validation
β”‚   β”œβ”€β”€ pre-deployment-check.js
β”‚   β”œβ”€β”€ post-deployment-validation.js
β”‚   └── smoke-test.js
β”œβ”€β”€ tests/                  # 634 tests (unit/integration/e2e)
β”œβ”€β”€ docs/                   # Documentation
β”œβ”€β”€ postman/               # API test collections
β”œβ”€β”€ customers.json         # Customer personas
β”œβ”€β”€ package.json          # Dependencies & scripts
└── README.md            # This file

🎯 Example Use Cases

Generate Test Data for Analytics Pipeline

# Generate 100 synthetic calls with random pairing
node scripts/generate-bulk-calls.js --count 100 --cps 1

# Results: Recordings, transcripts, Voice Intelligence insights
# β†’ Feeds into Segment CDP β†’ Data warehouse β†’ BI tools

Train ML Models on Customer Service Data

# Generate diverse scenarios (frustrated + inexperienced agent, etc.)
npm run start  # Creates random pairings

# Extract Voice Intelligence operator results
# β†’ Sentiment analysis, PII detection, call classification
# β†’ Use for supervised ML training data

Test Voice Application Changes

# Deploy new TwiML function
npm run deploy

# Validate with E2E tests
npm run smoke-test

# Generate synthetic calls to test behavior
node src/main.js

🀝 Contributing

  1. Fork the repository
  2. Create feature branch: git checkout -b feature/amazing-feature
  3. Follow TDD: Write tests first, then implementation
  4. Run checks: npm run build (linting + tests + formatting)
  5. Commit changes: Use conventional commits
  6. Push and create Pull Request

Development Standards

  • Tests Required: Comprehensive test coverage for all code
  • TDD Approach: Red β†’ Green β†’ Refactor
  • Code Quality: Must pass ESLint + Prettier
  • Documentation: Update relevant docs
  • No Secrets: Never commit credentials

πŸ“š Additional Resources

πŸ“„ License

MIT License - see LICENSE file for details.


Ready to build? Start with git clone and npm run setup - you'll be ready to party! πŸš€

About

Internal development repository for Twilio Synthetic Call Data Generator - includes full development history, internal docs, and AI coordination files

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •