Skip to content

Arya036/Call_Center_Compliance_API

Repository files navigation

Call Center Compliance API (Track 3)

Production-style FastAPI service for the Track 3 hackathon problem.

It accepts one Base64 MP3 call at a time, runs multi-stage AI analysis, and returns structured compliance and business intelligence JSON.

Known Deployment Behavior on Free Tiers: On cold starts, the homepage (/) may briefly return an internal server error; use /health for uptime checks and POST /api/call-analytics for functional verification.

Problem Alignment

This implementation is built to satisfy the Track 3 contract:

  • One MP3 file per request via Base64
  • Mandatory header authentication using x-api-key
  • Multi-stage pipeline: transcription -> NLP analysis -> metric extraction
  • Structured response with transcript, summary, sop_validation, analytics, and keywords
  • Strict enum handling for payment and rejection categories
  • Transcript indexing in vector storage for semantic retrieval evidence

Scoring-Aware Features

Mapped to the published rubric:

  1. API availability and reliability
  • FastAPI endpoint with request validation and auth checks
  • Health check endpoint and graceful error payloads
  1. Transcript and summary quality
  • Sarvam codemix STT for Hinglish/Tanglish
  • Transcript cleanup and quality scoring
  • Engine fallback path for low-quality transcript cases
  1. SOP validation quality
  • Five mandatory SOP stages: greeting, identification, problemStatement, solutionOffering, closing
  • Deterministic complianceScore recompute from booleans
  1. Analytics correctness
  • paymentPreference constrained to: EMI, FULL_PAYMENT, PARTIAL_PAYMENT, DOWN_PAYMENT
  • rejectionReason constrained to: HIGH_INTEREST, BUDGET_CONSTRAINTS, ALREADY_PAID, NOT_INTERESTED, NONE
  • sentiment constrained to: Positive, Negative, Neutral
  1. Keywords and traceability
  • Keyword list extracted from transcript-level context
  • Safe fallback keyword generation when NLP provider is unavailable
  1. Vector storage evidence
  • Transcripts indexed in ChromaDB for semantic search

Architecture

API Flow Diagram

flowchart LR
    A[Client or Evaluator] --> B[POST /api/call-analytics]
    B --> C[Check x-api-key]
    C -->|Invalid| Z1[401 Unauthorized]
    C -->|Valid| D[Validate request body]
    D -->|Invalid base64 or tiny audio| Z2[400 Bad Request]
    D -->|Valid| E[Decode Base64 MP3]
    E --> F[Write temp .mp3 file]
    F --> G[Transcription pipeline]

    G -->|Transcription failed| H1[Return structured error payload]
    G -->|Transcript ready| I[NLP analysis pipeline]

    I --> J[Build response fields]
    J --> K[Insert row in SQLite]
    J --> L[Index transcript in Chroma vector store]
    K --> M[Return JSON response]
    L --> M

    subgraph Response Shape
      M1[status]
      M2[language]
      M3[transcript]
      M4[summary]
      M5[sop_validation]
      M6[analytics]
      M7[keywords]
    end

    M --> M1
    M --> M2
    M --> M3
    M --> M4
    M --> M5
    M --> M6
    M --> M7
Loading

STT Engine Strategy

STT Strategy Diagram

flowchart TD
  A[Temp audio file + language] --> B[Run Sarvam codemix STT]
  B --> C{Sarvam success?}

  C -->|No| F1{Fallback enabled?}
  F1 -->|No| Z1[Transcription failure]
  F1 -->|Yes| F2[Try fallback candidates]
  F2 --> W[faster-whisper if enabled]
  F2 --> R[Reverie if enabled and configured]
  W --> P[Quality score each result]
  R --> P
  P --> Q{Any fallback success?}
  Q -->|No| Z1
  Q -->|Yes| S[Pick best-quality transcript]
  S --> T[Post-process transcript]
  T --> U[Return success result]

  C -->|Yes| D[Post-process Sarvam transcript]
  D --> E[Compute quality score]
  E --> G{Dual-engine check needed?}
  G -->|No| U
  G -->|Yes, low quality| F2

  T --> V[Squash long repetitions]
  T --> X[Entity correction]
  T --> Y[Quality warnings + metadata]
Loading

API Contract

Endpoint

  • Method: POST
  • Path: /api/call-analytics
  • Header: x-api-key: YOUR_SECRET_API_KEY
  • Content-Type: application/json

Request Body

{
  "language": "Tamil",
  "audioFormat": "mp3",
  "audioBase64": "<base64-mp3>"
}

Successful Response Shape

{
  "status": "success",
  "language": "Tamil",
  "transcript": "...",
  "summary": "...",
  "sop_validation": {
    "greeting": true,
    "identification": false,
    "problemStatement": true,
    "solutionOffering": true,
    "closing": true,
    "complianceScore": 0.8,
    "adherenceStatus": "NOT_FOLLOWED",
    "explanation": "..."
  },
  "analytics": {
    "paymentPreference": "PARTIAL_PAYMENT",
    "rejectionReason": "BUDGET_CONSTRAINTS",
    "sentiment": "Neutral"
  },
  "keywords": ["...", "..."]
}

Tech Stack

  • Python 3.11+
  • FastAPI + Uvicorn
  • Sarvam AI STT (primary)
  • faster-whisper (local fallback)
  • Reverie STT adapter (optional fallback)
  • Groq Llama for summary/SOP/analytics/keywords
  • SQLite (aiosqlite)
  • ChromaDB vector store

Repository Structure

.
├── main.py
├── config.py
├── requirements.txt
├── Procfile
├── nixpacks.toml
├── .env.example
├── README.md
├── services/
│   ├── transcription.py
│   ├── analysis.py
│   ├── payment_classifier.py
│   └── sop_validator.py
├── database/
│   ├── __init__.py
│   └── db.py
├── vector_store/
│   ├── __init__.py
│   └── store.py
├── models/
│   ├── __init__.py
│   └── schemas.py
├── scripts/
│   ├── test_api.py
│   ├── test_analysis.py
│   ├── test_transcription.py
│   └── benchmark_variants.py
├── templates/
│   └── dashboard.html
└── static/
    └── style.css

Local Setup

  1. Clone repo
git clone <your-repo-url>
cd GUVI_hack
  1. Install dependencies
pip install -r requirements.txt
  1. Create env file
cp .env.example .env
  1. Set required variables in .env
  • API_KEY
  • SARVAM_API_KEY
  • GROQ_API_KEY

Optional:

  • ENABLE_DUAL_ENGINE_STT
  • ENABLE_FASTER_WHISPER
  • ENABLE_REVERIE_FALLBACK
  • REVERIE_STT_URL
  • REVERIE_API_KEY
  1. Run server
uvicorn main:app --host 0.0.0.0 --port 8000

Test Commands

Run full contract and pipeline checks:

python scripts/test_api.py

Optional benchmark sweep:

python scripts/benchmark_variants.py

Deployment (Render)

  1. Push repo to GitHub.
  2. In Render, create a new Web Service from your GitHub repository.
  3. Use these service settings:
  • Runtime: Python 3
  • Build Command: pip install -r requirements.txt
  • Start Command: uvicorn main:app --host 0.0.0.0 --port $PORT
  • Health Check Path: /health
  1. Add environment variables from .env.example, at minimum:
  • API_KEY
  • SARVAM_API_KEY
  • GROQ_API_KEY
  1. For faster judge-time responses on free tier, set:
  • ENABLE_STT_FALLBACK=false
  • ENABLE_DUAL_ENGINE_STT=false
  • ENABLE_FASTER_WHISPER=false
  • ENABLE_NLP_CONSISTENCY_CHECK=false
  • ENABLE_REVERIE_FALLBACK=false
  1. Deploy and verify:
  • GET /health returns status ok
  • POST /api/call-analytics returns valid structured JSON

Render Troubleshooting Note

  • If you open /api/call-analytics directly in a browser, you may see: {"detail":"Method Not Allowed"}
  • This is expected behavior. The endpoint is POST-only by contract.
  • In some free-tier cold-start windows, the homepage (/) can briefly show an internal server error while the service is waking up.
  • This does not imply the API contract endpoint is unavailable; confirm service health at /health and test functionality with POST /api/call-analytics.
  • Use a client like Postman, Hoppscotch, or curl with:
    • method: POST
    • header: x-api-key
    • JSON body: language, audioFormat, audioBase64
  • For uptime verification in browser, use /health.

Notes for Evaluators

  • No hardcoded response payloads are used.
  • Responses are generated from provided Base64 audio.
  • Enum values are normalized to allowed categories.
  • Vector indexing is performed per processed call for semantic search evidence.

AI Tools Used

This project used AI-assisted development tools during implementation and documentation.

  • GitHub Copilot (GPT-5.3-Codex) for code suggestions, debugging support, and README improvements
  • ChatGPT for design discussion, prompt shaping, and architecture clarification

All generated code was reviewed and validated in the repository before use.

Known Limitations

  • Large transcripts can exceed Groq on-demand token limits and trigger fallback-safe analysis defaults.
  • First-request cold starts on free hosting tiers can increase latency.
  • External provider latency (Sarvam STT or LLM APIs) can affect end-to-end response time.
  • Vector embedding model download can slow the first indexing operation on a fresh instance.

About

Call Center Compliance API: Base64 MP3 in, Hinglish/Tanglish transcription + SOP validation + payment/rejection analytics + keyword extraction with vector-search evidence.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors