Production-style FastAPI service for the Track 3 hackathon problem.
It accepts one Base64 MP3 call at a time, runs multi-stage AI analysis, and returns structured compliance and business intelligence JSON.
Known Deployment Behavior on Free Tiers: On cold starts, the homepage (/) may briefly return an internal server error; use /health for uptime checks and POST /api/call-analytics for functional verification.
This implementation is built to satisfy the Track 3 contract:
- One MP3 file per request via Base64
- Mandatory header authentication using x-api-key
- Multi-stage pipeline: transcription -> NLP analysis -> metric extraction
- Structured response with transcript, summary, sop_validation, analytics, and keywords
- Strict enum handling for payment and rejection categories
- Transcript indexing in vector storage for semantic retrieval evidence
Mapped to the published rubric:
- API availability and reliability
- FastAPI endpoint with request validation and auth checks
- Health check endpoint and graceful error payloads
- Transcript and summary quality
- Sarvam codemix STT for Hinglish/Tanglish
- Transcript cleanup and quality scoring
- Engine fallback path for low-quality transcript cases
- SOP validation quality
- Five mandatory SOP stages: greeting, identification, problemStatement, solutionOffering, closing
- Deterministic complianceScore recompute from booleans
- Analytics correctness
- paymentPreference constrained to: EMI, FULL_PAYMENT, PARTIAL_PAYMENT, DOWN_PAYMENT
- rejectionReason constrained to: HIGH_INTEREST, BUDGET_CONSTRAINTS, ALREADY_PAID, NOT_INTERESTED, NONE
- sentiment constrained to: Positive, Negative, Neutral
- Keywords and traceability
- Keyword list extracted from transcript-level context
- Safe fallback keyword generation when NLP provider is unavailable
- Vector storage evidence
- Transcripts indexed in ChromaDB for semantic search
flowchart LR
A[Client or Evaluator] --> B[POST /api/call-analytics]
B --> C[Check x-api-key]
C -->|Invalid| Z1[401 Unauthorized]
C -->|Valid| D[Validate request body]
D -->|Invalid base64 or tiny audio| Z2[400 Bad Request]
D -->|Valid| E[Decode Base64 MP3]
E --> F[Write temp .mp3 file]
F --> G[Transcription pipeline]
G -->|Transcription failed| H1[Return structured error payload]
G -->|Transcript ready| I[NLP analysis pipeline]
I --> J[Build response fields]
J --> K[Insert row in SQLite]
J --> L[Index transcript in Chroma vector store]
K --> M[Return JSON response]
L --> M
subgraph Response Shape
M1[status]
M2[language]
M3[transcript]
M4[summary]
M5[sop_validation]
M6[analytics]
M7[keywords]
end
M --> M1
M --> M2
M --> M3
M --> M4
M --> M5
M --> M6
M --> M7
flowchart TD
A[Temp audio file + language] --> B[Run Sarvam codemix STT]
B --> C{Sarvam success?}
C -->|No| F1{Fallback enabled?}
F1 -->|No| Z1[Transcription failure]
F1 -->|Yes| F2[Try fallback candidates]
F2 --> W[faster-whisper if enabled]
F2 --> R[Reverie if enabled and configured]
W --> P[Quality score each result]
R --> P
P --> Q{Any fallback success?}
Q -->|No| Z1
Q -->|Yes| S[Pick best-quality transcript]
S --> T[Post-process transcript]
T --> U[Return success result]
C -->|Yes| D[Post-process Sarvam transcript]
D --> E[Compute quality score]
E --> G{Dual-engine check needed?}
G -->|No| U
G -->|Yes, low quality| F2
T --> V[Squash long repetitions]
T --> X[Entity correction]
T --> Y[Quality warnings + metadata]
- Method: POST
- Path: /api/call-analytics
- Header: x-api-key: YOUR_SECRET_API_KEY
- Content-Type: application/json
{
"language": "Tamil",
"audioFormat": "mp3",
"audioBase64": "<base64-mp3>"
}{
"status": "success",
"language": "Tamil",
"transcript": "...",
"summary": "...",
"sop_validation": {
"greeting": true,
"identification": false,
"problemStatement": true,
"solutionOffering": true,
"closing": true,
"complianceScore": 0.8,
"adherenceStatus": "NOT_FOLLOWED",
"explanation": "..."
},
"analytics": {
"paymentPreference": "PARTIAL_PAYMENT",
"rejectionReason": "BUDGET_CONSTRAINTS",
"sentiment": "Neutral"
},
"keywords": ["...", "..."]
}- Python 3.11+
- FastAPI + Uvicorn
- Sarvam AI STT (primary)
- faster-whisper (local fallback)
- Reverie STT adapter (optional fallback)
- Groq Llama for summary/SOP/analytics/keywords
- SQLite (aiosqlite)
- ChromaDB vector store
.
├── main.py
├── config.py
├── requirements.txt
├── Procfile
├── nixpacks.toml
├── .env.example
├── README.md
├── services/
│ ├── transcription.py
│ ├── analysis.py
│ ├── payment_classifier.py
│ └── sop_validator.py
├── database/
│ ├── __init__.py
│ └── db.py
├── vector_store/
│ ├── __init__.py
│ └── store.py
├── models/
│ ├── __init__.py
│ └── schemas.py
├── scripts/
│ ├── test_api.py
│ ├── test_analysis.py
│ ├── test_transcription.py
│ └── benchmark_variants.py
├── templates/
│ └── dashboard.html
└── static/
└── style.css
- Clone repo
git clone <your-repo-url>
cd GUVI_hack- Install dependencies
pip install -r requirements.txt- Create env file
cp .env.example .env- Set required variables in .env
- API_KEY
- SARVAM_API_KEY
- GROQ_API_KEY
Optional:
- ENABLE_DUAL_ENGINE_STT
- ENABLE_FASTER_WHISPER
- ENABLE_REVERIE_FALLBACK
- REVERIE_STT_URL
- REVERIE_API_KEY
- Run server
uvicorn main:app --host 0.0.0.0 --port 8000Run full contract and pipeline checks:
python scripts/test_api.pyOptional benchmark sweep:
python scripts/benchmark_variants.py- Push repo to GitHub.
- In Render, create a new Web Service from your GitHub repository.
- Use these service settings:
- Runtime: Python 3
- Build Command: pip install -r requirements.txt
- Start Command: uvicorn main:app --host 0.0.0.0 --port $PORT
- Health Check Path: /health
- Add environment variables from .env.example, at minimum:
- API_KEY
- SARVAM_API_KEY
- GROQ_API_KEY
- For faster judge-time responses on free tier, set:
- ENABLE_STT_FALLBACK=false
- ENABLE_DUAL_ENGINE_STT=false
- ENABLE_FASTER_WHISPER=false
- ENABLE_NLP_CONSISTENCY_CHECK=false
- ENABLE_REVERIE_FALLBACK=false
- Deploy and verify:
- GET /health returns status ok
- POST /api/call-analytics returns valid structured JSON
- If you open
/api/call-analyticsdirectly in a browser, you may see:{"detail":"Method Not Allowed"} - This is expected behavior. The endpoint is POST-only by contract.
- In some free-tier cold-start windows, the homepage (
/) can briefly show an internal server error while the service is waking up. - This does not imply the API contract endpoint is unavailable; confirm service health at
/healthand test functionality with POST/api/call-analytics. - Use a client like Postman, Hoppscotch, or curl with:
- method: POST
- header:
x-api-key - JSON body:
language,audioFormat,audioBase64
- For uptime verification in browser, use
/health.
- No hardcoded response payloads are used.
- Responses are generated from provided Base64 audio.
- Enum values are normalized to allowed categories.
- Vector indexing is performed per processed call for semantic search evidence.
This project used AI-assisted development tools during implementation and documentation.
- GitHub Copilot (GPT-5.3-Codex) for code suggestions, debugging support, and README improvements
- ChatGPT for design discussion, prompt shaping, and architecture clarification
All generated code was reviewed and validated in the repository before use.
- Large transcripts can exceed Groq on-demand token limits and trigger fallback-safe analysis defaults.
- First-request cold starts on free hosting tiers can increase latency.
- External provider latency (Sarvam STT or LLM APIs) can affect end-to-end response time.
- Vector embedding model download can slow the first indexing operation on a fresh instance.

