Problem
A pattern of abuse has been identified where users on unlimited plans use Omi exclusively to transcribe pre-recorded content (audiobooks, podcasts, TV audio) 24/7. The top speech consumer forwards ~55x the median user's daily speech to the transcription service. Among the top 10 consumers by speech hours, 6 are flagged for similar non-conversational usage patterns (e.g., single 8+ hour sessions transcribing published books).
This usage generates transcription costs that far exceed subscription revenue per user — the top abuser's daily transcription cost alone is ~3.5x their daily subscription value.
Usage Distribution (observed)
| Percentile |
Daily speech (VAD-measured) |
| Median |
12.7 min |
| P95 |
2.1 hours |
| P99 |
3.4 hours |
| Top abuser |
11.7 hours |
The intended use case is personal conversations and meetings. Audiobook/podcast transcription is not the product's purpose.
Proposed: 3-Part Fair-Use System
Part 1 — Soft cap detection (speech hours, NOT connected time)
Detect when a user exceeds rolling speech-hour thresholds. Speech hours = VAD-measured real speech (speech_ratio × session_duration), not raw connection time.
| Window |
Soft cap |
| Daily |
2h real speech |
| 3-day rolling |
8h |
| Weekly rolling |
10h |
These are detection triggers, not hard blocks. The 2h/day cap catches only the top ~5% of users.
Part 2 — LLM-based purpose detection (triggered on soft cap hit)
When any soft cap is hit, run LLM analysis on the user's recent conversations to classify usage purpose.
Inputs: Conversation structured data (title, overview, category) from recent sessions.
Detect wrong-purpose usage:
- Audiobook transcription
- Podcast transcription
- Pre-recorded content (TV, movies, lectures from media)
- Any non-personal-conversation content
Do NOT flag:
- Legitimate high usage (back-to-back meetings, all-day conference, long brainstorm)
- Personal conversations of any length
- Live lectures or classes the user is attending in person
Output: abuse_score (0–1), abuse_type (enum), evidence (which conversations triggered, why)
Part 3 — Graduated enforcement
| Stage |
Trigger |
Action |
| Warning |
First soft cap hit + abuse_score > 0.7 |
In-app notification explaining fair use policy |
| Throttle |
Second violation within 7 days |
Increase VAD silence threshold (reduce speech forwarded to transcription), reduce transcription to standard tier |
| Restrict |
Third violation within 30 days |
Hard cap at soft cap limits until next billing cycle, require manual review for continued access |
Enforcement must not degrade experience for legitimate users. Only users who (a) exceed soft caps AND (b) score high on abuse detection should be affected.
Acceptance Criteria
Related
- VAD gate metrics already track speech_ratio per session — can be aggregated for soft cap detection
- DG usage API provides per-user hourly consumption data for validation
by AI for @beastoin
Problem
A pattern of abuse has been identified where users on unlimited plans use Omi exclusively to transcribe pre-recorded content (audiobooks, podcasts, TV audio) 24/7. The top speech consumer forwards ~55x the median user's daily speech to the transcription service. Among the top 10 consumers by speech hours, 6 are flagged for similar non-conversational usage patterns (e.g., single 8+ hour sessions transcribing published books).
This usage generates transcription costs that far exceed subscription revenue per user — the top abuser's daily transcription cost alone is ~3.5x their daily subscription value.
Usage Distribution (observed)
The intended use case is personal conversations and meetings. Audiobook/podcast transcription is not the product's purpose.
Proposed: 3-Part Fair-Use System
Part 1 — Soft cap detection (speech hours, NOT connected time)
Detect when a user exceeds rolling speech-hour thresholds. Speech hours = VAD-measured real speech (speech_ratio × session_duration), not raw connection time.
These are detection triggers, not hard blocks. The 2h/day cap catches only the top ~5% of users.
Part 2 — LLM-based purpose detection (triggered on soft cap hit)
When any soft cap is hit, run LLM analysis on the user's recent conversations to classify usage purpose.
Inputs: Conversation structured data (title, overview, category) from recent sessions.
Detect wrong-purpose usage:
Do NOT flag:
Output: abuse_score (0–1), abuse_type (enum), evidence (which conversations triggered, why)
Part 3 — Graduated enforcement
Enforcement must not degrade experience for legitimate users. Only users who (a) exceed soft caps AND (b) score high on abuse detection should be affected.
Acceptance Criteria
Related
by AI for @beastoin