Skip to content

Fair-use anti-abuse: soft caps + LLM purpose detection + graduated enforcement #5746

@beastoin

Description

@beastoin

Problem

A pattern of abuse has been identified where users on unlimited plans use Omi exclusively to transcribe pre-recorded content (audiobooks, podcasts, TV audio) 24/7. The top speech consumer forwards ~55x the median user's daily speech to the transcription service. Among the top 10 consumers by speech hours, 6 are flagged for similar non-conversational usage patterns (e.g., single 8+ hour sessions transcribing published books).

This usage generates transcription costs that far exceed subscription revenue per user — the top abuser's daily transcription cost alone is ~3.5x their daily subscription value.

Usage Distribution (observed)

Percentile Daily speech (VAD-measured)
Median 12.7 min
P95 2.1 hours
P99 3.4 hours
Top abuser 11.7 hours

The intended use case is personal conversations and meetings. Audiobook/podcast transcription is not the product's purpose.

Proposed: 3-Part Fair-Use System

Part 1 — Soft cap detection (speech hours, NOT connected time)

Detect when a user exceeds rolling speech-hour thresholds. Speech hours = VAD-measured real speech (speech_ratio × session_duration), not raw connection time.

Window Soft cap
Daily 2h real speech
3-day rolling 8h
Weekly rolling 10h

These are detection triggers, not hard blocks. The 2h/day cap catches only the top ~5% of users.

Part 2 — LLM-based purpose detection (triggered on soft cap hit)

When any soft cap is hit, run LLM analysis on the user's recent conversations to classify usage purpose.

Inputs: Conversation structured data (title, overview, category) from recent sessions.

Detect wrong-purpose usage:

  • Audiobook transcription
  • Podcast transcription
  • Pre-recorded content (TV, movies, lectures from media)
  • Any non-personal-conversation content

Do NOT flag:

  • Legitimate high usage (back-to-back meetings, all-day conference, long brainstorm)
  • Personal conversations of any length
  • Live lectures or classes the user is attending in person

Output: abuse_score (0–1), abuse_type (enum), evidence (which conversations triggered, why)

Part 3 — Graduated enforcement

Stage Trigger Action
Warning First soft cap hit + abuse_score > 0.7 In-app notification explaining fair use policy
Throttle Second violation within 7 days Increase VAD silence threshold (reduce speech forwarded to transcription), reduce transcription to standard tier
Restrict Third violation within 30 days Hard cap at soft cap limits until next billing cycle, require manual review for continued access

Enforcement must not degrade experience for legitimate users. Only users who (a) exceed soft caps AND (b) score high on abuse detection should be affected.

Acceptance Criteria

  • Backend tracks per-user rolling speech hours (daily, 3-day, weekly) from VAD metrics
  • Soft cap detection fires event when any threshold is exceeded
  • LLM purpose classifier analyzes recent conversations on cap hit
  • Abuse verdicts are stored per-user with evidence trail
  • Warning notification delivered in-app on first offense
  • Throttle mechanism reduces transcription forwarding on second offense
  • Restrict mechanism enforces hard cap on third offense
  • Legitimate high-usage users are not affected (false positive rate < 1%)
  • Dashboard or admin view for reviewing flagged users

Related

  • VAD gate metrics already track speech_ratio per session — can be aggregated for soft cap detection
  • DG usage API provides per-user hourly consumption data for validation

by AI for @beastoin

Metadata

Metadata

Assignees

No one assigned

    Labels

    backendBackend Task (python)enhancementNew feature or requestmaintainerLane: High-risk, cross-system changesp1Priority: Critical (score 22-29)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions