Skip to content

Revert "Sync: fair-use tracking with lock-on-exhaustion and soft cap gates"#5875

Merged
beastoin merged 1 commit intomainfrom
revert-5863-fix/sync-fair-use-gate-5854
Mar 21, 2026
Merged

Revert "Sync: fair-use tracking with lock-on-exhaustion and soft cap gates"#5875
beastoin merged 1 commit intomainfrom
revert-5863-fix/sync-fair-use-gate-5854

Conversation

@beastoin
Copy link
Collaborator

Reverts #5863

@beastoin beastoin merged commit 7b9e004 into main Mar 21, 2026
3 checks passed
@beastoin beastoin deleted the revert-5863-fix/sync-fair-use-gate-5854 branch March 21, 2026 09:08
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 21, 2026

Greptile Summary

This PR reverts #5863 ("Sync: fair-use tracking with lock-on-exhaustion and soft cap gates"), removing the fair-use enforcement gates that were added to the /v1/sync-local-files endpoint, while simultaneously restoring the batched DG-usage-tracking approach originally introduced in #5854 to transcribe.py.

Key changes:

  • routers/sync.py: Hard-restriction pre-check (is_hard_restricted), credit-exhaustion lock (should_lock/is_locked), VAD speech-duration accumulation, soft-cap checks, and the record_usage call are all removed. The sync path now processes uploads without any fair-use gating.
  • models/conversation.py: is_locked removed from CreateConversation (the input DTO); the persisted Conversation model retains the field and defaults it to False, so no data-layer breakage.
  • routers/transcribe.py: Per-chunk record_dg_usage_ms calls are replaced with a local accumulator (dg_usage_ms_pending) that is flushed every 60 s by the periodic loop and at session end in the finally block — restoring the batching optimisation from Sync API: add subscription gate and fair-use tracking to prevent free-rider abuse #5854.
  • utils/fair_use.py: source parameter removed from record_speech_ms (was log-only, no Redis-key effect).
  • Tests/test.sh: Two test files covering the reverted sync fair-use logic are deleted, and references in test.sh are cleaned up accordingly.

Notable concern: The session-end flush of dg_usage_ms_pending in transcribe.py is guarded by if not use_custom_stt, but the Soniox and Speechmatics code paths also accumulate into dg_usage_ms_pending. For custom-STT sessions that end before the 60 s periodic flush fires, that accumulated usage is silently dropped. The periodic flush is intentionally placed before the use_custom_stt guard (with an explicit comment), but the session-end flush does not mirror that arrangement.

Confidence Score: 4/5

  • Safe to merge; the revert is clean and the only non-trivial concern is a pre-existing batching edge case in transcribe.py that affects DG usage accounting for short custom-STT sessions.
  • The sync revert is straightforward and well-scoped. The transcribe.py batching restoration is a known pattern (Sync API: add subscription gate and fair-use tracking to prevent free-rider abuse #5854) with an explicit design intent. The one real issue — dg_usage_ms_pending not being flushed at session end for use_custom_stt sessions shorter than 60 s — affects fair-use accounting accuracy but not user-facing functionality or data integrity. It warrants a follow-up fix but is not a blocker.
  • backend/routers/transcribe.py — session-end flush of dg_usage_ms_pending is skipped for use_custom_stt=True paths.

Important Files Changed

Filename Overview
backend/routers/sync.py Reverts fair-use gates from sync_local_files: removes hard-restriction pre-check, should_lock/is_locked propagation, speech-duration VAD accumulation, soft-cap checks, and the record_usage call. Removes corresponding imports. Clean revert with no residual references.
backend/routers/transcribe.py Restores the #5854 batching approach for DG budget tracking: accumulates dg_usage_ms_pending locally and flushes every 60 s (periodic loop) or at session end (finally block). Session-end flush is inside if not use_custom_stt, so Soniox/Speechmatics sessions shorter than 60 s will silently drop their accumulated usage. Also, two non-Deepgram code paths carry a misleading "DG usage" comment.
backend/models/conversation.py Removes is_locked: bool = False from CreateConversation (the input DTO). The persisted Conversation model still retains is_locked and defaults to False, so existing data and all other code paths (routers, DB queries) remain unaffected.
backend/utils/fair_use.py Removes the source parameter from record_speech_ms (it was for log traceability only and had no effect on Redis keys). Safe change; callers in transcribe.py already call without the parameter.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["POST /v1/sync-local-files"] --> B["Decode & VAD segment files"]
    B --> C["process_segment per path"]
    C --> D{"Closest existing\nconversation?"}
    D -- Yes --> E["Update conversation\nsegments & timestamps"]
    D -- No --> F["Create new conversation\nvia process_conversation"]
    E --> G["Return updated/new\nmemory IDs"]
    F --> G

    subgraph "REVERTED (was in #5863)"
        R1["is_hard_restricted check → 429"]
        R2["should_lock = not has_transcription_credits"]
        R3["record_speech_ms(uid, speech_ms, source='sync')"]
        R4["check_soft_caps + trigger_classifier"]
        R5["is_locked=True on CreateConversation"]
        R6["record_usage(uid, transcription_seconds=...)"]
    end

    subgraph "transcribe.py — restored batching (#5854)"
        T1["Audio chunk received"]
        T1 --> T2["dg_usage_ms_pending += chunk_ms\n(DG / Soniox / Speechmatics)"]
        T2 --> T3{"60 s timer fires?"}
        T3 -- Yes --> T4["record_dg_usage_ms(uid, pending)\npending = 0"]
        T3 -- No --> T5{"Session end\n(finally block)?"}
        T5 -- "use_custom_stt=False" --> T4
        T5 -- "use_custom_stt=True\n&& session < 60s" --> T6["⚠️ usage DROPPED"]
    end
Loading

Comments Outside Diff (1)

  1. backend/routers/transcribe.py, line 2217 (link)

    P2 Misleading "DG" comment in non-Deepgram paths

    The comment # Accumulate DG usage locally, flushed every 60s (#5854) appears identically in the Soniox (line ~2340) and Speechmatics (line ~2360) paths, where the audio is being sent to a non-Deepgram provider. The acronym "DG" specifically refers to Deepgram, so this label is misleading for the Soniox/Speechmatics paths.

    Consider updating the comment to be provider-agnostic, for example:

Last reviewed commit: "Revert "Sync: fair-u..."

Comment on lines +2673 to +2676
# Flush pending DG usage accumulator (#5854)
if FAIR_USE_ENABLED and FAIR_USE_RESTRICT_DAILY_DG_MS > 0 and dg_usage_ms_pending > 0:
record_dg_usage_ms(uid, dg_usage_ms_pending)
dg_usage_ms_pending = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 dg_usage_ms_pending not flushed for use_custom_stt sessions shorter than 60 s

The session-end flush of dg_usage_ms_pending is nested inside if not use_custom_stt and last_usage_record_timestamp:. However, the Soniox and Speechmatics code paths in flush_stt_buffer both accumulate into dg_usage_ms_pending regardless of whether use_custom_stt is True.

This means:

  • For sessions using Soniox or Speechmatics where use_custom_stt=True and the session ends before the 60 s periodic flush fires, any accumulated dg_usage_ms_pending is silently discarded and never written to Redis.
  • The periodic flush (line ~438) is intentionally placed before the use_custom_stt guard with a comment # Placed before use_custom_stt guard so all STT paths get flushed — the session-end flush is missing the same treatment.

A straightforward fix is to move the dg_usage_ms_pending flush outside the if not use_custom_stt block, mirroring the periodic flush's intent:

# Flush pending DG/STT usage accumulator (#5854) — covers all STT paths
if FAIR_USE_ENABLED and FAIR_USE_RESTRICT_DAILY_DG_MS > 0 and dg_usage_ms_pending > 0:
    record_dg_usage_ms(uid, dg_usage_ms_pending)
    dg_usage_ms_pending = 0

if not use_custom_stt and last_usage_record_timestamp:
    ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant