Skip to content

feat(asr): add per-document speaker name persistence#77

Merged
jedzill4 merged 9 commits into
release/v2.0.0from
feat/speaker-names
Apr 21, 2026
Merged

feat(asr): add per-document speaker name persistence#77
jedzill4 merged 9 commits into
release/v2.0.0from
feat/speaker-names

Conversation

@jedzill4
Copy link
Copy Markdown
Contributor

@jedzill4 jedzill4 commented Apr 21, 2026

Summary

  • Adds speaker_names: dict[str, str] JSON column to the audio_transcription table (Alembic migration included; upgrade/downgrade verified)
  • Wraps POST /asr/validation/document/{id} body in a new ASRValidationRequest schema {annotations, speaker_names}breaking change from the previous bare-list body (acceptable for release/v2.0.0)
  • GET /asr/validation/document/{id} now returns speaker_names in the ASRDocument response

Behaviour

speaker_names in POST body Effect
omitted (null) Existing map preserved
{} Clears all names
{"0": "Jueza", ...} Full replacement

Test Plan

  • 8/8 integration tests pass (test/api/endpoints/routers/asr/)
  • Alembic upgrade/downgrade verified against SQLite
  • ruff format + ruff check clean on all changed files

Notes

  • Migration uses server_default=sa.text("'{}'") — correct for SQLite. If PostgreSQL support is added later, the cast should be updated to ::json.
  • Keys in speaker_names are strings by design (JSON object keys must be strings).

Summary by Sourcery

Add per-document speaker name support to ASR validation, including persistence in the database and exposure via the ASR validation API.

New Features:

  • Allow clients to provide and update per-speaker display names alongside ASR validation annotations via a new ASRValidationRequest payload.
  • Expose persisted speaker name mappings on ASR validation read responses through the ASRDocument schema.

Enhancements:

  • Extend audio transcription creation and update logic to manage optional speaker name maps while preserving existing values when omitted.

Tests:

  • Add and update ASR endpoint integration tests to cover the new validation request shape and speaker name persistence and retrieval semantics.

Chores:

  • Add an Alembic migration and model changes to store speaker name maps on audio_transcription records as a JSON column.

@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented Apr 21, 2026

Reviewer's Guide

Adds persistent per-document speaker name support to ASR transcriptions by introducing a JSON speaker_names column, exposing it via the ASRDocument API, and extending the validation POST endpoint to accept both annotations and speaker_names through a new request schema, with corresponding CRUD updates, migration, and tests.

Sequence diagram for POST /asr/validation/document/{id} with ASRValidationRequest and speaker_names persistence

sequenceDiagram
    actor Client
    participant ASRRouter as ASRTranscribeRouter
    participant DBSession
    participant AudioTranscription as AudioTranscriptionRecord

    Client->>ASRRouter: POST /asr/validation/document/{document_id}\nbody ASRValidationRequest
    ASRRouter->>DBSession: get_session()
    ASRRouter->>DBSession: query AudioTranscription by document_id
    DBSession-->>ASRRouter: AudioTranscriptionRecord

    ASRRouter->>AudioTranscriptionRecord: set validation from payload.annotations
    alt speaker_names is not None
        ASRRouter->>AudioTranscriptionRecord: set speaker_names = payload.speaker_names
    else speaker_names is None
        ASRRouter->>AudioTranscriptionRecord: keep existing speaker_names
    end

    ASRRouter->>DBSession: add(record)
    ASRRouter->>DBSession: commit()
    ASRRouter->>DBSession: refresh(record)
    ASRRouter-->>Client: 204 No Content
Loading

Sequence diagram for GET /asr/validation/document/{id} returning ASRDocument with speaker_names

sequenceDiagram
    actor Client
    participant ASRRouter as ASRTranscribeRouter
    participant DBSession
    participant AudioTranscription as AudioTranscriptionRecord
    participant ASRDocument

    Client->>ASRRouter: GET /asr/validation/document/{document_id}
    ASRRouter->>DBSession: get_session()
    ASRRouter->>DBSession: query AudioTranscription by document_id
    DBSession-->>ASRRouter: AudioTranscriptionRecord

    ASRRouter->>ASRDocument: construct\n document_id = record.id\n document = record.validation or record.transcription\n speaker_names = record.speaker_names or {}
    ASRRouter-->>Client: ASRDocument JSON
Loading

ER diagram for updated audio_transcription table with speaker_names JSON column

erDiagram
    audio_transcription {
        uuid id
        string name
        json transcription
        json validation
        json speaker_names
    }
Loading

Class diagram for ASR validation request and audio transcription models with speaker_names

classDiagram
    class ASRParagraphRequest {
        +str text
    }

    class ASRValidationRequest {
        +list~ASRParagraphRequest~ annotations
        +dict~str,str~ speaker_names
    }

    class ASRParagraph {
        +str text
        +int speaker
        +str to_txt()
    }

    class ASRDocument {
        +list~ASRParagraph~ document
        +UUID document_id
        +dict~str,str~ speaker_names
        +str to_txt()
        +from_transcription(transcription AudioTranscriptionRead) ASRDocument
    }

    class AudioTranscriptionBase {
        +UUID id
        +str name
        +list~ASRParagraph~ transcription
        +list~ASRParagraph~ validation
        +dict~str,str~ speaker_names
    }

    class AudioTranscription {
    }

    class AudioTranscriptionUpdate {
        +str name
        +list~ASRParagraph~ transcription
        +list~ASRParagraph~ validation
        +dict~str,str~ speaker_names
    }

    class AudioTranscriptionRead {
    }

    class AudioTranscriptionCRUD {
        +audio_transcription_create_or_update(name str, transcription list~ASRParagraph~, session Session, speaker_names dict~str,str~) AudioTranscription
    }

    ASRValidationRequest "1" --> "*" ASRParagraphRequest : contains
    ASRDocument "1" --> "*" ASRParagraph : contains
    AudioTranscriptionBase <|-- AudioTranscription
    AudioTranscriptionBase <|-- AudioTranscriptionUpdate
    AudioTranscriptionBase <|-- AudioTranscriptionRead
    AudioTranscriptionCRUD --> AudioTranscription : creates_updates
    ASRDocument --> AudioTranscriptionRead : from_transcription
Loading

File-Level Changes

Change Details Files
Introduce speaker_names persistence on AudioTranscription and plumb it through CRUD and read models.
  • Add a non-null JSON speaker_names field with default {} to AudioTranscriptionBase and include it in AudioTranscriptionUpdate and AudioTranscriptionRead models.
  • Extend audio_transcription_create_or_update to accept an optional speaker_names dict that initializes it on create and conditionally updates it on existing records.
  • Create an Alembic migration that adds the speaker_names JSON column with an empty-object server default and drops it on downgrade.
aymurai/database/meta/audio_transcription.py
aymurai/database/crud/audio_transcription.py
aymurai/database/versions/56103c9aa43c_add_speaker_names_to_audio_transcription.py
Expose speaker_names on ASR API models and wire it into GET /asr/validation/document/{id}.
  • Add speaker_names dict field with default {} to ASRDocument and populate it from transcription.speaker_names (falling back to {}).
  • Ensure asr_read_document_validation returns ASRDocument with speaker_names sourced from the AudioTranscription record, defaulting to {} when absent.
aymurai/meta/api_interfaces.py
aymurai/api/endpoints/routers/asr/transcribe.py
Change POST /asr/validation/document/{id} contract to a structured ASRValidationRequest that carries annotations and optional speaker_names, and implement the update semantics.
  • Introduce ASRValidationRequest with annotations list and optional speaker_names field, documenting its None/{} semantics.
  • Update asr_save_document_validation to accept ASRValidationRequest, rewrite record.validation from payload.annotations, and conditionally overwrite record.speaker_names only when payload.speaker_names is not None.
aymurai/meta/api_interfaces.py
aymurai/api/endpoints/routers/asr/transcribe.py
Extend ASR integration tests to cover the new speaker_names behavior and updated POST request shape.
  • Update existing validation POST test to use the new ASRValidationRequest-shaped body with annotations key.
  • Add tests that verify persisting, preserving, and clearing speaker_names via POST /asr/validation/document/{id}.
  • Add tests that verify speaker_names is included in GET /asr/validation/document/{id} responses and defaults to {} when not set.
test/api/endpoints/routers/asr/test_transcribe.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • In asr_read_document_validation you manually construct ASRDocument instead of using the existing ASRDocument.from_transcription helper, which could lead to divergence if the response schema evolves; consider delegating to the factory for consistency.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `asr_read_document_validation` you manually construct `ASRDocument` instead of using the existing `ASRDocument.from_transcription` helper, which could lead to divergence if the response schema evolves; consider delegating to the factory for consistency.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@codacy-production
Copy link
Copy Markdown

Not up to standards ⛔

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes. Give us feedback

@jedzill4 jedzill4 merged commit ce7ca99 into release/v2.0.0 Apr 21, 2026
1 check passed
@jedzill4 jedzill4 deleted the feat/speaker-names branch April 21, 2026 19:50
dmazzini added a commit that referenced this pull request May 27, 2026
879309c8 Feat/entity manager mention feedback (#81)
4d2de106 Fix/responsive home layout (#80)
986e68d2 Fix/homogenize file check ui (#77)
046f8ab9 fix(file-annotator): fix upward autoscroll on search previous navigation (#76)
2ecf75dc feat(dependencies): add dnd-kit packages for drag-and-drop functionality

git-subtree-dir: frontend
git-subtree-split: 879309c841d8072babc4d06f1686d11cf8cbd03f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant