Skip to content

Hearing Transcription Pipeline #1840

@Mephistic

Description

@Mephistic

Problem

We have put work into developing a way to take Hearing Transcriptions generated by AssemblyAI and generate summaries of them. We now want to connect this work into the web flow so that we can automatically generate summaries for new transcriptions.

Success Criteria

  • All newly generated transcriptions should be automatically passed to this function (by a firebase function document trigger, either in Python directly or in Typescript using the ML Wrapper API)
    • Depending on which transcription fields we need, we might need to modify the hearing transcription flow slightly (because we save the individual paragraphs breakdown` separately from the transcriptions themselves, which could be a problem if we need those to generate the summary).
  • Backfill script to generate summaries for all pre-existing transcriptions that do not have one

Blockers

  • The ML Wrapper API isn't available yet, so there may be a blocker on that before we can launch the pipeline part of this
  • We also still need to do codebase splitting for Firebase Functions, which requires a bit of care to deploy - the work here isn't complicated, but it will require renaming existing functions to support typescript and python functions in the same project, which is non-trivial in Firebase.
  • Before we can backfill, we also need to run the backfill for the hearing transcriptions themselves, which is yet to be written.

Metadata

Metadata

Assignees

No one assigned

    Labels

    TranscriptionsbackendBackend Developmentneeds scopeIssues that need estimations/requirements/scoping

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions