Skip to content

Transcript processing from STT services to standardized formats.

Notifications You must be signed in to change notification settings

zevaverbach/tpro

Repository files navigation

tpro

Transcript Processing! tpro takes transcripts produced by various speech-to-text services and converts them to various standardized formats.

demo

Installation and Usage

Non-pip Requirement: Stanford NER JAR

  • download and unzip this
  • put these files in in /usr/local/bin/:
    • stanford-ner.jar
    • classifiers/english.all.3class.distsim.crf.ser.gz
  • you might have to update Java on Linux

Pip

$ pip install tpro

Usage

$ tpro --help

Usage: tpro [OPTIONS] TRANSCRIPT_DATA_PATH OUTPUT_PATH
            [amazon|gentle|speechmatics|google] [universal|vo]

Options:
  -p, --print-output    pretty print the transcript, breaks pipeability
  --language-code TEXT  specify language, defaults to en-US.
  --help                Show this message and exit.

STT Services

Planned

Output Formats

Planned

  • Draft.js JSON
  • Word (.doc, .docx)
  • text files
  • SRT (subtitles)

About

Transcript processing from STT services to standardized formats.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages