transcription-diff

A small python library to find differences between audio and transcriptions

Example (audio as mp4 to allow an embed):

sphere.mp4

from transcription_diff.text_diff import transcription_diff, render_text_diff

diff = transcription_diff("You can go pretty far in life if you're a perfect sphere in a vacuum", "sphere.mp4")
print(render_text_diff(diff))

! Well
You can go pretty far in life
! when
+ if
you're a perfect sphere in a vacuum

Mechanism

The library relies on openai-whisper to perform Audio Speech Recognition unguided by the transcription
It then compares the expected transcription to the output of Whisper, ignoring superfluous characters
It returns the output in a simple structure, keeping the original text format of the transcription

Limitations

Only a single hypothesis is considered for the ASR output, leaving the possibility of missing a hypothesis that would satisfy the expected transcription
The ASR output is not in the phoneme space, making homophones prone to false positives
Rare words unknown to Whisper require to be explicitly passed to the function, and have no guarantee of being properly recognized by Whisper
Currently only annotates up to 30 seconds of audio per sample

Installation

pip install transcription-diff

Short term TODOs

Phoneme-level comparison
User handling of model cache
Support for audios longer than 30s

Long shot TODOs

More robust support for non-English languages
Inverse normalization support for less false positives

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
tests		tests
transcription_diff		transcription_diff
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
examples.py		examples.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tests

tests

transcription_diff

transcription_diff

.gitignore

.gitignore

LICENSE

LICENSE

MANIFEST.in

MANIFEST.in

README.md

README.md

examples.py

examples.py

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

transcription-diff

Mechanism

Limitations

Installation

Short term TODOs

Long shot TODOs

About

Releases

Packages

Languages

License

CorentinJ/transcription-diff

Folders and files

Latest commit

History

Repository files navigation

transcription-diff

Mechanism

Limitations

Installation

Short term TODOs

Long shot TODOs

About

Resources

License

Stars

Watchers

Forks

Languages