Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/Speech to text transcription #495

wants to merge 5 commits into
base: master


Copy link

c-w commented Dec 12, 2019

This pull request is based on the work of @tayciryahmed in #121 and implements speech-to-text transcription in Doccano.

To keep things simple, the implementation for now uses html5 audio instead of something more sophisticated like wavesurfer. This can be improved in the future if the requirement arises. The alt+p keyboard shortcut has been introduced to play/pause the audio player.

Animation showing speech to text transcription

For ease-of-use, speech-to-text data can be imported either by posting audio files (MP3, WAV, etc.) or by uploading a JSONL manifest that encodes the audio as data URIs or URLs to the audio files.

To make it easier to identify and distinguish audio files, the document left-navigation has been updated to display a file name (instead of file content) if the meta.filename attribute is set.

Resolves #95

@c-w c-w force-pushed the CatalystCode:feature/speech-to-text branch from c4f6fb2 to 21e1dc9 Dec 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
1 participant
You can’t perform that action at this time.