Skip to content

Latest commit

 

History

History
34 lines (27 loc) · 1.22 KB

todo.MD

File metadata and controls

34 lines (27 loc) · 1.22 KB

General:

  • Update documentation
    • Upload instructions, save options, exports

Known issues with workarounds:

  • Export text removes paragraphs - for now use copy and paste
  • Large audio files cause text-audio sync issues - recommended to compress large audio files (constant bit rate, 8kbps, 8k sample rate). E.g. use Audacity for single files or ffmpeg for multiple: find ./ -name “*.mp3” -exec ffmpeg -i "{}" -codec:a libmp3lame -b:a 8k -ac 1 -ar 8000 '$(basename {} min)’.mp3 \;

Bugs:

  • Live demo loads transcript over autosaved one
  • Speakers get messed up on long audio
  • Play button disappears on local deployment
  • Annotations popup gets stuck when using ‘remove’
  • Only some elements in annotation panel are saved

Priority features:

  • Optimise for deployment on local server
  • Autosave indicator
  • Compatability with anything other than two speakers (untested)

Partially implemented:

  • Compatibility with Mozilla DeepSpeech

  • User sets confidence level

Long-term features:

  • Call transcription APIs within app
  • Google, MS, IBM compatibility (look up data standards)
  • Audio processing within app (ffmpeg?)

Really long term features:

  • Train a custom DeepSpeech model with user corrections
  • Natural language understanding