Skip to content

Script to process dir with audiofiles like telephone calls and create transcripts in json containing complete data from whisper.

License

Notifications You must be signed in to change notification settings

PiotrEsse/Whisper-to-json-dir

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Speech to Text Conversion and JSON Output

This script converts speech to text using the Whisper library and saves the transcription along with additional metadata into various file formats including JSON, TXT, TSV, SRT, and VTT.

Requirements

  • Python 3.x
  • Whisper library (pip install whisper-text)

Usage

  1. Ensure you have Python installed on your system.
  2. Install the Whisper library using pip: pip install whisper-text.
  3. Place your audio files in a directory and update the directory variable in the script to point to that directory.
  4. Choose the Whisper model by updating the model variable in the script. Available models are: "tiny", "base", "small", "medium", "large".
  5. Run the script.

Description

  • The script iterates over all the files in the specified directory.
  • It checks if each file is an audio file based on its extension.
  • Audio files supported include: .mp4, .mp3, .wav, .amr, .aac, .ogg, .m4a.
  • The script transcribes each audio file using the chosen Whisper model.
  • It adds the filename, creation date, and modification date as metadata to the transcription result.
  • The transcription result is then saved in the following formats:
    • JSON: .json
    • Text: .txt
    • Tab-separated values: .tsv
    • SubRip subtitle format: .srt
    • WebVTT subtitle format: .vtt

File Structure

  • speech_to_text.py: The main Python script.
  • README.md: This file providing instructions and information about the script.
  • example_audio/: A sample directory containing audio files for testing purposes.

Notes

  • The language for transcription is set to Polish ("pl"). Change the language parameter in the transcribe() function call if you need a different language.
  • Ensure that the Whisper library supports the audio format of your files.
  • Make sure to handle large audio files appropriately as transcription may take some time.
  • Choose the appropriate Whisper model based on your requirements. Update the model variable in the script accordingly.
  • Available Whisper models are: "tiny", "base", "small", "medium", "large". Choose a model based on your desired trade-off between accuracy and speed.

About

Script to process dir with audiofiles like telephone calls and create transcripts in json containing complete data from whisper.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages