Skip to content

pipeline for obtaining audio files from an input microphone and converting it to transcribed text

Notifications You must be signed in to change notification settings

realaryann/Speech-To-Text-Py

Repository files navigation

Speech-To-Text-Py

Python package to record input audio from a sound device and transcribe it into text for further operations

Installation and Usage

  • Clone entire directory into any folder of your choice
  • Manually go through each file and change the absolute paths based on your file structure
  • Inside the input_saver directory, run ./input.sh
  • After the process finishes, check results/test.txt for the transcription of the recording
  • the voskcall.py file is used for working with the vosk-transcriber and must be carefully altered
  • This package is supposed to work with https://github.com/realaryann/Keyword-Select-Service (service to extract keywords)

demo-ezgif com-video-to-gif-converter


Working pipeline of recording, transcribing, and storing

Required Python Libraries

  • sounddevice
  • numpy
  • scipy
  • vosk

Changing Sound Device

  • Navigate to input_saver/find_device.py
  • In a terminal, check the name of your sounddevice by running python3 -m sounddevice
  • Replace the DEVICE_NAME variable with the name of the device

Program Input

  • Input_Saver currently takes string input from the command line
  • Input Mapping
    1. R: Record for 5 seconds
    2. T: Record for 10 seconds
    3. Y: Record for 20 seconds
    4. Z: Exit program
  • The program will continue looping until valid input is entered

Configuring Output

  • Navigate to input_saver/voskhear.py
  • Replace the newf and filesrc variables (WARNING, could break everything!)

Cleaning Results

  • Navigate to the parent directory (./input_saver
  • Run ./clean.sh in ~/input_saver to clear out previous recordings and results

Full Release with tracking capabilities

(Check complete branch)
Order of execution for input_saver
  1. python3 inputaudio.py
  2. python3 voskhear.py

About

pipeline for obtaining audio files from an input microphone and converting it to transcribed text

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published