realaryann / Speech-To-Text-Py Public

Notifications You must be signed in to change notification settings
Fork 0
Star 1

pipeline for obtaining audio files from an input microphone and converting it to transcribed text

1 star 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
__pycache__		__pycache__
log		log
recordings		recordings
results		results
README.md		README.md
clean.sh		clean.sh
find_device.py		find_device.py
input.sh		input.sh
inputaudio.py		inputaudio.py
inputaudio_test.py		inputaudio_test.py
voskcall.py		voskcall.py
voskhear.py		voskhear.py

Repository files navigation

Speech-To-Text-Py

Python package to record input audio from a sound device and transcribe it into text for further operations

Installation and Usage

Clone entire directory into any folder of your choice
Manually go through each file and change the absolute paths based on your file structure
Inside the input_saver directory, run ./input.sh
After the process finishes, check results/test.txt for the transcription of the recording
the voskcall.py file is used for working with the vosk-transcriber and must be carefully altered
This package is supposed to work with https://github.com/realaryann/Keyword-Select-Service (service to extract keywords)

Working pipeline of recording, transcribing, and storing

Required Python Libraries

sounddevice
numpy
scipy
vosk

Changing Sound Device

Navigate to input_saver/find_device.py
In a terminal, check the name of your sounddevice by running python3 -m sounddevice
Replace the DEVICE_NAME variable with the name of the device

Program Input

Input_Saver currently takes string input from the command line
Input Mapping

R: Record for 5 seconds
T: Record for 10 seconds
Y: Record for 20 seconds
Z: Exit program

The program will continue looping until valid input is entered

Configuring Output

Navigate to input_saver/voskhear.py
Replace the newf and filesrc variables (WARNING, could break everything!)

Cleaning Results

Navigate to the parent directory (./input_saver
Run ./clean.sh in ~/input_saver to clear out previous recordings and results

Full Release with tracking capabilities

(Check complete branch)
Order of execution for input_saver

python3 inputaudio.py
python3 voskhear.py

About

pipeline for obtaining audio files from an input microphone and converting it to transcribed text

bash python3 speech-to-text stt ros2 vosk-api

Report repository

Releases

No releases published

Packages

No packages published

Languages