Skip to content

LarissaGuder/whisper-datastream

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Requirements

Python 3.8.10

for whisper

pip install git+https://github.com/openai/whisper.git

pip install setuptools-rust

choco install ffmpeg

Req for spark

pip install pyspark

All dependencies

pip3 install -r requirements.txt

How it works

In the code folder, we have 3 files.

You can start the spark_stream when you want. They will wait to have json files in the files folder.

In another terminal, you must run load_and_split_audio. They will create .wav files in the audio folder. This audios will be processed by the generate_array. When you run they, json files with the np.array of the audio files will be created, and feed the stream input.

How reproduce the results

In the code folder, check if in the generate_array file the dir path is dir_path = r'../input/mp3/'. It will be necessary that you start the spark_stream_evaluate script. They will start the spark engine, and will save the transcribed results into output_txt directory. After that, run the generate_array script, and just wait. When the process finished on the spark_stream_evaluate, you can run the evaluate script, where the results will be calculated.

The actual results are: Precision: 0.41
Recall: 0.33 F1-Score: 0.37

About

Transcription and NER in streaming environment

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages