Skip to content

Latest commit

 

History

History
57 lines (37 loc) · 1.79 KB

File metadata and controls

57 lines (37 loc) · 1.79 KB

Transcribe Worker

This worker uses OpenAI Whisper to transcribe audio to text.

Getting started

Follow the setup on the whisper repository.

Dependencies

Install conda/miniconda if you haven't already.

Using conda environment file

You can use the provided environment.yml file to get started fast.

conda env create -f environment.yml

Manually installing dependencies

conda env create --name transcription python=3.10
conda activate transcription
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -r requirements.txt

Or install from environment.yml file if using conda/mindiconda:

conda env create -f environment.yml
conda activate transcribe

Secrets / Environment Variables

Copy the template from .env.template into .env and grab the needed secrets from Auth0. For the environment variables for minio, postgres and redis make sure they match whatever is configured in backend/docker/local/docker-compose.yml or its environment file.

Running the Worker

Run it manually with python worker.py or use the Dockerfile (tbd) start it as container. When starting it manually the prisma client needs to be intantiated first. For that do the following:

prisma db push # This pushes the current `prisma.schema` to db
prisma generate # Generate client incl. types

Configuring OpenAI Whisper

In processor.py line 117 you can further configure whisper to your likings, simply modify the arguments passed to transcribe, see docs for further comments on the usage.

self.model.transcribe(audio=whisper_audio)