Skip to content

Celery task runner Docker App for YouTube Transcriptions with simple Next.js frontend

License

Notifications You must be signed in to change notification settings

seandearnaley/queuetube-whisper-transcriber

Repository files navigation

queuetube-whisper-transcriber

queuetube-whisper-transcriber is a Docker-based application designed to provide simple queueing and transcription of YouTube videos using the Whisper ASR (Automatic Speech Recognition) system. This repository contains both the backend API and frontend components required for the transcribing process.

Folder Structure

queuetube-whisper-transcriber/
|-- .vscode/
|-- .yarn/
|-- app/
|   |-- api.py
|   |-- audio_tools.py
|   |-- download_processor.py
|   |-- transcription_processor.py
|   |-- whisper_transcriber.py
|-- frontend/
|   |-- .yarn/
|   |-- pages/
|   |   |-- index.js
|   |   |-- index.module.css
|   |-- package.json
|   |-- yarn.lock
|-- .dockerignore
|-- .pylintrc
|-- docker-compose.yml
|-- Dockerfile
|-- LICENSE
|-- poetry.lock
|-- pyproject.toml
|-- README.md

Components

Backend (app/)

The backend of the application is responsible for handling API requests, downloading YouTube videos, and processing their transcriptions.

  • api.py: Defines the FastAPI application that handles API routes.
  • audio_tools.py: Contains functions for audio manipulation and conversion.
  • download_processor.py: Downloads YouTube videos and extracts information.
  • transcription_processor.py: Processes untranscribed videos for transcription.
  • whisper_transcriber.py: Integrates Whisper for transcribing audio.

Frontend (frontend/)

The frontend part of the application provides a web interface for interacting with the transcribing system.

  • pages/index.js: Main entry point for the frontend.
  • pages/index.module.css: Stylesheet for the frontend.
  • package.json: Defines frontend dependencies and scripts.

Configuration Files

  • .dockerignore: Specifies files and folders to exclude when building Docker images.
  • .pylintrc: Configuration for the Python linting tool.
  • docker-compose.yml: Defines Docker services and their configurations.
  • Dockerfile: Docker image build instructions.
  • LICENSE: The license under which the software is distributed.
  • poetry.lock: Lock file generated by Poetry for managing dependencies.
  • pyproject.toml: Configuration file for Poetry, specifying project dependencies.

Running the Application

To run the application, you'll need Docker installed on your system. Follow these steps:

  1. Clone the repository: git clone https://github.com/your-username/queuetube-whisper-transcriber.git
  2. Navigate to the cloned directory: cd queuetube-whisper-transcriber
  3. Build the Docker image: docker build -t queuetube-transcriber .
  4. Start the Docker containers: docker-compose up

This will launch the API server, Redis for queuing, Celery workers for transcription and download, and the frontend development server.

Accessing the Application

  • API: Access the FastAPI API at http://localhost:8000.
  • Frontend: Access the frontend interface at http://localhost:3000.

Notes

  • The application leverages the Whisper ASR system for transcription.
  • Videos are queued for transcription using Celery.
  • The WhisperTranscriber class handles the transcription process.

Feel free to explore and modify the code according to your needs. If you have any questions or need further assistance, don't hesitate to reach out. Happy coding!

About

Celery task runner Docker App for YouTube Transcriptions with simple Next.js frontend

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published