Skip to content

Commit

Permalink
Update README
Browse files Browse the repository at this point in the history
  • Loading branch information
FluxCapacitor2 committed Jul 21, 2023
1 parent 4f11f89 commit ac29298
Show file tree
Hide file tree
Showing 3 changed files with 24 additions and 4 deletions.
Binary file added .github/readme_images/app_dark.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .github/readme_images/app_light.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
28 changes: 24 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,33 @@
# `whisper-asr-webapp`

[![Docker](https://github.com/FluxCapacitor2/whisper-asr-webapp/actions/workflows/docker.yml/badge.svg)](https://github.com/FluxCapacitor2/whisper-asr-webapp/actions/workflows/docker.yml)
![GitHub last commit (branch)](https://img.shields.io/github/last-commit/FluxCapacitor2/whisper-asr-webapp/main)

A web app for automatic speech recognition using OpenAI's Whisper model running locally.

![](/.github/readme_images/app_dark.png#gh-dark-mode-only)
![](/.github/readme_images/app_light.png#gh-light-mode-only)

## Features

- Customize the model, language, and initial prompt
- Enable per-word timestamps (visible in downloaded JSON output)
- Runs Whisper locally
- Pre-packaged into a single Docker image
- View timestamped transcripts in the app
- Download transcripts in plain text, VTT, SRT, TSV, or JSON formats

## Architecture

The frontend is built with Svelte and builds to static HTML, CSS, and JS. It makes requests to the backend, which is on a separate port but has permissive CORS headers.
The frontend is built with Svelte and builds to static HTML, CSS, and JS.

The backend is built with FastAPI. The main endpoint, `/transcribe`, pipes an uploaded file into ffmpeg, then into Whisper. Once transcription is complete, it's returned as a JSON payload.

In a containerized environment, the static assets from the frontend build are served by the same FastAPI (Uvicorn) server that handles transcription.

## Running

1. Clone the repository.
2. Run `docker compose up`. This will build and run both the frontend and backend.
3. Open `http://localhost:5000` in your web browser.
1. Pull and run the image with Docker.
- Run in an interactive terminal: `docker run --rm -it -p 8000:8000 -v whisper_models:/root/.cache/whisper ghcr.io/fluxcapacitor2/whisper-asr-webapp:main`
- Run in the background: `docker run -d -p 8000:8000 -v whisper_models:/root/.cache/whisper ghcr.io/fluxcapacitor2/whisper-asr-webapp:main`
2. Visit http://localhost:8000 in a web browser

0 comments on commit ac29298

Please sign in to comment.