Melvin ASR

Melvin ASR is an application serving REST and WebSocket endpoints for the transcription of audio files.

REST API: The API is based on HTTP requests that handles the transcription of files in an async workflow, enabling user to send an audio file in a first request and receive the transcription via a second request as soon as the transcript is ready. See REST Documentation

WebSocket API: The API does provide streaming capabilities. See WebSocket Documentation

Getting Started

Prerequisites

Before you begin, ensure you have installed the following tools:

Python 3.10
Docker & Docker Compose
Visual Studio Code
ffmpeg

Run docker compose

Clone this repository to your local machine:

git clone https://github.com/shuffle-project/melvin-asr.git

Build and run the app using Docker Compose from the root directory:
```
docker-compose up
```
Access the REST-API at http://localhost:8393
Access the Websocket-API at http://localhost:8394. This is build upon python's websockets package.

Local Development

Besides the local Docker Compose stack, there is an option to run both services directory on your local machine.

Install dependencies

pip install -r ./requirements.txt

Run locally

Locally for a development environment the websocket and the flask api are started seperatly.

python app.py

Research

To optimize ASR there have been multiple Proof-of-concepts to find out which solutions are working most efficiently. Take a look at the following pages:

Configuration

The configuration of the service is done in the config.yml and config.local.yml file. The config.local.yml is used for local development, config.yml for Docker.

These files are read by the src/helper/config.py module, which is providing configurations to the service logic.

Linting & Testing

The project uses Ruff for linting and formating code, Pytest for Unit tests. See Test Documentation

Deployment

The project is delivered and deployed as a docker container. Depending on the usage of GPU or CPU, there are different factors that come in play. See Deployment Documentation

Code Integration

We are maintaining our code following trunk based development. This means we are working on features branches, integrating into one trunk, the main branch. Please keep your side branches small, and bring them back to main as soon as possible.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
.github/workflows		.github/workflows
data		data
docs		docs
example		example
models		models
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
config.local.yml		config.local.yml
config.yml		config.yml
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Melvin ASR

Getting Started

Prerequisites

Run docker compose

Local Development

Install dependencies

Run locally

Research

Configuration

Linting & Testing

Deployment

Code Integration

License

Acknowledgements

About

Releases

Packages

Contributors 8

Languages

License

shuffle-project/melvin-asr

Folders and files

Latest commit

History

Repository files navigation

Melvin ASR

Getting Started

Prerequisites

Run docker compose

Local Development

Install dependencies

Run locally

Research

Configuration

Linting & Testing

Deployment

Code Integration

License

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 8

Languages

Packages