CTC LSTM

spoken word recognition using CTC LSTMs

Instructions

Create a virtual environment: python -m venv venv
Install the required packages: ./venv/bin/pip install -r requirements.txt
Train the model: ./venv/bin/python main.py train (takes a few hours and needs around 20GB disk and 5GB memory)
- or download my pre-trained model (25 epochs, not good) from here and move it to target/model-final.ckpt
Test the final model: ./venv/bin/python main.py test
Infer text from flac: ./venv/bin/python main.py infer audio.flac