Skip to content
A TensorFlow implementation of Baidu's DeepSpeech architecture
C++ Python C Shell C# Java Other
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.github Add lock bot config Dec 28, 2018
bin Introducing utils.helpers for miscellaneous helper functions Jan 14, 2020
data Add 8kHz training test coverage Jan 10, 2020
doc Ensure properly link to TensorFlow r1.15 Jan 22, 2020
examples Remove example code Dec 10, 2019
images Updating Geometry Dec 2, 2019
native_client Adjust Buffer length to account for element size inside the JS binding Jan 27, 2020
taskcluster Ensure properly link to TensorFlow r1.15 Jan 22, 2020
util Fix whitespace Jan 27, 2020
.cardboardlint.yml Update cardboardlint configuration Oct 4, 2019
.compute Ensure properly link to TensorFlow r1.15 Jan 22, 2020
.gitattributes Remove old versions of decoder binary files Nov 8, 2018
.gitignore Sphinx doc Sep 24, 2019
.gitmodules Use submodule for building contrib examples into docs Dec 10, 2019
.pylintrc Remove alphabet param usage Nov 5, 2019
.readthedocs.yml Re-enable readthedocs.io Sep 24, 2019
.taskcluster.yml Move to TC Community Nov 5, 2019
.travis.yml Add pylint CI Apr 11, 2019
CODE_OF_CONDUCT.md Add Mozilla Code of Conduct file Mar 29, 2019
CONTRIBUTING.rst Move from Markdown to reStructuredText Oct 4, 2019
DeepSpeech.py Fix whitespace Jan 27, 2020
Dockerfile Switch TF dependency to r1.15 branch Jan 12, 2020
GRAPH_VERSION Embed alphabet directly in model Nov 5, 2019
ISSUE_TEMPLATE.md Create an issue template Nov 27, 2017
LICENSE Added LICENSE Sep 20, 2016
README.rst Bump version to v0.6.1 Jan 10, 2020
RELEASE.rst Move from Markdown to reStructuredText Oct 4, 2019
SUPPORT.rst Move from Markdown to reStructuredText Oct 4, 2019
VERSION Bump version to v0.6.1 Jan 10, 2020
bazel.patch Proper re-use of Bazel cache Jan 31, 2018
build-python-wheel.yml-DISABLED_ENABLE_ME_TO_REBUILD_DURING_PR Move to ARMbian Buster Aug 21, 2019
evaluate.py Update evaluate.py Jan 21, 2020
evaluate_tflite.py Fix linter errors Jan 12, 2020
requirements.txt Switch TF dependency to r1.15 branch Jan 12, 2020
requirements_eval_tflite.txt Update evaluate_tflite requirements Jan 12, 2020
stats.py Introducing utils.helpers for miscellaneous helper functions Jan 14, 2020
transcribe.py Separate process per file; less log noise Nov 20, 2019

README.rst

Project DeepSpeech

Documentation Task Status

DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier.

NOTE: This documentation applies to the v0.6.1 version of DeepSpeech only. If you're using a stable release, you must use the documentation for the corresponding version by using GitHub's branch switcher button above.

To install and use deepspeech all you have to do is:

# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-venv/
source $HOME/tmp/deepspeech-venv/bin/activate

# Install DeepSpeech
pip3 install deepspeech

# Download pre-trained English model and extract
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.6.1/deepspeech-0.6.1-models.tar.gz
tar xvf deepspeech-0.6.1-models.tar.gz

# Download example audio files
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.6.1/audio-0.6.1.tar.gz
tar xvf audio-0.6.1.tar.gz

# Transcribe an audio file
deepspeech --model deepspeech-0.6.1-models/output_graph.pbmm --lm deepspeech-0.6.1-models/lm.binary --trie deepspeech-0.6.1-models/trie --audio audio/2830-3980-0043.wav

A pre-trained English model is available for use and can be downloaded using the instructions below. A package with some example audio files is available for download in our release notes.

Quicker inference can be performed using a supported NVIDIA GPU on Linux. See the release notes to find which GPUs are supported. To run deepspeech on a GPU, install the GPU specific package:

# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-gpu-venv/
source $HOME/tmp/deepspeech-gpu-venv/bin/activate

# Install DeepSpeech CUDA enabled package
pip3 install deepspeech-gpu

# Transcribe an audio file.
deepspeech --model deepspeech-0.6.1-models/output_graph.pbmm --lm deepspeech-0.6.1-models/lm.binary --trie deepspeech-0.6.1-models/trie --audio audio/2830-3980-0043.wav

Please ensure you have the required CUDA dependencies.

See the output of deepspeech -h for more information on the use of deepspeech. (If you experience problems running deepspeech, please check required runtime dependencies).


Table of Contents

You can’t perform that action at this time.