Skip to content
Tool to collect and review sentences for Common Voice
Branch: master
Clone or download
semantic-release-bot chore(release): 1.8.1 [skip ci]
## [1.8.1](v1.8.0...v1.8.1) (2019-03-20)

### Bug Fixes

* **regex:** tested ([1ace2d3](1ace2d3))
* **validation index.js:** add italian ([c46af5c](c46af5c))
Latest commit d34d2ba Mar 20, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.
doc feat: add note about natural/conversational sentences (fixes #139) Jan 25, 2019
shared Update it.js Mar 20, 2019
web Merge pull request #179 from helenatxu/feature-spinner Feb 27, 2019
.env_template chore: fix error in .env_template Mar 11, 2019
.travis.yml Add Code of Conduct Nov 30, 2018
LICENSE Create LICENSE Jan 3, 2019 Update README with detail export instructions Feb 13, 2019
docker-compose.yml Activate accounts plugin for Kinto Nov 15, 2018
package.json chore(release): 1.8.1 [skip ci] Mar 20, 2019

Common Voice Sentence Collector Build Status

Get involved

  • Fork the project and test that you can run the environment locally following the instructions below.
  • Is everything working as expected? If not, submit a new issue.
  • Review the pending issues on the MVP milestone.
  • Create a new PR to fix any of the existing issues in the MVP milestone.
  • Get involved in the development discussion topic and ask any questions.


Local Development

cp .env_template .env
docker-compose up

Once Kinto is fully started, you can create an admin account with the password password. Run the following in a separate Terminal window:

curl --header "Content-Type: application/json" \
  --request PUT \
  --data '{"data": {"password": "password"}}' \

If you want to change the password, please also change the KINTO_PASSWORD in .env.

Now we can install the dependencies and initialize the database:

yarn init-db

If you get an error along the lines of Error: ENOENT: no such file or directory, scandir '/directory/sentence-collector/voice-web/server/data' you can safely ignore it for now. This folder is used to gather statistics and metadata from the local Common Voice instance. You can develop most of the features for the collector without having that repository around.

Finally, you can start the frontend by running yarn. Please make sure that you're in the root directory of the repository.

yarn start

The sentence collector is now accessible through http://localhost:1234.


The website is hosted on GitHub Pages. Contributors with write access to the repository can deploy to production by running the following command. Please note that you will need to provide a GitHub token for the release notes. Read more about tokens on GitHub.

GITHUB_TOKEN=... yarn run deploy

This assumes that your origin is pointing to this repository. If not, you can specify the remote name with:

GITHUB_TOKEN=... yarn run deploy -- -o <remotename>

This will also create release notes on GitHub.



Make sure you have voice-web repository cloned locally first.

git clone

Anyone with a locally working setup can export sentences to be added to Common Voice. Make sure to have your .env file correctly set up, including the correct path to the Common Voice (voice-web) repository as well as the Kinto credentials depending on the environment.

If you want to run the export against the local instance, remove the SC_SYSTEM env variable below.

SC_SYSTEM=production yarn run export

This will export all the approved sentences for languages currently active in and put them into sentence-collector.txt files in the corresponding locale folder of the Common Voice repository. After the script ran, you might verify the output by running git status in the voice-web repository.

Exporting to the official repository

  1. Make sure you have forked voice-web repo in your user.
  2. Clone voice-web locally and link your remote fork for exports
git clone
cd voice-web
git remote add fork

All steps to do the export to our fork (you can repeat this each time you want to make an updated export)

cd  voice-web
## Making sure our master branch is updated
git checkout master
git pull origin master
git push fork master
git push --delete fork sentence-collector-export
git branch -D sentence-collector-export
## Creating a new branch just for forks
git checkout -b sentence-collector-export
cd ..
## Creating the export
SC_SYSTEM=production yarn run export
## Committing the export to our Fork
cd voice-web
git add .
git commit -am "Sentence Collector - validated sentences export - 2019-02-13-13-28"
git push fork sentence-collector-export

Now you will be able to create a manual pull request using the following URL:

Adding a new user

You can add as many users as you want. To do so, call the accounts endpoint again:

curl --header "Content-Type: application/json" \
  --request PUT \
  --data '{"data": {"password": "THIS_IS_YOUR_PASSWORD"}}' \

where USERNAME is your username and THIS_IS_YOUR_PASSWORD is your password.

To create a user "Bob" with the password "mozilla":

curl --header "Content-Type: application/json" \
  --request PUT \
  --data '{"data": {"password": "mozilla"}}' \
You can’t perform that action at this time.