fe33918 Mar 19, 2015
85 lines (52 sloc) 3.31 KB


In the ideal world, this project is based entirely on community contributions. Your help is most appreciated.

Adding an audio sample

Find an audio sample in a known language, and propose that it be added to the index. Good samples are:

  • Primarily in one language
  • Of a few minutes in length, no more than an hour
  • From a known source, so that it can be correctly attributed
  • (Ideally) In a language that doesn't yet have much coverage

You can propose an audio sample by opening a Github issue with a link to the page for the sample, making sure to identify its language.

For the tech savvy, feel free to open a pull request directly with a proposed JSON record for the new sample. Be sure your contribution survives a make audit.

Correcting a sample's language

Please raise a Github issue about it.

Annotating samples

As well as contributing sound files, it helps to know which parts of the sound files are ok to use as examples of a language. This is called annotation.

You can contribute an annotation by raising a Github issue, just like for a sample. But, this takes a lot of time. It's much better to do a number at once, using the built-in annotation tool. This section is about using that tool. To begin with, we need to install it.

Installing the codebase

A built-in console annotation tool is provided for OS X and Linux. The instructions for getting started on OS X are as follows:

Firstly, install Homebrew, which will provide the brew command.

Next, we need to fetch the index.

brew install git
git clone
cd wide-language-index
make env

The index just says what media files are available, now we need to get the files themselves.

make fetch

Now we have the sound files and we're ready to go. Start an annotation session by running:

Using the annotation tool

You can start an annotation session at any time in the Terminal by going into the wide-language-index/ directory and running:

make annotate

The tool will guide you on making annotations. You can stop at any time by pressing CTRL-C.

Contributing your annotations

You can contribute your annotations back using a Github fork-and-pull workflow. You can read more about this type of workflow on Github. We'll step you through it here.

Step 1: sign up for Github. Doing this will get you a username. We'll use myusername as an example from here on out.

Step 2: fork wide-language-index on Github. Do this by visiting and clicking the "Fork" button on the top-right.

Step 3: locally commit your changes. In the Terminal, run:

cd wide-language-index
git add index
git commit -m 'Made annotations.'

Step 4: push your local changes to Github

git remote add myusername
git push -u myusername master

Step 5: make a pull request in Github. In your web browser, go to (remember to change myusername for yours). Click the "Make pull request" button, and follow the prompts. Once you've made a pull-request, it will be reviewed and accepted.