## Using Dunya API

This notebook demonstrates downloading data using 
the CompMusic Python library: https://github.com/MTG/pycompmusic which includes a client library to access Dunya. `pycompmusic` is already installed in the docker image, and is ready to use. 

To be able to download sounds from Dunya, you need to have a user and obtain an API authentication key (token). Please create a user: https://dunya.compmusic.upf.edu/social/register/ 
In order to get your API token you have to log in to dunya and then go to your profile where you will find your token. 

This example demonstrates:
 * downloading a single file using a recording's MusicBrainz ID
 * downloading files of a CompMusic dataset (https://github.com/MTG/otmm_makam_recognition_dataset)
    
The [MusicBrainz ID](https://musicbrainz.org/doc/MusicBrainz_Identifier) for a recording is the UUID at the end of a URL for a MusicBrainz page. For example, the recording https://musicbrainz.org/recording/e666ec52-b752-492d-9423-24e1c7bffbc7 has the MusicBrainz ID `e666ec52-b752-492d-9423-24e1c7bffbc7`

In [None]:
# Set your token here from https://dunya.compmusic.upf.edu/user/profile/
token = '...yourAPITokenGoesHere...'

In [None]:
import collections
import json
import os
from compmusic import dunya

dunya.set_token(token)

#### Downloading a single/specific file: 
https://musicbrainz.org/recording/e666ec52-b752-492d-9423-24e1c7bffbc7


In [None]:
musicbrainzid = 'e666ec52-b752-492d-9423-24e1c7bffbc7'
data_dir = '../data/compMusicDatasets/turkishMakam/'
_ = dunya.makam.download_mp3(musicbrainzid, data_dir)

#### Downloading a set of files
Example: Audio from the following dataset https://github.com/MTG/otmm_makam_recognition_dataset

The OTMM Makam Recognition Dataset comes with a JSON file listing a number of recordings which exist in Dunya, along with some additional metadata. This file has been copied to this repository.

We are going to download two audio files from each Makam.

In [None]:
# Reading the dataset description file which contains a list of references to audio
with open(os.path.join(data_dir, 'annotations.json')) as fp:
    collectionFiles = json.load(fp)

# Collecting the list of makams in this dataset
makams = collections.defaultdict(list)
for file in collectionFiles:
    makam = file['makam']
    makams[makam].append(file)

# Create sub-directories for makams and download a few files for each makam
num_files_per_makam = 2

print('Downloading files for {} makams'.format(len(makams)))
for makam, files in makams.items():
    print(' {}'.format(makam))
    makam_dir = os.path.join(data_dir, makam)
    os.makedirs(makam_dir, exist_ok=True)
    
    for file in files[:num_files_per_makam]:
        musicbrainzid = file['mbid'].split('http://musicbrainz.org/recording/')[-1]
        dunya.makam.download_mp3(musicbrainzid, makam_dir)

print('Sub-folders and files created in {}'.format(data_dir))