# Arab-Andalusian Collection - Corpus
With this notebook, it is possible to download and analyse the entire corpus of Arab-Andalusian music in Dunya. The corpus is composed by 164 recordings and their relative metadata coming from Dunya. Most of them also feature notes transcriptions stored in a xml score and other metadata coming from Musicbrainz. With this notebook, it is possible to select all or part of this corpus, downloading the data and computing pitch profile, distribution and tonic frequency of each recording.

## Initialization (MANDATORY)
In this cell, all the libraries are loaded. 
Furthermore, a function checks if the metadata related to Arab-Andalusian corpus of Dunya has been downloaded: if not, all metadata will be downloaded. 
At the end, the code creates an object to manage the Dunya metadata.

#### NB: Before to run, remember to add the Dunya token in the costants.py file. This file is in the directory "utilities".

In [10]:
from utilities.recordingcomputation import *
from utilities.dunyautilities import *
from utilities.metadataStatistics import *
from utilities.constants import *
from gui.gui_corpora import *

# download metadata from Dunya
if not check_dunya_metadata():
    print("Downloading metadata from Dunya...")
    collect_metadata()

# create an object with all the well-structured metadata
print("Analyzing Dunya Metadata...")
cm = CollectionMetadata()
print("Collection of metadata created")

Analyzing Dunya Metadata...


of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=False'.


  df = pd.concat([df, new_row])


Collection of metadata created


## Select a list of recordings

The widget of this cell allows to select a subset of the Arab-Andalusian corpus. With the characteristics of the first row, it is possible to select a list of recordings with a specific nawba, tab, mizan and form. All these data come from Dunya metadata. 

Options in the second row allow to select the recordings depending if the mp3 file and/or xml score and/or Musicbrainz metadata are downloaded or not.
It is also possible to select only the recordings that have been analysed or not. 

Furthermore, the list of resulting recordings could be manually tweaked using checkboxes.

In [13]:
selector = SelectionGui(cm, 10)

VBox(children=(Label(value='   SELECT CHARACTERISTICS: '), HBox(children=(VBox(children=(Label(value='   ṭāb‘'…

To extract the list of recordings from the previous interface, it is necessary to run the next cell. 
The resulting list contains all the Musicbrainz IDs of the selected recordings.

In [14]:
rmbid_list1 = selector.get_rmbid_list()
print("Number of tracks: " + str(len(rmbid_list1)))

Number of tracks: 22


## Download mp3, score and Musicbrainz metadata

With this function, it is possible to check the correctness of the list of Musicbrainz IDs. Incorrect IDs are showed.

In [15]:
check_before_download(rmbid_list1, cm)

List of recordings in Dunya: 
['c20e4852-d140-4909-acab-e850c0e7d8e8', 'b3d92934-0946-4f2d-8183-312450d7e45e', '97223154-d5c2-4c37-8e6c-4c998056a674', '1630e9c2-3c01-4959-a633-7dbacbc7616e', 'd4cadf34-1074-44ce-9928-f438198d5d6d', '2d2683c4-4b3e-4430-a254-c828427bdcc7', '3e5a82a2-d806-45cc-876e-6fa8a2b5a61d', 'cb85269e-ab6c-4226-aea1-8226be1fe86c', '4342021d-03a1-4727-8c0f-3c23180ef374', '679856bc-132f-4982-b04e-cbf6b5b1129b', '9b546274-eea6-459f-a0c2-918f0997fa2b', '6fe7108c-4e4f-457b-a363-ccf505bdee9a', '44183247-4857-40cd-82bc-b4e9e3f458f1', '92de6fc8-a040-4500-bd94-73e9ee39f189', '33423585-e406-40ec-ba28-88b0768cb668', 'f461045b-50bc-4b20-a731-66fbd3a264ae', 'e3003cd0-430a-4481-a33d-c15c22da2404', 'e22549ae-4a0c-43ef-87f4-e0f81ed49d58', '5eb3c226-d289-40fe-a9f4-697568eb37d5', 'a1eac726-208e-4c24-bd57-e34e9e93dcd3', '9727ddbe-eb79-461b-9861-8a60336b17f6', '023b4a37-1ab4-4593-b03a-850ee0db8350']

List of uncorrect recordings: 
[]


The following interface allows to download the list of recordings previously selected. User can choose the type of data to download.

In [17]:
downloader = DownloadGui(rmbid_list1, cm)

VBox(children=(HBox(children=(Label(value='Select data type:', layout=Layout(width='50%')), Label(value='Selec…


Downloading data for recording 679856bc-132f-4982-b04e-cbf6b5b1129b
 - Score downloaded

Downloading data for recording 9b546274-eea6-459f-a0c2-918f0997fa2b
 - Score downloaded

Downloading data for recording 5eb3c226-d289-40fe-a9f4-697568eb37d5
 - Score downloaded

Downloading data for recording cb85269e-ab6c-4226-aea1-8226be1fe86c
 - Score downloaded

Downloading data for recording 1630e9c2-3c01-4959-a633-7dbacbc7616e
 - Score downloaded

Downloading data for recording 023b4a37-1ab4-4593-b03a-850ee0db8350
 - Score downloaded

Downloading data for recording 2d2683c4-4b3e-4430-a254-c828427bdcc7
 - Score downloaded

Downloading data for recording 4342021d-03a1-4727-8c0f-3c23180ef374
 - Score downloaded

Downloading data for recording 44183247-4857-40cd-82bc-b4e9e3f458f1
 - Score downloaded

Downloading data for recording 9727ddbe-eb79-461b-9861-8a60336b17f6
 - Score downloaded

Downloading data for recording f461045b-50bc-4b20-a731-66fbd3a264ae
 - Score downloaded

Downloading data for

## Compute recordings 
The following interface allows to analyse a list of recordings. User can decide to analyse the pitch and to convert mp3s into wav files. The results of the pitch analysis consist in JSON files containing pitch profile, pitch distribution and tonic frequency. It is possible to export results also in txt format, in order to study a recording using the spectrogram tool of Sonic Visualizer.

#### NB: a new list of recordings can be created using the selection widget

In [None]:
analyzer = ComputationGui(rmbid_list1, cm)

## Test