## 1. Dataset Creation

In [1]:
%load_ext autoreload
%autoreload 2

This notebook should serve as a guide to the creation of your Carnatic Music Instrument dataset. We will start with the loading of the dataset using the mirdata API, extract the relevant sections and instruments, apply any relevant processing steps, and store the dataset in an intuitive and accessible format.

Typical Carnatic Music ensembles contain a wide-range of instruments. For this task we are going to focus on:

- Voice
- Violin
- Mridangam

You can refer to the instrumentation section of the [compIAM tutorial](https://mtg.github.io/IAM-tutorial-ismir22/indian_art_music/carnatic-music.html) for more information.

The final dataset will be a collection of short audios corresponding to each of these instruments. They will be organised such that each can be retrieved according to the instrument they contain, the performer, the raga and a unique identifier (for reproducibility later).

It is up to you to fill in each subsection with the relevant code to perform that task. If possible, try and split the sections amongst the project group to work in parallel. When the task is complete, you should try and abstract the code into .py files so that it can be ran without a python notebook.

### Explore Dataset

You can access the Saraga Carnatic dataset using the [mirdata API](https://github.com/mir-dataset-loaders/mirdata). You should already have the dataset downloaded on your machine in the mirdata repository.

In [2]:
import mirdata

In [1]:
data_home = 'C:/Users/solab/OneDrive/Documents3r_Curs\3r_trim\musicology'
saraga = mirdata.initialize('saraga_carnatic', data_home=data_home)
saraga.download()

NameError: name 'mirdata' is not defined

In [7]:
data_home = '/Volumes/MyPassport/mir_datasets2/saraga1.5_carnatic/'

In [8]:
saraga = mirdata.initialize('saraga_carnatic', data_home=data_home)
saraga.validate()

RecursionError: maximum recursion depth exceeded

You can choose a random track using `.choice_track()`. This returns a Track object.

In [None]:
example_track = saraga.choice_track()

You can load all tracks and information to a dict using `.load_tracks()`

In [None]:
all_tracks = saraga.load_tracks()

INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Sumitra Nitin at Arkay by Sumitra Nitin/Dorakuna/Dorakuna.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Sumitra Nitin at Arkay by Sumitra Nitin/Ganamuda Panam/Ganamuda Panam.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Sumitra Nitin at Arkay by Sumitra Nitin/Chidambara Natarajam/Chidambara Natarajam.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Sumitra Nitin at Arkay by Sumitra Nitin/Vandalum/Vandalum.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Sumitra Nitin at Arkay by Sumitra Nitin/Thiruveragane Saveri Varnam/Thiruveragane Saveri Varnam.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Sumitra Ni

INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Live at Kamarajar Hall by Sanjay Subrahmanyan/Soundararajam/Sanjay Subrahmanyan - Soundararajam.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Live at Kamarajar Hall by Sanjay Subrahmanyan/Payum Oli Nee Yenakku/Sanjay Subrahmanyan - Payum Oli Nee Yenakku.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Live at Kamarajar Hall by Sanjay Subrahmanyan/Shatre Vilagi Irum/Sanjay Subrahmanyan - Shatre Vilagi Irum.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Live at Kamarajar Hall by Sanjay Subrahmanyan/Maname Ramanai Paada/Sanjay Subrahmanyan - Maname Ramanai Paada.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Akkarai Sisters at Arkay by Akkarai Sisters/Apparama Bhak

INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Kanakadurga Venkatesh at Arkay by Kanakadurga Venkatesh/Vara Leela Gana Lola/Vara Leela Gana Lola.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Mahati at Arkay by Mahati/Chinnanchiru Kiliye/Chinnanchiru Kiliye.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Mahati at Arkay by Mahati/Gopi Gopala Bala/Gopi Gopala Bala.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Mahati at Arkay by Mahati/Pavamana Suttudu Pattu/Pavamana Suttudu Pattu.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Mahati at Arkay by Mahati/Kannallavo Swami/Kannallavo Swami.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Mahati at Arkay by

INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Prema Rangarajan at Arkay by Prema Rangarajan/Thamarai Kangal/Thamarai Kangal.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Prema Rangarajan at Arkay by Prema Rangarajan/Gam Ganapate/Gam Ganapate.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Prema Rangarajan at Arkay by Prema Rangarajan/Karunajaladhe/Karunajaladhe.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Modhumudi Sudhakar at Arkay by Modhumudi Sudhakar/Ghandhamu Poyyaruga/Ghandhamu Poyyaruga.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Modhumudi Sudhakar at Arkay by Modhumudi Sudhakar/Koluvaiyunnade/Koluvaiyunnade.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic

INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Sanjay at Sastri Hall by Sanjay Subrahmanyan/Va vela va/Sanjay Subrahmanyan - Va vela va.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Kuldeep Pai at Arkay by Kuldeep Pai/Gange Maam Pahi/Gange Maam Pahi.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Kuldeep Pai at Arkay by Kuldeep Pai/Arul Seya Vendum Ayya/Arul Seya Vendum Ayya.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Kuldeep Pai at Arkay by Kuldeep Pai/Kaalaharanamelara/Kaalaharanamelara.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Kuldeep Pai at Arkay by Kuldeep Pai/Vasudeva Sutam Devam/Vasudeva Sutam Devam.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga

INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Manda Sudharani at Arkay by Manda Sudharani/Sarasamukhi Sakala/Sarasamukhi Sakala.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Srividya Janakiraman at Arkay by Srividya Janakiraman/Parvathi Ninu Ne Nera Nammithi/Parvathi Ninu Ne Nera Nammithi.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Srividya Janakiraman at Arkay by Srividya Janakiraman/Nee Sari evvaramma/Nee Sari evvaramma.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Srividya Janakiraman at Arkay by Srividya Janakiraman/Jaya Jaya Janaki Kantha/Jaya Jaya Janaki Kantha.json not found.
INFO: Metadata file /Volumes/MyPassport/mir_datasets/saraga1.5_carnatic/saraga1.5_carnatic/Srividya Janakiraman at Arkay by Srividya Janakiraman/Santhamu leka Sowkyamu ledu/Santham

This returns a dict of `unique track identifier` : `track` object for each track.

Track objects contain all filepaths of audios and metadata associated with the chosen track, and some information related to the recording itself (such as artist names and instruments). Remember, that for many recordings, we have 4 audio files relevant to our task...


The path of the final mixed performance:

In [None]:
example_track.audio_path

'/Volumes/MyPassport/mir_datasets2/saraga1.5_carnatic/saraga1.5_carnatic/Sumithra Vasudev at Arkay by Sumithra Vasudev/Mati Matiki/Mati Matiki.mp3.mp3'

The path of the vocal microphone:

In [None]:
example_track.audio_vocal_path

'/Volumes/MyPassport/mir_datasets2/saraga1.5_carnatic/saraga1.5_carnatic/Sumithra Vasudev at Arkay by Sumithra Vasudev/Mati Matiki/Mati Matiki.multitrack-vocal.mp3'

The path of the violin microphone:

In [None]:
example_track.audio_violin_path

'/Volumes/MyPassport/mir_datasets2/saraga1.5_carnatic/saraga1.5_carnatic/Sumithra Vasudev at Arkay by Sumithra Vasudev/Mati Matiki/Mati Matiki.multitrack-violin.mp3'

And two mridangam microphones (one for each head):

In [None]:
example_track.audio_mridangam_left_path

'/Volumes/MyPassport/mir_datasets2/saraga1.5_carnatic/saraga1.5_carnatic/Sumithra Vasudev at Arkay by Sumithra Vasudev/Mati Matiki/Mati Matiki.multitrack-mridangam-left.mp3'

In [None]:
example_track.audio_mridangam_right_path

'/Volumes/MyPassport/mir_datasets2/saraga1.5_carnatic/saraga1.5_carnatic/Sumithra Vasudev at Arkay by Sumithra Vasudev/Mati Matiki/Mati Matiki.multitrack-mridangam-right.mp3'

Navigate to these files and listen to the audios. What do you notice about them? Are they the same intensity? Is there any undesirable artifacts such as leaking or noise?

Take note, the `mirdata` `Track` object will not have a `audio_vocal_path` (or vocal or mridangam) attribute if for the given track there is no multi-microphone recordings. Can you use this information to determine how many tracks we have multi-microphone recordings for? (HINT: You can check if an object has a specific attribute using the hasattr function: `hasattr(obj, "<attribute_to_check_for>")`.

In [None]:
# How many tracks with multitrack recordings?

Another important path is the metadata_path:

In [None]:
metadata_path = example_track.metadata_path

Here you will find information relating to the recording such as artist names, instruments, raaga.

Can you create some functions to explore these tracks and metadata? Perhaps it would be useful to know that JSON can be loaded in python using the `json` library:

In [None]:
import json

with open(json_path, 'r') as f:
    loaded_json = json.loads(f.read())

In [None]:
def get_metadata(track_id):
    """
    For <track_id>, return a dataframe of associated metadata
    """
    # code here
    return metadata

def get_performer(track_id):
    """
    For <track_id>, return the performer
    """
    # code here
    return performer

def get_performance(track_id):
    """
    For <track_id>, return the performance name
    """
    # code here
    return performance

def get_raga(track_id):
    """
    For <track_id>, return the raga name
    """
    # code here
    return raga


def get_tonic(track_id):
    """
    For <track_id>, return the tonic in hertz
    """
    # code here
    return tonic

How many ragas/performers/performances are available? How does that breakdown across performances for which we have multi-track recordings and those we dont?

In [None]:
# get dataset statistics

### Load Audio

The mirdata API returns paths to audio files associated with each track. Can you create some loaders to load an audio based on a given track name?

**Hint**: The `librosa` library contains functions to load audio from file to an array of amplitude values. `y, sr = librosa.load(audio_path, sr=44100)`. `sr` in this instance refers to the sampling rate of the audio, i.e. how many individual amplitude energy values there are per second (typically 44100Hz). It is important to remember this resolution when converting between number of elements in the returned array and time in the track.

In [None]:
def load_mixed_audio(track_id):
    """
    For <track_id>, return the loaded audio
    """
    # code here
    return audio_array

def load_violin_audio(track_id):
    """
    For <track_id>, return the isolated violin track
    """
    # code here
    return audio_array

def load_voice_audio(track_id):
    """
    For <track_id>, return the isolated voice track
    """
    # code here
    return audio_array

def load_mridangam_audio(track_id):
    """
    For <track_id>, return the isolated mridangam track
    """
    # code here
    return audio_array

### Listen to Audio

Let's write some functions to listen and visualise these audio arrays in the notebook.

**Hint**: You should find that the `Ipythoon.display.Audio` useful for playing audio inline in a Jupyter notebook.

**Hint2**: Using the `matplotlib` library you can plot on two dimensions as so:

```
import matplotlib.pyplot as plt

plt.plot(x, y)
```
More information on enhancing these plots (e.g. with titles, axis labels and gridlines) can be found [here](https://matplotlib.org/stable/gallery/lines_bars_and_markers/simple_plot.html).

In [None]:
def plot_waveform(audio_array):
    """
    Plot waveform for <audio_array> using matplotlib.pyplot
    """
    pass


def play_audio(audio_array):
    """
    Generate audio player for <audio_array> using Ipython library
    """
    pass

Are there any important observations about the mixed or isolated instrument tracks? What is the quality like, do you here all of the instruments clearly? Are there any differences between the audios of the individual instrument tracks?

### Processing

Are the isolated vocal tracks sufficiently isolated? Libraries like [`spleeter`](https://github.com/deezer/spleeter) can help separate singing sources from background instruments. Does it help here?

In [None]:
def separate_voice(audio_path, isolated_audio_output_path):
    """
    Apply spleeter source separation to input audio
    """
    pass

How does the quality compare? Does spleeter work effectively? Do we lose any important information?

### Tagging Audio

We want to tag our audios with whether or not a particular instrument is sounding. We can do this by identifying non-silent regions in the isolated tracks and tagging the mixed tracks with the instrument. The `librosa` library contains functionality for identifying silent regions in audio (`librosa.effects.split`).

In [None]:
# Load audios for each instrument track using previously defined functions

In [None]:
# Plot samples of audios (remember the relationship between elements and sampling rate)

Define a function to identify silent regions in an audio array. Look at the documentation for `librosa.effects.split` ([here](https://librosa.org/doc/main/generated/librosa.effects.split.html)).

**Hint** - The `top_db` parameter tunes the harshness of the cut (a higher value considers louder regions as "silent"). Experiment with this value and compare the results with the audio plots. Do they correspond to what you visualise/hear?

**Remember** - `librosa.effects.split` returns NON-silent intervals.

In [None]:
def detect_silence(audio_array):
    """
    Return array of 0 and 1 (is silent/is not silent) for input <audio_array>. Returned array should
    be equal in length to input array
    """
    return is_silent

Do these regions correspond to what you hear when playing the audio with `play_audio` or what you see with `plot_waveform`?

In [None]:
mridangam = [int(x or y) for x,y in zip(y_mridangam_left, y_mridangam_right)]

### Extracting Samples

We should now have all the tools necessary to load and annotated audio. We now want to extract small snippets of audio  from the mixed tracks across the dataset and annotate each of these snippets as either containing voice, mridangam, violin or none of the above (a single audio should be able to have more than one tag).

It is important that we have examples for all combinations of tags (violin, voice, mridangam, none). Each sample should be of the same length (what should that length be? think about the two extreme cases of very very short and very long, what problems would arise in each of these cases).

Each sample should have a unique identifier (index). The information relating to their tags should be stored in a metadata DataFrame where you can also find information about the performance.

These should all be saved in individual audio files.

Let us try with just on track to begin with...

1. For a certain track id, load all audio files (mix, violin, etc...)

In [None]:
# mix_array =
# vocal_array =
# violin_array =
# vocal_array =
# mridangam_left_array =
# mridangam_right_array =

2. Create a silent/non-silent array using `detect_silence()` defined earlier.

      **Remember**: The mridangam has two tracks corresponding to it, you must combine them to identify whether either is sounding

In [None]:
# violin_silence =
# vocal_silence =
# mridangam_silnce =

3. Split mixed audio into small chunks using [numpy array indexing](https://numpy.org/doc/stable/user/basics.indexing.html) (the size of these chunks should be informed by the literature)

4. Determine from your silent/non-silent arrays in Step 2 whether the chunk contains each instrument (voice, vocal, mridangam

In [None]:
# Slice silence arrays identically to mix slice and determine yes/no does chunk contain instrument

5. Save each audio with a unique index.

    **Hint**: Audio arrays can be saved to file using the `soundfile` library:
    `sf.write('<filename>.wav', <audio_array>, <sampling rate>)`
    
    **Remember**: Each audio chunk  needs to be assigned a unique index so as to be managed correctly later on. Feel free to use numbers, hashes or uuids

In [None]:
# store audio with soundfile

6. Add row to metadata table containing relevant track information, index, and instrument annotations.

    **Hint** - A `pandas` dataframe is a suitable place to store information relating to track and instrument annotations. You can create one using:

    `import pandas as pd`

    `df = pd.DataFrame(columns=<list of columns names>])`
    
    Add new rows using append:
    
    `df.append({dict of {column_name:value>, ignore_index=True)`
    
    And save using:
    
    `df.to_csv('<path.csv>', index=False)`
    
    **Remember** - This table should include the metadata relating to the track, the unique chunk index and a column indicating whether or not it includes each instrument
    

In [None]:
# metadata dataframe

7. Repeat for many tracks and many chunks. Now you have written the individual code to do this for one track/chunk. Let's combine this and apply to a large number of tracks/chunks. Storing each with a unique index and a row in the metadata dataframe.

### Load Dataset

With our dataset created and saved in an intuitive and accessible format. Let's create some loaders to load the files and get metadata.

In [None]:
def load_sample(index):
    """
    Load sample with index, <index>
    """
    return sample

def get_metadata(index):
    """
    Get metadata for sample with index, <index>
    """
    return sample

Typically, when datasets are presented, they are accompanied by some stats detailing their size and constiuent parts. What stats can you tell us about our dataset? Think about: number of seconds, performers, performances, instruments, ragas, filesizes etc....

In [None]:
# stats

### Reproducible Code

Jupyter notebooks are great for experimenting, especially when visualisation or audio playback is required. However they are not great for reproducibility or source control. Can you abstract the code created here to .py file(s) so that the code can be ran in future without having to load the HTML notebook?