(melody-extraction)=
# Pitch extraction

As seen in the melodic introduction, predominant and vocal pitch is a very relevant feature to tackle the melodic analysis of Carnatic and Hindustani Music. 

In [None]:
## Importing compiam to the project
import compiam

# Import extras and supress warnings to keep the tutorial clean
from pprint import pprint
import numpy as np
import warnings
warnings.filterwarnings('ignore')

Let's first print out the available tools we do have available to extract the pitch from Indian Art Music recordings.

In [None]:
pprint(compiam.melody.pitch_extraction.list_tools())

In an Indian Art Music context, this task has been mainly approached through *heuristic-based approaches* {cite}`rao_pitch_2010, salamon_pitch_2012`, which have been used yet in recent years.

Let's extract the pitch from an audio sample using Melodia {cite}`salamon_pitch_2012`. We first need to install `essentia`, which is the optional dependency required to load this tool.

In [None]:
%pip install essentia

In [None]:
# Importing and initializing a melodia instance
from compiam.melody.pitch_extraction import Melodia
melodia = Melodia()  

# Running extraction for an example track
melodia_pitch_track = melodia.extract("../audio/testing_samples/test_1.wav")

print("Shape of the output pitch:", np.shape(melodia_pitch_track))
pprint("First 5 time-stamps:", melodia_pitch_track[:5, 0])
pprint("Last 5 pitch values:", melodia_pitch_track[:-5, 1])

Melodia has been found, in the original paper experiments and also in the [MIREX campaign](https://nema.lis.illinois.edu/nema_out/mirex2011/results/ame/indian08/sg1results.html), to decently work on Indian Art Music samples. However, recent DL-based models have claimed the state-of-the-art for the task of pitch extraction. 

**Maybe we can use a Carnatic-trained version of one of these models to extract the pitch?** Let's now import a DL model that learns to automatically extract the predominant melody from audio recordings. In the documentation we observe that this model is based on `tensorflow`, therefore we must install this dependency before importing it.

In [None]:
%pip install tensorflow==2.7.2

In [None]:
from compiam.melody.pitch_extraction import FTANetCarnatic

Let's first deactivate the GPU usage, since we assume no CUDA-capable GPU is available in most of the cases. We import `tensorflow` and set the visible GPU devices to none.

```{note}
If you have an available GPU to allocate the model, get the index of the GPU (probably 0 if you have only a single instance) and change ``tf.config.set_visible_devices([], "GPU")`` for ``os.environ["CUDA_VISIBLE_DEVICES"] = "0"``
```

We also disable the `tensorflow` warnings in order to keep the tutorial clean.

In [None]:
# Disabling tensorflow warnings and debugging info
import os 
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3" 

# Importing tensorflow and disabling GPU usage
import tensorflow as tf
tf.config.set_visible_devices([], "GPU")

In ths case, we are only interested in inference. Therefore, we might be able to load FTANet as an already trained model. For that, let's print the models with available weights to load in `compiam`.

In [None]:
pprint(compiam.list_models())

**Cool! FTANet tuned to Carnatic Music is there.** Therefore, let's load it and run inference on an example track.

In [None]:
# Initializing an FTANet instance
ftanet_carnatic = compiam.load_model("melody:ftanet-carnatic")

# Predict!
ftanet_pitch_track = ftanet_carnatic.predict("../audio/testing_samples/test_1.wav")

Let's visualise the extracted pitch tracks on top of the spectrogram of the input signal.

In [None]:
import librosa
import librosa.display
import numpy as np
import matplotlib.pyplot as plt

y, sr = librosa.load("../audio/testing_samples/test_1.wav")
fig, ax = plt.subplots(nrows=1, ncols=1, sharex=True)
D = librosa.amplitude_to_db(np.abs(librosa.stft(y)), ref=np.max)
img = librosa.display.specshow(D, y_axis='linear', x_axis='time', sr=sr, ax=ax)
plt.plot(melodia_pitch_track[:, 1], color="white")
plt.plot(ftanet_pitch_track[:, 1], color="orange")
plt.show()