# Baseline CAD2 Task1

## Lyrics Intelligibility



Anechdotic misshearing of lyrics are very common and one can find several examples on Internet, from websites dedicated to lyrics misshearings to stand-up comedies that exploit this in a houmorous way.

However, lyrics misshearings is an important problem for those with hearing loss {cite}`greasley2020music`.

In this second round of Cadenza Challenges, we are presenting a challenge where entrants needs to process a pop/rock music signal and increase the intelligibility with least loss of audio quality.

More on the challenge description on the [Cadenza website](https://cadenzachallenge.org/docs/cadenza2/intro). 

This tutorial walks through the process of running the lyrics intelligibility baseline using the shell interface.

## Create the environment

We first need to install the Clarity package. For this, we use the tag version of the challenge, **v6.0.0**

### Setting the Location of the Project

For convenience, we are setting an environment variable with the location of the root working directory of the project. This variable will be used in various places throughout the tutorial. Please change this value to reflect where you have installed this notebook on your system.

In [None]:
import os
os.environ["NBOOKROOT"] = os.getcwd()
os.getenv("NBOOKROOT")

In [1]:
from IPython.display import clear_output

import os
import sys

print("Cloning git repo...")
!git clone --depth 1 --branch v0.6.0 https://github.com/claritychallenge/clarity.git


Repository installed


In [None]:
%cd clarity
%pip install -e .

sys.path.append(f'{os.getenv("NBOOKROOT")}/clarity')

clear_output()
print("Repository installed")

## Get the demo data

We will be using music audio and listener metadata.

In [None]:
import gdown

!gdown 10SfuZR7yVlVO6RwNUc3kPeJHGiwpN3VS
!mv cadenza_data_demo.tar.xz recipes/cad1/task1/baseline
!tar -xvf cadenza_data_demo.tar.xz

clear_output()
print("Data installed")

Data installed


## Changing working Directory

Next, we change working directory to the location of the shell scripts we wish to run.

In [None]:
%cd {os.environ['NBOOKROOT']}/clarity/recipes/cad1/task1/baseline
%pwd

/content/clarity/recipes/cad1/task1/baseline


'/content/clarity/recipes/cad1/task1/baseline'

## Inspecting Existing Configuration

All of the included shell scripts take configurable variables from the yaml files in the same directory as the shell script.Typically these are named <code>config.yaml</code>, however, other names may be used if more than one shell script is in a directory.

We can inspect the contents of the config file using <code>!cat</code>:

In [None]:
!cat config.yaml

path:
  root: ./
  metadata_dir: ${path.root}/task1/metadata
  music_dir: ${path.root}/task1/audio/musdb18hq
  music_train_file: ${path.metadata_dir}/musdb18.train.json
  music_valid_file: ${path.metadata_dir}/musdb18.valid.json
  listeners_train_file: ${path.metadata_dir}/listeners.train.json
  listeners_valid_file: ${path.metadata_dir}/listeners.valid.json
  exp_folder: ./exp # folder to store enhanced signals and final results

nalr:
  nfir: 220
  fs: 44100

apply_compressor: False
compressor:
  threshold: 0.35
  attenuation: 0.1
  attack: 50
  release: 1000
  rms_buffer_size: 0.064

soft_clip: True

separator:
  device: ~

evaluate:
  set_random_seed: True
  small_test: False
  batch_size: 5  # Number of batches
  batch: 0       # Batch number to evaluate

# hydra config
hydra:
  run:
    dir: ${path.exp_folder}

The general organisation of the config files is hierarchical, with property labels depending on the script in question. The config file for the enhance and evaluate recipes contains configurable paramaters for both scripts. These include:
- Paths for the locations of audio files, metadata and the export location for generated files
- Paramaters for the NAL-R fitting
- Paramaters for the automatic gain control (AGC) compressor used in the baseline enhancer
- Parameters for the challenge evaluator
- Parameters necessary for Hydra to run

The path.root parameter defaults to the root of the baseline and must be overrided with a dataset root path when the python script is called in the command line.

e.g

```
user:~$ python mypythonscript.py path.root='/path/to/project'
```

In this notebook we will use the environment variable <code>$NBOOKROOT</code> which we defined at the start of the tutorial.

Note the lack of slash at the end of the <code>path.root</code> argument string. If you inspect a variable such as <code>path.metadata_dir</code> you will see that this slash is already included in the line.

```
path:
  root: ./
  metadata_dir: ${path.root}/task1/metadata

```

The general form for overriding a parameter in the CLI is dot indexed. For the following entry in a <code>config.yaml</code> file:
```
A:
  B:
    parameter_0: some_value
    parameter_1: some_other_value
```
The CLI syntax to override those values would be:

```
User:~$ python myscript.py A.B.parameter_0="new_value" A.B.parameter_1="another_new_value"
```

## Shell Scripts

Typically, as stated above, all the work is done within python with configurable variables supplied by a <code>yaml</code> file which is parsed by Hydra inside the python code.

The execution of this code is performed in the CLI and new configuration variable values are supplied as arguments to override defaults.

---
### Additional steps for Colab Notebooks
This version of this tutorial is designed to run on Google Colab. The editable installation of the clarity repository is by default not visible to the python interpreter in this environment, even though the installation cell above makes the clarity tools visible to the iPython interpreter.

As such, we need to make sure that the standard python interpreter called in the shell magic that follows below has the location of the clarity packages in the PYTHONPATH variable.

For local environments, this step may not be necessary.

In [None]:
%env PYTHONPATH=$PYTHONPATH:/content/clarity

env: PYTHONPATH=$PYTHONPATH:/content/clarity


---
We are now ready to run the prepared python script. However, the standard configuration is designed to work with the full clarity dataset. We can redirect the script to the correct folders by overriding the appropriate configuration parameters.

In [None]:
%%shell
python enhance.py \
path.root=../cadenza_data_demo/cad1

[2023-03-03 11:30:24,122][torchaudio.utils.download][INFO] - The local file (/root/.cache/torch/hub/torchaudio/models/hdemucs_high_musdbhq_only.pt) exists. Skipping the download.
[2023-03-03 11:30:24,614][__main__][INFO] - [001/001)] Processing Actions - One Minute Smile for L5076...


Now we have the enhanced output. Below, we can load and play the audio to listen to examples of the results.

In [None]:
from os import listdir
from os.path import isfile, join
from pathlib import Path
from scipy.io import wavfile

import IPython.display as ipd

audio_path = Path("exp/enhanced_signals")
audio_files = [f for f in audio_path.glob('*/*/*') if f.suffix == '.wav']

for file_to_play in audio_files:
  sample_rate, signal = wavfile.read(file_to_play)
  # Take only 30 seconds
  if signal.shape[-1] == 1:
    signal = signal[30*sample_rate:60*sample_rate, :]
  else:
    signal = signal[30*sample_rate:60*sample_rate]
  print(file_to_play.name)
  ipd.display(ipd.Audio(signal.T, rate=sample_rate))

Now that we have enhanced audios we can use the evaluate recipe to generate HAAQI scores for the signals. The evaluation is run in the same manner as the enhancement script.

In [None]:
%%shell
python evaluate.py \
path.root=../cadenza_data_demo/cad1

We hope that this tutorial has been useful and has explained the process for using the recipe scripts using the Hydra configuration system. This approach can be applied to all of the recipes that are included in the repository.