# Task1

This tutorial walks through the process of running the CAD1 Task1 baseline using the shell Interface


The python and shell scripts included in the repository make use of <a href='https://hydra.cc/'>Hydra</a> and <a href='https://hydra.cc/docs/plugins/submitit_launcher/'>Submitit</a>, two technologies which streamline the configuration and parallel operation of python code on both local and high performance computing (HPC) environments.

The use of hydra for configuration allows for the existing shell scripts to be easily redirected to include new audio data and modify the various parameters of the recipe.

## Cloning the Clarity Repository
We first need to install the Clarity package.

In [12]:
!pip install git+https://github.com/groadabike/cadenza_webinar_may2023

Collecting git+https://github.com/groadabike/cadenza_webinar_may2023
  Cloning https://github.com/groadabike/cadenza_webinar_may2023 to /tmp/pip-req-build-n8jn60ms
  Running command git clone --filter=blob:none --quiet https://github.com/groadabike/cadenza_webinar_may2023 /tmp/pip-req-build-n8jn60ms
  Resolved https://github.com/groadabike/cadenza_webinar_may2023 to commit ebdab9277f7fb4d93cf730e7f52d20da1f703ad8
  Preparing metadata (setup.py) ... [?25lerror
  [1;31merror[0m: [1msubprocess-exited-with-error[0m
  
  [31m×[0m [32mpython setup.py egg_info[0m did not run successfully.
  [31m│[0m exit code: [1;36m1[0m
  [31m╰─>[0m [31m[6 lines of output][0m
  [31m   [0m Traceback (most recent call last):
  [31m   [0m   File "<string>", line 2, in <module>
  [31m   [0m   File "<pip-setuptools-caller>", line 34, in <module>
  [31m   [0m   File "/tmp/pip-req-build-n8jn60ms/setup.py", line 3, in <module>
  [31m   [0m     with open('README.md') as f:
  [31m   [0m Fi

## Get the demo data

We will be using music audio and listener metadata.

In [None]:
import gdown

!gdown 10SfuZR7yVlVO6RwNUc3kPeJHGiwpN3VS
!mv cadenza_data_demo.tar.xz recipes/cad1/task1/baseline
!tar -xvf cadenza_data_demo.tar.xz

clear_output()
print("Data installed")

## Changing working Directory

Next, we change working directory to the location of the shell scripts we wish to run.

In [None]:
%cd {os.environ['NBOOKROOT']}/clarity/recipes/cad1/task1/baseline
%pwd

## Inspecting Existing Configuration

All of the included shell scripts take configurable variables from the yaml files in the same directory as the shell script.Typically these are named <code>config.yaml</code>, however, other names may be used if more than one shell script is in a directory.

We can inspect the contents of the config file using <code>!cat</code>:

In [None]:
!cat config.yaml

The general organisation of the config files is hierarchical, with property labels depending on the script in question. The config file for the enhance and evaluate recipes contains configurable paramaters for both scripts. These include:
- Paths for the locations of audio files, metadata and the export location for generated files
- Paramaters for the NAL-R fitting
- Paramaters for the automatic gain control (AGC) compressor used in the baseline enhancer
- Parameters for the challenge evaluator
- Parameters necessary for Hydra to run

The path.root parameter defaults to the root of the baseline and must be overrided with a dataset root path when the python script is called in the command line.

e.g

```
user:~$ python mypythonscript.py path.root='/path/to/project' 
```

In this notebook we will use the environment variable <code>$NBOOKROOT</code> which we defined at the start of the tutorial.

Note the lack of slash at the end of the <code>path.root</code> argument string. If you inspect a variable such as <code>path.metadata_dir</code> you will see that this slash is already included in the line.

```
path:
  root: ./
  metadata_dir: ${path.root}/task1/metadata

```

The general form for overriding a parameter in the CLI is dot indexed. For the following entry in a <code>config.yaml</code> file:
```
A:
  B:
    parameter_0: some_value
    parameter_1: some_other_value
```
The CLI syntax to override those values would be:

```
User:~$ python myscript.py A.B.parameter_0="new_value" A.B.parameter_1="another_new_value"
```

## Shell Scripts 

Typically, as stated above, all the work is done within python with configurable variables supplied by a <code>yaml</code> file which is parsed by Hydra inside the python code. 

The execution of this code is performed in the CLI and new configuration variable values are supplied as arguments to override defaults. 

---
### Additional steps for Colab Notebooks
This version of this tutorial is designed to run on Google Colab. The editable installation of the clarity repository is by default not visible to the python interpreter in this environment, even though the installation cell above makes the clarity tools visible to the iPython interpreter. 

As such, we need to make sure that the standard python interpreter called in the shell magic that follows below has the location of the clarity packages in the PYTHONPATH variable.

For local environments, this step may not be necessary.

In [None]:
%env PYTHONPATH=$PYTHONPATH:/content/clarity

---
We are now ready to run the prepared python script. However, the standard configuration is designed to work with the full clarity dataset. We can redirect the script to the correct folders by overriding the appropriate configuration parameters.

In [None]:
%%shell
python enhance.py \
path.root=../cadenza_data_demo/cad1

Now we have the enhanced output. Below, we can load and play the audio to listen to examples of the results.

In [None]:
from os import listdir
from os.path import isfile, join
from pathlib import Path
from scipy.io import wavfile

import IPython.display as ipd

audio_path = Path("exp/enhanced_signals")
audio_files = [f for f in audio_path.glob('*/*/*') if f.suffix == '.wav']

for file_to_play in audio_files:
  sample_rate, signal = wavfile.read(file_to_play)
  # Take only 30 seconds
  if signal.shape[-1] == 1:
    signal = signal[30*sample_rate:60*sample_rate, :] 
  else: 
    signal = signal[30*sample_rate:60*sample_rate] 
  print(file_to_play.name)
  ipd.display(ipd.Audio(signal.T, rate=sample_rate))

Now that we have enhanced audios we can use the evaluate recipe to generate HAAQI scores for the signals. The evaluation is run in the same manner as the enhancement script.

In [None]:
%%shell
python evaluate.py \
path.root=../cadenza_data_demo/cad1 

We hope that this tutorial has been useful and has explained the process for using the recipe scripts using the Hydra configuration system. This approach can be applied to all of the recipes that are included in the repository.