Merge pull request #367 from claritychallenge/jpb/cec3-baseline

Jpb/cec3 baseline
claritychallenge · Apr 2, 2024 · 66ed680 · 66ed680
2 parents ec763ff + 87fd89a
commit 66ed680
Show file tree

Hide file tree

Showing 10 changed files with 600 additions and 7 deletions.
diff --git a/README.md b/README.md
@@ -18,6 +18,7 @@
 [![pre-commit.ci status](https://results.pre-commit.ci/badge/github/claritychallenge/clarity/main.svg)](https://results.pre-commit.ci/latest/github/claritychallenge/clarity/main)
 [![Downloads](https://pepy.tech/badge/pyclarity)](https://pepy.tech/project/pyclarity)
 
+[![PyPI](https://img.shields.io/static/v1?label=CEC3%20Challenge%20-%20pypi&message=v0.5.0&color=orange)](https://pypi.org/project/pyclarity/0.5.0/)
 [![PyPI](https://img.shields.io/static/v1?label=ICASSP%202024%20Cadenza%20Challenge%20-%20pypi&message=v0.4.1&color=orange)](https://pypi.org/project/pyclarity/0.4.1/)
 [![PyPI](https://img.shields.io/static/v1?label=CAD1%20and%20CPC2%20Challenges%20-%20pypi&message=v0.3.4&color=orange)](https://pypi.org/project/pyclarity/0.3.4/)
 [![PyPI](https://img.shields.io/static/v1?label=ICASSP%202023%20Challenge%20-%20pypi&message=v0.2.1&color=orange)](https://pypi.org/project/pyclarity/0.2.1/)
@@ -36,15 +37,15 @@ In this repository, you will find code to support all Clarity and Cadenza Challe
 
 ## Current Events
 
-- The ICASSP 2024 Cadenza Challenge (CAD_ICASSP_2024) is now open. :fire::fire:
+- The 3rd Clarity Enhancement Challenge is now open. :fire::fire:
+  - Visit the [challenge website](https://claritychallenge.org/docs/cec3/cec3_intro) for more details.
+  - Join the [Clarity Challenge Group](https://groups.google.com/g/clarity-challenge) to keep up-to-date on developments.
+- The ICASSP 2024 Cadenza Challenge (CAD_ICASSP_2024) will be presented at ICASSP 2024.
   - Join the [Cadenza Challenge Group](https://groups.google.com/g/cadenza-challenge) to keep up-to-date on developments.
   - Visit the Cadenenza Challenge [website](https://cadenzachallenge.org/) for more details.
 - The first Cadenza Challenge (CAD1) is closed.
   - Subjective Evaluation is underway. :new:
-- The 2nd Clarity Prediction Challenge (CPC2) is now open.   :fire::fire:
-  - Join the [Clarity Challenge Group](https://groups.google.com/g/clarity-challenge) to keep up-to-date on developments.
-  - Visit the Clarity Challenge [website](https://claritychallenge.org/) for more details.
-  - Evaluation tools and a baseline system will be available shortly.
+- The 2nd Clarity Prediction Challenge (CPC2) is now closed.
 - The 4th Clarity Workshop will be held as a satellite event of Interspeech 2023. For details visit the [workshop website](https://claritychallenge.org/clarity2023-workshop/).
 
 ## Installation
@@ -89,13 +90,14 @@ pip install -e git+https://github.com/claritychallenge/clarity.git@main
 
 Current challenge
 
-- [The ICASSP 2024 Cadenza Challenge](./recipes/cad_icassp_2024)
+- [The 3rd Clarity Enhancement Challenge](./recipes/cec3)
 
 Previous challenges
 
+- [The ICASSP 2024 Cadenza Challenge](./recipes/cad_icassp_2024)
 - [The 1st Cadenza Challenge (CAD1)](./recipes/cad1)
 - [The 2nd Clarity Prediction Challenge (CPC2)](./recipes/cpc2)
-- [The ICASSP 2023 Enhancement Challenge](./recipes/icassp_2023)
+- [The ICASSP 2023 Clarity Enhancement Challenge](./recipes/icassp_2023)
 - [The 2nd Clarity Enhancement Challenge (CEC2)](./recipes/cec2)
 - [The 1st Clarity Prediction Challenge (CPC1)](./recipes/cpc1)
 - [The 1st Clarity Enhancement Challenge (CEC1)](./recipes/cec1)

diff --git a/recipes/cec3/README.md b/recipes/cec3/README.md
@@ -0,0 +1,205 @@
+# The 3rd Clarity Enhancement Challenge (CEC3)
+
+Clarity challenge code for the 3rd Clarity Enhancement Challenge.
+
+For more information please visit the [challenge website](https://claritychallenge.org/docs/cec3/cec3_intro).
+
+Clarity tutorials are [now available](https://claritychallenge.github.io/clarity_CC_doc/tutorials). The tutorials introduce the Clarity installation, how to interact with Clarity metadata, and also provide examples of baseline systems and evaluation tools.
+
+## Data structure
+
+The 3rd Clarity Enhancement Challenge consists of three separate tasks each with its own training and evaluation data. Details for how to obtain the data can be found on the [challenge website](https://claritychallenge.org/docs/cec3/cec3_data).
+
+The data is distributed as one or more separate packages in `tar.gz` format.
+
+Unpack all packages under the same root directory using
+
+```bash
+tar -xvzf <PACKAGE_NAME>
+```
+
+The initially released data is in the package `clarity_CEC3_data.v1_0.tar.gz` and has the following structure:
+
+```text
+clarity_CEC3_data
+|── manifest
+|── task1
+|   |── clarity_data
+|   |   |── dev
+|   |   |   |── scenes
+|   |   |   └── speaker_adapt
+|   |   |── metadata
+|   |   └── train
+|   └── hrir
+|       └── HRIRs_MAT
+|── task2
+|   └── clarity_data
+|       |── dev
+|       |   |── interferers
+|       |   |── scenes
+|       |   |── speaker_adapt
+|       |   └── targets
+|       |── metadata
+|       └── train
+|           |── interferers
+|           |── scenes
+|           └── targets
+└── task3
+
+
+```
+
+## Baseline
+
+In the `baseline/' folder, we provide code for running the baseline enhancement system and performing the objective evaluation. The same system can be used for all three tasks by setting the configuration appropriately.
+
+The scripts are controlled by three variables.
+
+- `task` - The task to evaluate. This can be `task1`, `task2` or `task3`.
+- `path.root` - The root directory where you clarity data is stored.
+- `path.exp` - A directory that will be used to store intermediate files and the final evaluation results.
+
+These can be set in the `config.yaml` file or provided on the command line. In the following they are being set on the command line.
+
+### Enhancement
+
+The baseline enhancement simply takes the 6-channel hearing aid inputs and reduces this to a stereo hearing aid output by passing through the 'front' microphone signal of the left and right ear.
+
+Alternatively, you can provide the root variable on the command line, e.g.,
+
+```bash
+python enhance.py task=task1 path.root=/Users/jon/clarity_CEC3_data path.exp=/Users/jon/exp
+```
+
+Where '/Users/jon' is replaced with the path to the root of the clarity data and the experiment folder.
+
+The folder `enhanced_signals` will appear in the `exp` folder. Note, the experiment folder will be created if it does not already exist.
+
+### Evaluation
+
+The `evaluate.py`  will first pass signals through a provided hearing aid amplification stage using a NAL-R [[1](#references)] fitting amplification and a simple automatic gain compressor. The amplification is determined by the audiograms defined by the scene-listener pairs in `clarity_data/metadata/scenes_listeners.dev.json` for the development set. After amplification, the evaluate function calculates the better-ear HASPI  [[2](#references)].
+
+```bash
+python evaluate.py
+```
+
+The full evaluation set is 7500 scene-listener pairs and will take a long time to run, i.e., around 8 hours on a MacBook Pro. A standard small set which uses 1/15 of the data has been defined. This takes around 30 minutes to evaluate and can be run with,
+
+```bash
+python evaluate.py task=task1 path.root=/Users/jon/clarity_CEC3_data path.exp=/Users/jon/exp evaluate.small_test=True
+```
+
+Alternatively, see the section below, 'Running with multiple threads', for how to run with multiple threads or on an HPC system.
+
+The evaluation script will generate a CSV file containing the HASPI scores for each sample. This can be found in `<path.exp>/scores`
+
+### Reporting results
+
+Once the evaluation script has finished running, the final result can be reported with
+
+```bash
+python report_score.py task=task1 path.root=/Users/jon/clarity_CEC3_data path.exp=/Users/jon/exp
+```
+
+Or if you have run the small evaluation
+
+```bash
+python report_score.py task=task1 path.root=/Users/jon/clarity_CEC3_data path.exp=/Users/jon/exp evaluate.small_test=True
+```
+
+The scores for Task 1 and Task 2 should be as follows.
+
+Task 1
+
+```text
+Evaluation set size: 7500
+Mean HASPI score: 0.22178678134846783
+
+                 SNR     haspi
+SNR
+(-12, -9] -10.498088  0.052545
+(-9, -6]   -7.541468  0.080589
+(-6, -3]   -4.477046  0.143096
+(-3, 0]    -1.432494  0.239527
+(0, 3]      1.470118  0.352110
+(3, 6]      4.492380  0.477001
+```
+
+Task 2
+
+```text
+Evaluation set size: 7500
+Mean HASPI score: 0.18643217215546573
+
+                 SNR     haspi
+SNR
+(-12, -9] -10.545927  0.034330
+(-9, -6]   -7.552687  0.055647
+(-6, -3]   -4.538335  0.096237
+(-3, 0]    -1.455963  0.178413
+(0, 3]      1.434074  0.296364
+(3, 6]      4.507484  0.432177
+```
+
+## Tips
+
+### Configuring with Hydra
+
+The code is using [Hydra](https://hydra.cc) for configuration management. This allows for easy configuration of the system. The configuration file is `config.yaml` in the `baseline` folder. The task, root and exp variables can be set in this file to avoid having to set them on every command line. Simply replace the `???` entries with the appropriate values.
+
+You can make alternative configurations and store them in separate `yaml` files. These can then be used to override the default configuration, e.g.,
+
+```bash
+python enhance.py python report_score.py --config-name my_task1_config.yaml
+```
+
+You can get help on any of the commands with
+
+```bash
+python enhance.py --help
+```
+
+And specific help on Hydra usage with
+
+```bash
+python enhance.py --hydra-help
+```
+
+### Running with multiple threads
+
+The `evaluate.py` script can be sped up by running with multiple processes, i.e. each process will evaluate a separate block of scenes and generate its own csv file. The `report_score.py` script will then combine these csv files to produce a single result.
+
+To do this we can use the Hydra `--multirun` flag and set multiple values for `evaluate.first_scene`. For example, to run with 4 threads we can split the 7500 scenes into 4 blocks of 1875 scenes each and run with,
+
+```bash
+python evaluate.py evaluate.first_scene="0,1875,3750,5625" evaluate.n_scenes=1875 --multirun
+```
+
+Hydra has a Python like system for specifying ranges, so the above command is equivalent to
+
+```bash
+python evaluate.py  evaluate.first_scene="range(0,7500,1875) evaluate.n_scenes=1875 --multirun
+```
+
+If we wanted to split into jobs with just 100 scenes per job we could use
+
+```bash
+python evaluate.py evaluate.first_scene="range(0,7500,100)" evaluate.n_scenes=500 --multirun
+```
+
+Hydra will launch these job using configuration that can be found in `hydra/launcher/cec3_submitit_local.yaml`.
+
+The same approach can be used to run jobs on a SLURM cluster using configuration in `hydra/launcher/cec3_submitit_slurm.yaml`.
+
+```bash
+python evaluate.py hydra/launcher=cec3_submitit_slurm evaluate.first_scene="range(0,7500,100)" evaluate.n_scenes=100 --multirun
+```
+
+!!!Note In the examples above it is assumed that the `task`, `path.root` and `path.exp` variables are set in the `config.yaml` file.
+
+!!!Note Hydra has plugin support for other job launchers. See the [Hydra documentation for more information](https://hydra.cc/docs/intro/).
+
+## References
+
+- [1] Byrne, Denis, and Harvey Dillon. "The National Acoustic Laboratories'(NAL) new procedure for selecting the gain and frequency response of a hearing aid." Ear and hearing 7.4 (1986): 257-265.
+- [2] Kates J M, Arehart K H. The hearing-aid speech perception index (HASPI) J. Speech Communication, 2014, 65: 75-93.
diff --git a/recipes/cec3/__init__.py b/recipes/cec3/__init__.py
diff --git a/recipes/cec3/baseline/__init__.py b/recipes/cec3/baseline/__init__.py
diff --git a/recipes/cec3/baseline/config.yaml b/recipes/cec3/baseline/config.yaml
@@ -0,0 +1,40 @@
+task: ??? # This can be set to 'task1', 'task2' or 'task3'
+
+path:
+  root: ??? # root folder for clarity data
+  exp: ??? # folder to store enhanced signals and final results
+  scenes_folder: ${path.root}/${task}/clarity_data/dev/scenes
+  metadata_dir: ${path.root}/${task}/clarity_data/metadata
+  scenes_listeners_file: ${path.metadata_dir}/scenes_listeners.dev.json
+  listeners_file: ${path.metadata_dir}/listeners.json
+  scenes_file: ${path.metadata_dir}/scenes.dev.json
+
+nalr:
+  nfir: 220
+  sample_rate: 48000
+
+compressor:
+  threshold: 0.35
+  attenuation: 0.1
+  attack: 50
+  release: 1000
+  rms_buffer_size: 0.064
+
+soft_clip: True
+
+evaluate:
+  set_random_seed: True
+  small_test: False
+  first_scene: 0
+  n_scenes: 0
+
+# hydra config
+hydra:
+  run:
+    dir: ${path.exp}
+  sweep:
+    dir: ${path.exp}/multirun/${now:%Y-%m-%d}/${now:%H-%M-%S}
+    subdir: ${hydra.job.num}
+
+defaults:
+  - override hydra/launcher: cec3_submitit_local
diff --git a/recipes/cec3/baseline/enhance.py b/recipes/cec3/baseline/enhance.py
@@ -0,0 +1,72 @@
+""" Run the dummy enhancement. """
+
+import json
+import logging
+import pathlib
+
+import hydra
+import numpy as np
+from omegaconf import DictConfig
+from scipy.io import wavfile
+from tqdm import tqdm
+
+from clarity.utils.audiogram import Listener
+from recipes.icassp_2023.baseline.evaluate import make_scene_listener_list
+
+logger = logging.getLogger(__name__)
+
+
+@hydra.main(config_path=".", config_name="config")
+def enhance(cfg: DictConfig) -> None:
+    """Run the dummy enhancement."""
+
+    enhanced_folder = pathlib.Path(cfg.path.exp) / "enhanced_signals"
+    enhanced_folder.mkdir(parents=True, exist_ok=True)
+
+    with open(cfg.path.scenes_listeners_file, encoding="utf-8") as fp:
+        scenes_listeners = json.load(fp)
+
+    listener_dict = Listener.load_listener_dict(cfg.path.listeners_file)
+
+    # Make list of all scene listener pairs that will be run
+    scene_listener_pairs = make_scene_listener_list(
+        scenes_listeners, cfg.evaluate.small_test
+    )
+
+    for scene, listener_id in tqdm(scene_listener_pairs):
+        sample_rate, signal_ch1 = wavfile.read(
+            pathlib.Path(cfg.path.scenes_folder) / f"{scene}_mix_CH1.wav"
+        )
+
+        _, signal_ch2 = wavfile.read(
+            pathlib.Path(cfg.path.scenes_folder) / f"{scene}_mix_CH2.wav"
+        )
+
+        _, signal_ch3 = wavfile.read(
+            pathlib.Path(cfg.path.scenes_folder) / f"{scene}_mix_CH3.wav"
+        )
+
+        # Convert to 32-bit floating point scaled between -1 and 1
+        signal_ch1 = (signal_ch1 / 32768.0).astype(np.float32)
+        signal_ch2 = (signal_ch2 / 32768.0).astype(np.float32)
+        signal_ch3 = (signal_ch3 / 32768.0).astype(np.float32)
+
+        signal = (signal_ch1 + signal_ch2 + signal_ch3) / 3
+
+        # pylint: disable=unused-variable
+        listener = listener_dict[listener_id]  # noqa: F841
+
+        # Note: The audiograms are stored in the listener object,
+        # but they are not needed for the baseline
+
+        # Baseline just reads the signal from the front microphone pair
+        # and write it out as the enhanced signal
+
+        wavfile.write(
+            enhanced_folder / f"{scene}_{listener_id}_enhanced.wav", sample_rate, signal
+        )
+
+
+# pylint: disable=no-value-for-parameter
+if __name__ == "__main__":
+    enhance()