# Tech Arena 2024

The objective of this challenge is to develop a model that generates a set of head-related transfer functions (HRTFs) from a set of pinna pictures of the left and right ears. The model should be able to accept an arbitrary number of pinna images (the same number for each ear), and output an HRTF set with a predefined angle resolution.

For model training, a dataset of pinna images and corresponding HRTFs is provided for a total of 100 human subjects (90 for training and 10 for test). The pinna images are provided from different view points around each ear.
Pinna pictures are provided as PNG images, while the corresponding HRTFs are provided in SOFA format.

The model will be evaluated on 3 different tasks:
1. Task 0: using 19 pictures (at a 10 degree resolution from 0 to 180 degrees)
2. Task 1: using 7 pictures (at a 30 degree resolution from 0 to 180 degrees)
3. Task 2: using 3 pictures (at 30, 60 and 90 degrees only)

## Dataset
The default data source for this challenge is the [SONICOM](https://www.sonicom.eu/) dataset. While the pinna pictures are provided as part of this project, the HRTFs need to be manually downloaded from the SONICOM webpage. Additional datasets can also be used for training.

To prepare your data, download all relevant SOFA files from the [SONICOM FTP server](https://transfer.ic.ac.uk:9090/#/2022_SONICOM-HRTF-DATASET/). To do this, use the search functionality of the web interface to search for all files ending in `_FreeFieldCompMinPhase_NoITD_48kHz.sofa`. Select all the retrieved files, download and unzip them and place them into a folder named `SONICOM_HRTF` inside the `data` folder. 

_Please note that we provided a previous version of this notebook that used a different folder structure. We refer to the old folder structure as `v1` and the new folder structure as `v2`._

Together with the provided pinna pictures, the `data` folder should have the following structure:
```
/data/SONICOM_HRTF/KEMAR/(...)
                  /P0001/(...)
                  /P0002/(...)
                  (...)
                  /P0244/(...)
     /SONICOM_TestData_pics/
     /SONICOM_TrainingData_pics/
     /Average_HRTFs.sofa
```

## Delivery format

The delivery should include the codebase in one of the allowed languages (C, C++, Python, or Java), as well as a command line app that can be run on a Windows 64-bit system. The command line interface should be callable like this:

```>> <my_app> -l IMAGE_PATH [IMAGE_PATH ...] -r IMAGE_PATH [IMAGE_PATH ...] -o SOFA_PATH```

The required options `-l` and `-r` are followed by one or more file paths of left and right pinna images, respectively. The required option `-o` is followed by a single path for the SOFA HRTF output file.

**Please note: Submissions that do not provide this command line app cannot be considered.**

We provide an example inference script in `inference.py` that illustrates the expected command line interface. It can be run as follows:

In [None]:
%pip install sofar numpy imageio torch

In [None]:
!python inference.py -l ./data/SONICOM_TestData_pics/P0002_left_0.png ./data/SONICOM_TestData_pics/P0002_left_1.png -r ./data/SONICOM_TestData_pics/P0002_right_0.png ./data/SONICOM_TestData_pics/P0002_right_1.png -o ./data/output/prediction.sofa

## Training

Participants are free to use any of the allowed programming languages (C, C++, Python, or Java).
For Python, some data access classes are provided to facilitate model training and evaluation. These classes are provided in the `utils.py` module and are briefly explained here.

### Prerequisites

The following dependencies are required and can be installed as follows:

In [None]:
%pip install sofar numpy imageio torch tqdm

### Data access

In [None]:
from utils import SonicomDatabase
from torch.utils.data import DataLoader
import tqdm

Training and test data can be accessed by the `SonicomDatabase` class. The `sonicom_root` parameter should point to the project's `data` folder. To load the training data, set the `training_data` flag to `True`. Test data can be loaded by setting this flag to `False`. 

In [None]:
sonicom_root = './data'
sd = SonicomDatabase(sonicom_root, training_data=True, folder_structure='v2')

A pytorch `DataLoader` can be used to iterate through the data as follows:

In [None]:
train_dataloader = DataLoader(sd, batch_size=1, shuffle=False)
for i, (images, hrtf) in tqdm.tqdm(enumerate(train_dataloader)): #By default calls __getitem__(index_number) which returns: images, hrtf
    print(f'Image size: {images.shape} and HRTF size: {hrtf.shape}')

The images are returned as a tensor of shape: `(batch size, number of ears, number of images per ear, image height, image width)`, the dimensions of `hrtf` are `(batch size, number of directions, number of ears, spectrum length)`

## Evaluation

We provide an evaluation metric to compare HRTF sets in `metrics.py`.

The following code cell shows how to evaluate a model based on the Mean Spectral Distortion of the test data for all three tasks. The results of all three tasks will be summed to an overall result and the submission with the lowest overall result will be the winner. 


In [None]:
from inference import BaselineHRTFPredictor
from metrics import MeanSpectralDistortion
import torch
import numpy as np

predictor = BaselineHRTFPredictor()
metric = MeanSpectralDistortion()
results = {}

for task in range(3):
    sd = SonicomDatabase(sonicom_root, training_data=False, task_id=task, folder_structure='v2')
    test_dataloader = DataLoader(sd, batch_size=1, shuffle=False)
    
    total_error= []
    for image_batch, hrtf_batch in tqdm.tqdm(test_dataloader):
        for images, ground_truth_hrtf in zip(image_batch, hrtf_batch):
            predicted_hrtf = torch.as_tensor(sd._compute_HRTF(predictor.predict(images).Data_IR))
            total_error.append(metric.get_spectral_distortion(ground_truth_hrtf, predicted_hrtf)) 
    results[task] = np.mean(total_error)

results