# HEAR 2021 Evaluation Example

This notebook provides an example of downloading a pre-processed task and running evaluation using the Wav2Vec2 baseline model.

## 1. Install Dependecies
**Note: You may have to restart runtime after installation**

In [1]:
!pip install heareval
!pip install hearbaseline

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting heareval
  Downloading heareval-2021.1.2-py3-none-any.whl (37 kB)
Collecting pynvml
  Downloading pynvml-11.4.1-py3-none-any.whl (46 kB)
[K     |████████████████████████████████| 46 kB 4.5 MB/s 
Collecting torchinfo
  Downloading torchinfo-1.7.0-py3-none-any.whl (22 kB)
Collecting numpy==1.19.2
  Downloading numpy-1.19.2-cp37-cp37m-manylinux2010_x86_64.whl (14.5 MB)
[K     |████████████████████████████████| 14.5 MB 18.6 MB/s 
Collecting numba==0.48
  Downloading numba-0.48.0-1-cp37-cp37m-manylinux2014_x86_64.whl (3.5 MB)
[K     |████████████████████████████████| 3.5 MB 50.6 MB/s 
[?25hCollecting sed-eval
  Downloading sed_eval-0.2.1.tar.gz (21 kB)
Collecting pytorch-lightning>=1.6
  Downloading pytorch_lightning-1.6.4-py3-none-any.whl (585 kB)
[K     |████████████████████████████████| 585 kB 60.5 MB/s 
[?25hCollecting spotty
  Downloading spotty-1.3.3-py3-none-any.whl (12

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting hearbaseline
  Downloading hearbaseline-2021.1.0-py3-none-any.whl (26 kB)
Collecting transformers
  Downloading transformers-4.19.2-py3-none-any.whl (4.2 MB)
[K     |████████████████████████████████| 4.2 MB 34.0 MB/s 
Collecting speechbrain
  Downloading speechbrain-0.5.11-py3-none-any.whl (408 kB)
[K     |████████████████████████████████| 408 kB 70.8 MB/s 
Collecting torchopenl3
  Downloading torchopenl3-1.0.0-py2.py3-none-any.whl (14 kB)
Collecting torchcrepe
  Downloading torchcrepe-0.0.16-py3-none-any.whl (72.3 MB)
[K     |████████████████████████████████| 72.3 MB 5.2 kB/s 
Collecting hyperpyyaml
  Downloading HyperPyYAML-1.0.1.tar.gz (14 kB)
Collecting sentencepiece
  Downloading sentencepiece-0.1.96-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
[K     |████████████████████████████████| 1.2 MB 54.3 MB/s 
[?25hCollecting huggingface-hub
  Downloadi

## 2. Download the Dataset

We download the Mridingham Tonic dataset with samplerate 16000 as an example.

In [2]:
!wget https://github.com/hearbenchmark/hear2021-sample-datasets/raw/main/hear2021-mridangam_tonic-v1.5-full-16000.tar.gz
!tar -xzf hear2021-mridangam_tonic-v1.5-full-16000.tar.gz

--2022-06-03 21:05:18--  https://github.com/hearbenchmark/hear2021-sample-datasets/raw/main/hear2021-mridangam_tonic-v1.5-full-16000.tar.gz
Resolving github.com (github.com)... 140.82.113.4
Connecting to github.com (github.com)|140.82.113.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/hearbenchmark/hear2021-sample-datasets/main/hear2021-mridangam_tonic-v1.5-full-16000.tar.gz [following]
--2022-06-03 21:05:18--  https://raw.githubusercontent.com/hearbenchmark/hear2021-sample-datasets/main/hear2021-mridangam_tonic-v1.5-full-16000.tar.gz
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 62455362 (60M) [application/octet-stream]
Saving to: ‘hear2021-mridangam_tonic-v1.5-full-16000.tar.gz’


2022-0

## 3. Compute embeddings using Wav2Vec2

In [3]:
!python3 -m heareval.embeddings.runner hearbaseline.wav2vec2 --tasks-dir ./tasks/

Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2022-06-03 21:05:31.459094: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
embeddings
Importing hearbaseline.wav2vec2
Downloading: 100% 212/212 [00:00<00:00, 361kB/s]
Downloading: 100% 1.73k/1.73k [00:00<00:00, 3.23MB/s]
Downloading: 100% 1.18G/1.18G [00:21<00:00, 58.3MB/s]
Some weights of the model checkpoint at facebook/wav2vec2-large-100k-voxpopuli were not used when initializing Wav2Vec2Model: ['quantizer.codevectors', 'project_q.bias', 'quantizer.weight_proj.bias', 'project_hid.bias', 'quantizer.weight_proj.weight', 'project_hid.weight', 'project_q.weight']
- This IS expected if you are initializing Wav2Vec2Model from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a Be

## 4. Evaluate
Train and predict with a shallow downstream classifier using the computed embeddings.

In [4]:
# Required for determinism
%env CUBLAS_WORKSPACE_CONFIG=:4096:8

# Train and evaluate classifier using hearbaseline embeddings
!python3 -m heareval.predictions.runner embeddings/hearbaseline.wav2vec2/*

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
Epoch 344: 100% 7/7 [00:00<00:00, 78.93it/s, loss=0.0451, v_num=8, val_loss=0.565, val_top1_acc=0.806, val_d_prime=3.220, val_aucroc=0.958, val_mAP=0.830] 
Epoch 347:   0% 0/7 [00:00<?, ?it/s, loss=0.045, v_num=8, val_loss=0.565, val_top1_acc=0.806, val_d_prime=3.220, val_aucroc=0.958, val_mAP=0.830]
Validation: 0it [00:00, ?it/s][A
Validation:   0% 0/2 [00:00<?, ?it/s][A
  x = self.activation(x)

Epoch 347: 100% 7/7 [00:00<00:00, 79.40it/s, loss=0.0444, v_num=8, val_loss=0.558, val_top1_acc=0.810, val_d_prime=3.220, val_aucroc=0.958, val_mAP=0.831]
Epoch 350:   0% 0/7 [00:00<?, ?it/s, loss=0.0438, v_num=8, val_loss=0.558, val_top1_acc=0.810, val_d_prime=3.220, val_aucroc=0.958, val_mAP=0.831]
Validation: 0it [00:00, ?it/s][A
Validation:   0% 0/2 [00:00<?, ?it/s][A
  x = self.activation(x)

Epoch 350: 100% 7/7 [00:00<00:00, 51.18it/s, loss=0.0438, v_num=8, val_loss=0.570, val_top1_acc=0.801, val_d_prime=3.230, val_auc

## Results

In [5]:
import json
results = json.load(open("embeddings/hearbaseline.wav2vec2/mridangam_tonic-v1.5-full/test.predicted-scores.json"))
results["aggregated_scores"]

{'epoch_mean': 258.2,
 'epoch_std': 122.77296119260137,
 'test_aucroc_mean': 0.9675908106543016,
 'test_aucroc_std': 0.005460301748738511,
 'test_d_prime_mean': 3.2592987690515245,
 'test_d_prime_std': 0.1784747735771514,
 'test_loss_mean': 0.5222560465335846,
 'test_loss_std': 0.05447612981155485,
 'test_mAP_mean': 0.8766347712036768,
 'test_mAP_std': 0.0230810459628224,
 'test_score_mean': 0.8290078163146972,
 'test_score_std': 0.01902561269892375,
 'test_top1_acc_mean': 0.8290078163146972,
 'test_top1_acc_std': 0.01902561269892375,
 'time_in_min_mean': 0.453901363213857,
 'time_in_min_std': 0.1486920540179773,
 'validation_score_mean': 0.8300140738487244,
 'validation_score_std': 0.014690384121085987}

In [6]:
results["aggregated_scores"]['test_score_mean']

0.8290078163146972