# Dataset Generator

We need to generate a dataset of reflections. To do this we need to gather some data from the internet. Make sure you have already executed the appropriate cells in Cepbimo - Preamble.

## Anechoic Data

In order to generate a dataset of reflections we need to start with anechoic data. A dataset of anechoic orchestral recordings is available thanks to [(Pätynen and Lokki, 2008)](#patynen-lokki-08).
***
> 1. [Mozart's An aria of Donna Elvira](https://mediatech.aalto.fi/images/research/virtualacoustics/recordings/mozart_mp3.zip)
>   - Flute, clarinet, bassoon, french horns (1-2), violin (I), violin (II), viola, cello, contrabass, soprano (soloist)
> 2. [Beethoven's Symphony no. 7, I movement, bars 1-53](https://mediatech.aalto.fi/images/research/virtualacoustics/recordings/beethoven_mp3.zip)
>   - Flutes (1-2), oboes (1-2), clarinets (1-2), bassoon (1-2), french horns (1-2),
>     trumpets (1-3), timpani, violin (I), violin (II), viola, cello, contrabass
> 3. [Bruckner's Symphony no. 8, II movement, bars 1-61](https://mediatech.aalto.fi/images/research/virtualacoustics/recordings/bruckner_mp3.zip)
>   - Flutes (1-3), oboes (1-3), clarinets (1-3), bassoon (1-3),
>     french horns (1-8), trumpets (1-3), trombones (1-3), tuba, timpani,
>     violin (I) (two divisi), violin (II) (two divisi), viola (two divisi),
>     cello (two divisi), contrabass (two divisi)
> 4. [Mahler's Symphony no. 1, IV movement, bars 1-85](https://mediatech.aalto.fi/images/research/virtualacoustics/recordings/mahler_mp3.zip)
>   - Piccolo (1-2) (fl1), flutes (1-2) (fl3), oboes (1-4), clarinets (1-4),
> bassoon (1-3), french horns (1-7), trumpets (1-4), tuba, timpani,
> percussions (1-2), violin I (two divisi), violin II (two divisi),
> viola, cello, contrabass
***
For more information on the anechoic data we use click [this link.](https://research.cs.aalto.fi//acoustics/virtual-acoustics/research/acoustic-measurement-and-analysis/85-anechoic-recordings.html)

In [1]:
# @title Anechoic Data
# @markdown Anechoic data recorded by [(Pätynen and Lokki, 2008)](#patynen-lokki-08)
# @markdown > Links to anechoic recordings
file_links = [
              'https://mediatech.aalto.fi/images/research/virtualacoustics/recordings/beethoven_mp3.zip',
              'https://mediatech.aalto.fi/images/research/virtualacoustics/recordings/mozart_mp3.zip',
              'https://mediatech.aalto.fi/images/research/virtualacoustics/recordings/bruckner_mp3.zip',
              'https://mediatech.aalto.fi/images/research/virtualacoustics/recordings/mahler_mp3.zip']
# @markdown > Download files if they don't exist
for l in file_links:
  ! wget -nc {l}

'wget' is not recognized as an internal or external command,
operable program or batch file.
'wget' is not recognized as an internal or external command,
operable program or batch file.
'wget' is not recognized as an internal or external command,
operable program or batch file.
'wget' is not recognized as an internal or external command,
operable program or batch file.


## Head-related Transfer Functions

Next we need a dataset of head-related transfer functions (HRTF). A set of HRTFs is available thanks to [(Gardner & Martin, 1994)](#gardner-martin-94) and MIT Media Lab.
***
For more information see [KEMAR HRTF measurements](https://sound.media.mit.edu/resources/KEMAR.html)

In [None]:
# @title HRTF Data
# @markdown HRTF data recorded by [(Gardner & Martin, 1994)](#gardner-martin-94)
! wget -nc https://sound.media.mit.edu/resources/KEMAR/full.zip

## Extracting Data

Extract the archived data to the `data/` directory.
***
**Warning: Overwrites existing files**

In [None]:
# @title Extracting Data
# @markdown Extract zipped data to `data/` directory.
# @markdown > List of files to unzip
zip_files = [
             ['beethoven_mp3.zip', 'anechoic/beethoven'],
             ['mozart_mp3.zip', 'anechoic/mozart'],
             ['bruckner_mp3.zip', 'anechoic/bruckner'],
             ['mahler_mp3.zip', 'anechoic/mahler'],
             ['full.zip', 'hrtfs']]
# > Unzip each archive to `data/anechoic` directory
for z in zip_files:
  !unzip -o {z[0]} -d data{z[1]}

## References

* Gardner, B., & Martin, K. (1994). Hrtf measurements of a kemar dummy-head microphone. MIT Media Lab. Perceptual Computing-Technical Report, 280, 1-7.
<a name='gardner-martin-94'><a/>

* Pätynen, J., Pulkki, V., & Lokki, T. (2008). Anechoic recording system for symphony orchestra. Acta Acustica united with Acustica, 94(6), 856-865. 
<a name='patynen-lokki-08'><a/>