<a href="https://colab.research.google.com/github/rahiakela/audio-processing-research-and-practice/blob/main/fundamentals-of-music-processing/01-basics/01_multimedia_basics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##Multimedia Basics

<p>
In this notebook, we give a short overview on how to integrate multimedia objects (in particular, audio, image, and video objects) into a Jupyter notebook. Rather than being comprehensive, we only give a selection of possibilities as used in the other FMP notebooks. In particular, we discuss two alternatives: a direct integration of images, video, and audio elements using HTML tags as well as an integration using the module <code>IPython.display</code>.
</p>

**Reference**

[Basics](https://www.audiolabs-erlangen.de/resources/MIR/FMP/B/B.html)

[Multimedia Basics](https://www.audiolabs-erlangen.de/resources/MIR/FMP/B/B_Multimedia.html)

##Setup

In [None]:
import os
import IPython.display as ipd
import librosa
import numpy as np
import pandas as pd
%matplotlib inline

In [None]:
!wget https://github.com/rahiakela/audio-processing-research-and-practice/raw/main/fundamentals-of-music-processing/01-basics/data/data.zip

!unzip data.zip
!rm -rf data.zip

## Audio Objects

**Audio: HTML `<audio>` tag**

The HTML `<audio>` tag defines an in-browser audio player and allows for playing back a specified audio file (MP3, WAV, OGG), see [here](https://www.w3schools.com/Tags/tag_audio.asp) for details. Note that the functionality and the visual appearance of the audio player depends on the respective browser used. The `<audio>` tag can be used within a markdown cell and does not require any Python. 

<audio src="../data/B/FMP_B_Note-C4_Piano.mp3" type="audio/mpeg" controls="controls"></audio>

### Audio: Using  <code>IPython.display.Audio</code>

An alternative is to use the module <code>IPython.display</code>, which is an application programming interface (API) for displaying various tools in IPython. As for audio, the following class is available ([`IPython` version 6.0 or higher](https://ipython.readthedocs.io/en/stable/api/generated/IPython.display.html)):

`IPython.display.Audio(data=None, filename=None, url=None, embed=None, rate=None, autoplay=False, normalize=True, *, element_id=None)`

<div class="alert alert-block alert-warning">
<strong>Warning:</strong> As default, <code>IPython.display.Audio</code> normalizes the audio (dividing by the maximum over all sample values) before playback. This may be unwanted for certain applications, where the volume of the audio should be kept to its original level. For examples, see the <a href="../B/B_PythonAudio.html">FMP notebook on Audio</a>.
</div> 

When used in a code cell, <code>IPython.display.audio</code> creates an in-browser audio player. The following two options are conceptually different: 

* When using the keyword argument `filename`, the audio file is loaded from the specified path and **embedded** into the notebook (with default `embed=True`). 
* When using the keyword argument `url`, the player is **linked** to the audio file by the specified URL (with default `embed=False`). 

Note that if you want the audio to be playable later with no internet connection (or with no local audio file available), you need to embed the audio file into the notebook. This can be done using the first option. The following example illustrates the difference between the two options. 

In [None]:
path_filename = os.path.join(".", "data", "FMP_B_Note-C4_Piano.mp3")

audio_element_filename = ipd.Audio(filename=path_filename)
print(f"Size of <audio> tag (with embedded audio file): {len(audio_element_filename._repr_html_().encode('utf8'))} Bytes")
ipd.display(audio_element_filename)

audio_element_url = ipd.Audio(filename=path_filename)
print(f"Size of <audio> tag (with embedded audio file): {len(audio_element_url._repr_html_().encode('utf8'))} Bytes")
ipd.display(audio_element_url)

### Audio: WAV and MP3

Embedding audio files may lead to very large Jupyter notebooks (also large files when exported as HTML). This particularly holds when embedding raw audio files encoded as WAV file. 

For example, encoding a song of five to ten minutes in CD quality (44100 Hz, stereo), may easily lead to a file size of more than 50 MB. 

Therefore, to reduce the size, one may consider the following:

* Trim audio files to have short durations.
* Reduce the sampling rate.
* Convert to mono. 
* Use the MP3 audio coding format.

The following example shows the difference in file size of a WAV and MP3 audio file.

In [None]:
path_filename_wav = os.path.join(".", "data", "FMP_B_Note-C4_Piano.wav")

audio_element_wav = ipd.Audio(filename=path_filename_wav)
print(f"Size of <audio> tag (with embedded WAV file): {len(audio_element_wav._repr_html_().encode('utf8'))} Bytes")
ipd.display(audio_element_wav)

path_filename_mp3 = os.path.join(".", "data", "FMP_B_Note-C4_Piano.mp3")
audio_element_mp3 = ipd.Audio(filename=path_filename_mp3)
print(f"Size of <audio> tag (with embedded MP3 file): {len(audio_element_mp3._repr_html_().encode('utf8'))} Bytes")
ipd.display(audio_element_mp3)

### Audio: Waveform-Based Signals

One may also use `IPython.display.Audio` to embed waveform-based audio signals (either mono or stereo). 

The following code example shows how to read a WAV and MP3 file using the Python package `librosa`. 

Note that in both cases, the audio files are converted into waveform representations.

In [None]:
x_wav, fs_wav = librosa.load(path_filename_wav, sr=None)
audio_wav = ipd.Audio(data=x_wav, rate=fs_wav)
print(f"Size of <audio> tag (coming from WAV): {len(audio_wav._repr_html_().encode('utf8'))} Bytes")
ipd.display(audio_wav)

x_mp3, fs_mp3 = librosa.load(path_filename_mp3, sr=None)
audio_mp3 = ipd.Audio(data=x_mp3, rate=fs_mp3)
print(f"Size of <audio> tag (coming from MP3): {len(audio_mp3._repr_html_().encode('utf8'))} Bytes")
ipd.display(audio_mp3)

The next example shows how to generate a stereo audio signal and how to embed it into the Jupyter notebook. 

For explanations of the code example, we refer to the [FMP notebook on waveforms](../C1/C1S3_Waveform.html).

<div class="alert alert-block alert-warning">
<strong>Warning:</strong> Depending on the web browser, only specific sampling rates may be supported for audio playback. In the following example, we use the sampling rate <code>Fs = 4000</code>, which seems to work for most browsers. 
</div> 

In [None]:
fs = 4000
duration = 4
t = np.linspace(0, duration, fs * duration)
signal_left = np.sin(2 * np.pi * 200 * t)
signal_right = np.sin(2 * np.pi * 600 * t)
signal_stereo = [signal_left, signal_right]
ipd.Audio(data=signal_stereo, rate=fs)

## Image Objects

**Image: HTML `<img>` tag**

Similar to audio, there are many ways to integrate image objects into a Jupyter notebook. First of all, one can use the [`<img>` tag](https://www.w3schools.com/Tags/tag_img.asp) within a markdown cell without requring any Python. The following figure shows a self-similarity matrix (SSM) of a recording of Brahms' Hungarian Dance No. 5, see Section 4.2.2 of <a href="http://www.music-processing.de/">[Müller, FMP, Springer 2015].</a>

<img src="https://github.com/rahiakela/audio-processing-research-and-practice/blob/main/fundamentals-of-music-processing/01-basics/data/FMP_B_Brahms-SSM.png?raw=1" width="300px" align="middle" alt="C0">

HTML also allows for showing animated GIFs. This simple format encodes a number of images or frames, which are presented in a specific order to create a short animation. Using animated GIFs is a nice way to illustrate processing pipelines. For example, the following animated GIF shows the previous SSM in its original form along with a version after applying smoothing as well as thresholding and scaling.

<img src="https://github.com/rahiakela/audio-processing-research-and-practice/blob/main/fundamentals-of-music-processing/01-basics/data/FMP_B_Brahms-SSM.gif?raw=1" width="300px" alt="SSM">

### Image: Using  <code>IPython.display.Image</code>

Similar to the the audio case, an alternative is to use the module <code>IPython.display</code> to create an image given the path to a PNG/JPEG/GIF file. As for images, the following class is available:

`IPython.display.Image(data=None, url=None, filename=None, format=None, embed=None, width=None, height=None, retina=False, unconfined=False, metadata=None)`

Again, there are two options, which either embed or link an image object:

* When using the keyword argument `filename`, the image file is loaded from the specified path and **embedded** into the notebook (with default `embed=True`). 
* When using the keyword argument `url`, the data is **linked** by the specified URL (with default `embed=False`). 

Here are some examples: