<div> 
<div><h1>Python Audio</h1></div>
</div>

<br/>

<p>
There are several ways to read and write <strong>audio files</strong> in Python, using different packages. This notebooks lists some options and discusses advantages and disadvantages. 

## LibROSA

One option is to use librosa's functions [`librosa.load`](https://librosa.github.io/librosa/generated/librosa.core.load.html) and `librosa.output.write_wav`. 

* Per default, `librosa.load` resamples the audio to 22050 Hz. Setting `sr=None` keeps the native sampling rate.
* Loaded audio is always converted to float in the range of $[-1, 1]$.
* `librosa.load` is able to read mp3-files when [`ffmpeg`](https://ffmpeg.org/) is available.
* `librosa.output.write_wav` always uses the data type of the numpy array (e.g. 64-bit float).

Note that some of the librosa functionality for reading and writing audio files may be dropped in later versions as 
[discussed in this thread](https://github.com/librosa/librosa/issues/509).

## PySoundFile

The audio library [`PySoundFile`](https://pysoundfile.readthedocs.io/en/0.9.0/), which is supported by several several other libraries, yields also functions for reading and writing sound files. In particular, it contains the functions [`soundfile.read`](https://pysoundfile.readthedocs.io/en/latest/#soundfile.read) and [`soundfile.write`](https://pysoundfile.readthedocs.io/en/latest/#soundfile.write). 

* Per default, the loaded audio is converted to float in the range of $[-1, 1]$. This can be changed with the `dtype` keyword.
* When writing, it uses signed 16 bit PCM (`subtype='PCM_16'`) as default.
* There are no resampling options.
* There is no option to read mp3 files.


## SciPy

Scipy offers the [`scipy.io.wavfile`](https://docs.scipy.org/doc/scipy/reference/io.html#module-scipy.io.wavfile) module, which also has functionalities for reading and writing wav files. However, not all variants of the wav format are support. For example, 24-bit integer wav files are not allowed. Furthermore, certain metadata fields in a wav file may also lead to errors.

## Normalized Audio Playback 

We introduced the class

`IPython.display.Audio(data=None, filename=None, url=None, embed=None, rate=None, autoplay=False, normalize=True, *, element_id=None)`

for audio playback ([`IPython` version 6.0 or higher](https://ipython.readthedocs.io/en/stable/api/generated/IPython.display.html)). As default, this class **normalizes** the audio (dividing by the maximum over all sample values) before playback. This may be unwanted for certain applications, where the volume of the audio should be kept to its original level. To avoid normalization, one has to set the parameter `normalize=False`. However, this requires that all samples of the audio lie within the range between $-1$ and $-1$. In the following code cell, we give an illustrative examples for the two options.