3D Audio Panner

This code provides a Python implementation of a 3D Audio Panner with a GUI, using the head-related impulse responses (HRIR) recorded in the CIPIC HRTF Database by the CIPIC Interface Laboratory at UC Davis.

Getting Started

The code is tested with Python 3.7. Next section provides the prerequisites to run the program.

Prerequisites

The code is dependant on the following external libraries: Numpy, Scipy, Pillow and pyAudio. These can be installed with Python's inbuilt package management system, pip. See Python's tutorial on installing packages for information about this issue. In short, the installation can be made as:

pip install numpy
pip install scipy
pip install pillow
pip install pyaudio

Usage:

Run HRTF_convolver.py.
Select a subject from the CIPIC database. You should select a subject with similar anthropometric measurements as yourself for the best experience.

Note: Due to storage limitations, the repository has only 4 subjects of the database to choose from. The full database is ~170MB and has 45 subjects. It can be downloaded for free at the CIPIC webpage. Make sure to download the MATLAB version of the database. In order to make it work, you should simply replace the folder CIPIC_hrtf_database with the one you downloaded.
Press Play to start playing the default sound file. Make sure to be using headphones as your audio output. Move the Azimuth and Elevation sliders to position the sound in the 3D space. You can load your own audio file in File/Load audio file. Also, there are other sound samples in the folder resources/sounds.

IMPORTANT: For now, the only working format is a mono WAV file at 44100 Hz sample rate and 16 bit depth.

You can save the file at the specified pair of Azimuth/Elevation in File/Save audio file. Lastly, you can choose to use a crossover in order not to spatialize low frequencies, since low frequencies are non-directional in nature. Go to Settings/Change cutoff frequency to set the desired frequency. By default, crossover is set at 200 Hz.

Implementation details

Before sound arrives to the auditory system, it is filtered by the diffraction and reflection properties of the head, pinna, and torso. This information is captured in the head related transfer function (HRTF), a pair of functions (one for each ear) that characterizes how an ear receives a sound from a point in space. The HRTF is highly dependent on the location of the sound source relative to the listener, which is the main reason we are able to locate the sound source. A pair of HRTFs for two ears can be used to synthesize a binaural sound that seems to come from a particular point in space.

The CIPIC database provides head related impulse responses (HRIR), that is, the inverse Fourier transform of the HRTF. The process of positioning a sound source in a virtual space using HRIRs consists on the convolution of the mono signal with the HRIR for left and right ear:

$x_{L, R}(t) = h_{L, R}(t) * x(t)$

HRIR Interpolation

The CIPIC database has HRIRs for a finite set of points in space. Nevertheless, it would be desirable to pan the audio source in space smoothly, without audible jumps from one point to another. Therefore, an interpolation must be made for the points in space in which there are no HRIRs recorded. For this panner, the interpolation approach outlined by H. Gamper in this paper (available for free) is used.

In short, the method consists on performing a triangulation of the measured set of points (every combination of azimuth, elevation). These points become then vertices for every triangle in the triangulation. Lastly, the interpolated HRIR at any point X inside a triangle can be represented as the weighted sum of the HRIRs measured at each vertex of the triangle, using its barycentric coordinates as interpolation weights.

Real-time convolution

It is also desirable that filtering is performed in real time, so that the audio changes as the user move the azimuth/elevation sliders. This procedure is not straightforward: since the length of the convolution between signals with length L and M would be L+M-1, it is not possible to simply concatenate the output blocks. Another issue that must be taken into account is the aliasing introduced by the cyclic FFT convolution.

To overcome these issues, the overlap-save method is implemented. In short, this method consists on breaking the input audio signal into chunks of size L, transform the chunks into the frequency domain with the FFT and multiply it by the impulse response's DFT (i.e. convolution in time domain), transform back to the time domain and lop on the last L samples from the resulting L+M-1 chunk.

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CIPIC_hrtf_database

CIPIC_hrtf_database

resources

resources

HRTF_convolver.py

HRTF_convolver.py

LICENSE.md

LICENSE.md

README.md

README.md

Repository files navigation

3D Audio Panner

Getting Started

Prerequisites

Usage:

Implementation details

HRIR Interpolation

Real-time convolution

License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
CIPIC_hrtf_database		CIPIC_hrtf_database
resources		resources
HRTF_convolver.py		HRTF_convolver.py
LICENSE.md		LICENSE.md
README.md		README.md

License

franciscorotea/3D-Audio-Panner

Folders and files

Latest commit

History

Repository files navigation

3D Audio Panner

Getting Started

Prerequisites

Usage:

Implementation details

HRIR Interpolation

Real-time convolution

License

About

Resources

License

Stars

Watchers

Forks

Languages