Spatial Audio on the Web.
JavaScript HTML
Latest commit 57281bc Dec 6, 2016 @hoch hoch committed on GitHub Merge pull request #31 from littlstar/master
add .npmignore

Omnitone: Spatial Audio on the Web

Omnitone is a robust implementation of FOA (first-order-ambisonic) decoder with binaural rendering written in Web Audio API. Its decoding process is based on multiple gain nodes for ambisonic gain matrix and convolutions for HRTF binaural rendering, ensuring the optimum performance.

See Omnitone in action:

The implementation of Omnitone is based on the Google spatial media specification. The input audio stream must be configured to ACN channel layout with SN3D normalization.

How it works

Omnitone is a high-level library that abstracts various technical layers of the spatial audio processing. The input audio stream can be either a media element (video or audio tags) or a multichannel web audio source. The rotation of the sound field also can be easily linked to the mobile phone's sensor or the on-screen user interaction.

Omnitone Diagram


Omnitone is designed to be used for the web projects, so the installation via NPM is recommended. Alternatively, you can clone or download this repository and use the library script file as usual.

npm install omnitone


The first step is to include the library file in an HTML document.

<script src="omnitone.min.js"></script>

The decoder requires an <audio> or <video> element and AudioContext. The following is an example of how to set up the context and the element for Omnitone.

// Prepare audio element to feed the ambisonic source audio feed.
var audioElement = document.createElement('audio');
audioElement.src = 'resources/4ch-spatial-audio-file.wav';

// Create an AudioContext and an Omnitone decoder.
var audioContext = new AudioContext();
var decoder = Omnitone.createFOADecoder(audioContext, audioElement);

// Initialize and then start playing the audio element.
decoder.initialize().then(function () {;
}, function (onInitializationError) {

The decoder constructor accepts the context and the element as arguments. Omnitone uses HRIRs from Google spatial media repository, but you can use a custom set of HRIR files as well. The initialization of a decoder instance returns a promise which resolves when the resources (i.e. impulse responses) are fully loaded.

The rotation matrix (3x3, row-major) in the decoder can be updated inside of the graphics render loop. This operation rotates the entire sound field. The rotation matrix is commonly derived from the quaternion of the orientation sensor on the VR headset or the smartphone. Also Omnitone converts the coordinate system from the WebGL space to the audio space internally, so you need not to transform the matrix manually.

// Rotate the sound field.

If you prefer to work with 4x4 rotation matrix (e.g. Three.js camera), you can use the following method instead.

// Rotate the sound field based on a Three.js camera object.

Use setMode method to change the setting of the decoder. This is useful when the media source does not have spatially encoded (e.g. stereo or mono) or when you want to reduce the CPU usage or the power consumption by disabling the decoder.

// Mono or regular multi-channel layouts.

// Ambisonically decoded audio stream.

// Disable encoding completely. (audio processing disabled)

Advanced Usage

Omnitone also provides various building blocks for the first-order-ambisonic decoding and the binaural rendering. The FOADecoder is just a ready-made object built with those components. You can create them and connect together build your own decoding mechanism.


FOADecoder is a ready-made FOA decoder and binaural renderer. If you need to use the spatialization without any custom configuration, this is the simplest way of using Omnitone.

var decoder = Omnitone.createFOADecoder(context, element, {
  postGainDB: 0,
  channelMap: [0, 1, 2, 3]
  • context (AudioContext): an AudioContext object.
  • element (MediaElement): A target video or audio element for streaming.
  • options (Object): options for decoder.
    • HRTFSetUrl (String): Base URL for the cube HRTF sets.
    • postGainDB (Number): Post-decoding gain compensation in dB.
    • channelMap (Array): A custom channel map.


FOARouter is useful when you need to change the channel layout of the incoming multichannel audio stream. This is necessary because the channel layout changes depending on the audio codec in the browser.

var router = Omnitone.createFOARouter(context, channelMap);
  • context (AudioContext): an AudioContext object.
  • channelMap (Array): an array represents the target channel layout.


router.setChannelMap([0, 1, 2, 3]); // 4-ch AAC in Chrome (default).
router.setChannelMap([1, 2, 0, 3]); // 4-ch AAC in Safari.


FOARotator is a sound field rotator for the first-order-ambisonic decoding. It also performs the coordinate transformation between the world space and the audio space.

var rotator = Omnitone.createFOARotator(context);
  • context (AudioContext): an AudioContext object.


rotator.setRotationMatrix([1, 0, 0, 0, 1, 0, 0, 0, 1]); // 3x3 row-major matrix.
rotator.setRotationMatrix4([1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1]); // 4x4 row-major matrix.
  • rotationMatrix (Array): 3x3 row-major matrix.
  • rotationMatrix4 (Array): 4x4 row-major matrix.


FOAPhaseMatchedFilter is a pair of pass filters (LP/HP) with a crossover frequency to compensate the gain of high frequency contents without a phase difference.

var filter = Omnitone.createFOAPhaseMatchedFilter(context);
  • context (AudioContext): an AudioContext object.


FOAVirtualSpeaker is a virtual speaker abstraction with the decoding gain coefficients and HRTF convolution for the first-order-ambisonic audio stream. Note that the speaker instance directly connects to the destination node of AudioContext. So you cannot apply additional audio processing after this component.

var speaker = Omnitone.createFOAVirtualSpeaker(context, options);
  • context (AudioContext): an AudioContext object.
  • options (Object): options for speaker.
    • coefficients: decoding coefficients for (W,X,Y,Z).
    • IR: stereo IR buffer for HRTF convolution.
    • gain: post-gain for the speaker.


speaker.enable();   // activate the speaker.
speaker.disable();  // deactivate the speaker.

Deactivating a virtual speaker can save CPU powers. Running multiple HRTF convolution can be computationally expensive, so disabling a speaker might be helpful when the binaural rendering is not necessary.


Omnitone uses WebPack to build the minified library and to manage dependencies.

npm install         # install dependencies.
npm run build       # build a non-minified library.
npm run watch       # recompile whenever any source file changes.
npm run build-all   # build a minified library and copy static resources.


Currently there is no sane way of testing web applications built with Web Audio API in the automated setting (e.g. Travis CI). To run unit tests locally, fire up the web server and load the index HTML page in the the test directory.

Audio Codec Compatibility

Omnitone is designed to run any browser that supports Web Audio API, however, it does not address the incompatibility issue around various media codecs in the browsers. At the time of writing, the decoding of compressed multichannel audio via <video> or <audio> elements is not fully supported by the majority of mobile browsers.

Related Resources


Special thanks to Julius Kammerl, Dillon Cower, Boris Smus and Brandon Jones for their help on this project. We are also grateful to Tim Fain and Jaunt VR for their permission to use beautiful VR contents in the demo.


If you have found an error in this library, please file an issue at:

Patches are encouraged, and may be submitted by forking this project and submitting a pull request through GitHub. See CONTRIBUTING for more detail.


Copyright 2016 Google Inc. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.