Skip to content

Recording Audio

abudaan edited this page Feb 1, 2015 · 3 revisions

#Recording audio

Background

For recording audio you need a browser that has getUserMedia implemented.

As you can see at can i use, you can use any browser but Internet Explorer and Safari. Also Chrome on iOS doesn't support getUserMedia.

You have to allow access to your microphone by clicking 'Allow' in the small message bar that appears after you've clicked the record button. You have to allow access to your microphone every new browser session.

This blogpost of Remus from NuSoft has helped me a lot. In this post Remus explains step-by-step how to record and encode the recording to mp3 using Recorder.js and an asm.js version of Lame mp3 encoder made by Andreas Krennmair. However in heartbeat I have used Kobu Gurkan's build of it.

As soon as you start recording, the audio stream from your microphone is routed to a webworker by a ScriptProcessorNode. The ScriptProcessorNode processes the audio data in chunks of 8192 bytes, you can adjust this buffer size if needed. The webworker collects the incoming audio data in a plain javascript array.

As soon as you stop recording, the webworker converts the bytes in this plain javascript array to a Float32Array and this array is returned to the main javascript thread. Here the array gets converted to an AudioBuffer that can be played back by the AudioContext.

Getting the recording

The AudioBuffer is stored in a recording object that contains the following keys:

  • id (recording id, necessary to retrieve the recording data)
  • arraybuffer (raw wav data)
  • audiobuffer (raw wav data converted to AudioBuffer instance)
  • wav
    • blob (binary wav file as blob)
    • base64 (base64 encode wav file)
    • dataUrl (wav file as data URI)
  • waveform
    • images (array containing 1 or more HTML IMG elements)
    • dataUrls (array containing 1 or more data URI's)

And after you've called encodeAudioRecording() and passed 'mp3' for encoding type also:

  • mp3
    • blob
    • base64
    • dataUrl

The recording object gets stored in heartbeat's internal storage by its id: sequencer.storage.audio.recordings[recordId]

You can retrieve this recording object by the recording id that is returned by both song.startRecording() and song.stopRecording():

    // create a track and provide a name (optional)
    var track = sequencer.createTrack('my audio track');

    // make it possible to record audio to this track
    track.recordEnabled = 'audio';

    // start your recording, store the recording id
    var recordId = song.startRecording();

    // record something and stop recording again
    var recordId2 = song.stopRecording();

    // prints "true"
    console.log(recordId === recordId2);

    // then wait for the recorded_events event to fire
    song.addEventListener('recorded_events', function(){

        // retrieve the recording directly from the storage object
        var recording = sequencer.storage.audio.recordings[recordId];

        // you can also use
        var recording = track.getAudioRecordingData(recordId);

        // or
        var recording = song.getAudioRecordingData(recordId);
    });

There is another, a bit more elaborate way of retrieving the recording object from the recordings (plural) that gets sent as an argument when the recorded_events event fires:

song.addEventListener('recorded_events', function(recordings){
    // the recordings object is a plain javascript object that contains all recordings organized per track,
    // the track name is used as key, the value is an array containing all recorded events

    // get the audio event: when recording audio, only one audio event per track is created
    var audioEvent = recording[track.name][0];

    // the sampleId of the event is the same as the id of the recording, so now we can retrieve the recording object
    var recording = track.getAudioRecordingData(audioEvent.sampleId);

    // or you can retrieve it from storage object directly as well:
    var recording = sequencer.storage.audio.recordings[audioEvent.sampleId];

});

Latency

If you are recording audio you have to deal with latency. Latency can be caused by many factors, in our case the most likely factors are:

  • Buffering audio data in order to ensure a gap-less audio stream from your microphone to your computer.
  • Converting analog audio from your microphone to digital audio, this happens in the A/D converters of your sound card.
  • Routing digital audio from your sound card back to the native browser process, the speed of this is dependent on your drivers, for instance CoreAudio or ASIO. If you are on Linux it depends on whether or not you're using the RT kernel as well.
  • Routing the signal onwards to the javascript main thread.
  • Then as described above the audio data is sent to a webworker, which converts the audio data to a Float32Array that can be use to create a wav file.
  • The array is sent back to the javascript main thread which then converts it to an AudioBuffer.

Professional DAW's like Cubase, Ardour or Logic usually have to deal with the latency that is caused by buffering. The latency can be calculated very easily and the DAW will compensate it for you automatically:

var sampleRate = 44100; // value in Hz, other common values are 48000, 96000 and even 192000
var buffersize = 8192; // this is the amount of bytes/samples that are buffered
var millisPerSample = (1/sampleRate) * 1000; // calculate the duration of one single sample
var latency = buffersize * millisPerSample;

console.log(latency + ' milliseconds') // prints 185.759637188 milliseconds

Note that a buffer of 8192 samples is very high; in your DAW you'd probably use value between 64 and 256 bytes which results in a latency of 1.5 and 5.8 milliseconds respectively.

In heartbeat I use a buffer of 8192 samples (configurable) and I have tried to compensate for it automatically but unfortunately that didn't work. The latency turned out to vary in subsequent recordings, however in most cases the second and following recordings all have the same, somewhat smaller latency than the first recording. My first thought was that it had to do with the time it takes to click on the "allow" button to grant access to your microphone, but that had nothing to do with it.

So I added a function that lets you adjust the latency per recording and a function that lets you adjust the latency for all recordings. The first function takes three arguments:

song.setAudioRecordingLatency(recordId, value, function(data){
    // data is the object that contains all updated recording data, see above
});

And the second takes only one value:

song.adjustLatencyForAllRecordings(value);

The value is the amount of latency compensation: it has to be a positive integer and it is the amount of milliseconds that gets removed from the start of the recording. This is a non-destructive action so the removed audio data will not be lost and you can re-adjust the latency to a lower value at any time.

Test and adjust the latency on your system

Usually when you record audio via a microphone you put on a (closed) headphone to avoid audio feedback. But for this test I would like you to deliberately record the output of your computer. You can use this example. Make sure that the metronome is on, click on "start record" and record a few bars only the tick of the metronome. Now play back the recording you just made. You'll notice that the recorded metronome tick is not in sync with the live metronome tick. This is your latency. You can align the recording with the metronome by using the latency slider. You can use the waveform image for some visual guidance: if the first peak is aligned with the left side of waveform image then your recording should be in sync with the metronome.

Note: soon an audio editor will be added to heartbeat, in this editor you will be able to drag the waveform back and forth to adjust the latency and the snap point (used for quantizing audio events).

Creating waveforms

As mentioned above, the recording object that is created after you've finished an audio recording also contains an array of one or more data URI's of waveform images. The waveform images are generated with default settings, but you can also control the how waveform images are created by passing a configuration object to the track:

track.setWaveformConfig({
    height: 200,
    width: 800,
    //density: 0.001,
    sampleStep: 1,
    color: '#71DE71',
    bgcolor: '#000'
});

The waveform image is created by plotting the PCM sample values on a canvas element and drawing lines between the dots. The density parameter determines the distance in pixels between the dots. This means that the width of the waveform depends on the length of the wav file.

However if you set a value for width then the value of density gets calculated by dividing the specified width by the number of samples; this way we get a fixed width that is independent of the length of a wav file. This means that the parameters width and density are mutual exclusive whereby the density parameter will be overruled if both are set.

The parameter height sets the height of the generated image and behaves the same in both cases described above.

Note that because the waveform is plotted on a canvas element, the value you set for width and hight must be less of equal then the maximum width and height of a canvas element which is 32767 pixels, see this post at Stackoverflow.

The parameter sampleStep is the amount of samples that is skipped between two dots. If you set it to 1 then all samples are plotted, but if you set it for instance to 10 then only the samples 1, 11, 21, 31, and so on get plotted. Especially with long wav files it is recommended to set sampleStep to a higher value to save processing time.

The parameters color and bgcolor set the color of the waveform and the color of the background respectively.

Because the images are generated in the main javascript thread, everything else has to wait for the process to complete. Fortunately in the future it will be possible to transfer a canvas object to a webworker so waveform images can be created in a background process. See the W3C specification. Up to today, no browser has yet implemented it, see at the bottom of this MDN page.