packaging lamejs as an AudioWorklet #48

mreinstein · 2017-12-14T00:17:37Z

Chrome is about to land AudioWorklet and deprecate ScriptProcessorNode.

It would be awesome if lamejs could be used a normal WebAudio node.

https://www.chromestatus.com/features/4588498229133312

zhuker · 2017-12-14T02:32:14Z

lamejs is just a bit manipulation library it transforms wave audio bits into mp3 bits.
integration into apis are welcome as pull requests

mreinstein · 2017-12-14T04:12:38Z

lamejs is just a bit manipulation library it transforms wave audio bits into mp3 bits

@zhuker yeah I get that. Maybe I misunderstand but it seems that the lamejs essentially takes Uint16Array data as input and produces Uint8Array encoded data as output. Is that right? I think the AudioWorklet api takes Float32Array data as input and output.

integration into apis are welcome as pull requests

I started working on this in a branch, but ran into the aforementioned issue. Would be happy to send a PR if/when it works!

zhuker · 2017-12-14T07:39:52Z

Input: You can easily convert float32 to int16 by multiplying each member of the array by 32767 Output: Mp3 is a stream of bytes hence uint8, outputting it as float32 makes no sense I am not very familiar with audioworklet but lamejs should be the terminating node in an audio pipeline. Shouldn't it?

…

On Wed, Dec 13, 2017, 20:12 Mike Reinstein ***@***.***> wrote: lamejs is just a bit manipulation library it transforms wave audio bits into mp3 bits @zhuker <https://github.com/zhuker> yeah I get that. Maybe I misunderstand but it seems that the lamejs essentially takes Uint16Array data as input and produces Uint8Array encoded data as output. Is that right? I think the AudioWorklet api takes Float32Array data as input and output. integration into apis are welcome as pull requests I startd working on this in a branch, but ran into the aforementioned issue. Would be happy to send a PR if/when it works! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#48 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEKlQn8IRQR0XhhRjgUf29-hC72wlmOxks5tAKA2gaJpZM4RBXpc> .

mreinstein · 2017-12-14T17:50:32Z

lamejs should be the terminating node in an audio pipeline. Shouldn't it?

The audio pipeline in my use case:

                          ┌---------------------┐
                      ┌-->|watson speech-to-text|
┌---┐    ┌------┐     |   └---------------------┘
|mic├--->|lamejs├-----┤
└---┘    └------┘     |   ┌---------------------------┐
                      └-->|indexeddb (browser storage)|
                          └---------------------------┘

Mp3 is a stream of bytes hence uint8, outputting it as float32 makes no sense

Yeah, this is what I'm struggling with. Unless I'm mistaken, as per https://webaudio.github.io/web-audio-api/#defining-a-valid-audioworkletprocessor AudioWorklet outputs are Float32Arrays. I'm trying to figure out how to package this in a sensible way as an AudioWorklet so that lamejs can be used a normal webaudio node.

guest271314 · 2022-01-24T00:51:13Z

Did you solve this?

mreinstein · 2022-01-24T01:23:24Z

That was 5 years ago, I'm not working on audio processing lately.

Audio Worklet support has gotten pretty decent now though. It should be pretty feasible in theory.

guest271314 · 2022-01-24T01:46:27Z

I achieved the requirement using the details here e18447f.

Is the issue resolved?

mreinstein · 2022-01-24T01:47:35Z

I guess I can take a look and see if i can make that work via audio worklet

guest271314 · 2022-01-24T01:58:42Z

This is what I am doing with raw PCM input that I simultaneously stream with MediaStreamTrackGenerator and record with lamejs which I modified to be a Module export, in pertinent part

async importEncoder() {
    if (this.mimeType.includes('mp3')) {
      const lamejs = (await import('./lame.min.js')).default;
      this.mp3encoder = new lamejs.Mp3Encoder(2, 44100, 128);
      this.mp3Data = [];
    } else if (this.mimeType.includes('opus')) {
      const { Decoder, Encoder, tools, Reader, injectMetadata } = (await import('./ts-ebml.min.js'));
      Object.assign(this, { Decoder, Encoder, tools, Reader, injectMetadata });
    }
 }

const int8 = new Int8Array(441 * 4);
const { value, done } = await this.inputReader.read();
// value: raw PCM from parec -d @DEFAULT_MONITOR@
if (!done) int8.set(new Int8Array(value));
const int16 = new Int16Array(int8.buffer);
// https://stackoverflow.com/a/35248852
const channels = [new Float32Array(441), new Float32Array(441)];
for (let i = 0, j = 0, n = 1; i < int16.length; i++) {
  const int = int16[i];
  // If the high bit is on, then it is a negative number, and actually counts backwards.
  const float = int >= 0x8000 ? -(0x10000 - int) / 0x8000 : int / 0x7fff;
  // deinterleave
  channels[(n = ++n % 2)][!n ? j++ : j - 1] = float;
}
// var floatSamples = new Float32Array(44100); // Float sample from an external source
const left = channels.shift();
const right = channels.shift();
let leftChannel, rightChannel;
if (this.mimeType.includes('mp3')) {
  const sampleBlockSize = 441;
  leftChannel = new Int32Array(left.length);
  rightChannel = new Int32Array(right.length);
  for (let i = 0; i < left.length; i++) {
    leftChannel[i] = left[i] < 0 ? left[i] * 32768 : left[i] * 32767;
    rightChannel[i] = right[i] < 0 ? right[i] * 32768 : right[i] * 32767;
  }
}
const data = new Float32Array(882);
data.set(left, 0);
data.set(right, 441);
const frame = new AudioData({
  timestamp,
  data,
  sampleRate: 44100,
  format: 'f32-planar',
  numberOfChannels: 2,
  numberOfFrames: 441,
});
this.duration += frame.duration;
await this.audioWriter.write(frame);
if (this.mimeType.includes('mp3')) {
  const mp3buf = this.mp3encoder.encodeBuffer(leftChannel, rightChannel);
  if (mp3buf.length > 0) {
    this.mp3Data.push(mp3buf);
  }
}

if (this.mimeType.includes('mp3')) {
  const mp3buf = this.mp3encoder.flush(); //finish writing mp3
  if (mp3buf.length > 0) {
    this.mp3Data.push(new Int8Array(mp3buf));
  }
  this.resolve(new Blob(this.mp3Data, { type: 'audio/mp3' }));
}

In an AudioWorklet we can use top-level import and modify sampleBlockSize to 128.

guest271314 · 2022-01-24T02:12:08Z

You should be able to incorporate the changes https://github.com/guest271314/AudioWorkletStream. FWIW for speech synthesis processing can also utilize https://github.com/guest271314/native-messaging-espeak-ng. I am currently updating https://github.com/guest271314/captureSystemAudio for MP3 support. Next I will substitute https://github.com/davedoesdev/webm-muxer.js for MediaRecorder.

mreinstein · 2022-01-24T03:56:10Z

Having played around a little with AudioWorklets just now, I can say with more confidence that my original ask just doesn't really make sense.

WebAudio Nodes are intended to operate on Float32Arrays, both as input and output. If one were to package lamejs as an audio worklet, it would have to follow this format.

None of the existing webaudio graph nodes can accept lamejs encoded mp3. It only really makes sense as an intermediate node. My original graph diagram visualizes this: I was piping from lamejs to watson-speech-to-text and a local storage sink. Neither of these destinations benefit from being represented as a webaudio node.

guest271314 · 2022-01-24T04:37:08Z

Having played around a little with AudioWorklets just now, I can say with more confidence that my original ask just doesn't really make sense.

Yes, it does make sense.

WebAudio Nodes are intended to operate on Float32Arrays, both as input and output. If one were to package lamejs as an audio worklet, it would have to follow this format.

Not necessarily, parse, convert data to the expected TypedArray.

None of the existing webaudio graph nodes can accept lamejs encoded mp3.

Technically it can via HTML <audio> with MediaElementSource or captureStream() connected to MediaStreamAudioDestinationNode or MediaStreamAudioSourceNode connected to AudioWorkletNode.

My original graph diagram visualizes this: I was piping from lamejs to watson-speech-to-text and a local storage sink. Neither of these destinations benefit from being represented as a webaudio node.

The benefit is flexibility, and fidelity, particularly for speech to text. Though Mozilla Voice does use MP3.

You can certainly pipe a MediaStreamTrack through AudioWorkletNode to encode the stream in "real-time" and send to other destinations and save simultaneously.

The requirement is possible.

mreinstein closed this as completed Jan 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

packaging lamejs as an AudioWorklet #48

packaging lamejs as an AudioWorklet #48

mreinstein commented Dec 14, 2017 •

edited

Loading

zhuker commented Dec 14, 2017

mreinstein commented Dec 14, 2017 •

edited

Loading

zhuker commented Dec 14, 2017 via email

mreinstein commented Dec 14, 2017 •

edited

Loading

guest271314 commented Jan 24, 2022

mreinstein commented Jan 24, 2022

guest271314 commented Jan 24, 2022

mreinstein commented Jan 24, 2022

guest271314 commented Jan 24, 2022

guest271314 commented Jan 24, 2022

mreinstein commented Jan 24, 2022

guest271314 commented Jan 24, 2022

packaging lamejs as an AudioWorklet #48

packaging lamejs as an AudioWorklet #48

Comments

mreinstein commented Dec 14, 2017 • edited Loading

zhuker commented Dec 14, 2017

mreinstein commented Dec 14, 2017 • edited Loading

zhuker commented Dec 14, 2017 via email

mreinstein commented Dec 14, 2017 • edited Loading

guest271314 commented Jan 24, 2022

mreinstein commented Jan 24, 2022

guest271314 commented Jan 24, 2022

mreinstein commented Jan 24, 2022

guest271314 commented Jan 24, 2022

guest271314 commented Jan 24, 2022

mreinstein commented Jan 24, 2022

guest271314 commented Jan 24, 2022

mreinstein commented Dec 14, 2017 •

edited

Loading

mreinstein commented Dec 14, 2017 •

edited

Loading

mreinstein commented Dec 14, 2017 •

edited

Loading