Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

packaging lamejs as an AudioWorklet #48

Closed
mreinstein opened this issue Dec 14, 2017 · 12 comments
Closed

packaging lamejs as an AudioWorklet #48

mreinstein opened this issue Dec 14, 2017 · 12 comments

Comments

@mreinstein
Copy link

mreinstein commented Dec 14, 2017

Chrome is about to land AudioWorklet and deprecate ScriptProcessorNode.

It would be awesome if lamejs could be used a normal WebAudio node.

https://www.chromestatus.com/features/4588498229133312

@zhuker
Copy link
Owner

zhuker commented Dec 14, 2017

lamejs is just a bit manipulation library it transforms wave audio bits into mp3 bits.
integration into apis are welcome as pull requests

@mreinstein
Copy link
Author

mreinstein commented Dec 14, 2017

lamejs is just a bit manipulation library it transforms wave audio bits into mp3 bits

@zhuker yeah I get that. Maybe I misunderstand but it seems that the lamejs essentially takes Uint16Array data as input and produces Uint8Array encoded data as output. Is that right? I think the AudioWorklet api takes Float32Array data as input and output.

integration into apis are welcome as pull requests

I started working on this in a branch, but ran into the aforementioned issue. Would be happy to send a PR if/when it works!

@zhuker
Copy link
Owner

zhuker commented Dec 14, 2017 via email

@mreinstein
Copy link
Author

mreinstein commented Dec 14, 2017

lamejs should be the terminating node in an audio pipeline. Shouldn't it?

The audio pipeline in my use case:

                          ┌---------------------┐
                      ┌-->|watson speech-to-text|
┌---┐    ┌------┐     |   └---------------------┘
|mic├--->|lamejs├-----┤
└---┘    └------┘     |   ┌---------------------------┐
                      └-->|indexeddb (browser storage)|
                          └---------------------------┘

Mp3 is a stream of bytes hence uint8, outputting it as float32 makes no sense

Yeah, this is what I'm struggling with. Unless I'm mistaken, as per https://webaudio.github.io/web-audio-api/#defining-a-valid-audioworkletprocessor AudioWorklet outputs are Float32Arrays. I'm trying to figure out how to package this in a sensible way as an AudioWorklet so that lamejs can be used a normal webaudio node.

@guest271314
Copy link

Did you solve this?

@mreinstein
Copy link
Author

That was 5 years ago, I'm not working on audio processing lately.

Audio Worklet support has gotten pretty decent now though. It should be pretty feasible in theory.

@guest271314
Copy link

I achieved the requirement using the details here e18447f.

Is the issue resolved?

@mreinstein
Copy link
Author

I guess I can take a look and see if i can make that work via audio worklet

@guest271314
Copy link

This is what I am doing with raw PCM input that I simultaneously stream with MediaStreamTrackGenerator and record with lamejs which I modified to be a Module export, in pertinent part

async importEncoder() {
    if (this.mimeType.includes('mp3')) {
      const lamejs = (await import('./lame.min.js')).default;
      this.mp3encoder = new lamejs.Mp3Encoder(2, 44100, 128);
      this.mp3Data = [];
    } else if (this.mimeType.includes('opus')) {
      const { Decoder, Encoder, tools, Reader, injectMetadata } = (await import('./ts-ebml.min.js'));
      Object.assign(this, { Decoder, Encoder, tools, Reader, injectMetadata });
    }
 }
const int8 = new Int8Array(441 * 4);
const { value, done } = await this.inputReader.read();
// value: raw PCM from parec -d @DEFAULT_MONITOR@
if (!done) int8.set(new Int8Array(value));
const int16 = new Int16Array(int8.buffer);
// https://stackoverflow.com/a/35248852
const channels = [new Float32Array(441), new Float32Array(441)];
for (let i = 0, j = 0, n = 1; i < int16.length; i++) {
  const int = int16[i];
  // If the high bit is on, then it is a negative number, and actually counts backwards.
  const float = int >= 0x8000 ? -(0x10000 - int) / 0x8000 : int / 0x7fff;
  // deinterleave
  channels[(n = ++n % 2)][!n ? j++ : j - 1] = float;
}
// var floatSamples = new Float32Array(44100); // Float sample from an external source
const left = channels.shift();
const right = channels.shift();
let leftChannel, rightChannel;
if (this.mimeType.includes('mp3')) {
  const sampleBlockSize = 441;
  leftChannel = new Int32Array(left.length);
  rightChannel = new Int32Array(right.length);
  for (let i = 0; i < left.length; i++) {
    leftChannel[i] = left[i] < 0 ? left[i] * 32768 : left[i] * 32767;
    rightChannel[i] = right[i] < 0 ? right[i] * 32768 : right[i] * 32767;
  }
}
const data = new Float32Array(882);
data.set(left, 0);
data.set(right, 441);
const frame = new AudioData({
  timestamp,
  data,
  sampleRate: 44100,
  format: 'f32-planar',
  numberOfChannels: 2,
  numberOfFrames: 441,
});
this.duration += frame.duration;
await this.audioWriter.write(frame);
if (this.mimeType.includes('mp3')) {
  const mp3buf = this.mp3encoder.encodeBuffer(leftChannel, rightChannel);
  if (mp3buf.length > 0) {
    this.mp3Data.push(mp3buf);
  }
}
if (this.mimeType.includes('mp3')) {
  const mp3buf = this.mp3encoder.flush(); //finish writing mp3
  if (mp3buf.length > 0) {
    this.mp3Data.push(new Int8Array(mp3buf));
  }
  this.resolve(new Blob(this.mp3Data, { type: 'audio/mp3' }));
}

In an AudioWorklet we can use top-level import and modify sampleBlockSize to 128.

@guest271314
Copy link

You should be able to incorporate the changes https://github.com/guest271314/AudioWorkletStream. FWIW for speech synthesis processing can also utilize https://github.com/guest271314/native-messaging-espeak-ng. I am currently updating https://github.com/guest271314/captureSystemAudio for MP3 support. Next I will substitute https://github.com/davedoesdev/webm-muxer.js for MediaRecorder.

@mreinstein
Copy link
Author

Having played around a little with AudioWorklets just now, I can say with more confidence that my original ask just doesn't really make sense.

WebAudio Nodes are intended to operate on Float32Arrays, both as input and output. If one were to package lamejs as an audio worklet, it would have to follow this format.

None of the existing webaudio graph nodes can accept lamejs encoded mp3. It only really makes sense as an intermediate node. My original graph diagram visualizes this: I was piping from lamejs to watson-speech-to-text and a local storage sink. Neither of these destinations benefit from being represented as a webaudio node.

@guest271314
Copy link

Having played around a little with AudioWorklets just now, I can say with more confidence that my original ask just doesn't really make sense.

Yes, it does make sense.

WebAudio Nodes are intended to operate on Float32Arrays, both as input and output. If one were to package lamejs as an audio worklet, it would have to follow this format.

Not necessarily, parse, convert data to the expected TypedArray.

None of the existing webaudio graph nodes can accept lamejs encoded mp3.

Technically it can via HTML <audio> with MediaElementSource or captureStream() connected to MediaStreamAudioDestinationNode or MediaStreamAudioSourceNode connected to AudioWorkletNode.

My original graph diagram visualizes this: I was piping from lamejs to watson-speech-to-text and a local storage sink. Neither of these destinations benefit from being represented as a webaudio node.

The benefit is flexibility, and fidelity, particularly for speech to text. Though Mozilla Voice does use MP3.

You can certainly pipe a MediaStreamTrack through AudioWorkletNode to encode the stream in "real-time" and send to other destinations and save simultaneously.

The requirement is possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants