Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

live streaming wav is inconsistent / stops #1

Open
benoitmercusot opened this issue Jan 3, 2021 · 27 comments
Open

live streaming wav is inconsistent / stops #1

benoitmercusot opened this issue Jan 3, 2021 · 27 comments

Comments

@benoitmercusot
Copy link

Hi, first of all i would like to thank you for that POC / Script. Exactly what i was looking for. Amazing work.
But I ran into an issue when i started to stream a "live stream". Actually raw audio stream from a node server (using node port audio). It works well for few seconds. (maybe a minute) then sounds is muted / gone. Do you think it's buffer related ? network issue ? For information i'm trying to broadcast audio without latency from node portaudio to the browser. I thought your script was a good way to start !
Thanks for your time. Benoit

@guest271314
Copy link
Owner

What is the input audio stream format and what is logged at the console?

@benoitmercusot
Copy link
Author

benoitmercusot commented Jan 3, 2021

Hey. i'm using this https://www.npmjs.com/package/node-portaudio it's says raw audio but your wav codec works just fine. I can't log / find anything that show me why it drops :/
I was just wondering if it's related to the header of my stream. (no lenght...) so maybe it's a problem.

PS : working on the port-message branch.

@guest271314
Copy link
Owner

If you have control over the input stream you can remove WAV header and use s16le to avoid remoal of the first 44 bytes. Is input 1 channel or 2 channel? What is the sampling rate?

@benoitmercusot
Copy link
Author

Here is my input for now (using my iphone ear / mic for now). Works like a charm but stops.
const ai = new portAudio.AudioInput({
channelCount: 2,
sampleFormat: portAudio.SampleFormat16Bit,
sampleRate: 44100,
deviceId : -1 // Use -1 or omit the deviceId to select the default device
});

Tried other branches of your repo :) Wasm memory hearing seems to be limited by the duration of the memory allowed. Not sure to understand this "and use s16le to avoid remoal of the first 44 bytes" but very nice of you to answer.

@guest271314
Copy link
Owner

According to https://www.npmjs.com/package/node-portaudio

// Note that this does not strip the WAV header so a click will be heard at the beginning
const rs = fs.createReadStream('steam_48000.wav');

Can you upload the WAV file here so that we can test using the same code?

@guest271314
Copy link
Owner

Not sure to understand this "and use s16le to avoid remoal of the first 44 bytes" but very nice of you to answer.

See http://www.topherlee.com/software/pcm-tut-wavformat.html, https://github.com/guest271314/AudioWorkletStream/blob/message-port-post-message/audioWorklet.js#L12.

@benoitmercusot
Copy link
Author

well it's actually not a file but a live stream from my mic. actually if you are interested and have time i can MP you a link ?

@benoitmercusot
Copy link
Author

Not sure to understand this "and use s16le to avoid remoal of the first 44 bytes" but very nice of you to answer.

See http://www.topherlee.com/software/pcm-tut-wavformat.html, https://github.com/guest271314/AudioWorkletStream/blob/message-port-post-message/audioWorklet.js#L12.

Interesting. Thanks. I actually removed this part // accumulate 344 * 512 * 1.5 of data (to achieve real time, maybe that's what causing latency)

@guest271314
Copy link
Owner

You should be able to record the microphone output per the NPM documentation. To capture microphone directly see also guest271314/SpeechSynthesisRecorder#17 (comment), https://github.com/guest271314/setUserMediaAudioSource, https://github.com/guest271314/captureSystemAudio.

I actually removed this part // accumulate 344 * 512 * 1.5 of data (to achieve real time, maybe that's what causing latency)

This is capable of streaming without waiting for accumulation of data https://github.com/guest271314/webtransport/blob/main/webTransportAudioWorkletWebAssemblyMemoryGrow.js.

@benoitmercusot
Copy link
Author

benoitmercusot commented Jan 3, 2021

Actually i'm not trying to record but to stream :) (server to browser, ie soundcard input (node server) to browser (client) ) your last link looks very nice, you think i can use it with an endless wav/raw stream ?

@guest271314
Copy link
Owner

It is difficult to test and verify "endless" https://bugs.chromium.org/p/chromium/issues/detail?id=1161429. During testing I streamed 8 hours of audio yesterday.

Capturing entire system output or specific application audio output is possible by creating a virtual microphone and setting the source to an application or user-defined stream, at the browser capturing with navigator.mediaDevice.getUserMedia({audio: true}) avoids the need to read and write bytes individually, after configuration the task can be reduced to HTMLMediaElement.srcObject = mediaStream.

The WebAssemblyMemory.grow() version of the code in this repository has implementation restrictions at Chrome which differ for 32-bit and 64-bit systems, see wasmerio/wasmer-php#121, according to the Chrome issue 4GB fo4 64-bit, I have not yet tested the maximum. A ring buffer could be written which overwrites the previously used indexes of the SharedArrayBuffer, that is a TODO.

@benoitmercusot
Copy link
Author

Thanks you very much for all the explaination. Yes navigator.mediaDevice.getUserMedia({audio: true}) but not working for what i want since i want the server to be the source :/ I'll take a look at all your links. All this is still very complicated and confused for me ! 🤪 You look way beyond everyone on the internet regarding this specifics API !! Have a good evening.
Benoit.

@guest271314
Copy link
Owner

For an "infinite" or "endless" audio stream I would try using WebTransport for the ReadableStream to avoid restritions on ServiceWorker approach, in that case the server handles quic-transport protocol, see https://github.com/guest271314/webtransport/.
opus_stream_sw.zip

@benoitmercusot
Copy link
Author

Thanks a ton. i'll look into that !
Benoit

@guest271314
Copy link
Owner

Is this issue resolved?

@benoitmercusot
Copy link
Author

héhé. i'm not that fast. i have to understand all the sources you gave me :)

@benoitmercusot
Copy link
Author

HI @guest271314 i was reading this whole thread. looks like exaclty what i was trying to achieve : wasmerio/wasmer-php#121 (except using node instead of php passthru) did you make anyprogress on this ? regarding memory grow / duration ? Thank you;
Benoit.

@guest271314
Copy link
Owner

The Native Messaging, PHP passthru() version is essentially the precursor to the QuicTransport and WebTransport versions. Since you read that Issue you noted the Chrome bug which limits WebAssembly.Memory.grow() to 4GB on 64-bit system. When I was testing that code I was on a 32-bit system. I have not yet tried to reach the maximum on 64-bit. It should be possible to use more than one Memory or SharedArrayBuffer or Typed Array instance, and, or, overwrite the indexes that have already been parsed, using a "circular buffer" approach. I suggest testing the maximum on your system.

Craeting a virtual microphone device and using MediaStream eliminates the need to do that (count bytes). However, it is edifying to be able to achieve either approach.

@benoitmercusot
Copy link
Author

Hi @guest271314 i'm now having fun with your MessagePort.postMessage() branch. From what i understand the time limit should be limited by the Uint8Array size of the AudioWorkletProcessor. I still have inconsistency in the playback (stops occurs after few seconds, sometimes a minute) but i suspect my wav stream to be to inconsistent (too big ?). the appendBuffers log shows huge variation in the array length so there must be a issue here. Still digging !
Benoit.

@benoitmercusot
Copy link
Author

benoitmercusot commented Jan 4, 2021

Think i found a hack (ugly ?) issues indeed occurs when index was lower than offset

// magic "if" hack 😅

if( this.offset < this.index ){

  for (let i = 0; i < 512; i++, this.offset++) {
    if (this.offset === this.uint8.length) {
      console.log(this.uint8);
      break;
    }
    uint8[i] = this.uint8[this.offset];
  }
  const uint16 = new Uint16Array(uint8.buffer);
  CODECS.get(this.codec)(uint16, channels);

}

@guest271314
Copy link
Owner

The offset is the bytes read, the index is the bytes written.

well it's actually not a file but a live stream from my mic.

One solution

navigator.mediaDevices.getUserMedia({audio: true})
.then(stream => {
  // do stuff with stream: MediaStream
});

@benoitmercusot
Copy link
Author

benoitmercusot commented Jan 5, 2021

Hi ! My "hack" works like a charm. Before that it actually stopped every time index was beyond offset (i suppose it make sense). Now it never stops.
MediaStream was not a solution because final use is to stream from input card from another device. For now everything works as expected ! Thanks a lot for all your "WIPs"

@benoitmercusot
Copy link
Author

You should be able to record the microphone output per the NPM documentation. To capture microphone directly see also guest271314/SpeechSynthesisRecorder#17 (comment), https://github.com/guest271314/setUserMediaAudioSource, https://github.com/guest271314/captureSystemAudio.

I actually removed this part // accumulate 344 * 512 * 1.5 of data (to achieve real time, maybe that's what causing latency)

This is capable of streaming without waiting for accumulation of data https://github.com/guest271314/webtransport/blob/main/webTransportAudioWorkletWebAssemblyMemoryGrow.js.

HI there !
If you have one minute to show me how to implement this ? (ie where that text /bytes input comes from ?)

@guest271314
Copy link
Owner

For the WebTransport version "text" input originates in the browser at function call

webTransportAudioWorkletMemoryGrow('hello world')

Sending to quic-transport URL

    let data = encoder.encode(text);
    await writer.write(data);
    console.log('writer close', await writer.close());

input_data here https://github.com/guest271314/webtransport/blob/main/quic_transport_server_tts.py#L138 is the same text in the process

data = subprocess.run(['./tts.sh', input_data], stdout=subprocess.PIPE)

"$1" is the input text passed in this case to espeak-ng https://github.com/guest271314/webtransport/blob/main/tts.sh#L2

espeak-ng -m --stdout "$1" # TODO pass, set options

the response is payload https://github.com/guest271314/webtransport/blob/main/quic_transport_server_tts.py#L140

self.connection.send_stream_data(response_id, payload, True)

the ReadableStream from quic-transort server https://github.com/guest271314/webtransport/blob/main/webTransportAudioWorkletWebAssemblyMemoryGrow.js#L183

const { readable } = stream;

that we pipeTo() https://github.com/guest271314/webtransport/blob/main/webTransportAudioWorkletWebAssemblyMemoryGrow.js#L184 a WrtitableStream(), in this case grow() a WebAssembly.Memory (SharedArrayBuffer) instance if necessary (see https://github.com/guest271314/webtransport/blob/main/webTransportAudioWorkletWebAssemblyMemoryGrow.js#L8 for minimum, maxiumum values corresponding to audio, feel free to verify the values used, as there was no manual for how to accurate dervice those values, I learned Python and bytes necessary per second by doing).

@guest271314
Copy link
Owner

To install aioquic

python3 -m pip install aioquic

create the necessary certificates https://github.com/guest271314/webtransport/blob/main/quic_transport_server_tts.py#L40, launch Chrome or Chromium with the appropriate flags found in the same comment block.

Note, I commented, do not use ALLOWED_ORIGINS https://github.com/guest271314/webtransport/blob/main/quic_transport_server_tts.py#L94 for the ability to run the code at console on any site.

@benoitmercusot
Copy link
Author

awesome, thanks a lot.

@guest271314
Copy link
Owner

Relevant to running the code at console at any origin, there is still a restriction on doing so using AudioWorklet due to the design being an Ecmascript Module, thus GitHub blocks loading. Once you get the code running and test at console on this very page you will perhaps gather why I filed WebAudio/web-audio-api-v2#109.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants