Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compliant AudioBufferSourceNode and AudioBuffer #67

Merged
merged 38 commits into from Dec 22, 2021

Conversation

b-ma
Copy link
Collaborator

@b-ma b-ma commented Dec 14, 2021

Hey,

Here is a working draft, there are still a number of points that needs to be improved but the thing is working and have a clean user-facing API.

Most of the new code lies in src/audio_buffer.rs and src/node/audio_buffer_source.rs, but I had to add some few stuff here and there.

I have put a lot of comments in the code but to summarise the things I'm not very happy with:

  • some types and struct are not really clean for now, but I'm pretty sure you will have some ideas on how to improve that
  • decodeAudioData only parse wav files and don't resample, but it's ok for testing and validating the principles
  • I didn't wrote much unit tests against the AudioBufferSourceNode but from the examples I'm quite confident there is no huge problem

I have put 2 new examples:

  • cargo run --release --example trigger_soundfile which showcases most of the API
  • cargo run --release --example granular which shows that it works quite well (stable ~2.4% cpu on my computer, I'm quite happy with this one :)

Let me know what you think

@b-ma b-ma requested a review from orottier December 14, 2021 17:53
@orottier
Copy link
Owner

orottier commented Dec 15, 2021

Hi @b-ma thanks for all the work. stay tuned, I need some more time

@orottier
Copy link
Owner

Hi @b-ma, thanks for the PR. You've surely put a lot of effort in it.

As I understand, your PR contains the following:

  1. a new definition for AudioBuffer: a Rc<Vec<Arc<Vec<f32>>>> with some metadata (len, channels)
  2. the new AudioBufferSourceNode for playing that buffer, working like this
  • setup a node with a crossbeam-channel to the render to send the actual audio buffer
  • ship the buffer to the renderer via this channel
  • in the renderer, try to receive the buffer
  • render the buffer piecewise, possible resampling, seeking, looping etc

Regarding 1) the current implementation of AudioBuffer is a Vec<Arc<Vec<f32>>> with some metadata. Why did you add the extra Rc? Is it because of the 'acquire the content' operations? I don't think we should port that to rust, as in rust there is the concept of ownership so we do not need to acquire/copy any data.

Regarding 2) I don't directly see what is the difference is with the current implementation. If we are going to add hundreds of new lines of code, I think there should be a very clear benefit to it. The old AudioBufferSourceNode also takes a buffer, ships it to the render thread, and renders/resamples/loops on the render thread.

All in all I don't really see how your implementation differs from the current implementation (although you have added some methods from the spec that are missing in the current implementation). But maybe I am missing something, can you explain what difference in architecture you are trying to achieve?

I'm going to add a few comments on the code to help our mutual understanding! All in all this is a good way to discuss the project's internals

src/audio_buffer.rs Outdated Show resolved Hide resolved
src/audio_buffer.rs Outdated Show resolved Hide resolved
src/audio_buffer.rs Outdated Show resolved Hide resolved
src/audio_buffer.rs Outdated Show resolved Hide resolved
src/audio_buffer.rs Outdated Show resolved Hide resolved
src/audio_buffer.rs Outdated Show resolved Hide resolved
src/node/audio_buffer_source.rs Outdated Show resolved Hide resolved
src/node/audio_buffer_source.rs Outdated Show resolved Hide resolved
attributes: SharedAttributes,
channel_config: ChannelConfig,
sender: Sender<BufferChannelsMessage>,
detune: AudioParam, // has constraints, no a-rate
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two params are not actually used in rendering, am I right (also not in the current implementation)

Copy link
Collaborator Author

@b-ma b-ma Dec 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes they are, the computed_playback_rate is calculated line 431 and used to update the renderer internal state (mainly playhead position) at each sample (line 576)

src/node/audio_buffer_source.rs Show resolved Hide resolved
@b-ma
Copy link
Collaborator Author

b-ma commented Dec 16, 2021

ok lot's of things to say, so

Regarding 1) the current implementation of AudioBuffer is a Vec<Arc<Vec>> with some metadata. Why did you add the extra Rc? Is it because of the 'acquire the content' operations? I don't think we should port that to rust, as in rust there is the concept of ownership so we do not need to acquire/copy any data.

Several things there:

  • I really believe there no huge difference between the buffer:AudioBuffer and the audio_buffer:AudioBuffer, except the second one follows the spec API. But very probably the 2 ones could be merged somehow, I just made a new thing so that is was more simple for me to think about the requirements and internal logic of the buffer and source.
  • The additional Rc seems required to me because several sources can share the same AudioBuffer, e.g. it's perfectly fine, and important to be able, to do something such as (in JSish pseudo-code):
const buffer = getSomeOneSecondBufferSomehow(); 
const now = audioContext.currentTime;
for (let i = 0; i < 100; i++) {
   const src = audioContext.createBufferSource();
   src.buffer = buffer;
   src.start(now + i * 0.01);
}

Here, you have 100 source that access the same AudioBuffer at the same time, so somehow the buffer internal_data needs to shared between these sources, that's why I put this extra Rc instead of copying the Vec of channels. Maybe it is not really necessary, I'm really not sure about that, but if you have a more than one channel sound file, it sounds cheaper to me to clone an Rc than to copy an entire Vec.

  • The concept of "acquire the content" as I understand it (that maybe I didn't implement the proper way in Rust, but I'm pretty confident in my understanding of the idea here) derives from that possibility. It basically means that once a source started its rendering, the memory it will access and read from cannot be corrupted or changed in any way (i.e. call to copyToChannel or getChannelData), e.g.:
const buffer = getSomeOneSecondBufferSomehow(); 
const now = audioContext.currentTime;
for (let i = 0; i < 100; i++) {
   if (i == 50) {
      buffer.copyToChannel(someArray, 0);
      buffer.copyToChannel(someArray, 1);
   }
   const src = audioContext.createBufferSource();
   src.buffer = buffer;
   src.start(now + i * 0.01);
}

Here, the 50 first sources (that are still playing) will continue to read the audioBuffer data as it was when they started their rendering. That's why I added this creation of a new Arc and Vec in copy_to_channel, and the manual clone of the Arc of channels when the source starts.
As I wrote in the comments, I'm not sure this last step is necessary, but as I couldn't find any clear informations about how clone behaves why nested structure, I just did it explicitly (i.e. if you clone an Rc<Vec<Arc<Vec>>> do you clone only the Rc or the Rc and the Arcs? If the second is true, I guess the code in AudioBufferSource::acquire_buffer_channels could perfectly be removed).
The important thing here, rather than implementation details or language specific concepts, is that: from the point of view of an AudioBufferSourceNode, once it has started to play, the underlying data it access cannot be modified.

Edit [spec]: This operation returns immutable channel data to the invoker.

Regarding 2) I don't directly see what is the difference is with the current implementation. If we are going to add hundreds of new lines of code, I think there should be a very clear benefit to it. The old AudioBufferSourceNode also takes a buffer, ships it to the render thread, and renders/resamples/loops on the render thread.

The main differences (which are quite important because it's why the AudioBufferSourceNode is primarily made for) from what I can see is that:

  • the old AudioBufferSourceNode works at block rate, cannot pitch because it just renders one block at a time (from what I can see).
  • the new one
    • is sub-sample time accurate (start, stop and loop),
    • can pitch / detune and even play backward because it has direct access to the full buffer,
    • follows the spec (which I guess is important here :),
    • and do all that in a very cheap and predictable way as it is fundamentally a table lookup into the underlying buffer (which is already resampled with the right sample rate) with a bit of interpolation.

All these features are really important if you want to go beyond simply playing an audio file (for which you'd use MediaElementSourceNode) and do serious real-time sample-based synthesis (e.g. sequencer, granular synthesis, concatenative synthesis, etc.)

…tion

It is very flawed:
- decode_audio_buffer does not exist yet so I need to put it in a media
  element
- this means I need to hardcode the buffer duration (3.0 seconds)
- since there's only 1 media element, I need to stop the previous scrub
  before I can start the next one. Causing clicks and a not nice sound
- something is off with the volume, I almost blew my eardrums out!
@orottier
Copy link
Owner

Hi @b-ma, thanks for your initial replies.

Edit: I just now see your reply from 11 minutes ago, I need to read that carefully.

Sidenote:
I hope you realize I greatly appreciate the effort you put in contributing to this project. No matter the outcome of this PR, you are making us question the current state of the project and thus pushing quality improvements all over the place. So, thanks!
End sidenote.

I have added a commit with a very rudimentary forward/backward scrubbing example with original implementation

It is very flawed:

  • decode_audio_buffer does not exist yet so I need to put it in a media
    element
  • this means I need to hardcode the buffer duration (3.0 seconds)
  • since there's only 1 media element, I need to stop the previous scrub
    before I can start the next one. Causing clicks and a not nice sound
  • something is off with the volume, I almost blew my eardrums out!

I will put some more work in it later, I think I can safely port your decode_audio_buffer code and shoehorn it into the current implementation. Then I can work on making my version of the example better!

@orottier
Copy link
Owner

The main differences (which are quite important because it's why the AudioBufferSourceNode is primarily made for) from what I can see is that:

the old AudioBufferSourceNode works at block rate, cannot pitch because it just renders one block at a time (from what I can see).
the new one ,,,

These are very valid points, I see now. I will get back

@b-ma
Copy link
Collaborator Author

b-ma commented Dec 16, 2021

Hey, for your volume problem, I guess it's probably related to how the decoding is done in media::WavDecoder,

let bits = wav.spec().bits_per_sample as f32;

which should be u16::MAX as f32, but I'm not sure to understand why you try do recreate the scrubbing example with the MediaElementSourceNode. I don't think this node should be used to try to do such kind of synthesis, it's meant at piping a stream from an html <audio> tag (which is the one which should hold the start, stop and seek API, cf. https://developer.mozilla.org/en-US/docs/Web/API/HTMLAudioElement) to the WebAudio API to add some effects or stuff like that (e.g. reverb, equalization). It handles a very different set of use cases (e.g. creating VLC like players, etc.) with completely different constraints (for example, in a VLC like app you would never want to fully load all your playlist and/or possibly very long files in memory) and where I have the impression that the actual implementation does quite well the job.

I think we kind of fallback here in the discussion of #64.

@orottier
Copy link
Owner

To keep stuff a bit organized, I propose to discuss 'acquiring the content' in a later issue. Okay? Let's focus on building a good AudioBufferSourceNode renderer in this PR.

@orottier
Copy link
Owner

I think this is going in a great direction! I will try to answer your question inline in the diff

src/buffer.rs Outdated Show resolved Hide resolved
//
// assert_eq!(buffer.sample_rate().0, 96_000);
// ```
pub(crate) fn resample(&mut self, sample_rate: SampleRate) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should definitely replace this version later with the library we are already using

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean the rubato lib here?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes!

@b-ma
Copy link
Collaborator Author

b-ma commented Dec 18, 2021

2 things left I guess to make the AudioBuffer fully compliant (added in #30):

  • buffer::ChannelData should be pub(crate), but is currently used in examples/bench.rs
  • would allow to move buffer::AudioBuffer::from_channels to pub(crate) once done

I preferred not to change that since I don't know if bench.rs is still relevant

@orottier
Copy link
Owner

Let's drop bench.rs, it was a silly way to benchmark in the past when I was trying out different data structures for the buffers

@b-ma
Copy link
Collaborator Author

b-ma commented Dec 18, 2021

Damn, that unfolded another problem I didn't see in test/media.rs where the SlowMedia iterator directly creates an AudioBuffer from ChannelData:

let channel_data = ChannelData::from(vec![self.value; RENDER_QUANTUM_SIZE]);
let buffer = AudioBuffer::from_channels(vec![channel_data], self.sample_rate);

I guess a good option here could be that the SlowMedia iterator should return a simple <Vec<[f32]>> while media::MediaElement (or maybe node::MediaStreamRenderer) would be the one in charge of creating the AudioBuffer for buffering and resampling. I guess that would make "external" media producers more simple and generic, what do you think?

Maybe it's best if you can have a look on that, as I'm not really familiar with these streaming nodes.

@orottier
Copy link
Owner

Sure, I'll have a look. Can you push the changes so far, even though the build/tests will fail?

@b-ma
Copy link
Collaborator Author

b-ma commented Dec 19, 2021

Ok thanks, just pushed my last changes (w/ build failing in tests/media.rs):

  • ChannelData and AudioBuffer::from_channels to pub(crate)
  • deleted bench.rs

and update test. Remove some unused stuff and fix docs
@orottier
Copy link
Owner

Hey @b-ma I added my changes. Is this ready for final review now or are you still considering new changes?

@b-ma
Copy link
Collaborator Author

b-ma commented Dec 20, 2021

Hey, thanks ! No I think I'm good on my side

@b-ma b-ma marked this pull request as ready for review December 20, 2021 15:30
src/buffer.rs Outdated
pub fn is_empty(&self) -> bool {
self.data.is_empty()
}
// never used
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete or keep that around?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clippy says we should keep it (as we also provide a len() method)

@b-ma
Copy link
Collaborator Author

b-ma commented Dec 20, 2021

I'll do a short pass on the comments tomorrow, there are a few typos and irrelevant/outdated notes here and there

Edit: that's done
Edit 2: I have some ideas to implement more unit tests for the AudioBufferSourceNode, I will do that in a next PR if ok for you

Copy link
Owner

@orottier orottier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost ready to merge I think. I left some nitpicks

// Then the number of frames copied from buffer to destination is max(0,min(𝑁𝑏−𝑘,𝑁𝑓)).
// If this is less than 𝑁𝑓, then the remaining elements of destination are not modified.
let dest_length = destination.len();
let max_frame = (self.length() - offset).min(dest_length).max(0);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if offset is a very large number self.length() - offset could cause an underflow. We should guard for it as it causes a panic in debug mode

src/buffer.rs Outdated
pub fn is_empty(&self) -> bool {
self.data.is_empty()
}
// never used
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clippy says we should keep it (as we also provide a len() method)

src/control.rs Outdated
@@ -24,6 +26,8 @@ impl Scheduler {
Self {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Scheduler is the driver of the AudioScheduledSourceNode trait which corresponds to https://www.w3.org/TR/webaudio/#AudioScheduledSourceNode
We should probably move offset & duration to the Controller. The Controller (not part of the spec) is also used for MediaElements but I will figure out a way to actually use these fields there later

//
// assert_eq!(buffer.sample_rate().0, 96_000);
// ```
pub(crate) fn resample(&mut self, sample_rate: SampleRate) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes!

- moved `offset` and `duration` in `Controller`
- make sure `offset` can't be outside range in `AudioBuffer::copy_from_channel` and `AudioBuffer::copy_to_channel`
- reintroduce `ChannelData::is_empty` with `#[allow(dead_code)]`
@b-ma
Copy link
Collaborator Author

b-ma commented Dec 22, 2021

ok, changes are done!

@orottier orottier merged commit 8107493 into orottier:main Dec 22, 2021
@orottier
Copy link
Owner

Great work!

@b-ma
Copy link
Collaborator Author

b-ma commented Dec 22, 2021

Cool ! and thanks for your help and patience on that

@b-ma b-ma deleted the feature/audio-buffer-source-node branch November 4, 2023 06:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants