Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed blocksize support in _Recorder.record((with pulseaudio) #11

Closed
wants to merge 3 commits into from

Conversation

mpariente
Copy link
Contributor

Here are the proposed changes suggested in #10

  • Added the optional argument fixed_blocksize (default False) in the _Recorder class
  • Added _Rercorder.flush method to empty and return the last unused chunk

Here is a working example

import soundcard as sc
import numpy as np

FS = 16000
CHUNK = 128
RECORD_SECONDS = 20
RECORD_STEP = int(float(FS * RECORD_SECONDS) / CHUNK)

default_speaker = sc.default_speaker()
default_mic = sc.default_microphone()

with default_mic.recorder(FS, blocksize=CHUNK, fix_blocksize=True) as mic,\
     default_speaker.player(FS, blocksize=CHUNK) as sp:
    for _ in range(RECORD_STEP):
        data = mic.record(CHUNK)
        if data.shape[0] != CHUNK:
            print('Not matching')
        sp.play(data)
last_chunk = mic.flush()

@bastibe
Copy link
Owner

bastibe commented Apr 18, 2018

Thank you for your pull request.

There is currently already an open pull request for this issue, #9, and @stvol will fix this issue very soon.

@mpariente
Copy link
Contributor Author

Thank you for your answer.

I saw the pull request but nothing happened for 3 weeks so I thought that I could take over.
Moreover, this PR enables the user to choose whether to use to this feature or not, and is tested.
Is there any problem with the code of this PR?
Thanks,

@bastibe
Copy link
Owner

bastibe commented Apr 18, 2018

I saw the pull request but nothing happened for 3 weeks so I thought that I could take over.

You couldn't know this, but @stvol is a colleague of mine, and we had a few discussions about this topic already. He has been away for a few days, that's why progress on that pull request stalled.

However, your pull request has made me think about this problem in more depth. My original issue with this kind of scheme was that a user could, hypothetically, open a Recorder, then immediately record a short bit of audio, then wait, then record again. In that case, I feared, we would keep a partial block from the first recording, and then append it to the later recording; If a buffer overflow happened in the meantime, this would concatenate two discontinuous pieces of audio and produce an artifact. I didn't actually want to go forward with this plan before finding a solution to this continuity problem.

It seemed like a difficult problem, since pulseaudio does not seem to expose any API for checking for buffer overflows while recording. Today I learned why (by trying it out): Pulse just buffers everything, and will never overrun as long as your memory doesn't fill up. I tried as much as several minutes without issues. This is incredibly convenient, and means that we can go ahead and fix this problem.

That said, let's talk API:

this PR enables the user to choose whether to use to this feature or not, and is tested.

I think I don't like how we have to set both blocksize and fix_blocksize. This is confusing.
I see two solutions to this:

  • Have record and record_chunk, where the former does all the buffering, while the latter just returns whatever pulse is returning (no buffering, no additional latency). The record method would then call record_chunk to get data from pulse.
  • allow numframes to be None in record, in which case it returns whatever pulse is returning (plus any buffered frames, if available). If numframes is set, do the buffering. (And probably factor out a hidden _record_chunk anyway to make the code cleaner).

What do you think about these options?

@bastibe bastibe mentioned this pull request Apr 18, 2018
@mpariente
Copy link
Contributor Author

My original issue with this kind of scheme was that a user could, hypothetically, open a Recorder, then immediately record a short bit of audio, then wait, then record again.

The solution I found to this problem was to implement the flush method. But if we can use deeper knowledge of pulseaudio to write cleaner code, it's obviously better.
Just to be sure that I understand what you explained on how pulseaudio works : If I record something, stop the recording, and (with the same object) start the recording a minute later, record will return the audio that happened a minute ago, right?
In this case there is no need to flush the remaining unreturned buffer.

So on the first option, the current record method would become record_chunk and the record method would call it and perform the buffering? I like that !

For the second option, it's possible that a user would want to request numframes samples and prefers to have variable size chunks rather than a little additional delay. And I think it wouldn't be possible with this option, or am I missing something?

Another question, what should be the value of blocksize for realtime recording/playback loop?
Should it be similar to numframes or much smaller?

Also, what drives the choice of the 1ms sleep here in the record code? Maybe the samples will be available before with a small blocksize.

Tell me if I can help or if I let @stvol and you to do what you will decide to do.
Anyway, thank you for keeping me in the loop!

@bastibe
Copy link
Owner

bastibe commented Apr 19, 2018

So on the first option, the current record method would become record_chunk and the record method would call it and perform the buffering? I like that !

exactly!

For the second option, it's possible that a user would want to request numframes samples and prefers to have variable size chunks rather than a little additional delay. And I think it wouldn't be possible with this option, or am I missing something?

Sorry for being unclear: I meant, if the user says numframes=1024 we will return exactly 1024 samples, and buffer in the background. If the user does not give numframes, we set a default value of numframes=None, and return whatever pulse returns.

Another question, what should be the value of blocksize for realtime recording/playback loop?
Should it be similar to numframes or much smaller?

I was thinking about this, too. It should probably be much smaller than your expected numframes. We'll have to experiment with the finished code to find a reasonable value.

Also, what drives the choice of the 1ms sleep here in the record code? Maybe the samples will be available before with a small blocksize.

In coreaudio, the only way you can access recorded data is in a callback. This loop just polls a queue that is filled by the callback. It's not a great solution. I am currently working on a better version that uses signals instead of polling, which should be much cleaner.

Tell me if I can help or if I let @stvol and you to do what you will decide to do.

If you want to, you can go ahead and implement either of the above solutions (numframes=None or record_chunk). @stvol said he'd be grateful if someone else took over.

@mpariente
Copy link
Contributor Author

In coreaudio, the only way you can access recorded data is in a callback.

I was talking about pulseaudio, and I guess it's not the same. I'm no expert in any of these three libraries, that's why I'm asking.

If you want to, you can go ahead and implement either of the above solutions (numframes=None or record_chunk). @stvol said he'd be grateful if someone else took over.

I'm going to (try to) implement the record_chunk option for pulseaudio, I'll let you know when I went forward.

@bastibe
Copy link
Owner

bastibe commented Apr 20, 2018

I was talking about pulseaudio, and I guess it's not the same. I'm no expert in any of these three libraries, that's why I'm asking.

Sorry about that. There are two places where a sleep-wait like this happens, and the more egregious place is in the coreaudio code (which I had open in a separate tab, and confused it with your link). The reason is much the same, though: waiting for new data.

I don't want to busy-wait, as this would saturate the CPU unnecessarily and steal resources that other threads might want to use. I guess the correct solution in this case would be to register a callback using pa_stream_set_read_callback, which triggers an Event, and wait for that event trigger instead of sleeping.

(This is untested, but something I intend to try out next week or so)

I'm going to (try to) implement the record_chunk option for pulseaudio, I'll let you know when I went forward.

Cool! I'll look forward to it!

@mpariente mpariente closed this Apr 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants