-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot record sound with loopback if silence at start #166
Comments
Depending on the sound card, silence is either reported as no-data, or as silence. However, support for this in soundcard has not been published yet, as I didn't have a good test case yet. Could you try running your code against the current Git master of soundcard? I believe your issue should be fixed on there. And if it is, I will publish it as a new version as soon as you confirm that it's working as intended. |
@bastibe ResultI cloned the current master of soundcard and ran the code written above on three output devices.
And I encountered the following warning at a random timing on all output devices, but the code could works as the above (Timing was random, but warnings tended to occur when output devices were switched before running the code). C:\Users\user\workspace\clone\bastibe\SoundCard\soundcard\mediafoundation.py:750: SoundcardRuntimeWarning: data discontinuity in recording
warnings.warn("data discontinuity in recording", SoundcardRuntimeWarning) |
Oh, the endless vagaries of sound drivers on Windows. Regrettably, I can't debug this issue on my machine, as my sound card behaves like your Realtek. Could you check how this fails in I could imagine that Alternatively, you could try extending the empty-watcher to more than 10ms. I have seen Windows sound cards taking up to 4s to wake up in extreme cases, if that's the problem. Perhaps we need to wait until However, if so, I still don't know how to proceed in soundcard, as the API does not give an indication of how much silence there was. Soundcard operates on the assumption that you can get a fixed number of samples per second. WASAPI just refusing to return anything breaks that assumption. If you have a reasonable idea of how to deal with that, I'm all ears! |
@bastibe The results of testing your opinionsThe value returned from GetNextPacketSize in _capture_available_frames()Unlike your expectation, GetNextPacketSize always returned 0. Extending the empty-watcher to more than 4sI extended empty-watcher to 5s and the code ended in about 5s from its start. Waiting until AUDCLNT_E_SERVICE_NOT_RUNNING clearsI don`t know what to do due to the lack of my knowledge about audio. Sorry for this. What I noticedtime.sleep() cannot sleep for 1msI noticed time.sleep(0.001) actually sleeps for not 1ms but about 5-15ms. This answer in stackoverflow says The reason the code ends immediately and records silence when there is no sound at the start of the code on Pixel Buds A-SeriesThe behavior of SoundCard in this case is as follows.
|
That's very interesting, thank you! If I understand this correctly, it means that (some variants of) the windows audio API just return no no data when none is available. Which is not in itself a problem, but breaks the assumption of soundcard, which would rather return zeros than no data. We can fudge that by just making up some zeros if no data is available. However, the question then becomes: How many zeros should we return? Because the length of the output is how soundcard expresses how much time has passed. In this case, it is probably acceptable if the number of zeros is off by some margin of error. Ideally, we'd ask the audio driver for a current "time", but as far as I can tell, no such API is available. As a workaround, change def _record_chunk(self):
# skip docstring for this example...
start_time = 0 # in the real implementation, make this self.start_time so we don't skip processing time
while not self._capture_available_frames():
if start_time == 0:
start_time = time.perf_counter_ns()
now = time.perf_counter_ns()
# no data for 50 ms: give up and return zeros.
if now - start_time > 50_000_000:
ppMixFormat = _ffi.new('WAVEFORMATEXTENSIBLE**')
hr = self._ptr[0][0].lpVtbl.GetMixFormat(self._ptr[0], ppMixFormat)
_com.check_error(hr)
samplerate = ppMixFormat[0][0].nSamplesPerSec # in the real implementation, cache samplerate in self.
num_samples = samplerate * (now - start_time) / 1_000_000
return numpy.zeros([len(set(self.channelmap)) * num_samples], dtype='float32')
time.sleep(0.001)
# continue with the rest of the function below the while loop... This should give you a reasonable estimate of the correct number of zeros. If this solves your problem, I'll code up a proper implementation. |
@bastibe The result of testing your code on my machineDebugging your codeI changed the following sections because there are errors. # your original code
samplerate = ppMixFormat[0][0].nSamplesPerSec
# modified code
samplerate = ppMixFormat[0][0].Format.nSamplesPerSec # your original code
num_samples = samplerate * (now - start_time) / 1_000_000
# modified code
num_samples = int(samplerate * (now - start_time) / 1_000_000) ResultYour code ended immediately and recorded silence because numpy.zeros() returned array large enough for the code to finish. The code which worked correctlyCode_record_chunk()`s while loop in mediafoundation.pystart_time = 0 # in the real implementation, make this self.start_time so we don't skip processing time
while not self._capture_available_frames():
if start_time == 0:
start_time = time.perf_counter_ns()
now = time.perf_counter_ns()
# no data for 50 ms: give up and return zeros.
if now - start_time > 50_000_000:
ppMixFormat = _ffi.new('WAVEFORMATEXTENSIBLE**')
hr = self._ptr[0][0].lpVtbl.GetMixFormat(self._ptr[0], ppMixFormat)
_com.check_error(hr)
samplerate = ppMixFormat[0][0].Format.nSamplesPerSec # in the real implementation, cache samplerate in self.
num_samples_per_ms = samplerate / 1_000
num_channels = len(set(self.channelmap))
giveup_ms = 50
return numpy.zeros(int(num_samples_per_ms * giveup_ms * num_channels), dtype='float32')
# rewrote time.sleep(0.001), because time.sleep(0.001) cannot sleep for 1ms.
remaining_time = 1
sleep_ms = 1
_start = time.perf_counter()
while remaining_time > 0:
elapsed_time = (time.perf_counter() - _start) * 1_000
remaining_time = sleep_ms - elapsed_time Test codeI added some codes which print info. import soundcard as sc
import soundfile as sf
import time
OUTPUT_FILE_NAME = "out.wav" # output file name.
SAMPLE_RATE = 48_000 # [Hz]. sampling rate.
RECORD_SEC = 5 # [sec]. recording duration.
print(f"output device: {str(sc.default_speaker().name)}")
with sc.get_microphone(id=str(sc.default_speaker().name), include_loopback=True).recorder(samplerate=SAMPLE_RATE) as mic:
_start_time: float = time.perf_counter()
# record audio with loopback from default speaker.
data = mic.record(numframes=SAMPLE_RATE*RECORD_SEC)
# output info
print("\n-- info --")
print(f"len of data: {len(data)}")
print(f"elapsed time: {time.perf_counter() - _start_time}s")
print("-- -- -- --\n")
sf.write(file=OUTPUT_FILE_NAME, data=data[:, 0], samplerate=SAMPLE_RATE) ResultInitially, the code recorded silence and then recorded sound from YouTube. soundcard_bug.mp4 |
It worked fine on my three output devices! |
Perfect! Thank you for your feedback! |
@bastibe |
First of all, thank you for the amazing library.
It helps my projects a lot.
Behavior I encountered
I wrote the following program which just records speaker output for 5 seconds with loopback and saves it.
This program works if there is sound from speaker at start.
However, this program doesn`t work if silence at start.
The behavior when silence at start is as follows.
My environment
Error
There is no error.
The text was updated successfully, but these errors were encountered: