Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High-resolution sampling #17

Closed
maelp opened this issue Feb 7, 2018 · 2 comments
Closed

High-resolution sampling #17

maelp opened this issue Feb 7, 2018 · 2 comments

Comments

@maelp
Copy link

maelp commented Feb 7, 2018

I was reading the paper https://arxiv.org/pdf/1712.03439.pdf mentioning that they sample the IR at 1024kHz, then resample to 128kHZ and do a low-pass filtering at 80kHz, then resample to the audio frequency of 16kHz

I have a few questions:

  • from their paper it seems that they seem to use 1024kHz sampling because of the distance between the microphones, would we expect to still have better results even in the single microphone case if we do a higher-resolution computation, then a resampling?

  • when I do the computation at 1024kHz and resample at 16kHz, the signal is not scaled in the same way, it seems that if I apply a scaling of 1024 / 16 I get comparable values for the max, although the resulting IR do not look the same (they seem slightly "shifted" in time). Is there a better way to do the resampling?

  • their paper shows how they do efficient OLA computations, is this what is used in pyroomacoustics, and would it require a lot of work to add it if it was needed?

@fakufaku
Copy link
Collaborator

fakufaku commented Feb 8, 2018

Hi @maelp , I think you are referencing the RIR creation method in this paper (ref [1] in the one you linked). Essentially, these guys are rediscovering 60+ years of signal processing on their own. I wouldn't recommend this as a reference for your own implementation.

In their method, they need to generate the RIR at a higher frequency because they use impulses that are rounded to the nearest sample. Pyroomacoustics doesn't have this problem because we generate fractional delays directly. No rounding to the nearest sample is ever done. What comes out of the simulator is better than what you'll get using this upsampling method.

Regarding computations, OLA is a completely standard method. The STFT method of pyroomacoustics lets you do it. Although it is a good point that this is not actually what we use to convolve the RIR in the room simulation. Instead we use scipy.signal.fftconvolve which uses FFT for the convolution (Real FFT filtering in the paper linked, Eq 4). So from their paper, an efficient implementation of OLA might be 2x as fast. The problem for us is that the OLA in stft is done in python, so it might be actually slower than fftconvolve. This should be checked.

@maelp
Copy link
Author

maelp commented Feb 8, 2018

Thank you

@maelp maelp closed this as completed Feb 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants