Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STFT: streaming peformance, output shapes #81

Closed
falseywinchnet opened this issue Dec 19, 2022 · 8 comments
Closed

STFT: streaming peformance, output shapes #81

falseywinchnet opened this issue Dec 19, 2022 · 8 comments
Labels
question Further information is requested

Comments

@falseywinchnet
Copy link

Hi, for my project https://github.com/falseywinchnet/streamcleaner
I have previously been using the librosa stft.
However, after i was able to make the ssqueezepy stft behave similar to it by appending
128 samples and then slicing all but the last 127 from the istft output, I switched over to ssqueezepy,
believing that for my purposes librosa/librosa#1279
this was an important caveat I should consider.

However, librosa's stft only consumes ~2% CPU, and the ssqueezepy on a similar workload takes up 25%.
I'm concerned this is due to my crude attempt at making the two behave similar, when doing this the STFT
representation is MSE very close to librosas.

Is there a proper way to make the stft of ssqueezepy generate a 257x376 complex representation from 48,000 samples
and return 48,000 samples and perform with similar compute requirements?

@falseywinchnet
Copy link
Author

It was as I feared, it was due to me adding an extra frame.
I removed the extra samples and my algorithm appears to work the exact same as it did before the change- perhaps even better.
however, it does generate a slightly different size of STFT.

@falseywinchnet
Copy link
Author

no, i was wrong. it still doesn't work right.
That is to say, i didnt bother seeing if it dropped samples- i assumed the reduction in cpu use was due to a corrected algorithm.
However, it now returns 47999 samples, which causes my program to quietly hang.

@falseywinchnet
Copy link
Author

Let's just make this much simpler: dear john, is there a simple change I can make to librosa's handling as a post-processing step on their STFT which will deliver the changes recommended in librosa/librosa#1279 ?

@OverLordGoldDragon
Copy link
Owner

Hello,

is there a simple change I can make to librosa's handling as a post-processing step on their STFT which will deliver the changes recommended in librosa/librosa#1279

By modifying librosa.stft or as a postprocessing step, yes; both codes may need one-sample adjusting to handle different configs

extra frame.

Note librosa's output size is suboptimal

ssqueezepy on a similar workload takes up 25%

The default behavior is multiprocessing for speedup, can turn off via os.environ['SSQ_PARALLEL'] = '0'

Is there a proper way to make the stft of ssqueezepy

For inquiries like this it helps to have a minimally reproducing code of obtained vs desired behavior.

@falseywinchnet
Copy link
Author

well, at this time i'm also being strongly cautioned that i need to start looking into what is called "online" stft where analysis(the STFT part) is done into a ringbuffer and the analysis(istft) is done via overlap-add, potentially on different chunk sizes, so my algorithm can become realtime.

However, I wanted to do the best job I could with simple one second at a time frames of 48000 samples using a FFT of 512 and a hop length of 128, before I attempted to work on such a massive change.
My first obstacle was that no matter whose alternate STFT implementation I used, they returned 47999 samples instead of 48000 when attempting to generate a similar STFT and reverse it.

@OverLordGoldDragon
Copy link
Owner

There's a 1-sample ambiguity per integer rounding, hence N= in istft must be specified:

x = np.ones(48000)
kw = dict(n_fft=512, hop_len=128)
assert len(x) == len(istft(stft(x, **kw), **kw, N=len(x)))

However, I just noticed, there's a precision issue with float32 and time-localized windows (which the default window=None happens to often qualify), as the 1 less frame in ssqueezepy involves dividing by the window tail. This only affects a small portion of the rightmost boundary, but still should be documented/warned about.

@OverLordGoldDragon
Copy link
Owner

From a feature standpoint this also makes librosa's oversampling preferable for some configurations, yet still less for others; both libraries determine the number of STFT frames independent of window shape, which isn't optimal, but not really a big deal either. Maybe future TODO.

@OverLordGoldDragon OverLordGoldDragon changed the title ssqueezepy performance STFT: streaming peformance, output shapes Jan 1, 2023
@OverLordGoldDragon
Copy link
Owner

Appears resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants