ISTFT #27

keunwoochoi · 2019-03-14T05:25:37Z

https://gist.github.com/keunwoochoi/2f349e72cc941f6f10d4adf9b0d3f37e

It works, but I'll probably not gonna make a PR to Pytorch (pytorch/pytorch#3775) because it's written in Python (torch.stft is written in C). Just for a future reference/usage.

The text was updated successfully, but these errors were encountered:

faroit · 2019-03-14T06:56:00Z

Great. Did you did test the reconstruction error?

seungwonpark · 2019-03-14T07:10:26Z

I think line 33-40 can be replaced with transpose convolution. (I've done that in my private repository, but also didn't make a PR to PyTorch since it's not written in C like torch.stft.)
Probably I could open-source my private implementation of inverse-STFT after some code refactoring. That is also based on librosa implementation of istft. Will that be helpful for us?

keunwoochoi · 2019-03-14T07:22:28Z

reconstruction error

@faroit Could you possibly test by yourself? Because I did, and the error was 0. Not sure if it's for real..

transpose convolution

@seungwonpark Yes. I think there's pros/cons though. torch.rfft would be a complexity of n log n in computation while transpose convolution would be n ** 2. But transpose conv layer might outspeed due to a better parallelization -- although the current approach can be also framed (with reshaping) first to remove the loop, and probably the torch.rfft is only executed for a couple of times.

That said, we can simply time it and compare :) so yes, please share it! Even better if you can measure them.

seungwonpark · 2019-03-14T07:26:55Z

@keunwoochoi Thanks for pointing out the difference! I'll let you know when I'm ready to share.

seungwonpark · 2019-03-14T09:35:52Z

Here it is. I've also measured the time performance, and looks like your implementation is much faster!
https://github.com/seungwonpark/istft-pytorch

By the way, I can't find out the actual difference between them. The major difference is that mine uses ifft and yours uses irfft. So, I've tried using irfft instead of ifft in my implementation, but no significant difference was found.

faroit · 2019-03-14T15:44:03Z

@keunwoochoi cool, I tested yours in a small experiment where I run STFT of one library (librosa, scipy, tf, torch) against ISTFT of another and measured the reconstruction RMSE.

ISTFT --> ISTFT RMSE
=================================
test_torch --> test_torch 0.21650636
test_torch --> test_scipy 1.08139886e-07
test_torch --> test_librosa 1.1001805855710549e-07
test_torch --> test_tf 1.9412305e-06
test_scipy --> test_torch 0.21650635
test_scipy --> test_scipy 7.398381e-08
test_scipy --> test_librosa 7.647041059862848e-08
test_scipy --> test_tf 1.9329398e-06
test_librosa --> test_torch 0.21650635
test_librosa --> test_scipy 2.0918270288343244e-16
test_librosa --> test_librosa 2.294981695912998e-16
test_librosa --> test_tf 1.9328936e-06
test_tf --> test_torch 0.21650644
test_tf --> test_scipy 4.216027e-06
test_tf --> test_librosa 4.220019682546531e-06
test_tf --> test_tf 4.694827e-06

i think this is because of the combination of half overlap (you used //4) the hann window and the not existing normalization. Anyway this is easy to fix...

faroit · 2019-03-14T15:53:30Z

@f0k btw. do you know a good way to write a unit test for STFT/ISTFT errors due to windowing when masking was applied before the ISTFT, e.g. normalizing using the window instead of the squared window....? I was always wondering how to catch these....

keunwoochoi · 2019-03-14T16:04:18Z

It’s probably due to the boundary. I don’t pre-pad, if you ignore the first and last half-windows in the eval it should be much better for the moment.

nkundiushuti · 2019-03-14T16:50:53Z

it would be interesting to compare with https://github.com/wslihgt/pyfasst/tree/master/pyfasst/tftransforms
I have used as a baseline the stft implementation of JL Durrieu which uses numpy. I think madmom (JKU Linz audio framework) also uses something similar.
oh btw you can find some tentative implementation of cqt using nsgt. but it's not reliable. and I don't know how you could port it in pytorch

faroit · 2019-03-14T16:53:32Z

it would be interesting to compare with https://github.com/wslihgt/pyfasst/tree/master/pyfasst/tftransforms

cool. There also https://github.com/nils-werner/stft which I used for quite a while. I think we should stick to scipy.signal and librosa since they are heavily tested. However, I'm super interested to spot the differences and discuss which one has the best reconstruction performance.. but maybe thats out of scope for this repo ;-)

seungwonpark · 2019-03-28T15:28:07Z

Hi, recently we've found out that our previous time profiling code in my repo for comparing Choi's implementation(rfft) and mine(deconvolution with parallel ops) was wrong, and fixed it.

And the new conclusion is: deconvolution is much faster!
Please check https://github.com/seungwonpark/istft-pytorch.

faroit · 2019-05-02T20:39:17Z

I think we should close this then and move this initiative over to pytorch/pytorch#3775

@seungwonpark are you willing to help out over there?

keunwoochoi · 2019-05-02T20:42:50Z

Yeah, we could close it, maybe re-open once we wanna work on a wrapper for it.

keunwoochoi · 2019-05-13T02:09:57Z

@faroit Hey, so after closing it I think we sort of lost the track. I don't think it's actually necessary to wait for the pytorch istft. There are also some implementation details (e.g., deconv like @seungwonpark's vs rfft as I've done), but maybe we have our own tentative implementation with a less-tentative API? Seems like critical issues were resolved (faroit/stft-istft-experiments#1), and we can follow the API of STFT https://github.com/keunwoochoi/torchaudio-contrib/blob/master/torchaudio_contrib/layers.py#L35.

keunwoochoi · 2019-05-13T02:14:08Z

Along with add InverseSTFT we need more tests for STFT including to check if the stft result is correct.

faroit · 2019-05-13T08:34:18Z

Okay, what about we open up an PR with an API proposal to address ISTFT in pytorch/pytorch#3775. Then we could

implement ISTFT using the deconv "hack" here
switch to a cleaner ISTFT op later without changing the API, once available in pytorch

keunwoochoi · 2019-05-13T15:32:28Z

opening up a PR

You mean a PR here? (Yes let's do it)
Or in pytorch? (Would they care about accepting an API without an implementation? Also, we can easily change it later, at least in this repo)

faroit · 2019-05-13T15:40:12Z

On pytorch.

Yes we could just add this ISTFT but then it's rather hacky (using conv layers) and would mean it won't be that easy to make it clean and readable. And once we reached that someone would have to start from scratch using libtorch, thus it wouldn't particularly sustainable ;-)

keunwoochoi · 2019-05-13T15:48:12Z

Ok, I'm not really sure if they'd care an API only. Given the history, they've been adapting what had implemented outside of Pytorch quite actively (in other words, somehow blindly if it seems good), at least for the case of audio processing. So, how about we do it here, maybe

def istft(fft, hop, etc etc, mode='conv'):
    """wrapper for istft"""

def conv_istft(fft, hop, etc etc):
    """one of the options"""

?

It wouldn't block a follow-up dev while we refining it.

keunwoochoi · 2019-05-13T23:43:14Z

(continued)

..Or, BOTH?! Since we have control and flexibility here -- and the API for ISTFT would be pretty much the same to that of STFT, let me add a tentative one (or persuade @seungwonpark to add one here :) while asking Pytorch. @faroit Could you open a PR for that?

drscotthawley · 2019-06-02T05:48:29Z

Minor clarification question: If the STFT coefficients are allowed to evolve & train via autograd (I assume they are, yes?), then is the idea of for this ISTFT module to be either

(a) sort of a 'trainable' ISTFT which may not remain a perfect inverse to the trainable STFT? i.e. it is independent from the STFT routine? (@seungwonpark's deconv version looks that way)
or
(b) a routine that is to be somehow mathematically tied to & the (trainable) STFT and always guaranteed to invert it 'perfectly'?

I'm guessing (a). Yes? Mainly because I have no idea how you'd do (b), and can think of plenty of uses for (a), and proper training could enforce arbitrary levels of inversion-quality if you wanted it.

keunwoochoi · 2019-06-03T17:56:48Z

I also guess it'd be (a) in the case, i.e., the provided ISTFT is a static, non-trained module.

drscotthawley · 2019-06-04T23:35:01Z

Ohhh, static & non-trained. Interesting. Thanks for the clarification.

keunwoochoi · 2019-07-05T15:23:03Z

ISTFT is being developed in the main repo pytorch/audio#130

faroit mentioned this issue Mar 14, 2019

Fix pytorch istft faroit/stft-istft-experiments#1

Open

seungwonpark mentioned this issue Mar 28, 2019

License seungwonpark/istft-pytorch#1

Closed

faroit closed this as completed May 2, 2019

keunwoochoi reopened this May 13, 2019

faroit changed the title ~~gist: inverse-STFT~~ ISTFT May 13, 2019

r9y9 mentioned this issue Mar 2, 2020

Autograd waveform reconstruction from STFT r9y9/nnmnkwii#30

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ISTFT #27

ISTFT #27

keunwoochoi commented Mar 14, 2019

faroit commented Mar 14, 2019

seungwonpark commented Mar 14, 2019

keunwoochoi commented Mar 14, 2019

seungwonpark commented Mar 14, 2019

seungwonpark commented Mar 14, 2019

faroit commented Mar 14, 2019

faroit commented Mar 14, 2019 •

edited

Loading

keunwoochoi commented Mar 14, 2019

nkundiushuti commented Mar 14, 2019

faroit commented Mar 14, 2019 •

edited

Loading

seungwonpark commented Mar 28, 2019

faroit commented May 2, 2019

keunwoochoi commented May 2, 2019

keunwoochoi commented May 13, 2019

keunwoochoi commented May 13, 2019

faroit commented May 13, 2019 •

edited

Loading

keunwoochoi commented May 13, 2019

faroit commented May 13, 2019

keunwoochoi commented May 13, 2019

keunwoochoi commented May 13, 2019

drscotthawley commented Jun 2, 2019

keunwoochoi commented Jun 3, 2019

drscotthawley commented Jun 4, 2019

keunwoochoi commented Jul 5, 2019

ISTFT #27

ISTFT #27

Comments

keunwoochoi commented Mar 14, 2019

faroit commented Mar 14, 2019

seungwonpark commented Mar 14, 2019

keunwoochoi commented Mar 14, 2019

seungwonpark commented Mar 14, 2019

seungwonpark commented Mar 14, 2019

faroit commented Mar 14, 2019

faroit commented Mar 14, 2019 • edited Loading

keunwoochoi commented Mar 14, 2019

nkundiushuti commented Mar 14, 2019

faroit commented Mar 14, 2019 • edited Loading

seungwonpark commented Mar 28, 2019

faroit commented May 2, 2019

keunwoochoi commented May 2, 2019

keunwoochoi commented May 13, 2019

keunwoochoi commented May 13, 2019

faroit commented May 13, 2019 • edited Loading

keunwoochoi commented May 13, 2019

faroit commented May 13, 2019

keunwoochoi commented May 13, 2019

keunwoochoi commented May 13, 2019

drscotthawley commented Jun 2, 2019

keunwoochoi commented Jun 3, 2019

drscotthawley commented Jun 4, 2019

keunwoochoi commented Jul 5, 2019

faroit commented Mar 14, 2019 •

edited

Loading

faroit commented Mar 14, 2019 •

edited

Loading

faroit commented May 13, 2019 •

edited

Loading