Separate audio transformations into a different module? #16

turian · 2021-03-07T02:52:37Z

This repo is very useful. However, I am doing repeated evaluations over a dataset with different splits, and it would be more convenient if I there were a module that computed the audio representation. Then, the audio distance module would be much simpler as would be just distances over calls to audio representations.

This would also be useful because sometimes you want to retrieve the full pairwise matrix of distances, not the mean distance

turian · 2022-09-11T22:38:46Z

For example torch.cdist. And I find myself needing this again :)

turian · 2022-09-11T22:39:20Z

Or a workaround that if I pass a batch of audio in and a batch of audio out, that I get an in x out return tensor.

csteinmetz1 · 2022-09-13T23:31:36Z

I still am interested in providing this kind of interface. We can imagine creating a loss by passing a transform (e.g STFT) and distance function (e.g. L2). This would be great if we also want to support pretrained audio representations as more complex transforms. We could separate the transforms into their own modules, which should enable the use case you are interested in. Not sure I have the bandwidth now to work on this though, and I worry it needs to be done carefully to not have negative impact on users of the current API.

For now, as you mentioned, the easiest thing we might be able to do is to return the transformed inputs and targets as an additional return value which you could use in your downstream evaluation.

turian · 2022-09-13T23:42:05Z

Right. The main gnarly bit is if you have a multiscale STFT representation, how to correctly scale each STFT FFT size so that the scores are identical to the existing auraloss scores. (If you could provide any tips, I can try to kludge something maybe.)

csteinmetz1 mentioned this issue Jan 15, 2023

Feat: Reduction option for SumAndDifferenceSTFTLoss #50

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate audio transformations into a different module? #16

Separate audio transformations into a different module? #16

turian commented Mar 7, 2021 •

edited

Loading

turian commented Sep 11, 2022

turian commented Sep 11, 2022

csteinmetz1 commented Sep 13, 2022 •

edited

Loading

turian commented Sep 13, 2022

Separate audio transformations into a different module? #16

Separate audio transformations into a different module? #16

Comments

turian commented Mar 7, 2021 • edited Loading

turian commented Sep 11, 2022

turian commented Sep 11, 2022

csteinmetz1 commented Sep 13, 2022 • edited Loading

turian commented Sep 13, 2022

turian commented Mar 7, 2021 •

edited

Loading

csteinmetz1 commented Sep 13, 2022 •

edited

Loading