Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to convert fbank features back to audio ? #15

Closed
linmou opened this issue Aug 17, 2022 · 3 comments
Closed

How to convert fbank features back to audio ? #15

linmou opened this issue Aug 17, 2022 · 3 comments
Labels
question Further information is requested

Comments

@linmou
Copy link

linmou commented Aug 17, 2022

Given that the fbank feature reconstructed by ssast is not so straight forward, how to transform it into pure audio data for further analysis ?

@YuanGongND
Copy link
Owner

Hi there,

The goal of reconstruction loss here is just to force the model to learn a good audio representation. We didn't mean to make the model a strong reconstructor. But if you want to convert spectrogram back to waveforms, you will need a vocoder (not included in this repo).

-Yuan

@YuanGongND YuanGongND added the question Further information is requested label Aug 17, 2022
@linmou
Copy link
Author

linmou commented Aug 19, 2022

Thanks for your warmly reply.
Any vocoder recommend? I want to inverse fbank features to audios.

@YuanGongND
Copy link
Owner

Hi there,

I am not familiar with vocoder - you can check the github list: https://github.com/topics/vocoder. Note most of these are for TTS (speech) rather than general audio.

-Yuan

@linmou linmou closed this as completed Jan 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants