distorted spectrograms after model #3

thepowerfuldeez · 2021-08-31T19:55:29Z

Hi! I tried your pretrained checkpoint in colab and got some extra values at the spectrogram in the first case and broken harmonics in the second case.
First audio is 44100Hz real speech (converted to 24k and then upscaled to 48k).
Second audio is the output of text-to-speech system (22050, upscaled to 44100)

I don't hear any noticeable difference in both audios, is this expected?

this is the spectrogram representation in Audacity. Upper one is before, bottom is after. Mel scale

zkx06111 · 2021-09-01T02:35:38Z

I'm not sure, can you please share the audio file with me?

thepowerfuldeez · 2021-09-01T07:53:32Z

alright.
before: https://voca.ro/1meOfwM2dIEw
after:https://voca.ro/11L19CihKeI6

zkx06111 · 2021-09-01T09:38:40Z

I think it's probably because our training data is not quite noisy and your input audio is noisy.

thepowerfuldeez · 2021-09-01T10:32:00Z

second audio is generated from text-to-speech and it's clean, however you see that harmonics became distorted. i cannot share second audio

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

distorted spectrograms after model #3

distorted spectrograms after model #3

thepowerfuldeez commented Aug 31, 2021 •

edited

zkx06111 commented Sep 1, 2021

thepowerfuldeez commented Sep 1, 2021 •

edited

zkx06111 commented Sep 1, 2021

thepowerfuldeez commented Sep 1, 2021

distorted spectrograms after model #3

distorted spectrograms after model #3

Comments

thepowerfuldeez commented Aug 31, 2021 • edited

zkx06111 commented Sep 1, 2021

thepowerfuldeez commented Sep 1, 2021 • edited

zkx06111 commented Sep 1, 2021

thepowerfuldeez commented Sep 1, 2021

thepowerfuldeez commented Aug 31, 2021 •

edited

thepowerfuldeez commented Sep 1, 2021 •

edited