Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

an analysis-synthesis example #10

Closed
zhouyong64 opened this issue Jun 17, 2021 · 3 comments
Closed

an analysis-synthesis example #10

zhouyong64 opened this issue Jun 17, 2021 · 3 comments

Comments

@zhouyong64
Copy link

So many stuff in the tutorials, but I can't find something like the analysis and synthesis of an existing wav file. I mean, given a wav file, first do an analysis of it and use the result of the analysis to synthesize a new wav file.

@TonyWangX
Copy link
Member

TonyWangX commented Jun 17, 2021

Hello, thanks for the nice suggestion.

I updated the tutorial book s1_demostration_hn-nsf.ipynb and s2_demonstration_music_nsf.ipynb in this newfunctions branch https://github.com/nii-yamagishilab/project-NN-Pytorch-scripts/tree/newfunctions/
In Section 2.2

  1. Prepare input data: extract F0 and mel-spectrogram from input waveform
  2. Generate sample: waveform synthesize given the extracted features.

Many wonderful tools are available for F0 extraction. For simplicity, I used https://github.com/bjbschmitt/AMFM_decompy/blob/master/amfm_decompy/pYAAPT.py
Mel-spec extractor for music data is based on librosa
Mel-spec extractor for speech data is based on numpy (for historical reasons)

The dependency should have been covered by the ./tutorials/env-cpu.yml
Please let me know whether they work on your computer.

@zhouyong64
Copy link
Author

Thanks for your quick response. Should I use sub_get_mel.py to compute the mel? Just want to make sure I use the same mel as your model training mels.

@TonyWangX
Copy link
Member

TonyWangX commented Jun 17, 2021

If you use the pre-trained speech model in this repo, then, yes, please use sub_get_mel.py so that the mel is compatible with the pre-trained speech model.

Two ways to use it:

  1. Load it as module, then use get_melsp() to extract the mel. This is how s1_demostration_hn-nsf.ipynb works.
  2. Use it as a script to extract mel and save the mel as a file on the disk. The commandline is $: python sub_get_mel.py PATH_INPUT_WAV PATH_OUTPUT_MEL

Finally, if you want to train a new model, of course, you can use any tool you prefer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants