Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pytorch synthesizer #472

Merged
282 commits merged into from Feb 14, 2021
Merged

Pytorch synthesizer #472

282 commits merged into from Feb 14, 2021

Conversation

ghost
Copy link

@ghost ghost commented Aug 6, 2020

I have taken the tacotron model from fatchord/WaveRNN and integrated it with this repo (#447). Aside from the new format of the synthesizer model (.pt) this change should be completely transparent to the end user.

Major Changes

  • Toolbox no longer requires tensorflow 🎉
  • Synthesizer is tacotron1 instead of tacotron2

Pretrained Model

A download link and instructions are provided here: #472 (comment)

Task List

  • Inference
  • Training
  • Update preprocessing scripts to use synthesizer_pt
  • Cleanup files in synthesizer_pt
  • Match repo code style
  • Move synthesizer_pt to synthesizer (no more tensorflow)
  • Testing
  • Release pretrained model
  • Update documentation
  • Review
    • Retest as needed if code changes made
  • Merge into master branch of repo

blue-fish added 30 commits July 30, 2020 09:39
…es with WaveRNN synthesizer pretrained model
@ghost
Copy link
Author

ghost commented Feb 7, 2021

Development of the pytorch synthesizer is complete. Please review the changes.

The pretrained model release consists of the pretrained encoder, along with synthesizer and vocoder models I have developed. Audio samples and model details: https://blue-fish.github.io/experiments/RTVC-7.html

@ghost ghost requested a review from CorentinJ February 7, 2021 17:41
@CorentinJ
Copy link
Owner

Wow, amazing work. I'll do my best to find the time to review within this week.

@Garvit-32
Copy link

Garvit-32 commented Feb 11, 2021

Hi @blue-fish Amazing Work !!
Can you tell me why you are using tacotron1 instead of tacotron2 ? Have you tested tacotron2 ?

@ghost
Copy link
Author

ghost commented Feb 11, 2021

@Garvit-32
The main reason to use Taco1 is that it uses the same codebase with the vocoder (fatchord/WaveRNN). The commonality makes it a lot easier to write the training script and integrate it with the rest of the repo. Now that the supporting infrastructure is in place, the model can be switched with relative ease.

I've already shared my thoughts on Taco1 vs Taco2 in #472 (comment) . They're close in performance. I prefer Taco2.

Copy link
Owner

@CorentinJ CorentinJ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright well I checked the code and played around with the toolbox. I think all seems good.

Thank you again for your amazing work, feel free to merge whenever.

@121898
Copy link

121898 commented Feb 26, 2021

Development of the pytorch synthesizer is complete. Please review the changes.

The pretrained model release consists of the pretrained encoder, along with synthesizer and vocoder models I have developed. Audio samples and model details: https://blue-fish.github.io/experiments/RTVC-7.html

Hi, thanks for the amazing work, may I know what does Google rows mean in your demo page? Thanks

@ghost
Copy link
Author

ghost commented Feb 26, 2021

@121898 Google rows are the audio samples from 1806.04558. https://google.github.io/tacotron/publications/speaker_adaptation/index.html

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants