New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GAN SVS] Add VISinger2, UHifiGAN, Avocodo #5123
Conversation
Remove relu to avoid gradient vanishing.
Combine melody information in pitch predictor.
This is not a bug, but an improvement in VISinger2, I will add it later.
note that there's a bug when changing downsample parameters.
This change is for both gan_tts and gan_svs
for more information, see https://pre-commit.ci
Hi @jerryuhoo could you let me know when you have finished the development? Then I can also help to fix the CI issue in the import test for you. |
Those listed models and functions are done, but I'm still investigating the performance gap. It is caused by either the posterior encoder or the vocoder, but currently I cannot find the bug. |
Sorry that I did not find time to fix the CI, let's discuss the details in today's meeting |
Some code can be improved such as ddsp module (some part of the ddsp code is not used) and modules in MFD. For example, in visinger2_vocoder.py, maybe we can use LogMelFbank instead of TorchSTFT. But LogMelFbank doesn't have a feature of domain="double", which considers both linear and log fbanks. |
You may add TODOs to the codebase. As this PR consists of important bug fix, I believe we can quickly merge it by fixing the ci. |
I've fixed the import test and a few comments errors for you |
Last request for this PR, can we fix the ci tests for the imported functions https://github.com/espnet/espnet/actions/runs/5043122127/jobs/9044514638 ? After that, I can merge it |
Codecov Report
@@ Coverage Diff @@
## master #5123 +/- ##
=========================================
Coverage 74.99% 75.00%
=========================================
Files 618 630 +12
Lines 55603 56816 +1213
=========================================
+ Hits 41700 42614 +914
- Misses 13903 14202 +299
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 7 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@@ -566,7 +566,7 @@ def apply_spectral_norm(self): | |||
"""Apply spectral normalization module from all of the layers.""" | |||
|
|||
def _apply_spectral_norm(m: torch.nn.Module): | |||
if isinstance(m, torch.nn.Conv2d): | |||
if isinstance(m, torch.nn.Conv1d): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#5215 Hi @jerryuhoo , could you double check the issue?
@ftshijt @A-Quarter-Mile
This is an update for VISinger, I added multiple modules.