-
Notifications
You must be signed in to change notification settings - Fork 338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add VitsSVC implementation #14
Conversation
Conflicts: utils/mel.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two additional general comments:
- Add Amphion's copyright information for all the newly added files.
- Add some comments (descriptions and detail instructions) for some functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this file the same as egs/_template/run.sh
? You can create a soft link to it. It is easy for our future repair.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually not 🥲 vits module needs special Cython module initialization other than common run.sh
for svc task:
https://github.com/viewfinder-annn/AmphionPublic/blob/main/egs/svc/VitsSVC/run.sh#L9
So is it feasible to split Cython module initialization into another .sh file? In this case we can indicate this in readme file and reuse egs/_template/run.sh
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
return z, m, logs, x_mask | ||
|
||
class SynthesizerTrn(nn.Module): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this class same as
Amphion/models/tts/vits/vits.py
Line 151 in 42ca016
class SynthesizerTrn(nn.Module): |
We need to merge them into a common one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the internal encoding is different, one for text and acoustic condition the other, which affects forward/infer functions either. So in my opinion it can not be merged.
.gitignore
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this file necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, as discussed before, whisper extractor needs modules/whisper_extractor/assets/mel_filters.npz
to extract feature properly, while this file is initially ignored by the universal *.npz
configuration. Adding this line can make this file preserved by git version control.
Any audio samples to support this PR? |
Here are two samples which are converted from M4Singer dataset to Opencpop dataset, with original sample and SoVits4.1 model's output.
|
egs/svc/VitsSVC/run.sh
Outdated
export PYTHONIOENCODING=UTF-8 | ||
|
||
# monotonic_align | ||
cd $work_dir/modules/monotonic_align |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modify the modules.vits
. Let @lmxue know it.
Add implementation of VITS-based model for Singing Voice Conversion task