Add VitsSVC implementation #14

viewfinder-annn · 2023-12-06T12:10:02Z

Add implementation of VITS-based model for Singing Voice Conversion task

Conflicts: utils/mel.py

RMSnow

Two additional general comments:

Add Amphion's copyright information for all the newly added files.
Add some comments (descriptions and detail instructions) for some functions.

config/vitssvc.json

RMSnow · 2023-12-06T14:06:20Z

egs/svc/VitsSVC/run.sh

Is this file the same as egs/_template/run.sh? You can create a soft link to it. It is easy for our future repair.

Actually not 🥲 vits module needs special Cython module initialization other than common run.sh for svc task:
https://github.com/viewfinder-annn/AmphionPublic/blob/main/egs/svc/VitsSVC/run.sh#L9

So is it feasible to split Cython module initialization into another .sh file? In this case we can indicate this in readme file and reuse egs/_template/run.sh.

models/svc/base/svc_dataset.py

models/svc/vits/vits.py

RMSnow

Use Black for formatting your code. See this blog.

models/svc/vits/vits.py

RMSnow · 2023-12-06T14:13:25Z

models/svc/vits/vits.py

+
+        return z, m, logs, x_mask
+
+class SynthesizerTrn(nn.Module):


Is this class same as

Amphion/models/tts/vits/vits.py

Line 151 in 42ca016

class SynthesizerTrn(nn.Module):

We need to merge them into a common one.

No, the internal encoding is different, one for text and acoustic condition the other, which affects forward/infer functions either. So in my opinion it can not be merged.

models/svc/vits/vits.py

models/svc/vits/vits_trainer.py

RMSnow · 2023-12-06T14:21:25Z

.gitignore

Is this file necessary?

Yes, as discussed before, whisper extractor needs modules/whisper_extractor/assets/mel_filters.npz to extract feature properly, while this file is initially ignored by the universal *.npz configuration. Adding this line can make this file preserved by git version control.

zhizhengwu · 2023-12-06T14:31:28Z

Any audio samples to support this PR?

viewfinder-annn · 2023-12-08T10:24:53Z

@zhizhengwu
Any audio samples to support this PR?

Here are two samples which are converted from M4Singer dataset to Opencpop dataset, with original sample and SoVits4.1 model's output.
VitsSVC model uses ContentVec and Whisper feature and hifigan as generator, is trained from scratch for 110k steps.
SoVits4.1 model uses Whisper feature and nsf-hifigan as generator, is fine-tuned on a 330k steps-pretrained base model for 110k steps.

	Tenor-3 -> opencpop_female1	Alto-6 -> opencpop_female1
Original	Tenor-3	Alto-6
SoVits 4.1	Tenor-3_SoVits4.1_opencpop_female1	Alto-6_SoVits4.1_opencpop_female1
VitsSVC	Tenor-3_VitsSVC_opencpop_female1	Alto-6_VitsSVC_opencpop_female1

egs/svc/VitsSVC/run.sh

RMSnow · 2023-12-08T12:31:35Z

egs/svc/VitsSVC/run.sh

+export PYTHONIOENCODING=UTF-8
+
+# monotonic_align
+cd $work_dir/modules/monotonic_align


Modify the modules.vits. Let @lmxue know it.

viewfinder-annn added 12 commits December 3, 2023 17:20

Add VITS-based SVC

5212274

Update svc egs readme

4b74c6c

Rename & add ConditionEncoder

1b8bb85

Rename

47cecda

Fix clip_if_too_long for audio cut

7853aae

Merge branch 'main' of https://github.com/open-mmlab/Amphion into main

a8e07b9

Conflicts: utils/mel.py

Fix whisper extraction

5f0d9a9

Add condition encoder

3ca12e7

Add vitssvc base config

8cee489

Update configuration & readme

91b5ecb

Merge branch 'main' of https://github.com/open-mmlab/Amphion into main

9f5dc43

Merge branch 'main' of https://github.com/open-mmlab/Amphion into main

dc5f143

RMSnow reviewed Dec 6, 2023

View reviewed changes

RMSnow requested changes Dec 6, 2023

View reviewed changes

viewfinder-annn added 3 commits December 6, 2023 23:21

Delete comments & add license

cee3fac

Delete trainer comments & black format

2185a81

Update trainer comment granularity

13d3e8e

Merge branch 'main' of https://github.com/open-mmlab/Amphion into main

ca5ba3b

viewfinder-annn requested a review from RMSnow December 8, 2023 10:36

RMSnow requested changes Dec 8, 2023

View reviewed changes

egs/svc/VitsSVC/run.sh Outdated Show resolved Hide resolved

viewfinder-annn added 2 commits December 8, 2023 21:34

update default vitssvc generator config

0d860b5

Update monotonic module & run.sh

0c4a59b

viewfinder-annn requested a review from RMSnow December 8, 2023 13:44

RMSnow requested a review from lmxue December 8, 2023 13:45

RMSnow approved these changes Dec 8, 2023

View reviewed changes

RMSnow merged commit 554b791 into open-mmlab:main Dec 8, 2023

RMSnow mentioned this pull request Jan 6, 2024

Add TransformerVC implementation #90

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add VitsSVC implementation #14

Add VitsSVC implementation #14

viewfinder-annn commented Dec 6, 2023

RMSnow left a comment

RMSnow Dec 6, 2023

viewfinder-annn Dec 6, 2023

RMSnow left a comment

RMSnow Dec 6, 2023

viewfinder-annn Dec 6, 2023

RMSnow Dec 6, 2023

viewfinder-annn Dec 6, 2023

zhizhengwu commented Dec 6, 2023

viewfinder-annn commented Dec 8, 2023 •

edited

RMSnow Dec 8, 2023

Add VitsSVC implementation #14

Add VitsSVC implementation #14

Conversation

viewfinder-annn commented Dec 6, 2023

RMSnow left a comment

Choose a reason for hiding this comment

RMSnow Dec 6, 2023

Choose a reason for hiding this comment

viewfinder-annn Dec 6, 2023

Choose a reason for hiding this comment

RMSnow left a comment

Choose a reason for hiding this comment

RMSnow Dec 6, 2023

Choose a reason for hiding this comment

viewfinder-annn Dec 6, 2023

Choose a reason for hiding this comment

RMSnow Dec 6, 2023

Choose a reason for hiding this comment

viewfinder-annn Dec 6, 2023

Choose a reason for hiding this comment

zhizhengwu commented Dec 6, 2023

viewfinder-annn commented Dec 8, 2023 • edited

RMSnow Dec 8, 2023

Choose a reason for hiding this comment

viewfinder-annn commented Dec 8, 2023 •

edited