add core code of valle #4

lmxue · 2023-12-01T16:09:46Z

Vall-E is a zero-shot TTS architecture that uses a neural codec language model with discrete codes. This PR is to support Vall-E in Amphion.

bins/tts
config/valle.json
egs/tts/VALLE
models/tts/valle

RMSnow

Leave the comments as the next improvement

RMSnow · 2023-12-01T16:15:37Z

egs/tts/VALLE/prompt_examples/260_123440_000010_000004.normalized.txt

replace the complex name with a easy-to-understand one

RMSnow · 2023-12-01T17:07:34Z

processors/content_extractor.py

-
+from utils.tokenizer import G2PModule, tokenize_text
+from utils.symbol_table import SymbolTable
+from text.g2p import preprocess_english, read_lexicon

 """
    Extractor for content features


Provide the comments of extract_phoneme related code

RMSnow · 2023-12-01T17:09:17Z

processors/content_extractor.py

@@ -539,3 +541,49 @@ def extract_utt_content_features_dataloader(cfg, metadata, num_workers):
                )
                for index, utt in enumerate(_metadata):
                    extractor.save_feature(utt, batch_content_features[index])
+
+    if cfg.preprocess.extract_phoneme:


The current code will make SVC and TTS entangle unnecessarily. Move Line545-589 to a new function.

zhizhengwu · 2023-12-02T01:46:56Z

bins/tts/inference.py

-    print("args: ", args)
-
+    parser = build_parser()
+    VALLEInference.add_arguments(parser)


this looks like VALLEInference is used no matter what type of the model.

zhizhengwu · 2023-12-02T01:48:14Z

bins/tts/preprocess.py

+    if 'test' not in types: 
+        types.append('test') 
+    if "eval" in dataset:
+        types = ["test"]


repeating lines: 32 - 39

zhizhengwu · 2023-12-02T01:48:32Z

bins/tts/preprocess.py

    metadata = []
    for dataset_type in types:
        dataset_output = os.path.join(output_path, dataset)
+        # dataset_file = os.path.join(dataset_output, "{}.json".format(dataset_type))
        dataset_file = os.path.join(dataset_output, "{}.json".format(dataset_type))


duplicating line 78?

zhizhengwu · 2023-12-02T01:49:19Z

bins/tts/train.py

@@ -77,12 +93,13 @@ def main():
            new_datasets_list.extend(filter(None, new_datasets))
        cfg.dataset.extend(new_datasets_list)

-    # CUDA settings
+    # # CUDA settings


No need to add one more '#'

zhizhengwu · 2023-12-02T01:50:38Z

We should provide demos/samples in a PR

config/base.json

RMSnow · 2023-12-02T01:39:59Z

config/base.json

Move all the configs about TTS into a TTS's config base json

RMSnow · 2023-12-02T01:52:50Z

processors/acoustic_extractor.py

@@ -211,6 +212,11 @@ def __extract_utt_acoustic_features(dataset_output, cfg, utt):
            label = audio_to_label(wav, cfg.preprocess.bits)
            save_feature(dataset_output, cfg.preprocess.label_dir, uid, label)

+        if cfg.preprocess.extract_acoustic_token:


Do not modify __extract_utt_acoustic_features anymore. It is not a common extraction pipeline now. See extract_utt_acoustic_features_tts (line221) and extract_utt_acoustic_features_vocoder(line233) as reference. Please move all the functions of TTS's acoustic feature extraction into line221.

zhizhengwu · 2023-12-02T02:06:06Z

bins/tts/inference.py

@@ -75,9 +73,9 @@ def build_parser():
    )
    parser.add_argument(
        "--text",
-        help="Text to be synthesized",
+        help="Text",


'Text to be synthesized' is more informative than 'Text'

add core code of valle

a0693ad

lmxue requested review from zhizhengwu, RMSnow, HeCheng0625 and VocodexElysium December 1, 2023 16:09

lmxue and others added 2 commits December 2, 2023 00:49

add modules related to valle model

9f9bfd9

Update optimizers.py

d333659

RMSnow approved these changes Dec 1, 2023

View reviewed changes

lmxue force-pushed the vall_dev1 branch 2 times, most recently from 02bff49 to d333659 Compare December 1, 2023 18:56

lmxue and others added 2 commits December 2, 2023 02:56

update models optimizer config egs to support valle

375ab7f

Update optimizers.py

0df6ed9

zhizhengwu reviewed Dec 2, 2023

View reviewed changes

RMSnow requested changes Dec 2, 2023

View reviewed changes

RMSnow added 2 commits December 2, 2023 10:00

Update scaling.py

dc8e097

Update optimizers.py

28d6750

zhizhengwu reviewed Dec 2, 2023

View reviewed changes

Update base.json

1b1c26d

RMSnow approved these changes Dec 2, 2023

View reviewed changes

Update inference.py

42fc01a

zhizhengwu mentioned this pull request Dec 2, 2023

Where is the code for VALLE training and inference? #3

Closed

RMSnow added 2 commits December 2, 2023 10:23

Update README.md

370250f

Update README.md

c071731

RMSnow merged commit eea8473 into open-mmlab:main Dec 2, 2023

RMSnow mentioned this pull request Dec 5, 2023

add_hifitts_data_processor #9

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add core code of valle #4

add core code of valle #4

lmxue commented Dec 1, 2023 •

edited by zhizhengwu

RMSnow left a comment

RMSnow Dec 1, 2023

RMSnow Dec 1, 2023

RMSnow Dec 1, 2023

zhizhengwu Dec 2, 2023

zhizhengwu Dec 2, 2023

zhizhengwu Dec 2, 2023

zhizhengwu Dec 2, 2023

zhizhengwu commented Dec 2, 2023

RMSnow Dec 2, 2023

RMSnow Dec 2, 2023

zhizhengwu Dec 2, 2023

add core code of valle #4

add core code of valle #4

Conversation

lmxue commented Dec 1, 2023 • edited by zhizhengwu

RMSnow left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhizhengwu commented Dec 2, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lmxue commented Dec 1, 2023 •

edited by zhizhengwu