Introduce expressivity_predict, and change pretssel_inference to expressivity_evaluate. #251

kauterry · 2023-12-07T00:58:38Z

This PR does the following:

Add expressivity_predict which runs SeamlessExpressive inference on a single audio path.
Packages the current expressivity batched evaluate script to expressivity_evaluate.
Fix pad_value bug in unit_extractor: is it typo in UnitExtractor? self.collate = Collater(pad_value=2, pad_to_multiple=2) #255.

Testing:

expressivity_predict <input_audio_path> --tgt_lang spa --model_name seamless_expressivity --vocoder_name vocoder_pretssel --output_path spa_whisper.wav

2023-12-08 22:29:24,508 INFO -- seamless_communication.cli.expressivity.predict.predict: Running inference on device=device(type='cuda', index=0) with dtype=torch.float16.
Using the cached tokenizer of seamless_expressivity. Set force to True to download again.
Using the cached tokenizer of seamless_expressivity. Set force to True to download again.
/private/home/krs/miniconda3/envs/fairseq2/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
2023-12-08 22:29:31,425 INFO -- seamless_communication.cli.expressivity.predict.predict: text_generation_opts=SequenceGeneratorOptions(beam_size=5, soft_max_seq_len=(1, 200), hard_max_seq_len=1024, step_processor=None, unk_penalty=0.0, len_penalty=1.0)
2023-12-08 22:29:31,425 INFO -- seamless_communication.cli.expressivity.predict.predict: unit_generation_opts=SequenceGeneratorOptions(beam_size=5, soft_max_seq_len=(25, 50), hard_max_seq_len=1024, step_processor=None, unk_penalty=0.0, len_penalty=1.0)
2023-12-08 22:29:31,425 INFO -- seamless_communication.cli.expressivity.predict.predict: unit_generation_ngram_filtering=False
2023-12-08 22:29:32,436 INFO -- seamless_communication.cli.expressivity.predict.predict: Saving expressive translated audio in spa
2023-12-08 22:29:32,463 INFO -- seamless_communication.cli.expressivity.predict.predict: Translated text in spa: ¿Por qué estás golpeando mi jukebox?

expressivity_evaluate eng_spa_100.tsv --task s2st --tgt_lang spa --output_path expressivity_whisper --ref_field tgt_text --model_name seamless_expressivity --vocoder_name vocoder_pretssel --duration_factor 1.0

Using the cached tokenizer of seamless_expressivity. Set force to True to download again.
Using the cached tokenizer of seamless_expressivity. Set force to True to download again.
2023-12-08 22:23:50,414 INFO -- seamless_communication.cli.expressivity.evaluate.evaluate: text_generation_opts=SequenceGeneratorOptions(beam_size=5, soft_max_seq_len=(1, 200), hard_max_seq_len=1024, step_processor=None, unk_penalty=0.0, len_penalty=1.0)
2023-12-08 22:23:50,415 INFO -- seamless_communication.cli.expressivity.evaluate.evaluate: unit_generation_opts=SequenceGeneratorOptions(beam_size=5, soft_max_seq_len=(25, 50), hard_max_seq_len=1024, step_processor=None, unk_penalty=0.0, len_penalty=1.0)
2023-12-08 22:23:50,415 INFO -- seamless_communication.cli.expressivity.evaluate.evaluate: unit_generation_ngram_filtering=False
/private/home/krs/miniconda3/envs/fairseq2/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
100%|██████████████████████████████████████████████| 99/99 [00:40<00:00, 2.44it/s]
2023-12-08 22:24:32,111 INFO -- seamless_communication.cli.expressivity.evaluate.evaluate: Processed 99 hyps, 99 refs
2023-12-08 22:24:32,128 INFO -- seamless_communication.cli.expressivity.evaluate.evaluate: Output results in expressivity_whisper/eng_spa_100/generate-eng_spa_100.tsv

pytest -v --device cuda:0

========================== 21 passed in 95.65s (0:01:35) ==========================

src/seamless_communication/cli/streaming/evaluate.py

elbayadm

Approving for the tutorial

yilinyang7 · 2023-12-10T13:42:27Z

src/seamless_communication/cli/expressivity/predict/predict.py

+    gcmvn_mean = torch.tensor(_gcmvn_mean, device=device, dtype=dtype)
+    gcmvn_std = torch.tensor(_gcmvn_std, device=device, dtype=dtype)
+
+    wav, sample_rate = torchaudio.load(args.input)


I thought you'd want to use AudioDecoder?

I needed to use this since I'm resampling to 16khz if the user specifies a generic audio.

yilinyang7 · 2023-12-10T13:46:09Z

I'd suggest to leverage this file: https://github.com/facebookresearch/seamless_communication/blob/main/demo/expressive/app.py

It does the same thing.

kauterry · 2023-12-10T19:24:02Z

I'd suggest to leverage this file: https://github.com/facebookresearch/seamless_communication/blob/main/demo/expressive/app.py

It does the same thing.

That file should actually leverage the expressivity/predict.py file, because in your suggestion we'll have the issue of circular imports. Please feel free to send a refactor PR.

yilinyang7 · 2023-12-10T22:28:01Z

That file should actually leverage the expressivity/predict.py file, because in your suggestion we'll have the issue of circular imports. Please feel free to send a refactor PR.

I don't think it's ideal to change those files (e.g. HF demo & our public demo code), since they're up and running now..

* make dot/ and test_data folder * linting * linting * Guil's comments --------- Co-authored-by: Tuan Tran <tuantran@devfair0436.h2.fair>

Changing pretssel_inference to expressivity_evaluate.

5b04394

kauterry requested review from cndn, cbalioglu and yilinyang7 December 7, 2023 00:58

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 7, 2023

Implement expressivity_predict to run SeamlessExpressive inference.

ba8a3de

kauterry mentioned this pull request Dec 9, 2023

is it typo in UnitExtractor? self.collate = Collater(pad_value=2, pad_to_multiple=2) #255

Closed

kauterry requested a review from elbayadm December 9, 2023 06:11

Don't hardcode --model_name, --vocoder_name in expressivity_predict.

6056bed

kauterry marked this pull request as ready for review December 9, 2023 06:36

ibanesh reviewed Dec 9, 2023

View reviewed changes

src/seamless_communication/cli/streaming/evaluate.py Outdated Show resolved Hide resolved

Revert addition of --gated-model-dir to streaming/evaluate.

4be0dd7

elbayadm approved these changes Dec 10, 2023

View reviewed changes

kauterry merged commit 6ab3787 into main Dec 10, 2023
1 check passed

kauterry deleted the expressivity_predict branch December 10, 2023 05:54

yilinyang7 reviewed Dec 10, 2023

View reviewed changes

gwenzek pushed a commit that referenced this pull request Jan 18, 2024

[unity.cpp] make dot/ and test_data folder before the test (#251)

2110c89

* make dot/ and test_data folder * linting * linting * Guil's comments --------- Co-authored-by: Tuan Tran <tuantran@devfair0436.h2.fair>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce expressivity_predict, and change pretssel_inference to expressivity_evaluate. #251

Introduce expressivity_predict, and change pretssel_inference to expressivity_evaluate. #251

kauterry commented Dec 7, 2023 •

edited

Loading

elbayadm left a comment

yilinyang7 Dec 10, 2023

kauterry Dec 10, 2023

yilinyang7 commented Dec 10, 2023

kauterry commented Dec 10, 2023

yilinyang7 commented Dec 10, 2023

Introduce expressivity_predict, and change pretssel_inference to expressivity_evaluate. #251

Introduce expressivity_predict, and change pretssel_inference to expressivity_evaluate. #251

Conversation

kauterry commented Dec 7, 2023 • edited Loading

elbayadm left a comment

Choose a reason for hiding this comment

yilinyang7 Dec 10, 2023

Choose a reason for hiding this comment

kauterry Dec 10, 2023

Choose a reason for hiding this comment

yilinyang7 commented Dec 10, 2023

kauterry commented Dec 10, 2023

yilinyang7 commented Dec 10, 2023

kauterry commented Dec 7, 2023 •

edited

Loading