Proposal to integrate into 🤗 Hub #555

patrickvonplaten · 2021-04-30T09:51:08Z

Hi TensorSpeech team! I hereby propose an integration with the HuggingFace model hub 🤗

This integration would allow you to freely download/upload models from/to the Hugging Face Hub: https://huggingface.co/.

Your users could then directly download model weights, etc within Python without having to manually downloads weights.
Taking your fastspeech_2_inference.ipynb example the following diff would show the code could change to be able to directly download weights from the model hub.

import tensorflow as tf

-from tensorflow_tts.inference import AutoConfig
from tensorflow_tts.inference import TFAutoModel
from tensorflow_tts.inference import AutoProcessor

processor = AutoProcessor.from_pretrained(
-    pretrained_path="../tensorflow_tts/processor/pretrained/ljspeech_mapper.json"
+   pretrained_path="tensorspeech/fastspeech2_tts"
)

input_text = "i love you so much."
input_ids = processor.text_to_sequence(input_text)

-config = AutoConfig.from_pretrained("../examples/fastspeech2/conf/fastspeech2.v1.yaml")
fastspeech2 = TFAutoModel.from_pretrained(
-    config=config, 
-    pretrained_path="../examples/fastspeech2/checkpoints/model-150000.h5",
+   pretrained_path="tensorspeech/fastspeech2_tts"
    is_build=True,
    name="fastspeech2"
)

mel_before, mel_after, duration_outputs, _, _ = fastspeech2.inference(
    input_ids=tf.expand_dims(tf.convert_to_tensor(input_ids, dtype=tf.int32), 0),
    speaker_ids=tf.convert_to_tensor([0], dtype=tf.int32),
    speed_ratios=tf.convert_to_tensor([1.0], dtype=tf.float32),
    f0_ratios =tf.convert_to_tensor([1.0], dtype=tf.float32),
    energy_ratios =tf.convert_to_tensor([1.0], dtype=tf.float32),
)

As an example, I uploaded a fastspeech model to this repo of the HF hub:
I uploaded some weights exemplary to the hub here: https://huggingface.co/patrickvonplaten/tf_tts_fast_speech_2.
If you'd like to add this feature to your library we would obviously change the organization name from patrickvonplaten to tensorspeech.

You can try it out by running the following code:

import tensorflow as tf

from tensorflow_tts.inference import TFAutoModel
from tensorflow_tts.inference import AutoProcessor

processor = AutoProcessor.from_pretrained(pretrained_path="patrickvonplaten/tf_tts_fast_speech_2")

input_text = "i love you so much."
input_ids = processor.text_to_sequence(input_text)

fastspeech2 = TFAutoModel.from_pretrained(
    pretrained_path="patrickvonplaten/tf_tts_fast_speech_2",
    is_build=True,
    name="fastspeech2"
)

mel_before, mel_after, duration_outputs, _, _ = fastspeech2.inference(
    input_ids=tf.expand_dims(tf.convert_to_tensor(input_ids, dtype=tf.int32), 0),
    speaker_ids=tf.convert_to_tensor([0], dtype=tf.int32),
    speed_ratios=tf.convert_to_tensor([1.0], dtype=tf.float32),
    f0_ratios=tf.convert_to_tensor([1.0], dtype=tf.float32),
    energy_ratios=tf.convert_to_tensor([1.0], dtype=tf.float32),
)

Besides freely storing your model weights, we also provide git version control and download statistics for your models :-) We can also provide you with a hosted inference API where users could try out your models directly on the website.

We've already integrated with a couple of other libraries - you can check them out here:

Sorry for the missing tests in the PR - I just did the minimal changes to showcase you how the integration with the HF hub could look like :-) I'd also be more than happy to add you guys to a Slack channel where we could discuss further.

Cheers,
Patrick & Hugging Face team

Also cc @julien-c

dathudeptrai · 2021-05-02T12:53:14Z

@patrickvonplaten Thank you so much, this is a really great and useful feature :D. I have learned a lot from the huggingface transformers repo and as you can see, our repo has the same structure as the transformers repo then it would easily to integrated with HuggingFace_hub. I'm on a vacation and will be back in a few days. :D.

Fix AutoProcessor

Fix Test

dathudeptrai · 2021-05-14T03:39:10Z

@patrickvonplaten Merged :D. Can you tell me what is a next steps ?

patrickvonplaten · 2021-05-14T09:36:43Z

Hey @dathudeptrai,

Awesome to see that the PR is merged 🥳 In a next step, I think we can create an organization on the hub, here: https://huggingface.co/organizations/new (maybe called Tensorspeech ?) and then if you want we can upload a bunch of your models and create a demo widget to showcase them 🙂

Also cc @julien-c , @osanseviero

dathudeptrai · 2021-05-14T16:07:50Z

@patrickvonplaten I just added tensorspeech organization in hf hub. Let me do the remaining jobs :D.

osanseviero · 2021-05-17T08:20:06Z

@dathudeptrai thank you for creating the org! That's awesome.

There are some additional steps in our side. The two main things missing, I think, are:

Add a code snippet that says how to use the model with TensorFlowTTS. Something along these lines but for TensorFlowTTS.

Add widget for TensorFlowTTS models. User would input a sentence and then we can provide an audio. This will be a great way to showcase the models! Something like this:

@dathudeptrai something that could be interesting is to implement a push_to_hub method. This will allow your users to easily share their models by uploading to the hub. This also facilitates creating automatic model cards, making sure all the tags are correct, and more. What do you think?

osanseviero · 2021-05-17T15:03:35Z

@dathudeptrai by looking at the examples and familiarizing myself with the library, I was wondering if you would have an idea of the example code snippet that will be shown to the users. From what I see, there are two open questions:

After doing model.inference to generate the mel-spectogram, we'll still need to do melgan.inference on top of it to get speech. Is this right, or is there a better approach to make the speech? Alternatively, would it be ok if the code snippet only shows how to do the initial inference?
I see that the .inference method signature is different depending on which model we're using, which might make it a bit harder to implement things in a generic way which works for all of them. If you have an example function that deals with these differences, it would be greatly appreciated.

Thank you for the library! I've been playing with it and it's awesome!

dathudeptrai · 2021-05-17T15:25:46Z

@osanseviero

After doing model.inference to generate the mel-spectogram, we'll still need to do melgan.inference on top of it to get speech. Is this right, or is there a better approach to make the speech? Alternatively, would it be ok if the code snippet only shows how to do the initial inference?

Yes, almost TTS model now is 2 stages (text2mel and mel2wav). We can combine into one end2end model for the inference stage :D.

I see that the .inference method signature is different depending on which model we're using, which might make it a bit harder to implement things in a generic way which works for all of them. If you have an example function that deals with these differences, it would be greatly appreciated.

Unlike transformers for NLP where the input is almost the same, the text2mel's inputs are varied, they can add more input such as speaker_ids (for multi-speakers), language_ids (for multi-lingual), speaker_embeddings (for voice clone), style embedding (for emotional TTS) and some inputs to adjust speed, f0, energy ... But generally, we only need 2 inputs (input_ids and speaker_ids) :D.

osanseviero · 2021-05-26T06:45:06Z

Hi @dathudeptrai. We got some exciting news!

Last week our team worked on open-sourcing the code for adding code snippets as well as running the inference API for other libraries. This is in the huggingface_hub repo. This PR adds the code snippet as we discussed :) your users will already benefit from being able to search for all TensorFlowTTS models.

dathudeptrai · 2021-05-27T03:33:38Z

@osanseviero Awesome! :D. I'm uploading all our models to https://huggingface.co/tensorspeech, will add a model card soon :D

osanseviero · 2021-05-28T06:36:46Z

Awesome! I'm looking forward to see this :)

As a tip, you can use different tags text-to-mel and mel-to-wav so the code snippets are more complete for your users. Example.

neso613 · 2021-07-10T09:40:35Z

https://huggingface.co/julien-c/kan-bayashi-jsut_tts_train_tacotron2_ja

Does the tflite model available for ESPNET model https://huggingface.co/julien-c/kan-bayashi-jsut_tts_train_tacotron2_ja?

patrickvonplaten added 2 commits April 30, 2021 09:49

add tf hub

542c294

remove ipdb

2ab7ad5

dathudeptrai self-requested a review May 2, 2021 12:35

dathudeptrai self-assigned this May 2, 2021

dathudeptrai added enhancement 🚀 New feature or request Feature Request 🤗 Feature support labels May 2, 2021

dathudeptrai and others added 5 commits May 10, 2021 16:01

🌝 Allow pretrained_path is None in TFAutoModel.from_pretrained().

164989b

😝 Add save_pretrained method for all Processor.

842430a

Merge pull request #1 from dathudeptrai/add_tf_hub-patch-1

7839062

Fix AutoProcessor

🔧 Fix test.

72b5ce8

Merge pull request #2 from dathudeptrai/add_tf_hub-patch-2

f4efa38

Fix Test

dathudeptrai merged commit f53ecd9 into TensorSpeech:master May 14, 2021

dathudeptrai mentioned this pull request Jun 1, 2021

🌝 Release TensorFlowTTS v1.6 #579

Merged

osanseviero mentioned this pull request Jun 2, 2021

🤗 Hub and TensorFlowTTS integration for Inference API #588

Closed

patrickvonplaten deleted the add_tf_hub branch July 10, 2021 10:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal to integrate into 🤗 Hub #555

Proposal to integrate into 🤗 Hub #555

patrickvonplaten commented Apr 30, 2021 •

edited

Loading

dathudeptrai commented May 2, 2021

dathudeptrai commented May 14, 2021

patrickvonplaten commented May 14, 2021

dathudeptrai commented May 14, 2021

osanseviero commented May 17, 2021

osanseviero commented May 17, 2021

dathudeptrai commented May 17, 2021 •

edited

Loading

osanseviero commented May 26, 2021

dathudeptrai commented May 27, 2021

osanseviero commented May 28, 2021

neso613 commented Jul 10, 2021

Proposal to integrate into 🤗 Hub #555

Proposal to integrate into 🤗 Hub #555

Conversation

patrickvonplaten commented Apr 30, 2021 • edited Loading

dathudeptrai commented May 2, 2021

dathudeptrai commented May 14, 2021

patrickvonplaten commented May 14, 2021

dathudeptrai commented May 14, 2021

osanseviero commented May 17, 2021

osanseviero commented May 17, 2021

dathudeptrai commented May 17, 2021 • edited Loading

osanseviero commented May 26, 2021

dathudeptrai commented May 27, 2021

osanseviero commented May 28, 2021

neso613 commented Jul 10, 2021

patrickvonplaten commented Apr 30, 2021 •

edited

Loading

dathudeptrai commented May 17, 2021 •

edited

Loading