Add text to audio task by Vaibhavs10 · Pull Request #969 · huggingface/hub-docs

Vaibhavs10 · 2023-09-26T14:13:33Z

No description provided.

osanseviero · 2023-09-26T14:29:43Z

You will also need to update in a couple of other places https://github.com/search?q=repo%3Ahuggingface%2Fhub-docs+TEXT-to-speech+language%3ATypeScript&type=code&l=TypeScript

Vaibhavs10 · 2023-09-26T14:53:56Z

I think I added it to all the relevant ones. Do you mind giving it a look again please @osanseviero ?

osanseviero

Thank you! Please check that you have similar set of files as in https://github.com/search?q=repo%3Ahuggingface%2Fhub-docs+TEXT-to-speech+language%3ATypeScript&type=code&l=TypeScript

You also need to add it to tasksData (can lead to an undefined task page for now (or same as TTS?) - in any case, if not, let's open an issue for a follow-up task page
Need to specify which libraries (and here) support this task

osanseviero

Awesome, things are looking good! You will also need to

Add an icon (similar to https://github.com/huggingface/hub-docs/blob/main/js/src/lib/components/Icons/IconTextToSpeech.svelte ) and specify it in https://github.com/huggingface/hub-docs/blob/main/js/src/lib/components/PipelineIcon/PipelineIcon.svelte#L27 (I suggest to check https://carbondesignsystem.com/ for icons)
You might also want to enable the widget, which involves modifying https://github.com/huggingface/hub-docs/blob/main/js/src/lib/components/InferenceWidget/InferenceWidget.svelte (most likely you can re-use TextToSpeechWidget)

You can preview how this would look like in GitHub Codespaces as specified in https://github.com/huggingface/moon-landing#codespace

Vaibhavs10 · 2023-09-26T17:38:27Z

Noice. Going to to test this now!

Vaibhavs10 · 2023-09-26T17:46:48Z

Weird! I keep getting permission denied on codespace for anything I do. Is this the same for you too @osanseviero ?

Vaibhavs10 · 2023-09-27T11:39:17Z

Hey @mishig25

After your suggestions, I tried running npm ci and npm run prettier - no changes
then I tried npm run lint also resulted in no changes.

Send help! - welp!

mishig25 · 2023-09-27T14:31:23Z

Was TextToAudio widget was created in this PR ?
if so, it would good idea to provide a description on what the API shape looks like, plus screenshot, plus which model to test on. You can see an example here

After your suggestions, I tried running npm ci and npm run prettier - no changes
then I tried npm run lint also resulted in no changes.

npm run format:all was the one

hub-docs/js/package.json

Line 14 in b1a591d

"format:all": "npm run prettier src && npm run lint:all",

Vaibhavs10 · 2023-09-27T16:46:28Z

Hey @mishig25 - I just copied InferenceTextToSpeech widget over to text-to-audio.
So there are no new changes. I'll follow up with a separate PR to change the actual look.

Let me know if this is okay for you?

osanseviero · 2023-09-27T19:44:54Z

I think we should reuse TextToSpeechWidget rather than reimplementing it, same way we reuse TabularDataWidget. API input/output is exactly the same

mishig25 · 2023-09-27T20:05:14Z

Btw, is the task already supported in api-inference? Is there a model or curl script to test on ?

Vaibhavs10 · 2023-10-03T09:11:39Z

Hey! @mishig25 - Sorry for the delayed response, api-inference needed an update to make sure that the model can work.

Since the pipeline_tag text-to-audio is not yet supported the models cannot be directly curled, however, changing the pipeline_tag to text-to-speech does the trick and works - you can test it here: https://huggingface.co/reach-vb/musicgen-small?text=lo-fi+music

Do note that this is embarrassingly slow - will work on fixing that separately.

@osanseviero - I was thinking of making a follow-up PR to update the Icon for TTM, that's why I created those files. I think it'd be good to distinguish the two. However, if you feel strongly about this then I can revert the change. lmk.

I'll treat this as priority today, would be nice to get this merged soon.

Vaibhavs10 · 2023-10-03T09:15:31Z

Note: that text-to-speech is an alias to text-to-audio pipeline so the behaviour would be exactly the same for both.

osanseviero

Ok from my side to follow-up with the icon, but as mentioned before, we should reuse the widget, no need to implement a new one

Vaibhavs10 · 2023-10-03T14:35:32Z

@osanseviero - Removed all the custom TextToAudio code. I am waiting for the checks to run.

Is there anything else left for this to be good to merge? @mishig25 @osanseviero

Vaibhavs10 · 2023-10-03T15:11:03Z

Both the checks have passed! 🤗

osanseviero

LGTM to merge if the TTA Inference API is working already

Vaibhavs10 · 2023-10-04T12:23:03Z

Good to merge, then?

For testing the inference API, feel free to head over to this URL: https://huggingface.co/reach-vb/musicgen-small

mishig25 · 2023-10-04T12:37:49Z

js/src/lib/components/PipelineIcon/PipelineIcon.svelte

 		"fill-mask": IconFillMask,
 		"sentence-similarity": IconSentenceSimilarity,
 		"text-to-speech": IconTextToSpeech,
+		"text-to-audio": IconTextToAudio,


if it is exact same icon as IconTextToSpeech, why not just use IconTextToSpeech ?

Good call: I fixed it in the latest commit.

osanseviero · 2023-10-04T12:38:20Z

It seems stuck in loading or internal server errors though

Vaibhavs10 · 2023-10-04T16:24:57Z

@osanseviero -> The model is pretty much unusable on CPU (it takes an insane amount of time to generate).

I tested it from api-inference main - It returns a file as expected!

The plan is to enable these for GPU run-time once the pipeline is merged.

add text to audio task.

72e1af3

Vaibhavs10 requested a review from osanseviero September 26, 2023 14:13

Vaibhavs10 added 3 commits September 26, 2023 16:43

add text-to-audio to serveJS.ts

85ba8e2

add text-to-audio inputs snippet.

461896f

add text-to-audio to serveCurl.ts

0d33742

osanseviero reviewed Sep 26, 2023

View reviewed changes

add text-to-audio in tasks specific files.

2436113

osanseviero reviewed Sep 26, 2023

View reviewed changes

enabling text-to-audio widget.

4cdc62a

fix failing InferenceWidget test.

3099724

osanseviero requested a review from mishig25 September 27, 2023 09:45

format

425251c

osanseviero reviewed Oct 3, 2023

View reviewed changes

Vaibhavs10 added 4 commits October 3, 2023 16:26

remove TextToAudio specific copied code.

2c8a03f

resolve conflicts.

0ceda42

fix style.

fb56835

fix style. x2

dd68d2b

Vaibhavs10 requested a review from osanseviero October 4, 2023 10:16

osanseviero approved these changes Oct 4, 2023

View reviewed changes

mishig25 reviewed Oct 4, 2023

View reviewed changes

PipelineIconAudio -> PipelineIconSpeech.

0622593

osanseviero merged commit ff7586c into main Oct 5, 2023

osanseviero deleted the add_tta branch October 5, 2023 09:00

Conversation

Vaibhavs10 commented Sep 26, 2023

Uh oh!

osanseviero commented Sep 26, 2023

Uh oh!

Vaibhavs10 commented Sep 26, 2023

Uh oh!

osanseviero left a comment

Choose a reason for hiding this comment

Uh oh!

osanseviero left a comment

Choose a reason for hiding this comment

Uh oh!

Vaibhavs10 commented Sep 26, 2023

Uh oh!

Vaibhavs10 commented Sep 26, 2023

Uh oh!

Vaibhavs10 commented Sep 27, 2023

Uh oh!

mishig25 commented Sep 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Vaibhavs10 commented Sep 27, 2023

Uh oh!

osanseviero commented Sep 27, 2023

Uh oh!

mishig25 commented Sep 27, 2023

Uh oh!

Vaibhavs10 commented Oct 3, 2023

Uh oh!

Vaibhavs10 commented Oct 3, 2023

Uh oh!

osanseviero left a comment

Choose a reason for hiding this comment

Uh oh!

Vaibhavs10 commented Oct 3, 2023

Uh oh!

Vaibhavs10 commented Oct 3, 2023

Uh oh!

osanseviero left a comment

Choose a reason for hiding this comment

Uh oh!

Vaibhavs10 commented Oct 4, 2023

Uh oh!

mishig25 Oct 4, 2023

Choose a reason for hiding this comment

Uh oh!

Vaibhavs10 Oct 4, 2023

Choose a reason for hiding this comment

Uh oh!

osanseviero commented Oct 4, 2023

Uh oh!

Vaibhavs10 commented Oct 4, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mishig25 commented Sep 27, 2023 •

edited

Loading