-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HF][streaming][3/n] Text2Speech (no streaming, but updating docs on completion params) #854
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced Jan 10, 2024
rossdanlm
requested review from
saqadri,
rholinshead,
suyoglastmileai,
Ankush-lastmile and
jonathanlastmileai
as code owners
January 10, 2024 09:25
TSIA Adding streaming functionality to text summarization model parser ## Test Plan Rebase onto and test it with 11ace0a. Follow the README from AIConfig Editor https://github.com/lastmile-ai/aiconfig/tree/main/python/src/aiconfig/editor#dev, then run these command ```bash aiconfig_path=/Users/rossdancraig/Projects/aiconfig/cookbooks/Gradio/huggingface.aiconfig.json parsers_path=/Users/rossdancraig/Projects/aiconfig/cookbooks/Gradio/hf_model_parsers.py alias aiconfig="python3 -m 'aiconfig.scripts.aiconfig_cli'" aiconfig edit --aiconfig-path=$aiconfig_path --server-port=8080 --server-mode=debug_servers --parsers-module-path=$parsers_path ``` Then in AIConfig Editor run the prompt (it will be streaming format by default) https://github.com/lastmile-ai/aiconfig/assets/151060367/e91a1d8b-a3e9-459c-9eb1-2d8e5ec58e73
TSIA Adding streaming output support for text translation model parser. I also fixed a bug where we didn't pass in `"translation"` key into the pipeline ## Test Plan Rebase onto and test it: 5b74344. Follow the README from AIConfig Editor https://github.com/lastmile-ai/aiconfig/tree/main/python/src/aiconfig/editor#dev, then run these command ```bash aiconfig_path=/Users/rossdancraig/Projects/aiconfig/cookbooks/Gradio/huggingface.aiconfig.json parsers_path=/Users/rossdancraig/Projects/aiconfig/cookbooks/Gradio/hf_model_parsers.py alias aiconfig="python3 -m 'aiconfig.scripts.aiconfig_cli'" aiconfig edit --aiconfig-path=$aiconfig_path --server-port=8080 --server-mode=debug_servers --parsers-module-path=$parsers_path ``` With Streaming https://github.com/lastmile-ai/aiconfig/assets/151060367/d7bc9df2-2993-4709-bf9b-c5b7979fb00f Without Streaming https://github.com/lastmile-ai/aiconfig/assets/151060367/71eb6ab3-5d6f-4c5d-8b82-f3daf4c5e610
…completion params) Ok this one is weird. Today, streaming is only ever supported on text outputs in Transformers library. See `BaseStreamer` in here: https://github.com/search?q=repo%3Ahuggingface%2Ftransformers%20BaseStreamer&type=code In the future it may support other formats, but not yet. For example, OpenAI supports it: https://community.openai.com/t/streaming-from-text-to-speech-api/493784 Anyways, I basically here only did some updates to docs to clarify why completion params were null. Jonathan and I synced about this briefly ofline, but I forgot again so wanted to capture it here so no one forgets
saqadri
approved these changes
Jan 10, 2024
@@ -25,6 +25,8 @@ | |||
|
|||
# Step 1: define Helpers | |||
def refine_pipeline_creation_params(model_settings: Dict[str, Any]) -> List[Dict[str, Any]]: | |||
# There are from the transformers Github repo: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: These
, not there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in #862
saqadri
added a commit
that referenced
this pull request
Jan 10, 2024
[HF][5/n] Image2Text: Allow base64 inputs for images Before we didn't allow base64, only URI (either local or http or https). This is good becuase our text2Image model parser outputs into a base64 format, so this will allow us to chain model prompts! ## Test Plan Rebase and test on 0d7ae2b. Follow the README from AIConfig Editor https://github.com/lastmile-ai/aiconfig/tree/main/python/src/aiconfig/editor#dev, then run these command ```bash aiconfig_path=/Users/rossdancraig/Projects/aiconfig/cookbooks/Gradio/huggingface.aiconfig.json parsers_path=/Users/rossdancraig/Projects/aiconfig/cookbooks/Gradio/hf_model_parsers.py alias aiconfig="python3 -m 'aiconfig.scripts.aiconfig_cli'" aiconfig edit --aiconfig-path=$aiconfig_path --server-port=8080 --server-mode=debug_servers --parsers-module-path=$parsers_path ``` Then in AIConfig Editor run the prompt (streaming not supported so just took screenshots) These are the images I tested (with bear being in base64 format) ![fox_in_forest](https://github.com/lastmile-ai/aiconfig/assets/151060367/ca7d1723-9e12-4cc8-9d8d-41fa9f466919) ![bear-eating-honey](https://github.com/lastmile-ai/aiconfig/assets/151060367/a947d89e-c02a-4c64-8183-ff1c85802859) <img width="1281" alt="Screenshot 2024-01-10 at 04 57 44" src="https://github.com/lastmile-ai/aiconfig/assets/151060367/ea60cbc5-e6ab-4bf2-82e7-17f3182fdc5c"> --- Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/lastmile-ai/aiconfig/pull/856). * __->__ #856 * #855 * #854 * #853 * #851
rossdanlm
pushed a commit
that referenced
this pull request
Jan 10, 2024
Small fixes from comments from Sarmad + me from these diffs: - #854 - #855 - #821 Main things I did - rename `refine_chat_completion_params` --> `chat_completion_params` - edit `get_text_output` to not check for `OutputDataWithValue` - sorted the init file to be alphabetical - fixed some typos/print statements - made some error messages a bit more intuitive with prompt name - sorted some imports - fixed old class name `HuggingFaceAutomaticSpeechRecognition` --> `HuggingFaceAutomaticSpeechRecognitionTransformer` ## Test Plan These are all small nits and shouldn't change functionality
This was referenced Jan 10, 2024
rossdanlm
added a commit
that referenced
this pull request
Jan 10, 2024
HF transformers: Small fixes nits Small fixes from comments from Sarmad + me from these diffs: - #854 - #855 - #821 Main things I did - rename `refine_chat_completion_params` --> `chat_completion_params` - edit `get_text_output` to not check for `OutputDataWithValue` - sorted the init file to be alphabetical - fixed some typos/print statements - made some error messages a bit more intuitive with prompt name - sorted some imports - fixed old class name `HuggingFaceAutomaticSpeechRecognition` --> `HuggingFaceAutomaticSpeechRecognitionTransformer` ## Test Plan These are all small nits and shouldn't change functionality
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
[HF][streaming][3/n] Text2Speech (no streaming, but updating docs on completion params)
Ok this one is weird. Today, streaming is only ever supported on text outputs in Transformers library. See
BaseStreamer
in here: https://github.com/search?q=repo%3Ahuggingface%2Ftransformers%20BaseStreamer&type=codeIn the future it may support other formats, but not yet. For example, OpenAI supports it: https://community.openai.com/t/streaming-from-text-to-speech-api/493784
Anyways, I basically here only did some updates to docs to clarify why completion params were null. Jonathan and I synced about this briefly ofline, but I forgot again so wanted to capture it here so no one forgets
This downloaded this file here: https://drive.google.com/file/d/1xP-uDVRe8X5peSyCh1-KrbVrVEbglBQy/view?usp=sharing
Stack created with Sapling. Best reviewed with ReviewStack.