Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HF][streaming][4/n] Image2Text (no streaming, but lots of fixing) #855

Merged
merged 4 commits into from
Jan 10, 2024

Conversation

rossdanlm
Copy link
Contributor

@rossdanlm rossdanlm commented Jan 10, 2024

[HF][streaming][4/n] Image2Text (no streaming, but lots of fixing)

This model parser does not support streaming (surprising!):

TypeError: ImageToTextPipeline._sanitize_parameters() got an unexpected keyword argument 'streamer'

In general, I mainly just did a lot of fixing up to make sure that this worked as expected. Things I fixed:

  1. Now works for multiple images (it did before, but didn't process responses for each properly, just put the entire response)
  2. Constructing responses to be in pure text output
  3. Specified the completion params that are supported (only 2: https://github.com/huggingface/transformers/blob/701298d2d3d5c7bde45e71cce12736098e3f05ef/src/transformers/pipelines/image_to_text.py#L97-L102C13)

Next diff I will add support for b64 encoded image format --> we need to convert this to a PIL, see https://github.com/huggingface/transformers/blob/701298d2d3d5c7bde45e71cce12736098e3f05ef/src/transformers/pipelines/image_to_text.py#L83

Test Plan

Rebase onto and test it: 5f3b667.

Follow the README from AIConfig Editor https://github.com/lastmile-ai/aiconfig/tree/main/python/src/aiconfig/editor#dev, then run these command

aiconfig_path=/Users/rossdancraig/Projects/aiconfig/cookbooks/Gradio/huggingface.aiconfig.json
parsers_path=/Users/rossdancraig/Projects/aiconfig/cookbooks/Gradio/hf_model_parsers.py
alias aiconfig="python3 -m 'aiconfig.scripts.aiconfig_cli'"
aiconfig edit --aiconfig-path=$aiconfig_path --server-port=8080 --server-mode=debug_servers --parsers-module-path=$parsers_path

Then in AIConfig Editor run the prompt (streaming not supported for this model so just took screenshots)

These are the images I tested
fox_in_forest
trex

Before
Screenshot 2024-01-10 at 04 00 22

After
Screenshot 2024-01-10 at 04 02 01


Stack created with Sapling. Best reviewed with ReviewStack.

Rossdan Craig rossdan@lastmileai.dev added 4 commits January 10, 2024 05:08
TSIA

Adding streaming functionality to text summarization model parser

## Test Plan
Rebase onto and test it with 11ace0a.

Follow the README from AIConfig Editor https://github.com/lastmile-ai/aiconfig/tree/main/python/src/aiconfig/editor#dev, then run these command
```bash
aiconfig_path=/Users/rossdancraig/Projects/aiconfig/cookbooks/Gradio/huggingface.aiconfig.json
parsers_path=/Users/rossdancraig/Projects/aiconfig/cookbooks/Gradio/hf_model_parsers.py
alias aiconfig="python3 -m 'aiconfig.scripts.aiconfig_cli'"
aiconfig edit --aiconfig-path=$aiconfig_path --server-port=8080 --server-mode=debug_servers --parsers-module-path=$parsers_path
```

Then in AIConfig Editor run the prompt (it will be streaming format by default)


https://github.com/lastmile-ai/aiconfig/assets/151060367/e91a1d8b-a3e9-459c-9eb1-2d8e5ec58e73
TSIA

Adding streaming output support for text translation model parser. I also fixed a bug where we didn't pass in `"translation"` key into the pipeline

## Test Plan
Rebase onto and test it: 5b74344.

Follow the README from AIConfig Editor https://github.com/lastmile-ai/aiconfig/tree/main/python/src/aiconfig/editor#dev, then run these command
```bash
aiconfig_path=/Users/rossdancraig/Projects/aiconfig/cookbooks/Gradio/huggingface.aiconfig.json
parsers_path=/Users/rossdancraig/Projects/aiconfig/cookbooks/Gradio/hf_model_parsers.py
alias aiconfig="python3 -m 'aiconfig.scripts.aiconfig_cli'"
aiconfig edit --aiconfig-path=$aiconfig_path --server-port=8080 --server-mode=debug_servers --parsers-module-path=$parsers_path
```

With Streaming

https://github.com/lastmile-ai/aiconfig/assets/151060367/d7bc9df2-2993-4709-bf9b-c5b7979fb00f

Without Streaming

https://github.com/lastmile-ai/aiconfig/assets/151060367/71eb6ab3-5d6f-4c5d-8b82-f3daf4c5e610
…completion params)

Ok this one is weird. Today, streaming is only ever supported on text outputs in Transformers library. See `BaseStreamer` in here: https://github.com/search?q=repo%3Ahuggingface%2Ftransformers%20BaseStreamer&type=code

In the future it may support other formats, but not yet. For example, OpenAI supports it: https://community.openai.com/t/streaming-from-text-to-speech-api/493784

Anyways, I basically here only did some updates to docs to clarify why completion params were null. Jonathan and I synced about this briefly ofline, but I forgot again so wanted to capture it here so no one forgets
This model parser does not support streaming (surprising!):

```
TypeError: ImageToTextPipeline._sanitize_parameters() got an unexpected keyword argument 'streamer'
```

In general, I mainly just did a lot of fixing up to make sure that this worked as expected. Things I fixed:

1. Now works for multiple images (it did before, but didn't process responses for each properly, just put the entire response)
2. Constructing responses to be in pure text output
3. Specified the completion params that are supported (only 2: https://github.com/huggingface/transformers/blob/701298d2d3d5c7bde45e71cce12736098e3f05ef/src/transformers/pipelines/image_to_text.py#L97-L102C13)

Next diff I will add support for b64 encoded image format --> we need to convert this to a PIL, see https://github.com/huggingface/transformers/blob/701298d2d3d5c7bde45e71cce12736098e3f05ef/src/transformers/pipelines/image_to_text.py#L83

## Test Plan
Rebase onto and test it: 5f3b667.

Follow the README from AIConfig Editor https://github.com/lastmile-ai/aiconfig/tree/main/python/src/aiconfig/editor#dev, then run these command
```bash
aiconfig_path=/Users/rossdancraig/Projects/aiconfig/cookbooks/Gradio/huggingface.aiconfig.json
parsers_path=/Users/rossdancraig/Projects/aiconfig/cookbooks/Gradio/hf_model_parsers.py
alias aiconfig="python3 -m 'aiconfig.scripts.aiconfig_cli'"
aiconfig edit --aiconfig-path=$aiconfig_path --server-port=8080 --server-mode=debug_servers --parsers-module-path=$parsers_path
```

Then in AIConfig Editor run the prompt (streaming not supported for this model so just took screenshots)

These are the images I tested
![fox_in_forest](https://github.com/lastmile-ai/aiconfig/assets/151060367/ca7d1723-9e12-4cc8-9d8d-41fa9f466919)
![trex](https://github.com/lastmile-ai/aiconfig/assets/151060367/2f556ead-a808-4aea-9378-a2537c715e1f)

Before
<img width="1268" alt="Screenshot 2024-01-10 at 04 00 22" src="https://github.com/lastmile-ai/aiconfig/assets/151060367/4426f2b9-0b83-48e2-8af1-865f157ae12c">

After
<img width="1277" alt="Screenshot 2024-01-10 at 04 02 01" src="https://github.com/lastmile-ai/aiconfig/assets/151060367/2ed172a8-ed26-4c1b-9a9e-5c240376a278">
@@ -93,10 +103,11 @@ async def deserialize(
await aiconfig.callback_manager.run_callbacks(CallbackEvent("on_deserialize_start", __name__, {"prompt": prompt, "params": params}))

# Build Completion data
completion_params = self.get_model_settings(prompt, aiconfig)
model_settings = self.get_model_settings(prompt, aiconfig)
completion_params = refine_completion_params(model_settings)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

prompt.outputs = [output]
await aiconfig.callback_manager.run_callbacks(CallbackEvent("on_run_complete", __name__, {"result": prompt.outputs}))
prompt.outputs = outputs
print(f"{prompt.outputs=}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove print?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #862

Comment on lines +171 to +172
# HuggingFace Text summarization does not support function
# calls so shouldn't get here, but just being safe
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Hugging Face image-to-text...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #862

saqadri added a commit that referenced this pull request Jan 10, 2024
[HF][5/n] Image2Text: Allow base64 inputs for images

Before we didn't allow base64, only URI (either local or http or https).
This is good becuase our text2Image model parser outputs into a base64
format, so this will allow us to chain model prompts!

## Test Plan

Rebase and test on
0d7ae2b.

Follow the README from AIConfig Editor
https://github.com/lastmile-ai/aiconfig/tree/main/python/src/aiconfig/editor#dev,
then run these command
```bash
aiconfig_path=/Users/rossdancraig/Projects/aiconfig/cookbooks/Gradio/huggingface.aiconfig.json
parsers_path=/Users/rossdancraig/Projects/aiconfig/cookbooks/Gradio/hf_model_parsers.py
alias aiconfig="python3 -m 'aiconfig.scripts.aiconfig_cli'"
aiconfig edit --aiconfig-path=$aiconfig_path --server-port=8080 --server-mode=debug_servers --parsers-module-path=$parsers_path
```

Then in AIConfig Editor run the prompt (streaming not supported so just
took screenshots)

These are the images I tested (with bear being in base64 format)

![fox_in_forest](https://github.com/lastmile-ai/aiconfig/assets/151060367/ca7d1723-9e12-4cc8-9d8d-41fa9f466919)

![bear-eating-honey](https://github.com/lastmile-ai/aiconfig/assets/151060367/a947d89e-c02a-4c64-8183-ff1c85802859)

<img width="1281" alt="Screenshot 2024-01-10 at 04 57 44"
src="https://github.com/lastmile-ai/aiconfig/assets/151060367/ea60cbc5-e6ab-4bf2-82e7-17f3182fdc5c">

---
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with
[ReviewStack](https://reviewstack.dev/lastmile-ai/aiconfig/pull/856).
* __->__ #856
* #855
* #854
* #853
* #851
@saqadri saqadri merged commit 19d7844 into main Jan 10, 2024
@rossdanlm rossdanlm deleted the pr855 branch January 10, 2024 18:38
rossdanlm pushed a commit that referenced this pull request Jan 10, 2024
Small fixes from comments from Sarmad + me from these diffs:

- #854
- #855
- #821

Main things I did
- rename `refine_chat_completion_params` --> `chat_completion_params`
- edit `get_text_output` to not check for `OutputDataWithValue`
- sorted the init file to be alphabetical
- fixed some typos/print statements
- made some error messages a bit more intuitive with prompt name
- sorted some imports
- fixed old class name `HuggingFaceAutomaticSpeechRecognition` --> `HuggingFaceAutomaticSpeechRecognitionTransformer`

## Test Plan
These are all small nits and shouldn't change functionality
rossdanlm added a commit that referenced this pull request Jan 10, 2024
HF transformers: Small fixes nits

Small fixes from comments from Sarmad + me from these diffs:

- #854
- #855
- #821

Main things I did
- rename `refine_chat_completion_params` --> `chat_completion_params`
- edit `get_text_output` to not check for `OutputDataWithValue`
- sorted the init file to be alphabetical
- fixed some typos/print statements
- made some error messages a bit more intuitive with prompt name
- sorted some imports
- fixed old class name `HuggingFaceAutomaticSpeechRecognition` -->
`HuggingFaceAutomaticSpeechRecognitionTransformer`

## Test Plan
These are all small nits and shouldn't change functionality
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fast Follows for image2Text HF model parser
2 participants