-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add support for multimodal on ChatMessages with ContentPart #7913
Conversation
… method was not used at init
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the approach I suggested on this comment.
Pull Request Test Coverage Report for Build 9680828964Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
Hi @silvanocerza, Your comment for reference:
I replied to your comment a while ago remarking my concerns on how to know whether a string or 'ByteStream' is one or another data type, how do you suggest we solve that? I'm not a very experienced coder, so I do not know how to implement this using your method. If you are willing to give me some guidance I have no problem of implementing your solution, as I said, it results in a simpler API, but I cannot think of a way of implementing it. I will tag other members of our initial conversation so they can give their opinion on the matter. |
@CarlosFerLo The idea is to keep the abstraction level higher than what is proposed in this PR. If some Generator need to send the |
Okey, did not know what that field was thought for in the ByteStream. Just one last question, how can we represent image urls? If we go for the simple string, it is more natural, we might run into problems for differentiating urls and raw text, but if we go with a ByteStream we would find ourselves using a tool for a thing it wasn't meant to be used for, I believe. I will close this PR and start implementing your solution right away. |
|
@silvanocerza What if we add a prefix or something to the urls? To avoid duplicate information. |
Like |
Like a 'image_url' prefix. 'http' may appear in regular text. We could find a prefix that is not usual, and together with the url structure differentiate between ulrs and text. |
Related Issues
str
-only doesn't allow user to pass image #7848Proposed Changes:
I have added a new data class called 'ContentPart' as suggested in #7848 that holds any kind of content you might want to add to a 'ChatMessage', now only text, image urls and image in base 64 are implemented, as the original goal of this PR was to support Image passing to the OpenAI Chat API. This class allows for serialization and OpenAI specific formatting.
Added support for different content types on 'ChatMessage': 'str', 'ContentPart' and 'List[Union[str, ContentPart]]'. All serialization and OpenAI specific formatting was addressed.
Added support for these new content types for 'ChatMessage' on 'ChatPromptBuilder'.
How did you test it?
Added unit test for any new functionality for both 'ChatMessage' and 'ChatPromptBuilder', and created a new file containing unit tests for 'ContentPart'.
Notes for the reviewer
Checklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
. ✅