-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
.Net: Add overload ctor to ImageContent that takes an string "url" #4781
Comments
Another way is to remove the length limitation of
|
this was my initial thought, but I am not sure how this would work with validation that is done today in |
I think this is bad idea to introduce this breaking change in abstraction layer. |
I would also prefer one of the alternative designs, when I think more about it. I still think it is a too bad, that Uri is a set-able public property :/ |
@dersia This is exactly the reason why But from Semantic Kernel point of view, we want developers to use So, answering your question - we don't want to introduce breaking changes and new property that keeps image as base64 should work. Hope that makes sense, thank you! |
@dmytrostruk sounds good to me. I have started on the implementation (as I said, we need this sooner rather than later) and in this implementation I have gone down the non breaking path. I also added overloads for also I was wondering if there would be a need for a stream version that would use a crypto stream to base64 encode on the fly while writing to the stream. this would be interesting for larger image files, but I don't think this is a concern as of now, right? |
Did you consider using
That would be awesome, thank you!
Yes, I think it's not a concern, at least for now. |
I will also adjust the tests for to work with BinaryData. |
Added Tests Closes microsoft#4781
Added Tests Closes microsoft#4781
Added Tests Closes microsoft#4781
@dmytrostruk please find my PR #4919 |
Added Tests Closes microsoft#4781
Added Tests Closes microsoft#4781
Reopend PR. See #4919 for previous PR. Reopening this PR from another feature branch makes rebasing easier. ### Motivation and Context As Described in #4781 right now there is no possibility in SK to add Images as DataUris to ChatCompletion APIs, although the Azure OpenAI API and the Open AI API both support this. Fixes #4781 ### Description As per Discussion added overload to the ImageContent ctor that takes BinaryData. For backward Compat we kept the ctor that takes an URI. Also the new ctor throws, if the BinaryData is null, empty or if there is not MediaType provided. I thought about allowing plain, non base64 encoded DataUris with BinaryData. The Idea was to not encode to base64, if the MediaType is set to "text/plain", but then I decided, that this is not needed, since `Uri` in general allows for DataUris like `new Uri("data:text/plain;http://exmpaledomain.com")` just not for DataUris that are longer than 65520 bytes. I feel like that is ok, for plain DataUris. We can still add this if needed. Also as per discussion in the issue, I did not add additional overloads for direct Streams support. ### Contribution Checklist <!-- Before submitting this PR, please make sure: --> - [x] The code builds clean without any errors or warnings - [x] The PR follows the [SK Contribution Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md) and the [pre-submission formatting script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts) raises no violations - [x] All unit tests pass, and I have added new tests where possible - [x] I didn't break anyone 😄 --------- Co-authored-by: Roger Barreto <19890735+RogerBarreto@users.noreply.github.com>
Thanks for merging this @RogerBarreto, however I thought we would wait for Azure/azure-rest-api-specs#27780 to be done so we can fully integrate this. Because as of now I does not do anything :/ |
When using the Vision models of AzureOpenAI and OpenAI, the API allows for an Image to be attached either as an URL or as a Data URI. In Semantic Kernel we do have
ImageContent
as a KernelContent-Component, but unfortunately, it takes only anSystem.Uri
as the parameter for the object.I suggest to following addition to
ImageContent
:Proposed API
This would allow us to use Data URIs with embedded images for the Vision APIs.
Usage
Alternate Design
This would be a breaking Change, since we change the Type of the
Uri
property onImageContent
, so insteadDataUri
toImageContent
, but this would also mean that we have to check which of the two properties are used and prefer one other the other, if both are filled.private string field
that holds the DataUri and when theUri
ctor is used we just store the Uri as string in that field and only use the field on implementation site.ImageContent
take a ROS, byte[] or even just a Stream of the image and handle all the Base64Encoding in theImageContent
ctor (I think I would prefer a static Create method over ctor for this option, sonce we can than read the stream async), but this would mean, that we also need a second parameter with the image-ContentType and this would not allow for non base64 encoded data urisI am happy to work on the implementation, since I have to do it anyway, because we need the solution fairly soon.
The text was updated successfully, but these errors were encountered: