Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding decoding of base64 image data for gemini pro 1.5 #3711

Merged
merged 1 commit into from
May 20, 2024

Conversation

hmcp22
Copy link

@hmcp22 hmcp22 commented May 17, 2024

Adding decoding of base64 image data for gemini pro 1.5

Relevant issues

When passing a base64 encoded image to gemini-pro-1.5 we get the following error:

Exception has occurred: APIConnectionError
[Errno 36] File name too long: '/home/hmcp22/hugo-repos/litellm/...
  File "/home/hmcp22/hugo-repos/litellm/litellm/main.py", line 1759, in completion
    model_response = gemini.completion(
                     ^^^^^^^^^^^^^^^^^^
  File "/home/hmcp22/hugo-repos/litellm/litellm/llms/gemini.py", line 147, in completion
    prompt = prompt_factory(
             ^^^^^^^^^^^^^^^
  File "/home/hmcp22/hugo-repos/litellm/litellm/llms/prompt_templates/factory.py", line 1505, in prompt_factory
    return _gemini_vision_convert_messages(messages=messages)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hmcp22/hugo-repos/litellm/litellm/llms/prompt_templates/factory.py", line 1359, in _gemini_vision_convert_messages
    raise e
  File "/home/hmcp22/hugo-repos/litellm/litellm/llms/prompt_templates/factory.py", line 1354, in _gemini_vision_convert_messages
    image = Image.open(img)
            ^^^^^^^^^^^^^^^

Type

🆕 New Feature
🐛 Bug Fix

Changes

Added code to load image from base64 data in _gemini_vision_convert_message

Code to test:

Set GEMINI_API_KEY

import litellm
import base64
from dotenv import load_dotenv

load_dotenv()

def encode_image(image_path):
  with open(image_path, "rb") as image_file:
    return base64.b64encode(image_file.read()).decode('utf-8')

image_path = "landmark3.jpg"

base64_image = encode_image(image_path)

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": 'Describe the image in a few sentences.'
            },
            {
                "type": "image_url",
                "image_url": { "url": f"data:image/jpeg;base64,{base64_image}"
          }
            }
        ]
    }
]
response = litellm.completion(
    model="gemini/gemini-1.5-pro-latest",
    messages=messages,
)
content = response.get('choices', [{}])[0].get('message', {}).get('content')
print(content)

Screenshot of result of running above code:
Screenshot 2024-05-17 173344

Copy link

vercel bot commented May 17, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
litellm ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 17, 2024 4:38pm

@ishaan-jaff ishaan-jaff merged commit 622e241 into BerriAI:main May 20, 2024
2 checks passed
@ishaan-jaff
Copy link
Contributor

hi @hmcp22 can we hop on a call sometime this week. I'd love to learn how we can improve litellm for you. What's the best email to send an invite to ?

If it's easier here's a link to my cal https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants