# How to pass multi-modal data directly to models

Here we demonstrate how to pass multi-modal input directly to models. 
We currently expect all input to be passed in the same format as [OpenAI expects](https://platform.openai.com/docs/guides/vision).
For other model providers that support multi-modal input, we have added logic inside the class to convert to the expected format.

In this example we will ask a model to describe an image.

In [1]:
image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"

In [2]:
from langchain_core.messages import HumanMessage
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4o")

The most commonly supported way to pass in images is to pass it in as a byte string.
This should work for most model integrations.

In [3]:
import base64

import httpx

image_data = base64.b64encode(httpx.get(image_url).content).decode("utf-8")

In [4]:
message = HumanMessage(
    content=[
        {"type": "text", "text": "describe the weather in this image"},
        {
            "type": "image_url",
            "image_url": {"url": f"data:image/jpeg;base64,{image_data}"},
        },
    ],
)
response = model.invoke([message])
print(response.content)

The weather in the image appears to be clear and pleasant. The sky is mostly clear with some scattered clouds, indicating good visibility and probably a sunny day. The lighting suggests that it might be either morning or late afternoon. The lush green grass and foliage suggest that it is likely spring or summer. There are no signs of harsh weather conditions such as rain, fog, or strong winds. Overall, it looks like a calm and beautiful day.


We can feed the image URL directly in a content block of type "image_url". Note that only some model providers support this.

In [5]:
message = HumanMessage(
    content=[
        {"type": "text", "text": "describe the weather in this image"},
        {"type": "image_url", "image_url": {"url": image_url}},
    ],
)
response = model.invoke([message])
print(response.content)

The weather in the image appears to be clear and sunny. The sky is mostly blue with some scattered white clouds, suggesting a pleasant day with no signs of rain or storm. The sunlight is bright, illuminating the green grass and the wooden path, which indicates a likely warm or mild temperature. Overall, the scene depicts a calm and pleasant day.


We can also pass in multiple images.

In [6]:
message = HumanMessage(
    content=[
        {"type": "text", "text": "are these two images the same?"},
        {"type": "image_url", "image_url": {"url": image_url}},
        {"type": "image_url", "image_url": {"url": image_url}},
    ],
)
response = model.invoke([message])
print(response.content)

Yes, the two images are the same. They both depict a wooden pathway through a grassy field under a blue sky with wispy clouds. The composition, colors, and elements in both images are identical.
