# MultiModality

- Table of contents
    - [Image](#image)
      1. [From Data](#1-from-data)
      2. [From URL](#2-from-url)

    - [Document (PDF's)](#documents-pdfs)
      1. [From Data](#from-data)
      2. [From URI](#from-uri)

  - [Using Multi-model prompts](#using-multi-model-prompts)

In [6]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini")

## Image
#### 1. From Data

In [None]:
import base64
import httpx
from langchain_core.messages import HumanMessage
from langchain.chat_models import init_chat_model

model = init_chat_model("anthropic:claude-3-5-sonnet-latest")

# Fetch image data
image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
image_data = base64.b64encode(httpx.get(image_url).content).decode("utf-8")


message = HumanMessage(
    content=[
        {
            "type": "text",
            "text": "Describe the weather in this image:"
        },
        {
            "type": "image",
            "source_type": "base64",
            "data": image_data,
            "mini_type": "image/jpeg"
        }
    ]
)

response = llm.invoke([message])
print(response.text())

The image shows a beautiful clear day with bright blue skies and wispy cirrus clouds stretching across the horizon. The clouds are thin and streaky, creating elegant patterns against the blue backdrop. The lighting suggests it's during the day, possibly late afternoon given the warm, golden quality of the light on the grass. The weather appears calm with no signs of wind (the grass looks relatively still) and no indication of rain. It's the kind of perfect, mild weather that's ideal for walking along the wooden boardwalk through the marsh grass.


#### 2. From URL

In [None]:
message = HumanMessage(
    content=[
        {
            "type": "text",
            "text": "Describe the weather in this image:"
        },
        {
            "type": "image",
            "source_type": "url",
            "data": image_url,
            "mini_type": "image/jpeg"
        }
    ]
)

#### Using Multiple Images

In [None]:
message = {
    "role": "user",
    "content": [
        {"type": "text", "text": "Are these two images the same?"},
        {"type": "image", "source_type": "url", "url": image_url},
        {"type": "image", "source_type": "url", "url": image_url},
    ],
}
response = llm.invoke([message])
print(response.text())

## Documents (PDF's)

- Some provides such as OpenAI, Anthropic and Google Gemini will accept PDF documents.

#### From Data

In [None]:
import base64

import httpx
from langchain.chat_models import init_chat_model

# Fetch PDF data
pdf_url = "https://pdfobject.com/pdf/sample.pdf"
pdf_data = base64.b64encode(httpx.get(pdf_url).content).decode("utf-8")


# Pass to LLM
llm = init_chat_model("anthropic:claude-3-5-sonnet-latest")

message = {
    "role": "user",
    "content": [
        {
            "type": "text",
            "text": "Describe the document:",
        },
        {
            "type": "file",
            "source_type": "base64",
            "data": pdf_data,
            "mime_type": "application/pdf",
        },
    ],
}
response = llm.invoke([message])
print(response.text())

#### From URI

In [None]:
message = {
    "role": "user",
    "content": [
        {
            "type": "text",
            "text": "Describe the document:",
        },
        {
            "type": "file",
            "source_type": "url",
            "url": pdf_url,
        },
    ],
}
response = llm.invoke([message])
print(response.text())

## Using Multi-model prompts

- Demonstrate how to use **`PromptTemplate`** to format multi-model input data.

- Prompt that takes a URL

In [None]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.messages import SystemMessage, HumanMessage

# Define a prompt
prompt = ChatPromptTemplate(
    [
        SystemMessage(
            content="Describe the image provided."
        ),
        HumanMessage(
            content=[
                {
                    "type": "image",
                    "source_type": "url",
                    "url": "{image_url}"
                }
            ]
        )
    ]
)

- Pass the image URI into the prompt

In [None]:
from langchain.chat_models import init_chat_model

llm = init_chat_model("anthropic:claude-3-5-sonnet-latest")

url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"

chain = prompt | llm
response = chain.invoke({"image_url": url})
print(response.text())