# Chat with PDF page images

**If you're looking or the web application, check the src/ folder.** 

This notebook demonstrates how to convert PDF pages to images and send them to a vision model for inference

## Authenticate to OpenAI

The following code connects to OpenAI, either using an Azure OpenAI account, GitHub models, or local Ollama model. See the README for instruction on configuring the `.env` file.

In [1]:
import os

import azure.identity
import openai
from dotenv import load_dotenv

load_dotenv(".env", override=True)

openai_host = os.getenv("OPENAI_HOST")
if openai_host == "local":
    print("Using local OpenAI-compatible API with no key")
    openai_client = openai.OpenAI(api_key="no-key-required", base_url=os.environ["LOCAL_OPENAI_ENDPOINT"])
elif openai_host == "github":
    print("Using GitHub-hosted model")
    openai_client = openai.OpenAI(
        api_key=os.environ["GITHUB_TOKEN"],
        base_url=os.environ["GITHUB_MODELS_ENDPOINT"],
    )
elif os.getenv("AZURE_OPENAI_KEY_FOR_CHATVISION"):
    # Authenticate using an Azure OpenAI API key
    # This is generally discouraged, but is provided as a convenience
    print("Using Azure OpenAI with key")
    openai_client = openai.AzureOpenAI(
        api_version=os.getenv("AZURE_OPENAI_API_VERSION") or "2024-02-15-preview",
        azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
        api_key=os.environ["AZURE_OPENAI_KEY_FOR_CHATVISION"],
    )
elif os.getenv("AZURE_OPENAI_ENDPOINT"):
    tenant_id = os.environ["AZURE_TENANT_ID"]
    print("Using Azure OpenAI with Azure Developer CLI credential for tenant id", tenant_id)
    default_credential = azure.identity.AzureDeveloperCliCredential(tenant_id=tenant_id)
    token_provider = azure.identity.get_bearer_token_provider(
        default_credential, "https://cognitiveservices.azure.com/.default"
    )
    openai_client = openai.AzureOpenAI(
        api_version=os.getenv("AZURE_OPENAI_API_VERSION") or "2024-02-15-preview",
        azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
        azure_ad_token_provider=token_provider,
    )

Using Azure OpenAI with Azure Developer CLI credential for tenant id 1bd0d125-6c64-49d1-af0d-88fa60e18074


## Convert PDFs to images

In [2]:
!pip install Pillow PyMuPDF


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [3]:
import pymupdf
from PIL import Image

filename = "plants.pdf"
doc = pymupdf.open(filename)
for i in range(doc.page_count):
    doc = pymupdf.open(filename)
    page = doc.load_page(i)
    pix = page.get_pixmap()
    original_img = Image.frombytes("RGB", [pix.width, pix.height], pix.samples)
    original_img.save(f"page_{i}.png")

## Send images to vision model

In [4]:
import base64


def open_image_as_base64(filename):
    with open(filename, "rb") as image_file:
        image_data = image_file.read()
    image_base64 = base64.b64encode(image_data).decode("utf-8")
    return f"data:image/png;base64,{image_base64}"

In [5]:
user_content = [{"text": "What plants are listed on these pages?", "type": "text"}]
for i in range(doc.page_count):
    user_content.append({"image_url": {"url": open_image_as_base64(f"page_{i}.png")}, "type": "image_url"})

response = openai_client.chat.completions.create(
    model=os.environ["OPENAI_MODEL"], messages=[{"role": "user", "content": user_content}], temperature=0.5
)

print(response.choices[0].message.content)

The plants listed on these pages from The Watershed Nursery are categorized into Annuals, Bulbs, Grasses, Perennials, and other categories. Here is a detailed list of the plants:

### Annuals
1. Centromadia pungens (Common tarweed)
2. Epilobium densiflorum (Dense Spike-primrose)
3. Eschscholzia caespitosa (Tufted Poppy)
4. Eschscholzia californica (California poppy)
5. Eschscholzia californica 'Purple Gleam' (Purple Gleam Poppy)
6. Eschscholzia californica var. maritima (Coastal California Poppy)
7. Madia elegans (Tarweed)
8. Mentzelia lindleyi (Lindley's Blazing Star)
9. Symphyotrichum subulatum (Slim marsh aster)
10. Trichostema lanceolatum (Vinegar weed)
11. Trichostema lanceolatum (Vinegar weed)

### Bulbs
1. Brodiaea californica (California brodiaea)
2. Chlorogalum pomeridianum (Soap plant)
3. Epipactis gigantea (Stream orchid)
4. Wyethia angustifolia (Narrowleaf mule ears)
5. Wyethia angustifolia (Narrowleaf mule ears)
6. Wyethia angustifolia (Narrowleaf mule ears)
7. Wyethia mol