<a href="https://colab.research.google.com/github/graphlit/graphlit-samples/blob/main/python/Notebook%20Examples/Graphlit_2024_11_16_Describe_Image_with_Vision_LLM.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Description**

This example shows how to use Graphlit to describe a provided image using a vision LLM.

**Requirements**

Prior to running this notebook, you will need to [signup](https://docs.graphlit.dev/getting-started/signup) for Graphlit, and [create a project](https://docs.graphlit.dev/getting-started/create-project).

You will need the Graphlit organization ID, preview environment ID and JWT secret from your created project.

Assign these properties as Colab secrets: GRAPHLIT_ORGANIZATION_ID, GRAPHLIT_ENVIRONMENT_ID and GRAPHLIT_JWT_SECRET.


---

Install Graphlit Python client SDK

In [22]:
!pip install --upgrade graphlit-client



Initialize Graphlit

In [23]:
import os
from google.colab import userdata
from graphlit import Graphlit
from graphlit_api import input_types, enums, exceptions

os.environ['GRAPHLIT_ORGANIZATION_ID'] = userdata.get('GRAPHLIT_ORGANIZATION_ID')
os.environ['GRAPHLIT_ENVIRONMENT_ID'] = userdata.get('GRAPHLIT_ENVIRONMENT_ID')
os.environ['GRAPHLIT_JWT_SECRET'] = userdata.get('GRAPHLIT_JWT_SECRET')

graphlit = Graphlit()

Define Graphlit helper functions

In [24]:
from typing import List, Optional

async def create_openai_specification(model: enums.OpenAIModels):
    if graphlit.client is None:
        return;

    input = input_types.SpecificationInput(
        name=f"OpenAI [{str(model)}]",
        type=enums.SpecificationTypes.EXTRACTION,
        serviceType=enums.ModelServiceTypes.OPEN_AI,
        openAI=input_types.OpenAIModelPropertiesInput(
            model=model,
        )
    )

    try:
        response = await graphlit.client.create_specification(input)

        return response.create_specification.id if response.create_specification is not None else None
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None

async def describe_encoded_image(prompt: str, mimeType: str, data: str, specification_id: str):
    if graphlit.client is None:
        return None

    try:
        response = await graphlit.client.describe_encoded_image(prompt=prompt, mime_type=mimeType, data=data, specification=input_types.EntityReferenceInput(id=specification_id))

        message = response.describe_encoded_image.message if response.describe_encoded_image is not None else None

        return message
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None

async def describe_image(prompt: str, uri: str, specification_id: str):
    if graphlit.client is None:
        return None

    try:
        response = await graphlit.client.describe_image(prompt=prompt, uri=uri, specification=input_types.EntityReferenceInput(id=specification_id))

        message = response.describe_image.message if response.describe_image is not None else None

        return message
    except exceptions.GraphQLClientError as e:
        print(str(e))
        return None

async def delete_all_specifications():
    if graphlit.client is None:
        return;

    _ = await graphlit.client.delete_all_specifications(is_synchronous=True)


In [25]:
import base64
import requests
from io import BytesIO
from PIL import Image

def image_to_base64(uri: str) -> Optional[str]:
    """
    Fetch an image from a URI and return its base64 representation.

    Args:
        uri (str): The URI of the image.

    Returns:
        str: The base64-encoded string of the image.
    """
    try:
        # Fetch the image content from the URI
        response = requests.get(uri)
        response.raise_for_status()

        # Open the image using PIL
        image = Image.open(BytesIO(response.content))

        # Convert the image to bytes
        buffer = BytesIO()
        image.save(buffer, format=image.format)
        image_bytes = buffer.getvalue()

        # Encode the bytes to a base64 string
        base64_string = base64.b64encode(image_bytes).decode('utf-8')

        return base64_string
    except Exception as e:
        print(f"Error loading image from URI: {e}")
        return None


Execute Graphlit example

In [26]:
from IPython.display import display, Markdown, HTML
import time

# Remove any existing specifications; only needed for notebook example
await delete_all_specifications()

print('Deleted all specifications.')

uri = "https://graphlitplatform.blob.core.windows.net/test/images/medical_chart.jpeg"

Deleted all specifications.


Create specification, and start multi-turn summarization conversation.

In [27]:
specification_id = await create_openai_specification(enums.OpenAIModels.GPT4O_128K)

if specification_id is not None:
    print(f'Created specification [{specification_id}].')

    prompt = "Thoroughly describe this image."

    message = await describe_image(prompt, uri, specification_id)

    if message is not None:
        display(Markdown(f'**URI Description:**\n{message}'))
        print()

    data = image_to_base64(uri)
    mime_type = 'image/jpeg'

    prompt = "Extract all the text from this image into Markdown format."

    if data is not None:
        message = await describe_encoded_image(prompt, mime_type, data, specification_id)

        if message is not None:
            display(Markdown(f'**Base64 Description:**\n{message}'))
            print()


Created specification [cb3a1576-de18-48a3-8170-01ee2de0771c].


**URI Description:**
The image is a graph titled "Statistical Analysis of SE-HPLC (% Main Peak) Purity Data for Drug Product Stored at <<StorageRecommended>>°C." It displays the purity data of a drug product over time, measured in months, using SE-HPLC (Size-Exclusion High-Performance Liquid Chromatography).

### Key Elements:

- **Y-Axis:** Represents the percentage of the main peak by SE-HPLC, ranging from 98.5% to 100.0%.
- **X-Axis:** Represents time in months, from 0 to 48 months.
- **Data Points:** 
  - Black circles represent raw data.
  - Red hash marks indicate the worst-case lot (049D108163).
- **Lines:**
  - A solid red horizontal line at 98.5% marks the specification limit.
  - A green dashed line represents the 1-sided 95% confidence bound on the worst-case lot.
  - A blue dashed line shows the predicted mean.
  - A vertical dotted line indicates the current shelf life.

### Observations:

- Most data points are above the specification limit of 98.5%.
- The predicted mean line (blue) shows a slight downward trend over time.
- The worst-case lot data points (red) are generally lower than the other data points but still mostly above the specification limit.
- The confidence bound (green dashed line) also trends downward but remains above the specification limit.

This graph is used to analyze the stability and purity of the drug product over time, ensuring it remains within acceptable limits.




**Base64 Description:**
```markdown
Figure 1. Statistical Analysis of SE-HPLC (% Main Peak) Purity Data for Drug Product Stored at
<<StorageRecommended>>°C

% Main Peak by SE-HPLC

Specification=98.5%

O O O Raw Data
# # # Worst Case Lot 049D108163
-- -- -- 1-sided 95% Confidence Bound on Worst Case Lot
.. __ .. Current Shelf Life
------- Predicted Mean

Time (months)
0 6 12 18 24 30 36 42 48
```



