# Accessing multimodal capapabilities in GPT-4o

This notebook provides a basic demonstration of how to use the GPT-4o model through the OpenAI API to interpret both text and image prompts.

Relevant links:
- [Introduction to GPT-4o cookbook from OpenAI](https://cookbook.openai.com/examples/gpt4o/introduction_to_gpt4o)
- [API reference for chat completions](https://platform.openai.com/docs/api-reference/chat/create)
- [Documentation for function calling](https://platform.openai.com/docs/guides/function-calling)
- [JSON Schema documentation](https://json-schema.org/understanding-json-schema/reference/non_json_data#light-scheme-icon)

In [1]:
# Import necessary libraries
import os
import pandas as pd
from openai import OpenAI
from dotenv import load_dotenv
import base64
import json
from datetime import date

# Load the .env file
load_dotenv()

# Set up an OpenAI object using the OpenAI API key
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

## Create a standard chat completion using GPT-4o.

In [2]:
completion = client.chat.completions.create(
  model="gpt-4o",
  messages=[
    {
      "role": "system", 
      "content": "You are a helpful assistant."
    },
    {
      "role": "user", 
      "content": "Write a haiku about a duck."
    } 
  ]
)

print(completion.choices[0].message.content)

Beneath golden leaves,
A duck glides on tranquil ponds—
Whispers of autumn.


### Encode images
To pass an image to the model, first turn it into a base64-encoded string.

In [3]:
# Image path
IMAGE_PATH = "data-eob/claim_sunil_01162024.png"

# Encode the image file as a base64 string
def encode_image(image_path):
  with open(image_path, "rb") as image_file:
    return base64.b64encode(image_file.read()).decode("utf-8")

base64_image = encode_image(IMAGE_PATH)

### Prompt GPT-4o using an image
Next, pass the image in the messages object by setting the `type` to `image_url`.

In [4]:

# Pass the image to GPT-4o anlong with a prompt.
response = client.chat.completions.create(
  model="gpt-4o",
  messages=[
    {
      "role": "system",
      "content": "Answer the quesiton based on the provided image."
    },
    {
      "role": "user",
      "content": [
        {
            "type": "text",
            "text": "Who are the providers listed in this image?"
        },
        {
          "type": "image_url", 
          "image_url": {
            "url": f"data:image/png;base64,{base64_image}"
          }
        }
      ]
    }
  ],
  temperature=0.0,
)

# Print the response
print(response.choices[0].message.content)

The providers listed in the image are:

1. JOSHUA A HALL
2. SNAP DIAGNOSTICS LLC


### Get structured data from GPT-4o
To get the output as JSON, specify JSON output in the system message and set the `response_format` parameter to `{"type": "json_object"}`.

In [5]:
response = client.chat.completions.create(
  model="gpt-4o",
  messages=[
    {
      "role": "system",
      "content": "If the image is a explanation of benefits(eob), output provider, Date of service, service description, provider billed, member discount, net charged, copay and total for each provider as JSON. If it's not a eob, ask for a eob."
    },
    {
      "role": "user",
      "content": [
        {
          "type": "image_url",
          "image_url": {
            "url": f"data:image/png;base64,{base64_image}"
          }
        }
      ]
    }
  ],
  temperature=0.0,
  response_format={ "type": "json_object" }
)

print(response.choices[0].message.content)

{
  "providers": [
    {
      "provider": "JOSHUA A HALL",
      "date_of_service": "01/16/2024",
      "service_description": "OFFICE/OTHER OUTPATIENT VISITS",
      "provider_billed": "$135.00",
      "member_discount": "$100.48",
      "net_charged": "$34.52",
      "provider_adjustment": "$0.00",
      "other_health_plan_coverage": "$0.00",
      "your_plan_paid": "$0.00",
      "copay": "$34.52",
      "deductible": "$0.00",
      "coinsurance": "$0.00",
      "excluded": "$0.00",
      "total": "$34.52"
    },
    {
      "provider": "SNAP DIAGNOSTICS LLC",
      "date_of_service": "12/16/2023",
      "service_description": "DIAGNOSTIC TESTING",
      "provider_billed": "$1,100.00",
      "member_discount": "$1,100.00",
      "net_charged": "$0.00",
      "provider_adjustment": "$0.00",
      "other_health_plan_coverage": "$0.00",
      "your_plan_paid": "$0.00",
      "copay": "$0.00",
      "deductible": "$0.00",
      "coinsurance": "$0.00",
      "excluded": "$0.00",
      "

### Set up a function call
To further control the JSON output, use function calling. See the [API reference](https://platform.openai.com/docs/api-reference/chat/create#chat-create-tools) for more on function calling, and [JSON schema reference](https://json-schema.org/understanding-json-schema/reference) for info on how to format the function call schema.

Below the function call schema is broken out into a variable to make the function easier to read.

In [18]:
function_call = [
    {
        "type": "function",
        "function": {
            "name": "itemize_eob",
            "description": "Itemize a eob from an image",
            "parameters": {
                "type": "object",
                "properties": {
                    "patient": {
                        "type": "string",
                        "description": "Name of patient",
                    },
                    "provider": {
                        "type": "string",
                        "description": "Name of provider",
                    },
                    "DocumentID": {
                        "type": "string",
                        "description": "DocumentID",
                    },
                    "servicedate": {
                        "type": "string",
                        "format": "date",
                        "description": "Service Date",
                    },
                    "claimdate": {
                        "type": "string",
                        "format": "date",
                        "description": "Claim Date",
                    },
                    "items": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "name": {
                                    "type": "string",
                                    "description": "Service Name",
                                },
                                "billed": {
                                    "type": "number",
                                    "description": "Provider billed",
                                },
                                "discount": {
                                    "type": "number",
                                    "description": "Member discount",
                                },
                                "charged": {
                                    "type": "number",
                                    "description": "Net charged",
                                },
                                "copay": {
                                    "type": "number",
                                    "description": "Copay",
                                },
                                "total": {
                                    "type": "number",
                                    "description": "Total",
                                },
                                "category": {
                                    "type": "string",
                                    "description": "Category of item",
                                    "enum": ["OfficeVisit", "Diagnostic", "Telehealth", "other"],
                                },
                            },
                        },
                        "description": "List of services in eob",
                    },
                },
                "required": ["patient", "claimdate", "provider", "servicedate", "DocumentID", "items"],
            },
        }
    }
]

### Multimodal prompting with function calling
Combine the multimodal image prompt with a function call to capture relevant data from receipts.

Note: The system message is set up to capture any images that are not of receipts and return a regular completion instead of the function call.

In [19]:
# Use IPython.display.JSON for easier to read JSON output.
from IPython.display import JSON

response = client.chat.completions.create(
  model="gpt-4o",
  messages=[
    {"role": "system", "content": "If the image is a explaination of benefits(eob), process the data. If it's not a eob, ask for a eob."},
    {"role": "user", "content": [
      {"type": "image_url", "image_url": {
        "url": f"data:image/png;base64,{base64_image}"}
      }
    ]},
  ],
  tools=function_call, # <-- Add the function_call schema from above
  tool_choice="auto",
  temperature=0.0,
)

print(response)
# Parse the JSON data from the response
receipt_data = json.loads(response.choices[0].message.tool_calls[0].function.arguments)

# Display the JSON data
JSON(receipt_data, expanded=True)



ChatCompletion(id='chatcmpl-9gn4LZG8fAtWSwEONAOOQgf6Tqvpp', choices=[Choice(finish_reason='tool_calls', index=0, logprobs=None, message=ChatCompletionMessage(content=None, role='assistant', function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_59p016z2o8Cu3eTMCevJtCMI', function=Function(arguments='{"patient": "SUNIL K PATI", "provider": "JOSHUA A HALL", "DocumentID": "WM2401261128042605", "servicedate": "2024-01-16", "claimdate": "2024-01-17", "items": [{"name": "OFFICE/OTHER OUTPATIENT VISITS", "billed": 135, "discount": 100.48, "charged": 34.52, "copay": 34.52, "total": 34.52, "category": "OfficeVisit"}]}', name='itemize_eob'), type='function'), ChatCompletionMessageToolCall(id='call_oK2BruIs8plv12RdMCDa1oY0', function=Function(arguments='{"patient": "SUNIL K PATI", "provider": "SNAP DIAGNOSTICS LLC", "DocumentID": "WM2401261128042605", "servicedate": "2023-12-16", "claimdate": "2024-01-09", "items": [{"name": "DIAGNOSTIC TESTING", "billed": 1100, "discount": 1100, 

<IPython.core.display.JSON object>

### Create a Dataframe from a CSV file

In [25]:
eob_df = pd.read_csv("eob.csv")
eob_df

Unnamed: 0,Service Date,Patient Name,Provider Name,Claim Date,Document ID,Service Name,Billed,Discount,charged,copay,Total,category


### Add new rows to the dataframe
Iterate through `receipt_data`, create a new row for each item, and add the data to the `expenses_df` DataFrame.

In [26]:
new_rows = []
for item in receipt_data['items']:

  print(f"Adding item: {item['name']}")
  new_row = {
    "Service Date": receipt_data.get("servicedate", date.today().isoformat()),
    "Patient Name": receipt_data.get("patient", ""),
    "Provider Name": receipt_data.get("provider", ""),
    "Claim Date": receipt_data.get("claimdate", date.today().isoformat()),
    "Document ID": receipt_data.get("DocumentID", ""),
    "Service Name": item.get("name", ""),
    "Billed": item.get("billed", 0),
    "Discount": item.get("discount", 0),
    "charged": item.get("charged", 0),
    "Copay": item.get("copay", 0),
    "Total": item.get("total", 0),
    "category": item.get("category", "Uncategorized"),
  }
  new_rows.append(new_row)

# Convert the list of new rows to a DataFrame
new_rows_df = pd.DataFrame(new_rows)

# Concatenate the new rows DataFrame to the existing expenses DataFrame
if eob_df.empty:
  eob_df = new_rows_df
else:
  eob_df = pd.concat([eob_df, new_rows_df], ignore_index=True)

eob_df

Adding item: OFFICE/OTHER OUTPATIENT VISITS


Unnamed: 0,Service Date,Patient Name,Provider Name,Claim Date,Document ID,Service Name,Billed,Discount,charged,Copay,Total,category
0,2024-01-16,SUNIL K PATI,JOSHUA A HALL,2024-01-17,WM2401261128042605,OFFICE/OTHER OUTPATIENT VISITS,135,100.48,34.52,34.52,34.52,OfficeVisit


### Write new rows to CSV
Save the new data in the existing CSV by overwriting it with the `eob_df` data.

In [10]:
eob_df.to_csv('eob.csv', index=False)