# Multi-Modal Structured Output: Credit Card Statement Analysis

This notebook demonstrates how to combine multi-modal input (images) with structured output to extract transaction data from credit card statements.

In [1]:
import asyncio
import os
from pydantic import BaseModel, Field
from typing import List
from agent_framework.azure import AzureOpenAIResponsesClient
from agent_framework import Message, Content
from azure.identity import AzureCliCredential

## Define Structured Output Models

We'll create Pydantic models to represent the transaction data structure we want to extract from the credit card statement.

In [2]:
class Transaction(BaseModel):
    """A single transaction from a credit card statement."""
    
    post_date: str = Field(description="The date when the transaction was posted to the account")
    transaction_date: str = Field(description="The date when the transaction occurred")
    description: str = Field(description="Description of the transaction or merchant name")
    amount: float = Field(description="Transaction amount (positive for charges, negative for credits)")


class CreditCardStatement(BaseModel):
    """Structured output for credit card statement analysis."""
    
    transactions: List[Transaction] = Field(description="List of all transactions from the statement")

## Initialize the Azure OpenAI Client

We'll use the AzureOpenAIResponsesClient which supports structured output with the response_format parameter.

In [3]:
credential = AzureCliCredential()
client = AzureOpenAIResponsesClient(
    project_endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
    deployment_name=os.environ["AZURE_OPENAI_RESPONSES_DEPLOYMENT_NAME"],
    credential=credential,
)

agent = client.as_agent(
    name="StatementAnalyzer",
    instructions="""You are a financial document analysis assistant specialized in extracting transaction data from credit card statements.
    Analyze the provided credit card statement image and extract all transactions with their post date, transaction date, description, and amount.
    Be precise with dates and amounts. Format dates as they appear in the statement.""",
)

## Load Credit Card Statement Image

Load the credit card statement image from a file. Make sure to place your credit card statement image in the `../data/` directory.

In [4]:
# Update this path to point to your credit card statement image
image_path = "../data/sample-exercise-statement.png"

# Load image from local file
with open(image_path, "rb") as image_file:
    image_bytes = image_file.read()

print(f"Loaded credit card statement image from: {image_path}")

Loaded credit card statement image from: ../data/sample-exercise-statement.png


## Create Multi-Modal Message

Combine text instructions with the image data to create a multi-modal message.

In [5]:
message = Message(
    role="user",
    contents=[
        Content.from_text(
            text="""Please analyze this credit card statement image and extract all transactions.
            For each transaction, provide the post date, transaction date, description, and amount."""
        ),
        Content.from_data(
            data=image_bytes,
            media_type="image/png"  # Adjust if using jpg: "image/jpeg"
        )
    ]
)

## Extract Structured Transaction Data

Run the agent with structured output to extract transaction data from the credit card statement.

In [6]:
print("Analyzing credit card statement...\n")

result = await agent.run(message, options={"response_format": CreditCardStatement})

if structured_data := result.value:
    print(f"✓ Successfully extracted {len(structured_data.transactions)} transactions\n")
    print("="*80)
    print("EXTRACTED TRANSACTIONS")
    print("="*80)
    
    for idx, transaction in enumerate(structured_data.transactions, 1):
        print(f"\nTransaction #{idx}:")
        print(f"  Post Date:        {transaction.post_date}")
        print(f"  Transaction Date: {transaction.transaction_date}")
        print(f"  Description:      {transaction.description}")
        print(f"  Amount:           ${transaction.amount:.2f}")
    
    print("\n" + "="*80)
else:
    print(f"Failed to parse response: {result.text}")

Analyzing credit card statement...

✓ Successfully extracted 11 transactions

EXTRACTED TRANSACTIONS

Transaction #1:
  Post Date:        01/23/18
  Transaction Date: 01/23/18
  Description:      FINANCE CHARGES
  Amount:           $2500.00

Transaction #2:
  Post Date:        01/22/18
  Transaction Date: 12/21/17
  Description:      INSTL 1/12 iPHONE X POWER MAC CENTER SM NORTH QUEZON CITY
  Amount:           $5800.00

Transaction #3:
  Post Date:        01/22/18
  Transaction Date: 12/21/17
  Description:      SM NORTH DEPT STORE QUEZON CITY
  Amount:           $12000.00

Transaction #4:
  Post Date:        01/22/18
  Transaction Date: 12/21/17
  Description:      INSTL 1/3 SM NORTH AUTOMATIC CENTER QUEZON CITY
  Amount:           $3200.00

Transaction #5:
  Post Date:        01/22/18
  Transaction Date: 12/21/17
  Description:      SM NORTH TRAVEL CLUB QUEZON CITY
  Amount:           $6500.00

Transaction #6:
  Post Date:        01/22/18
  Transaction Date: 12/21/17
  Description:  

## Display Summary Statistics (Optional)

Calculate and display summary statistics from the extracted transactions.

In [7]:
if structured_data := result.value:
    total_charges = sum(t.amount for t in structured_data.transactions if t.amount > 0)
    total_credits = sum(abs(t.amount) for t in structured_data.transactions if t.amount < 0)
    net_amount = total_charges - total_credits
    
    print("\nSUMMARY STATISTICS")
    print("="*50)
    print(f"Total Transactions:  {len(structured_data.transactions)}")
    print(f"Total Charges:       ${total_charges:.2f}")
    print(f"Total Credits:       ${total_credits:.2f}")
    print(f"Net Amount:          ${net_amount:.2f}")
    print("="*50)


SUMMARY STATISTICS
Total Transactions:  11
Total Charges:       $45500.00
Total Credits:       $20000.00
Net Amount:          $25500.00


## Export to DataFrame (Optional)

Convert the structured data to a pandas DataFrame for further analysis.

In [8]:
import pandas as pd

if structured_data := result.value:
    # Convert transactions to dictionaries
    transactions_dict = [t.model_dump() for t in structured_data.transactions]
    
    # Create DataFrame
    df = pd.DataFrame(transactions_dict)
    
    print("\nTRANSACTIONS DATAFRAME")
    print("="*80)
    display(df)
    
    # Optionally save to CSV
    # df.to_csv("../data/extracted_transactions.csv", index=False)
    # print("\n✓ Transactions saved to extracted_transactions.csv")


TRANSACTIONS DATAFRAME


Unnamed: 0,post_date,transaction_date,description,amount
0,01/23/18,01/23/18,FINANCE CHARGES,2500.0
1,01/22/18,12/21/17,INSTL 1/12 iPHONE X POWER MAC CENTER SM NORTH ...,5800.0
2,01/22/18,12/21/17,SM NORTH DEPT STORE QUEZON CITY,12000.0
3,01/22/18,12/21/17,INSTL 1/3 SM NORTH AUTOMATIC CENTER QUEZON CITY,3200.0
4,01/22/18,12/21/17,SM NORTH TRAVEL CLUB QUEZON CITY,6500.0
5,01/22/18,12/21/17,SM NORTH HYPERMARKET QUEZON CITY,5000.0
6,01/14/18,01/18/18,NIHONBASHITEI MAKATI CITY,3000.0
7,01/10/18,11/11/17,SM MARIKINA HYPERMARKET MARIKINA CITY,2500.0
8,01/10/18,11/11/17,INSTL 2/6 ELECTROWORLD TRINOMA QUEZON CITY,2500.0
9,06/08/17,06/08/17,INSTL 7/12 ELECTROWORLD TRINOMA QUEZON CITY,2500.0
