Experimenting with multimodal LLMs to interpret graphs

In [9]:
import os
import sys
from dotenv import load_dotenv

## Add root directory to path
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), "../")))

load_dotenv()
assert os.environ["LANGCHAIN_API_KEY"], "Please set the LANGCHAIN_API_KEY environment variable"
assert os.environ["OPENAI_API_KEY"], "Please set the OPENAI_API_KEY environment variable"

In [10]:
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_openai import ChatOpenAI

openai_llm = ChatOpenAI(model="gpt-4o-mini", api_key=os.environ["OPENAI_API_KEY"])
DATA_DIR = "../../../data"
image_path = DATA_DIR + "/processed/anomaly_detection_sample.png"

In [11]:
import base64

with open(image_path, "rb") as image_file:
    image_data = base64.b64encode(image_file.read()).decode("utf-8")

In [12]:
message = HumanMessage(
    content=[
        {"type": "text", "text": "Describe the charts in this dashboard."},
        {
            "type": "image_url",
            "image_url": {"url": f"data:image/jpeg;base64,{image_data}"},
        },
    ],
)
response = openai_llm.invoke([message])
print(response.content)

The chart displays a scatter plot titled "Anomaly Detection in Financial Data Across Securities." Here's a breakdown of its components:

- **Axes**: 
  - The x-axis represents the "Price" of the securities, ranging from 0 to 500.
  - The y-axis represents the "Quantity," ranging from 0 to 20,000.

- **Data Points**: 
  - The plot features numerous colored dots, which indicate the quantities of different securities at various price levels. The colors suggest that there are multiple categories or types of securities represented.

- **Anomalies**: 
  - Red 'X' markers highlight the anomalies in the data. These points are likely outliers that deviate significantly from the expected pattern of quantity based on price.

- **Legend**: 
  - There is a legend indicating that the red markers represent "Anomalies."

Overall, the chart visually emphasizes the distribution of financial data across different securities while specifically identifying anomalous data points that may warrant further inv

Add more details in the prompt

In [14]:
import prompts.dashboard_prompts as prompts

messages = [
    SystemMessage(content=prompts.system_prompt)
]

##TODO: Integrate anomalies and reasons
anomalies_and_reasons = {}
anomaly_prompt = prompts.anomaly_dash_prompt.format(anomalies_and_reasons=anomalies_and_reasons)

messages.append(
    HumanMessage(
        content=[
            {"type": "text", "text": anomaly_prompt},
            {
                "type": "image_url",
                "image_url": {"url": f"data:image/jpeg;base64,{image_data}"},
            },
        ],
    )
)

response = openai_llm.invoke(messages)
print(response.content)

### Chart Description: Anomaly Detection in Financial Data Across Securities

This scatter plot visualizes the results of an anomaly detection model applied to the institution's trading order data. Each point represents a unique trading order, plotted by its price on the x-axis and quantity on the y-axis. The red crosses highlight the flagged anomalies, indicating trades that deviate significantly from typical trading patterns.

Compliance officers can utilize this chart to quickly identify suspicious activities that may warrant further investigation. The distribution of orders across various price ranges, with clusters of flagged anomalies, provides insight into potentially irregular trading behavior. By focusing on these highlighted points, compliance teams can prioritize their review of specific trades that may pose risks to market integrity or compliance with regulatory standards.
