**Notebook for experimenting with prompts**

In [15]:
## Imports
## System imports
import os
import sys

ROOT_DIR = os.path.join(os.path.dirname(os.getcwd()), os.pardir, os.pardir)
sys.path.append(ROOT_DIR)
sys.path.append(ROOT_DIR + "/src")
sys.path.append(os.path.join(os.path.dirname(os.getcwd()), "../")) ## LLM directory

## Data manipulation
import pandas as pd

## LangChain
from langchain_groq import ChatGroq
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_core.messages import HumanMessage, SystemMessage

## Misc
from dotenv import load_dotenv
from datetime import datetime
from pprint import pprint

## Self-defined modules
from src.llm.utils import calculations
from src.llm.prompts import dashboard_prompts, variable_descriptions
import anomaly

API keys (retrieved from .env file)

In [2]:
## Load environment variables
load_dotenv()
assert os.environ['LANGCHAIN_API_KEY'], "Please set the LANGCHAIN_API_KEY environment variable"
assert os.environ['OPENAI_API_KEY'], "Please set the OPENAI_API_KEY environment variable"

In [3]:
## Configuring LLM
openai_llm = ChatOpenAI(model="gpt-4o-mini", api_key=os.environ['OPENAI_API_KEY'])

In [None]:
## Run anomaly detection model
continuous_outlier_df, discrete_outlier_df, overall_anomaly_results = anomaly.main();


In [7]:
print(overall_anomaly_results)

      Unnamed: 0    Instance    OrderNo  ParentOrderNo  RootParentOrderNo  \
485          485  BURGERKING  172374515       25869076          226102108   
708          708  BURGERKING   72375304       16501749          570372666   
719          719  BURGERKING   72374535       60528598          585916882   
2053        2053  BURGERKING  172370932       53710462          562953313   
2372        2372  BURGERKING  172374752       14801087          572610220   
2860        2860  BURGERKING  172373762       26663556          659348535   
4671        4671  BURGERKING   72372809       11804334           76882466   

              CreateDate           DeleteDate  AccID  AccCode BuySell  ...  \
485  2024-10-09 10:27:10  2024-02-01 15:03:52    173     NOEL       S  ...   
708  2024-10-09 10:39:03  2024-02-01 17:41:14    172       ZM       B  ...   
719  2024-10-09 10:39:57  2024-02-01 17:52:28      5     NOEL       B  ...   
2053 2024-10-09 11:55:07  2024-02-01 11:37:44     80  UNKNOWN       S  

**Prompts**

Use PromptTemplate to create a prompt, specifying the input variables to be replaced by data

In [17]:
filtered_anomaly_results = overall_anomaly_results[['OrderNo', 'top_feature_1', 'top_feature_2', 'top_feature_3']]
anomalies_and_reasons = filtered_anomaly_results.to_dict(orient='records')
pprint(anomalies_and_reasons)

full_anomalies_and_reasons = overall_anomaly_results.to_dict(orient='records')
pprint(full_anomalies_and_reasons)

[{'OrderNo': 172374515,
  'top_feature_1': 'AccID_173',
  'top_feature_2': 'AccCode_NOEL',
  'top_feature_3': 'Value'},
 {'OrderNo': 72375304,
  'top_feature_1': 'AccID_172',
  'top_feature_2': 'SecCode_SEK',
  'top_feature_3': 'OriginOfOrder_NOEL'},
 {'OrderNo': 72374535,
  'top_feature_1': 'AccID_5',
  'top_feature_2': 'AccCode_NOEL',
  'top_feature_3': 'Value'},
 {'OrderNo': 172370932,
  'top_feature_1': 'SecCode_S32',
  'top_feature_2': 'DoneValue',
  'top_feature_3': 'DoneVolume'},
 {'OrderNo': 172374752,
  'top_feature_1': 'AccID_156',
  'top_feature_2': 'DoneValue',
  'top_feature_3': 'Value'},
 {'OrderNo': 172373762,
  'top_feature_1': 'AccID_13',
  'top_feature_2': 'OriginOfOrder_NOEL',
  'top_feature_3': 'SecCode_BSL'},
 {'OrderNo': 72372809,
  'top_feature_1': 'DoneValue',
  'top_feature_2': 'AccCode_NOEL',
  'top_feature_3': 'Value'}]
[{'AccCode': 'NOEL',
  'AccID': 173,
  'BuySell': 'S',
  'ClientOrderID': 0.395855319,
  'CreateDate': Timestamp('2024-10-09 10:27:10'),
  'C

In [14]:
anomaly_prompt = PromptTemplate.from_template(dashboard_prompts.anomaly_dash_prompt)\
    .format(anomalies_and_reasons=anomalies_and_reasons)
pprint(anomaly_prompt)

('\n'
 'This particular chart is part of an Anomaly Detection dashboard.\n'
 'It is used by compliance officers to identify suspicious trading '
 'activities.\n'
 'The chart is generated after running the order data through an anomaly '
 'detection model.\n'
 'Write a description for this chart, with reference to the following list of '
 'flagged orders and the respective reasons they were flagged:\n'
 "[{'OrderNo': 172374515, 'top_feature_1': 'AccID_173', 'top_feature_2': "
 "'AccCode_NOEL', 'top_feature_3': 'Value'}, {'OrderNo': 72375304, "
 "'top_feature_1': 'AccID_172', 'top_feature_2': 'SecCode_SEK', "
 "'top_feature_3': 'OriginOfOrder_NOEL'}, {'OrderNo': 72374535, "
 "'top_feature_1': 'AccID_5', 'top_feature_2': 'AccCode_NOEL', "
 "'top_feature_3': 'Value'}, {'OrderNo': 172370932, 'top_feature_1': "
 "'SecCode_S32', 'top_feature_2': 'DoneValue', 'top_feature_3': 'DoneVolume'}, "
 "{'OrderNo': 172374752, 'top_feature_1': 'AccID_156', 'top_feature_2': "
 "'DoneValue', 'top_feature_

In [16]:
messages = [
    SystemMessage(content=dashboard_prompts.system_prompt),
    HumanMessage(content=anomaly_prompt)
]
response = openai_llm.invoke(messages)
print(response.content)

**Anomaly Detection Chart Description**

This chart visualizes flagged trading orders identified by our anomaly detection model, facilitating compliance officers in monitoring and investigating suspicious activities. Each entry on the chart represents an order that has been flagged based on specific features that deviate from expected patterns.

- **Order Number 172374515**: Flagged due to unusual activity associated with **Account ID 173** and **Account Code NOEL**, indicating potential discrepancies in the order's value.
  
- **Order Number 72375304**: This order was highlighted for its connection to **Account ID 172** and the security code **SEK**, with the origin of the order marked as suspicious (**OriginOfOrder_NOEL**).

- **Order Number 72374535**: Similar to the previous orders, this entry is linked to **Account ID 5** and **Account Code NOEL**, raising concerns regarding its reported value.

- **Order Number 172370932**: This transaction has been flagged for its association wi

In [18]:
## Try again with all columns
full_anomaly_prompt = PromptTemplate.from_template(dashboard_prompts.anomaly_dash_prompt)\
    .format(anomalies_and_reasons=full_anomalies_and_reasons)

messages = [
    SystemMessage(content=dashboard_prompts.system_prompt),
    HumanMessage(content=full_anomaly_prompt)
]
response = openai_llm.invoke(messages)
print(response.content)

### Anomaly Detection Chart Overview

The Anomaly Detection Chart provides compliance officers with a critical tool for identifying suspicious trading activities within our institution. This visualization utilizes advanced anomaly detection algorithms applied to the institution's order data, highlighting transactions that deviate significantly from established patterns. 

Each flagged order is represented in the chart, accompanied by key attributes that shed light on the nature of the anomaly. For instance, orders may be flagged based on unusual transaction values, atypical account codes, or striking buy/sell patterns. In the data set, we can see several flagged orders with specific reasons for their identification:

1. **Order No. 172374515**: Flagged due to an unusually high transaction value of approximately AUD 7.8 million, associated with Account ID 173 and a sell action for security CBA.

2. **Order No. 72375304**: This order was highlighted for its significant volume and value, 