# Credit Card Recommender

This notebook will demo the credit card recommendation workflow utilizing GPT. 

## Preprocessing
1. Read in transaction history (CSV) -> convert to JSON string
2. Read credit card data as JSON

In [32]:
import numpy as np
import pandas as pd
from pprint import pprint
import json

from openai import OpenAI
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

In [4]:
# Import transaction data
tx_df = pd.read_csv('tx-data.csv')
tx_df.head()

Unnamed: 0,User,Card,Year,Month,Day,Time,Amount,Use Chip,Merchant Name,Merchant City,Merchant State,Zip,MCC,Errors?,Is Fraud?
0,0,0,2002,9,1,06:21,$134.09,Swipe Transaction,3527213246127876953,La Verne,CA,91750.0,5300,,No
1,0,0,2002,9,1,06:42,$38.48,Swipe Transaction,-727612092139916043,Monterey Park,CA,91754.0,5411,,No
2,0,0,2002,9,2,06:22,$120.34,Swipe Transaction,-727612092139916043,Monterey Park,CA,91754.0,5411,,No
3,0,0,2002,9,2,17:45,$128.95,Swipe Transaction,3414527459579106770,Monterey Park,CA,91754.0,5651,,No
4,0,0,2002,9,3,06:23,$104.71,Swipe Transaction,5817218446178736267,La Verne,CA,91750.0,5912,,No


In [5]:
tx_json = tx_df.to_dict(orient="records")

In [6]:
pprint(tx_json)

[{'Amount': '$134.09',
  'Card': 0,
  'Day': 1,
  'Errors?': nan,
  'Is Fraud?': 'No',
  'MCC': 5300,
  'Merchant City': 'La Verne',
  'Merchant Name': 3527213246127876953,
  'Merchant State': 'CA',
  'Month': 9,
  'Time': '06:21',
  'Use Chip': 'Swipe Transaction',
  'User': 0,
  'Year': 2002,
  'Zip': 91750.0},
 {'Amount': '$38.48',
  'Card': 0,
  'Day': 1,
  'Errors?': nan,
  'Is Fraud?': 'No',
  'MCC': 5411,
  'Merchant City': 'Monterey Park',
  'Merchant Name': -727612092139916043,
  'Merchant State': 'CA',
  'Month': 9,
  'Time': '06:42',
  'Use Chip': 'Swipe Transaction',
  'User': 0,
  'Year': 2002,
  'Zip': 91754.0},
 {'Amount': '$120.34',
  'Card': 0,
  'Day': 2,
  'Errors?': nan,
  'Is Fraud?': 'No',
  'MCC': 5411,
  'Merchant City': 'Monterey Park',
  'Merchant Name': -727612092139916043,
  'Merchant State': 'CA',
  'Month': 9,
  'Time': '06:22',
  'Use Chip': 'Swipe Transaction',
  'User': 0,
  'Year': 2002,
  'Zip': 91754.0},
 {'Amount': '$128.95',
  'Card': 0,
  'Day': 2

IOPub data rate exceeded.
The Jupyter server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--ServerApp.iopub_data_rate_limit`.

Current values:
ServerApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
ServerApp.rate_limit_window=3.0 (secs)



In [29]:
# Import credit card data
with open("credit-card-data.json", "rb") as file:
    credit_card_json = json.load(file)
pprint(credit_card_json)

{'creditCards': [{'APR': '20.99% - 27.99%',
                  'annualFee': 95,
                  'benefits': ['$300 in travel credits in the first year',
                               'Primary rental car insurance',
                               'No foreign transaction fees',
                               '$150 in additional partnership benefit value',
                               '1:1 point transfer with partners',
                               'Travel and purchase coverage'],
                  'cardName': 'Chase Sapphire Preferred Card',
                  'cardType': 'Credit Card',
                  'countryOfOrigin': 'USA',
                  'creditCardScoreMax': 850,
                  'creditCardScoreMin': 650,
                  'issuer': 'Chase',
                  'linkToApply': 'https://creditcards.chase.com/rewards-credit-cards/sapphire/preferred',
                  'rewards': {'pointsPerDollar': {'dining': 3,
                                                  'onlineGrocer

## Connect to OpenAI API

In [8]:
client = OpenAI()

pprint(client.models.list().data)

[Model(id='dall-e-2', created=1698798177, object='model', owned_by='system'),
 Model(id='whisper-1', created=1677532384, object='model', owned_by='openai-internal'),
 Model(id='gpt-3.5-turbo-instruct', created=1692901427, object='model', owned_by='system'),
 Model(id='tts-1-hd-1106', created=1699053533, object='model', owned_by='system'),
 Model(id='gpt-3.5-turbo', created=1677610602, object='model', owned_by='openai'),
 Model(id='gpt-3.5-turbo-0125', created=1706048358, object='model', owned_by='system'),
 Model(id='babbage-002', created=1692634615, object='model', owned_by='system'),
 Model(id='davinci-002', created=1692634301, object='model', owned_by='system'),
 Model(id='dall-e-3', created=1698785189, object='model', owned_by='system'),
 Model(id='gpt-4o-mini', created=1721172741, object='model', owned_by='system'),
 Model(id='tts-1', created=1681940951, object='model', owned_by='openai-internal'),
 Model(id='gpt-3.5-turbo-16k', created=1683758102, object='model', owned_by='openai

### Let's try feeding both JSONs directly to the model

### Quick demo

In [23]:
llm = ChatOpenAI(model="chatgpt-4o-latest")
output_parser = StrOutputParser()

In [57]:
rec_prompt = ChatPromptTemplate.from_messages([
    (
        "system",
        "You are a world-class expert in credit cards and you specialize in making credit card recommendations."),
    (
        "user", """
        Here is the user's transaction history:
        ```json
        {transactions}
        ```
        
        Here is the user's annual income: $<income>
        
        Here is our database of credit card information:
        ```json
        {credit_cards}
        ```
        
        Please make a recommendation on the credit cards that provide the most value based on the user's transaction history.
        
        Format your answer as follows:
        1. Think through the problem step-by-step. Talk about possible factors to consider when choosing a credit card as a customer
        2. Taking all these factors into account, explain your reasoning step-by-step, which ultimately leads to your recommendations. Ensure to explain your reasoning for why you ranked some cards higher/lower than others.
        3. Rank your recommendations in ascending order by most recommended to least recommended. Output your recommendations in a list along with their value proposition like this:
        
        <answer>
        1. <card name>: <reasoning>
        2. <card name>: <reasoning>
        ...
        </answer>
        """)
])

In [58]:
rec_chain = rec_prompt | llm | output_parser

In [59]:
travel_info = tx_df.query("MCC == 4722")
travel_info["Transaction Amount"] = np.random.uniform(5000, 100000, travel_info.shape[0])

income = 50000000

output = rec_chain.invoke({
    "transactions": travel_info.to_dict(orient="records"),
    "income": income,
    "credit_cards": credit_card_json,
})

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  travel_info["Transaction Amount"] = np.random.uniform(5000, 100000, travel_info.shape[0])


In [60]:
import re
from rich.console import Console
from rich.markdown import Markdown

sections = re.split(r"###", output)

console = Console()

for section in sections:
    if section.strip():
        markdown_content = Markdown(f"### {section.strip()}")
        console.print(markdown_content)