## This notebook is an illustration in developing and deploying a conversational AI to answer queries related to certain banking products - Credit Cards 

## Create Vector database

### Implement Document Splitting

### Install libraries

In [1]:
% pip install openai

Note: you may need to restart the kernel to use updated packages.


In [2]:
pip install python-dotenv

Note: you may need to restart the kernel to use updated packages.


In [None]:
pip install langchain

In [25]:
pip install langchain-openai

Note: you may need to restart the kernel to use updated packages.


In [None]:
pip install pypdf

In [None]:
pip install faiss-cpu

In [None]:
pip install langchainhub

In [None]:
pip install langchain-community

## Load & Split PDF documents into chunks

In [4]:
#list all the files in ../credit_card_products
from langchain.document_loaders import PyPDFLoader
import os
loaders = []
for file in os.listdir("../credit_card_products"):
    if file.endswith(".pdf"):
        loaders.append(file)     
pdf_loaders = [PyPDFLoader(f"../credit_card_products/{file}") for file in loaders]

pages = []

for loader in pdf_loaders:
    pages.extend(loader.load())

Ignoring wrong pointing object 6 0 (offset 0)
Ignoring wrong pointing object 8 0 (offset 0)
Ignoring wrong pointing object 10 0 (offset 0)


In [5]:
from langchain.text_splitter import CharacterTextSplitter
text_splitter = CharacterTextSplitter(
    separator="\n",
    chunk_size=1500,
    chunk_overlap=100,
    length_function=len
)
docs = text_splitter.split_documents(pages)

In [12]:
len(docs)

195

In [13]:
docs[3]

Document(metadata={'source': '../credit_card_products/Citi® _ AAdvantage® Executive - Airline Miles Credit Card _ Citi.com.pdf', 'page': 1}, page_content="purchas es.\nLoyalty Points Bonus es\nEarn a 10,000 L oyalty Points bonus aft er reaching\n50,000 L oyalty Points in a s tatus qualiﬁc ation y ear.\nEarn ano ther 10,000 L oyalty Points bonus aft er\nreaching 90,000 L oyalty Points in the s ame s tatus\nqualiﬁc ation y ear.Admir als Club  Member ship\nInclude s access to nearly 100 A dmir als Club  and p artner\nlounge s worldwide . Immediat e family ( spous e, dome stic\npartner and/ or childr en under 18) or up t o 2 gue sts ma y join\nyou. Up t o $850 v alue.\nRedeeming AA dvantage  Mile s\nUse your AA dvantage  mile s earned fr om y our C iti /\nAAdvantage  Executive card for award travel to over 1,000\ndestinations w orldwide , with ﬂe xible r edemp tion op tions f or\none- way or r ound trip a wards on Americ an Airline s. Your\nAAdvantage  mile s can als o be r edeemed f or Bu

In [15]:
docs[41]

Document(metadata={'source': '../credit_card_products/Citi® Double Cash Card - Cash Back Credit Card _ Citi.com.pdf', 'page': 1}, page_content="This o ffer is a vailable if y ou apply thr ough the me thod( s) provided in this ad t oday. Offers ma y vary and this o ffer ma y not be\navailable in o ther plac es wher e the c ard is o ffered.\nAdditional In formation +\nFrequen tly Ask ed Que stions\nCASH BACK CREDIT CARD REWARDS & PROGRAM DETAILS\nApply no w for one o f Citi's best cash b ack cr edit c ards, with no c aps and no c ategory r estrictions . Earn c ash b ack r ewards in e very\npurchas e with the C iti Double C ash  Card. You earn unlimit ed 1% c ash b ack on pur chas es made with y our c ash b ack cr edit c ard, plus\nanother 1% c ash b ack as y ou pay for tho se pur chas es, whe ther y ou pay in full or o ver time .\nCash b ack is e arned in the f orm o f ThankY ou Points. This me ans e ach billing c ycle, you will e arn 1 T hankY ou poin t per $1 spen t on\npurchas es and 

In [14]:
docs[20]

Document(metadata={'source': '../credit_card_products/AT&T Points Plus® - Rewards Credit Card _ Citi.com.pdf', 'page': 3}, page_content='Additional In formation\nApply No w (https://online .citi.com/US /ag/cards/applic ation ?app=UNSOL &HKOP=62b5f e3cf6f ac441d6d2ab156b7016c7280e552672\nFIND THE RIGHT CREDIT CARD FOR YOU\nAll Cr edit C ards\n(https://www .citi.com/credit-\ncards/comp are/view-all-cr edit-\ncards?\nintc=citic ard_vac_202405_AB &afc=1C2)Rewards Cards\n(https://www .citi.com/credit-\ncards/comp are/rewards-credit-\ncards?\nintc=citic ard_vac_202405_AB &afc=1C2)Travel Cards\n(https://www .citi.com/credit-\ncards/comp are/travel-reward-\ncredit-cards?\nintc=citic ard_vac_202405_AB &afc=1C2)0 %\n0% In tro AP\n(https://www .cit\ncards/comp are/0-\napr-credit-\nintc=citic ard_vac_2\n®\n1\n2\nWhy Citi\nWealth Managemen t\nBusine ss Banking\nRates\nApply No w (https://online .citi.com10/3/24, 3:34 PM AT&T Points Plus® - Rewards Credit Card | Citi.com\nhttps://www.citi.com/credit

## Generate embeddings and store in vector database
### FAISS(Facebook AI Similarity Search) vector database

In [2]:
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) 

OPENAI_API_KEY=os.environ['OPENAI_API_KEY']

In [7]:
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
embeddings_model = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY, model="text-embedding-3-small")
# Load it into the vector store and embed
vectordb = FAISS.from_documents(docs, embeddings_model)

In [8]:
print(vectordb.index.ntotal)

195


### Persist Data in Vector Store

In [9]:
vectordb.save_local("faiss2_credit_card_index")

### Load Vector Store

In [10]:
new_db = FAISS.load_local("faiss2_credit_card_index", embeddings_model, allow_dangerous_deserialization=True)
new_db.index.ntotal

195

## Perform semantic search

In [15]:
def print_output(docs):
    for doc in docs:
        print('The output is: {}. \n\nThe metadata is {} \n\n'.format(doc.page_content, doc.metadata)) 
docs = vectordb.similarity_search("what is Annual Percentage Rate (APR) for Purchases?")
print_output(docs)

The output is: CITI DISCLOSURES
Interest Rates and Interest Charges
Annual Percentage Rate (APR)
for Purchases 20.74% to 28.74%, based on your creditworthiness.
These APRs will vary with the market based on the Prime Rate.a
APR for Balance Transfers20.74% to 28.74%, based on your creditworthiness, for transfers
completed within 2 months from date of account opening.
These APRs will vary with the market based on the Prime Rate.a
APR for Cash Advances29.99%
This APR will vary with the market based on the Prime Rate.b
APR for Citi Flex Plan20.74% to 28.74%, based on your creditworthiness.
These APRs will vary with the market based on the Prime Rate.a
Penalty APR and When it AppliesUp to 29.99%, based on your creditworthiness.
This APR will vary with the market based on the Prime Rate.
This APR may be applied to your account if you:
(1) Make a late payment or
(2) Make a payment that is returned.
How Long Will the Penalty APR Apply? If your APRs are
increased for either of these reasons, th

## Use vector database(vectordb) as retriever

### Configure retriever
#### Use the similarity search capabilities of a vector store to facilitate retrieval

In [16]:
retriever = vectordb.as_retriever(search_type="similarity", search_kwargs={"k": 6})

## Configure LLM

In [17]:
from langchain_openai import ChatOpenAI

#initialize the LLM we'll use - OpenAI GPT 3.5 Turbo
llm = ChatOpenAI(openai_api_key=OPENAI_API_KEY, model="gpt-3.5-turbo-0125")

### Define prompt with conversation history

In [18]:
from langchain.chains import create_history_aware_retriever
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

system_prompt = """Given the chat history and a recent user question \
generate a new standalone question \
that can be understood without the chat history. Do NOT answer the question, \
just reformulate it if needed or otherwise return it as is."""

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)

retriever_with_history = create_history_aware_retriever(
    llm, retriever, prompt
)

## Perform question answering

In [19]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain

qa_system_prompt = """You are an assistant for question-answering tasks. \
Use the following pieces of retrieved context to answer the question. \
If you don't know the answer, just say that you don't know. \
Use three sentences maximum and keep the answer concise.\

{context}"""

qa_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", qa_system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)


question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)

rag_chain = create_retrieval_chain(retriever_with_history, question_answer_chain)

In [21]:
from langchain_core.messages import HumanMessage

chat_history = []

question = "Can anyone apply for the Costco Anywhere Visa® Card by Citi?"

ai_msg_1 = rag_chain.invoke({"input": question, "chat_history": chat_history})

chat_history.extend([HumanMessage(content=question), ai_msg_1["answer"]])

print(ai_msg_1["answer"])

No, the Costco Anywhere Visa Card by Citi is exclusively for Costco members. If you do not already have a Costco membership, you can purchase one at Costco.com before applying for the Costco Anywhere credit card.


In [22]:
second_question = "is it visa or castercard?"

ai_msg_2 = rag_chain.invoke({"input": second_question, "chat_history": chat_history})
chat_history.extend([HumanMessage(content=question), ai_msg_2["answer"]])
print(ai_msg_2["answer"])

The Costco Anywhere Visa Card by Citi is a Visa credit card.
