# Bank AI Assistant - OpenAI RAG System

This notebook demonstrates a Retrieval-Augmented Generation (RAG) system using OpenAI's embeddings and GPT models for answering questions about Bank of America's Online Banking Service Agreement.

## Features:
- OpenAI text-embedding-3-small for high-quality embeddings
- GPT-3.5-turbo for natural language answer generation
- Persistent index storage for fast subsequent runs
- Batch processing for efficient API usage
- Similarity search with cosine similarity


## 1. Dependencies and Imports


In [5]:
# Install required packages
!pip install openai scikit-learn langchain pypdf




In [6]:
import numpy as np
from pypdf import PdfReader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from sklearn.metrics.pairwise import cosine_similarity
import openai
import os
import pickle
import time
import warnings
warnings.filterwarnings('ignore')


## 2. Helper Functions


In [7]:
def load_pdf(path):
    """Load and extract text from PDF file"""
    reader = PdfReader(path)
    text = ""
    for page in reader.pages:
        text += page.extract_text() + "\n"
    return text

def setup_openai(api_key=None):
    """Setup OpenAI API key"""
    if api_key:
        openai.api_key = api_key
    elif os.getenv("OPENAI_API_KEY"):
        openai.api_key = os.getenv("OPENAI_API_KEY")
    else:
        print("Please set your OpenAI API key:")
        print("export OPENAI_API_KEY='your-api-key-here'")
        return False
    return True


## 3. Loading PDF


In [8]:
pdf_path = "../../dataset/Bank_of_America_Online Banking_Service Agreement.pdf"
document_text = load_pdf(pdf_path)
print("Document length (chars):", len(document_text))


Document length (chars): 124022


## 4. Chunking the Text
We'll split the document into manageable chunks for embedding.

In [9]:
splitter = RecursiveCharacterTextSplitter(
    chunk_size=800,
    chunk_overlap=100,
    separators=["\n\n", "\n", ".", " "]
)
chunks = splitter.split_text(document_text)
print(f"Total chunks created: {len(chunks)}")
print("First chunk preview:", chunks[0][:300])

Total chunks created: 173
First chunk preview: Online Banking
Online Banking Service Agreement
‹‹  Go to Online Banking Overview
Bank of America Online Banking Service Agreement
Effective Date: July 21, 2025
Table of Contents: Hide all Topics
1. General Description of Bank of America Online Banking Service Agreement (this "Agreement")
Introducti


## 5. OpenAI Embedding Test
Getting an embedding for a single chunk to make sure our API key works.

In [10]:
setup_openai()
try:
    response = openai.embeddings.create(
        model="text-embedding-3-small",
        input=[chunks[0]]
    )
    print("Embedding length:", len(response.data[0].embedding))
except Exception as e:
    print("Error getting embedding:", e)

Embedding length: 1536


## 6. Batch Embedding Function
Now, let's write a function to get embeddings for all chunks, batching to avoid rate limits.

In [11]:
def get_embeddings(texts, model="text-embedding-3-small", batch_size=100):
    all_embeddings = []
    for i in range(0, len(texts), batch_size):
        batch = texts[i:i + batch_size]
        print(f"Batch {i//batch_size + 1}")
        try:
            response = openai.embeddings.create(
                model=model,
                input=batch
            )
            batch_embeddings = [data.embedding for data in response.data]
            all_embeddings.extend(batch_embeddings)
            time.sleep(0.1)
        except Exception as e:
            print(f"Error in batch {i//batch_size + 1}: {e}")
            if batch_size > 10:
                print("Retrying with smaller batch size...")
                return get_embeddings(texts, model, batch_size // 2)
            else:
                raise e
    return all_embeddings

## 7. Build Embedding Index
build the embedding index for all chunks

In [12]:
embeddings = get_embeddings(chunks)
embeddings = np.array(embeddings)
print("Embeddings shape:", embeddings.shape)

Batch 1
Batch 2
Embeddings shape: (173, 1536)


## 8. Cosine Similarity Search
function to search for the most relevant chunks using cosine similarity.

In [13]:
def search(query, chunks, embeddings, model="text-embedding-3-small", top_k=3):
    query_embedding = get_embeddings([query], model=model)[0]
    query_vector = np.array(query_embedding).reshape(1, -1)
    similarities = cosine_similarity(query_vector, embeddings).flatten()
    top_indices = similarities.argsort()[-top_k:][::-1]
    results = []
    for i, idx in enumerate(top_indices):
        results.append({
            'chunk': chunks[idx],
            'similarity': similarities[idx],
            'rank': i + 1
        })
    return results

## 9. Trying a Search

In [14]:
results = search("What is the Zelle transfer limit for new users?", chunks, embeddings)
for r in results:
    print(f"Rank {r['rank']} (score={r['similarity']:.3f}):\n{r['chunk'][:200]}\n")

Batch 1
Rank 1 (score=0.755):
your initial enrollment in the Service, or for certain recipients. The minimum transfer amount for any single Zelle  transfer is $1.00. There are no
receiving limits for Zelle  transfers.
Zelle  trans

Rank 2 (score=0.709):
transactions
$45,000 / 30
transactions$60,000 /120
transactionsZelle  enrollment tenure 16-30 days $4,000 / 20
transactions
Zelle  enrollment tenure 31-60 days $8,000 / 20
transactions
Zelle  enrollme

Rank 3 (score=0.648):
limits will be lower for the first 60 days of your initial Zelle enrollment with your customer profile (User ID). W e also limit the amount you can send
to certain recipients on a daily basis. W e lim



## 10. Generate Answer with GPT
GPT-3.5-turbo to generate an answer using the top chunks as context

In [15]:
def answer_question(question, chunks, embeddings, top_k=3):
    results = search(question, chunks, embeddings, top_k=top_k)
    context = "\n\n".join([r['chunk'] for r in results])
    try:
        response = openai.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[
                {"role": "system", "content": "You are a helpful assistant that answers questions about Bank of America's Online Banking Service Agreement. Use only the provided context to answer questions. If the answer is not in the context, say 'I don't have enough information to answer that question.'"},
                {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}"}
            ],
            max_tokens=300,
            temperature=0.1
        )
        answer = response.choices[0].message.content
    except Exception as e:
        print(f"Error generating answer: {e}")
        answer = f"Based on the Bank of America Online Banking Service Agreement:\n\n{context[:1000]}..."
    return answer, results

## 11. Full RAG QA

In [16]:
question = "How do I cancel a scheduled bill payment?"
answer, context_results = answer_question(question, chunks, embeddings)
print("Answer:", answer)

Batch 1
Answer: You can cancel a scheduled bill payment by following the instructions on Bank of America's website. The cancel feature can be found in the payment activity section. Additionally, you may also request to cancel a future scheduled or recurring transfer by calling Bank of America at 800.432.1000 for consumer accounts and 866.758.5972 for small business accounts.


## 12. Save/Load Index

In [17]:
def save_index(filepath, chunks, embeddings, model):
    index_data = {
        'chunks': chunks,
        'embeddings': embeddings,
        'model': model
    }
    with open(filepath, 'wb') as f:
        pickle.dump(index_data, f)
    print(f"Index saved to {filepath}")

In [18]:
def load_index(filepath):
    if os.path.exists(filepath):
        with open(filepath, 'rb') as f:
            index_data = pickle.load(f)
        print(f"Index loaded from {filepath}")
        return index_data['chunks'], index_data['embeddings'], index_data['model']
    return None, None, None

In [19]:
save_index("../data/openai_rag_index.pkl", chunks, embeddings, "text-embedding-3-small")

Index saved to ../data/openai_rag_index.pkl


In [20]:
chunks, embeddings, model = load_index("../data/openai_rag_index.pkl")
if chunks is not None:
    print("Index loaded successfully!")
else:
    print("No index found, please build it first.")

Index loaded from ../data/openai_rag_index.pkl
Index loaded successfully!
