# **Import Modules**

## *python*

In [1]:
# Import python modules
import os
import sys

## *custom*

In [2]:
# Import customised modules

# Define path
sys.path.append(os.path.abspath(os.path.join("..")))

# Import relevant modules
try:
    from scripts._03_embed_and_index import LangchainIndexer

except ImportError:
    print("Could not import module from `scripts/`.")

# **Embedding and Indexing**

In [3]:
# Define DataFrame path
df_chunks_path = os.path.join(
    os.path.dirname(os.getcwd()), "data/processed/chunked_complaints.csv"
)
vector_store_dir = os.path.join(os.path.dirname(os.getcwd()), "data", "vector store")

In [4]:
indexer = LangchainIndexer(
    df_chunks_path=df_chunks_path, vector_store_dir=vector_store_dir
)

# Resume batch if interrupted
indexer.index_chunks(
    batch_size=5000, resume_from=180
)  # Resume for chunk 180 (default is zero)

Using embedding model: all-MiniLM-L6-v2

ðŸ“¥ Loaded 1256204 chunks from ..\data\processed\chunked_complaints.csv

ðŸ’Ž Indexing 1256204 complaint chunks into ChromaDB in batches of 5000...



Preparing documents: 100%|â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 1256204/1256204 [01:05<00:00, 19314.15it/s]



âœ… Indexed batch 181 (5000 documents)

âœ… Indexed batch 182 (5000 documents)

âœ… Indexed batch 183 (5000 documents)

âœ… Indexed batch 184 (5000 documents)

âœ… Indexed batch 185 (5000 documents)

âœ… Indexed batch 186 (5000 documents)

âœ… Indexed batch 187 (5000 documents)

âœ… Indexed batch 188 (5000 documents)

âœ… Indexed batch 189 (5000 documents)

âœ… Indexed batch 190 (5000 documents)

âœ… Indexed batch 191 (5000 documents)

âœ… Indexed batch 192 (5000 documents)

âœ… Indexed batch 193 (5000 documents)

âœ… Indexed batch 194 (5000 documents)

âœ… Indexed batch 195 (5000 documents)

âœ… Indexed batch 196 (5000 documents)

âœ… Indexed batch 197 (5000 documents)

âœ… Indexed batch 198 (5000 documents)

âœ… Indexed batch 199 (5000 documents)

âœ… Indexed batch 200 (5000 documents)

âœ… Indexed batch 201 (5000 documents)

âœ… Indexed batch 202 (5000 documents)

âœ… Indexed batch 203 (5000 documents)

âœ… Indexed batch 204 (5000 documents)

âœ… Indexed batch 205 (5000 documents)


In [5]:
# Search for complaints related to BNPL issues
results = indexer.search(query="Why are customers unhappy with Buy Now Pay Later?", k=5)

# Print the top result
for i, doc in enumerate(results):
    print(f"--- Result {i+1} ---")
    print(f"Product: {doc.metadata.get('Product')}")
    print(f"Issue: {doc.metadata.get('Issue')}")
    print(f"Chunk:{doc.page_content}\n")

--- Result 1 ---
Product: Credit card
Issue: Other features, terms, or problems
Chunk:i am am shocked and appaled that this is how they treat valued customers, as i have an excellent credit rating and have never been late on a payment.

--- Result 2 ---
Product: Credit card or prepaid card
Issue: Fees or interest
Chunk:they are now charging customers late fees even if they received the payment on the same day.

--- Result 3 ---
Product: Credit card
Issue: APR or interest rate
Chunk:i thought that the new laws protected customers if they were a few days late on a payment?

--- Result 4 ---
Product: Credit card
Issue: Late fee
Chunk:i am sure i am not the only consumer this is happening to. we should not be penalized for making payments early. there should be a system in place to prevent this mis-applying of payments and charging unfair late fees to consumers- especially ones like me who have been faithful in paying every single month.

--- Result 5 ---
Product: Credit card
Issue: Fees o

In [7]:
# Search for complaints related to Payday Loans issues
results = indexer.search(query="Why are customers unhappy with Payday Loans?", k=5)

# Print the top result
for i, doc in enumerate(results):
    print(f"--- Result {i+1} ---")
    print(f"Product: {doc.metadata.get('Product')}")
    print(f"Issue: {doc.metadata.get('Issue')}")
    print(f"Chunk:{doc.page_content}\n")

--- Result 1 ---
Product: Payday loan
Issue: Charged fees or interest I didn't expect
Chunk:secondly, the rates are staggering and they never check to see if i could afford these loans as there are rules stating that a purchaser should not pay on more than xxxx outstanding payday loans at any one time. and, i have been paying on a total of ( xxxx ) payday loans for well over 9 months in succession with much difficulty. thirdly, i not only continue to pay these loans with exorbitant rates and fees.

--- Result 2 ---
Product: Payday loan
Issue: Charged fees or interest I didn't expect
Chunk:secondly, the rates are staggering and they never check to see if i could afford these loans as there are rules stating that a purchaser should not pay on more than xxxx outstanding payday loans at any one time. and, i have been paying on a total of ( xxxx ) payday loans for well over 9 months in succession with much difficulty. thirdly, i not only continue to pay these loans with exorbitant rates and

In [6]:
indexer.preview_results(query="What are the common complaints about credit cards?", k=5)

 --- Result 1 --- 
ðŸ”¹ Product: Credit card
ðŸ”¹ Issue: Problem with a purchase shown on your statement
ðŸ”¹ Chunk :this is terriable and this is not right and i am wanting to do a formal complaint with these credit card practices. can you please assist best, xxxx xxxx xxxx

 --- Result 2 --- 
ðŸ”¹ Product: Credit card or prepaid card
ðŸ”¹ Issue: Problem with a purchase shown on your statement
ðŸ”¹ Chunk :dissatisfied with purchases, billing errors, late fees errors, phones calls, no valid responses from letters sent, accessing credit files and submitting errors to credit report agencies.

 --- Result 3 --- 
ðŸ”¹ Product: Credit card or prepaid card
ðŸ”¹ Issue: Getting a credit card
ðŸ”¹ Chunk :in short, my complaints are : 1 ) receiving an unsolicited credit card 2 ) no direct means of contacting the issuer ( citibank )

 --- Result 4 --- 
ðŸ”¹ Product: Credit card or prepaid card
ðŸ”¹ Issue: Fees or interest
ðŸ”¹ Chunk :i got upset on this and tried to write review on the internet a

In [8]:
indexer.preview_results(
    query="What are the common complaints about money transfers?", k=5
)

 --- Result 1 --- 
ðŸ”¹ Product: Money transfer, virtual currency, or money service
ðŸ”¹ Issue: Unauthorized transactions or other transaction problem

 --- Result 2 --- 
ðŸ”¹ Product: Checking or savings account
ðŸ”¹ Issue: Closing an account
ðŸ”¹ Chunk :this has caused me serious damage and i have tried many times to stop the money from being transferred.

 --- Result 3 --- 
ðŸ”¹ Product: Money transfer, virtual currency, or money service
ðŸ”¹ Issue: Fraud or scam
ðŸ”¹ Chunk :i have transferred thousands of dollars- however, you never intended to fully understand the nature of the transaction. let me assure you that the prudent bank employee would have been able to identify unusual patterns in my transactions meaning that in case xxxx had been fulfilled by the bank, the red flags would have been immediately raised. nevertheless, my genuine willingness is to resolve this dispute and continue our cooperation, as i have always been satisfied with your services.

 --- Result 4 --- 
ðŸ”¹ 