# Advanced Retrieval with LangChain

In the following notebook, we'll explore various methods of advanced retrieval using LangChain!

We'll touch on:

- Naive Retrieval
- Best-Matching 25 (BM25)
- Multi-Query Retrieval
- Parent-Document Retrieval
- Contextual Compression (a.k.a. Rerank)
- Ensemble Retrieval
- Semantic chunking

We'll also discuss how these methods impact performance on our set of documents with a simple RAG chain.

There will be two breakout rooms:

- 🤝 Breakout Room Part #1
  - Task 1: Getting Dependencies!
  - Task 2: Data Collection and Preparation
  - Task 3: Setting Up QDrant!
  - Task 4-10: Retrieval Strategies
- 🤝 Breakout Room Part #2
  - Activity: Evaluate with Ragas

# 🤝 Breakout Room Part #1

## Task 1: Getting Dependencies!

We're going to need a few specific LangChain community packages, like OpenAI (for our [LLM](https://platform.openai.com/docs/models) and [Embedding Model](https://platform.openai.com/docs/guides/embeddings)) and Cohere (for our [Reranker](https://cohere.com/rerank)).

We'll also provide our OpenAI key, as well as our Cohere API key.

In [1]:
import os
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API Key:")

In [2]:
os.environ["COHERE_API_KEY"] = getpass.getpass("Cohere API Key:")

## Task 2: Data Collection and Preparation

We'll be using our Loan Data once again - this time the strutured data available through the CSV!

### Data Preparation

We want to make sure all our documents have the relevant metadata for the various retrieval strategies we're going to be applying today.

In [3]:
from langchain_community.document_loaders.csv_loader import CSVLoader
from datetime import datetime, timedelta

loader = CSVLoader(
    file_path=f"./data/complaints.csv",
    metadata_columns=[
      "Date received", 
      "Product", 
      "Sub-product", 
      "Issue", 
      "Sub-issue", 
      "Consumer complaint narrative", 
      "Company public response", 
      "Company", 
      "State", 
      "ZIP code", 
      "Tags", 
      "Consumer consent provided?", 
      "Submitted via", 
      "Date sent to company", 
      "Company response to consumer", 
      "Timely response?", 
      "Consumer disputed?", 
      "Complaint ID"
    ]
)

loan_complaint_data = loader.load()

for doc in loan_complaint_data:
    doc.page_content = doc.metadata["Consumer complaint narrative"]

Let's look at an example document to see if everything worked as expected!

In [4]:
loan_complaint_data[0]

Document(metadata={'source': './data/complaints.csv', 'row': 0, 'Date received': '03/27/25', 'Product': 'Student loan', 'Sub-product': 'Federal student loan servicing', 'Issue': 'Dealing with your lender or servicer', 'Sub-issue': 'Trouble with how payments are being handled', 'Consumer complaint narrative': "The federal student loan COVID-19 forbearance program ended in XX/XX/XXXX. However, payments were not re-amortized on my federal student loans currently serviced by Nelnet until very recently. The new payment amount that is effective starting with the XX/XX/XXXX payment will nearly double my payment from {$180.00} per month to {$360.00} per month. I'm fortunate that my current financial position allows me to be able to handle the increased payment amount, but I am sure there are likely many borrowers who are not in the same position. The re-amortization should have occurred once the forbearance ended to reduce the impact to borrowers.", 'Company public response': 'None', 'Company'

## Task 3: Setting up QDrant!

Now that we have our documents, let's create a QDrant VectorStore with the collection name "LoanComplaints".

We'll leverage OpenAI's [`text-embedding-3-small`](https://openai.com/blog/new-embedding-models-and-api-updates) because it's a very powerful (and low-cost) embedding model.

> NOTE: We'll be creating additional vectorstores where necessary, but this pattern is still extremely useful.

In [5]:
from langchain_community.vectorstores import Qdrant
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

vectorstore = Qdrant.from_documents(
    loan_complaint_data,
    embeddings,
    location=":memory:",
    collection_name="LoanComplaints"
)

## Task 4: Naive RAG Chain

Since we're focusing on the "R" in RAG today - we'll create our Retriever first.

### R - Retrieval

This naive retriever will simply look at each review as a document, and use cosine-similarity to fetch the 10 most relevant documents.

> NOTE: We're choosing `10` as our `k` here to provide enough documents for our reranking process later

In [6]:
naive_retriever = vectorstore.as_retriever(search_kwargs={"k" : 10})

### A - Augmented

We're going to go with a standard prompt for our simple RAG chain today! Nothing fancy here, we want this to mostly be about the Retrieval process.

In [7]:
from langchain_core.prompts import ChatPromptTemplate

RAG_TEMPLATE = """\
You are a helpful and kind assistant. Use the context provided below to answer the question.

If you do not know the answer, or are unsure, say you don't know.

Query:
{question}

Context:
{context}
"""

rag_prompt = ChatPromptTemplate.from_template(RAG_TEMPLATE)

### G - Generation

We're going to leverage `gpt-4.1-nano` as our LLM today, as - again - we want this to largely be about the Retrieval process.

In [8]:
from langchain_openai import ChatOpenAI

chat_model = ChatOpenAI(model="gpt-4.1-nano")

### LCEL RAG Chain

We're going to use LCEL to construct our chain.

> NOTE: This chain will be exactly the same across the various examples with the exception of our Retriever!

In [9]:
from langchain_core.runnables import RunnablePassthrough
from operator import itemgetter
from langchain_core.output_parsers import StrOutputParser

naive_retrieval_chain = (
    # INVOKE CHAIN WITH: {"question" : "<<SOME USER QUESTION>>"}
    # "question" : populated by getting the value of the "question" key
    # "context"  : populated by getting the value of the "question" key and chaining it into the base_retriever
    {"context": itemgetter("question") | naive_retriever, "question": itemgetter("question")}
    # "context"  : is assigned to a RunnablePassthrough object (will not be called or considered in the next step)
    #              by getting the value of the "context" key from the previous step
    | RunnablePassthrough.assign(context=itemgetter("context"))
    # "response" : the "context" and "question" values are used to format our prompt object and then piped
    #              into the LLM and stored in a key called "response"
    # "context"  : populated by getting the value of the "context" key from the previous step
    | {"response": rag_prompt | chat_model, "context": itemgetter("context")}
)

Let's see how this simple chain does on a few different prompts.

> NOTE: You might think that we've cherry picked prompts that showcase the individual skill of each of the retrieval strategies - you'd be correct!

In [10]:
naive_retrieval_chain.invoke({"question" : "What is the most common issue with loans?"})["response"].content

'The most common issue with loans, based on the context provided, appears to be problems related to mismanagement and misinformation. Many complaints involve errors in loan balances, incorrect reporting on credit reports, difficulty in applying payments correctly, and issues arising from loan transfers and servicing changes without proper notification. Additionally, borrowers frequently face issues with inadequate communication, failure to provide clear documentation, and unfair or confusing practices that can result in increased balances or negative impacts on credit scores.'

In [11]:
naive_retrieval_chain.invoke({"question" : "Did any complaints not get handled in a timely manner?"})["response"].content

'Based on the provided data, at least one complaint was not handled in a timely manner. Specifically, the complaint with Complaint ID \'12709087\' submitted to MOHELA on 03/28/25 was marked as "Not timely response?" = "No," indicating it was not handled promptly. Additionally, the complaint with Complaint ID \'12973003\' submitted to EdFinancial Services on 04/14/25 was marked as "Yes" for timely response, showing some were handled on time.\n\nIn summary, yes, some complaints did not get handled in a timely manner, notably the complaint to MOHELA.'

In [12]:
naive_retrieval_chain.invoke({"question" : "Why did people fail to pay back their loans?"})["response"].content

"People failed to pay back their loans primarily due to a combination of factors highlighted in the complaints:\n\n1. **Lack of Clear Communication from Servicers:** Many borrowers were not adequately informed about when their loan repayments would resume, changes in loan servicers, or how payments would be applied. For example, some were unaware of the exact date repayment started or were not notified of transfer to different servicers, leading to missed or late payments.\n\n2. **Problems with Payment Handling and Application:** Borrowers reported that their payments were misapplied—most often directed toward interest rather than principal—and that they could not easily make extra payments to reduce the loan faster. This inefficient application process prolongs debt repayment and increases interest accumulation.\n\n3. **Difficulty in Accessing or Re-evaluating Payment Options:** Several complaints mention inability to modify payment plans, request adjustments based on income, or have 

Overall, this is not bad! Let's see if we can make it better!

## Task 5: Best-Matching 25 (BM25) Retriever

Taking a step back in time - [BM25](https://www.nowpublishers.com/article/Details/INR-019) is based on [Bag-Of-Words](https://en.wikipedia.org/wiki/Bag-of-words_model) which is a sparse representation of text.

In essence, it's a way to compare how similar two pieces of text are based on the words they both contain.

This retriever is very straightforward to set-up! Let's see it happen down below!


In [13]:
from langchain_community.retrievers import BM25Retriever

bm25_retriever = BM25Retriever.from_documents(loan_complaint_data, )

We'll construct the same chain - only changing the retriever.

In [14]:
bm25_retrieval_chain = (
    {"context": itemgetter("question") | bm25_retriever, "question": itemgetter("question")}
    | RunnablePassthrough.assign(context=itemgetter("context"))
    | {"response": rag_prompt | chat_model, "context": itemgetter("context")}
)

Let's look at the responses!

In [15]:
bm25_retrieval_chain.invoke({"question" : "What is the most common issue with loans?"})["response"].content

'Based on the provided context, the most common issue with loans appears to be problems related to dealing with lenders or servicers, including issues such as disputes over fees, difficulty applying payments correctly, receiving inaccurate or bad information about the loan, and challenges in resolving account or loan term discrepancies. These issues are frequently mentioned across multiple complaints, suggesting that borrower frustration often stems from inadequate communication, inaccurate information, or poor service from loan providers or servicers.'

In [16]:
bm25_retrieval_chain.invoke({"question" : "Did any complaints not get handled in a timely manner?"})["response"].content

'Based on the provided context, all the complaints listed received timely responses from the companies. Specifically, the complaints from April 26, 2025; April 1, 2025; April 24, 2025; and May 8, 2025 all indicate that the responses were "Yes" in terms of timely response. Therefore, there is no evidence from this data to suggest that any complaints did not get handled in a timely manner.'

In [17]:
bm25_retrieval_chain.invoke({"question" : "Why did people fail to pay back their loans?"})["response"].content

'People failed to pay back their loans for several reasons, including:\n\n1. Problems with payment plans: Some borrowers experienced issues with their repayment plans, such as being steered into wrong types of forbearances or not receiving proper communication about their payment status.\n2. Administrative errors and miscommunication: Borrowers reported not being contacted when their loans were transferred or when payments were due, leading to unintentional missed payments.\n3. Automatic payment issues: There were cases where automatic payments were discontinued without proper notification, resulting in missed payments and negative credit impacts.\n4. Poor customer service and handling by loan servicers: Many borrowers encountered unhelpful or unresponsive service, which prevented resolving issues promptly.\n5. Lack of transparency: Borrowers were often unaware of changes to their loans or payment status, leading to confusion and missed payments.\n6. Disputes and complications related 

It's not clear that this is better or worse, if only we had a way to test this (SPOILERS: We do, the second half of the notebook will cover this)

#### ❓ Question #1:

Give an example query where BM25 is better than embeddings and justify your answer.

##### ✅ Answer:

Keyword Search

Does FAFSA provide student loans?

For company searches, using tickers, etc. 

## Task 6: Contextual Compression (Using Reranking)

Contextual Compression is a fairly straightforward idea: We want to "compress" our retrieved context into just the most useful bits.

There are a few ways we can achieve this - but we're going to look at a specific example called reranking.

The basic idea here is this:

- We retrieve lots of documents that are very likely related to our query vector
- We "compress" those documents into a smaller set of *more* related documents using a reranking algorithm.

We'll be leveraging Cohere's Rerank model for our reranker today!

All we need to do is the following:

- Create a basic retriever
- Create a compressor (reranker, in this case)

That's it!

Let's see it in the code below!

In [18]:
from langchain.retrievers.contextual_compression import ContextualCompressionRetriever
from langchain_cohere import CohereRerank

compressor = CohereRerank(model="rerank-v3.5")
compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor, base_retriever=naive_retriever
)

Let's create our chain again, and see how this does!

In [19]:
contextual_compression_retrieval_chain = (
    {"context": itemgetter("question") | compression_retriever, "question": itemgetter("question")}
    | RunnablePassthrough.assign(context=itemgetter("context"))
    | {"response": rag_prompt | chat_model, "context": itemgetter("context")}
)

In [20]:
contextual_compression_retrieval_chain.invoke({"question" : "What is the most common issue with loans?"})["response"].content

'Based on the provided context, the most common issue with loans appears to be problems related to dealing with lenders or servicers, such as errors in loan balances, misapplied payments, wrongful denials of payment plans, and mishandling of loan information. Many complaints involve incorrect or mismatched balances, lack of clear communication or documentation, and issues with loan transfers or handling, especially in the context of federal student loans.'

In [21]:
contextual_compression_retrieval_chain.invoke({"question" : "Did any complaints not get handled in a timely manner?"})["response"].content

'Based on the provided context, at least one complaint was resolved in a timely manner. Specifically, the complaint about "Dealing with your lender or servicer" submitted to EdFinancial Services was responded to promptly, with the response indicated as "Closed with explanation" and marked as "Yes" under "Timely response." \n\nHowever, the first complaint related to delays and unresolved issues for over a year, indicates that some complaints did not get handled in a timely manner, as the issues remained unresolved for nearly 18 months, despite ongoing requests for resolution.\n\nOverall, while some complaints were handled promptly, others were not addressed in a timely manner.'

In [22]:
contextual_compression_retrieval_chain.invoke({"question" : "Why did people fail to pay back their loans?"})["response"].content

'People failed to pay back their loans primarily due to a lack of clear information and communication from loan servicers about the repayment process, interest accumulation, and their rights. Many borrowers were unaware that they had to repay their loans, or they were not properly notified when changes occurred in loan ownership or payment requirements. Additionally, options such as forbearance or deferment led to interest continuing to grow, making it more difficult to pay off the loans over time. Some borrowers experienced difficulties with incorrect or inconsistent account information, which further hindered their ability to make timely payments. Overall, inadequate information, poor communication, and the compounding effect of interest contributed to the failure to repay loans.'

We'll need to rely on something like Ragas to help us get a better sense of how this is performing overall - but it "feels" better!

## Task 7: Multi-Query Retriever

Typically in RAG we have a single query - the one provided by the user.

What if we had....more than one query!

In essence, a Multi-Query Retriever works by:

1. Taking the original user query and creating `n` number of new user queries using an LLM.
2. Retrieving documents for each query.
3. Using all unique retrieved documents as context

So, how is it to set-up? Not bad! Let's see it down below!



In [23]:
from langchain.retrievers.multi_query import MultiQueryRetriever

multi_query_retriever = MultiQueryRetriever.from_llm(
    retriever=naive_retriever, llm=chat_model
)

In [24]:
multi_query_retrieval_chain = (
    {"context": itemgetter("question") | multi_query_retriever, "question": itemgetter("question")}
    | RunnablePassthrough.assign(context=itemgetter("context"))
    | {"response": rag_prompt | chat_model, "context": itemgetter("context")}
)

In [25]:
multi_query_retrieval_chain.invoke({"question" : "What is the most common issue with loans?"})["response"].content

'Based on the provided complaints data, the most common issues with student loans are related to:\n\n- Problems with how payments are being handled, such as misapplied payments, inability to apply extra funds to the principal, or restrictions that prolong repayment.\n- Dealing with lenders or servicers, including receiving bad information, mismanagement, or errors in loan balances and interest calculations.\n- Issues with loan discharges, forgiveness, or discharge applications not being processed properly.\n- Errors and discrepancies in loan balances, interest calculations, or reporting to credit bureaus.\n- Problems with how loans are transferred or reported, including unauthorized transfers, incorrect account statuses, and inaccurate credit reporting.\n\nOverall, the most prevalent issue appears to be mismanagement or errors by loan servicers regarding payment processing, interest, and account information, leading to confusion, credit damage, and difficulty managing repayment.'

In [26]:
multi_query_retrieval_chain.invoke({"question" : "Did any complaints not get handled in a timely manner?"})["response"].content

'Based on the provided complaints, it appears that several complaints indicate issues with handling in a timely manner. Some complaints explicitly mention delays exceeding 30 days or responses not received within regulatory timeframes, such as delays of over a year or multiple months without resolution. For example:\n\n- Complaints that state it has been over 1 year or nearly 18 months since initial requests, with no response or resolution.\n- Complaints about delays of several months or more with no contact or action taken.\n- Specific complaints mention failures to respond within the required timeframes, and some complaints highlight that the company did not respond or investigate for over 30 days.\n\nTherefore, it can be concluded that yes, some complaints did not get handled in a timely manner according to the standards expected for consumer complaints.'

In [27]:
multi_query_retrieval_chain.invoke({"question" : "Why did people fail to pay back their loans?"})["response"].content

"People failed to pay back their loans primarily due to multiple interconnected reasons highlighted in the complaints:\n\n1. **Lack of Clear or Accurate Information:** Borrowers were often misinformed or lacked proper information about repayment options, including income-driven plans, the impact of interest accrual, and available forgiveness programs. Many complaints mention being steered into forbearance or consolidation without understanding the long-term consequences, such as interest capitalization and loss of forgiveness benefits.\n\n2. **Unmanageable Interest and Debt Growth:** Several borrowers experienced their balances ballooning due to accrued interest, especially when interest was not explained properly or when payments were deferred. This made it difficult or impossible to pay off the original loan amount.\n\n3. **Servicer Mismanagement and Bad Practices:** Complaints include errors in loan balances, misapplied payments, wrongful denials of payment plans, and systemic issue

#### ❓ Question #2:

Explain how generating multiple reformulations of a user query can improve recall.

##### ✅ Answer:
Good for ambiguity

## Task 8: Parent Document Retriever

A "small-to-big" strategy - the Parent Document Retriever works based on a simple strategy:

1. Each un-split "document" will be designated as a "parent document" (You could use larger chunks of document as well, but our data format allows us to consider the overall document as the parent chunk)
2. Store those "parent documents" in a memory store (not a VectorStore)
3. We will chunk each of those documents into smaller documents, and associate them with their respective parents, and store those in a VectorStore. We'll call those "child chunks".
4. When we query our Retriever, we will do a similarity search comparing our query vector to the "child chunks".
5. Instead of returning the "child chunks", we'll return their associated "parent chunks".

Okay, maybe that was a few steps - but the basic idea is this:

- Search for small documents
- Return big documents

The intuition is that we're likely to find the most relevant information by limiting the amount of semantic information that is encoded in each embedding vector - but we're likely to miss relevant surrounding context if we only use that information.

Let's start by creating our "parent documents" and defining a `RecursiveCharacterTextSplitter`.

In [28]:
from langchain.retrievers import ParentDocumentRetriever
from langchain.storage import InMemoryStore
from langchain_text_splitters import RecursiveCharacterTextSplitter
from qdrant_client import QdrantClient, models

parent_docs = loan_complaint_data
child_splitter = RecursiveCharacterTextSplitter(chunk_size=750)

We'll need to set up a new QDrant vectorstore - and we'll use another useful pattern to do so!

> NOTE: We are manually defining our embedding dimension, you'll need to change this if you're using a different embedding model.

In [29]:
from langchain_qdrant import QdrantVectorStore

client = QdrantClient(location=":memory:")

client.create_collection(
    collection_name="full_documents",
    vectors_config=models.VectorParams(size=1536, distance=models.Distance.COSINE)
)

parent_document_vectorstore = QdrantVectorStore(
    collection_name="full_documents", embedding=OpenAIEmbeddings(model="text-embedding-3-small"), client=client
)

Now we can create our `InMemoryStore` that will hold our "parent documents" - and build our retriever!

In [30]:
store = InMemoryStore()

parent_document_retriever = ParentDocumentRetriever(
    vectorstore = parent_document_vectorstore,
    docstore=store,
    child_splitter=child_splitter,
)

By default, this is empty as we haven't added any documents - let's add some now!

In [31]:
parent_document_retriever.add_documents(parent_docs, ids=None)

We'll create the same chain we did before - but substitute our new `parent_document_retriever`.

In [32]:
parent_document_retrieval_chain = (
    {"context": itemgetter("question") | parent_document_retriever, "question": itemgetter("question")}
    | RunnablePassthrough.assign(context=itemgetter("context"))
    | {"response": rag_prompt | chat_model, "context": itemgetter("context")}
)

Let's give it a whirl!

In [33]:
parent_document_retrieval_chain.invoke({"question" : "What is the most common issue with loans?"})["response"].content

'The most common issue with loans, based on the provided complaints, appears to be related to errors and misconduct by loan servicers. Specific problems include inaccuracies in loan balances, misapplied payments, wrongful denials of payment plans, issues with loan reporting such as incorrect information on credit reports, and unexpected or unjustified changes to interest rates. Additionally, complaints highlight systemic issues like lack of verification of debt legitimacy and illegal credit reporting due to the dissolution of certain government agencies.\n\nIn summary, errors, misconduct, and systemic breakdowns by loan servicers and reporting agencies are the most common issues with loans in this context.'

In [34]:
parent_document_retrieval_chain.invoke({"question" : "Did any complaints not get handled in a timely manner?"})["response"].content

'Based on the provided context, all the complaints listed explicitly indicate that they were not handled in a timely manner. Specifically, the complaints with Complaint IDs 12709087 and 12935889 both state "Timely response?": "No," and the narratives mention delays exceeding the expected timeframes, with some responses taking longer than the specified periods and in one case, waiting times of four hours or more and a seven-hour wait. \n\nTherefore, yes, some complaints did not get handled in a timely manner.'

In [35]:
parent_document_retrieval_chain.invoke({"question" : "Why did people fail to pay back their loans?"})["response"].content

'People often fail to pay back their loans due to various challenges, such as being unaware of payment obligations, facing severe financial hardship, or experiencing mismanagement by loan servicers. For example, some individuals reported that their loan payments were unexpectedly resumed while they were still in school or did not receive proper communication about when to start payments. Others faced difficulties because they relied on deferment or forbearance due to unemployment or health issues, which increased the debt over time. Additionally, misinformation about the value and management of their educational programs or institutional failures contributed to their inability to repay. All of these factors can lead to missed payments and difficulty in fulfilling loan repayment obligations.'

Overall, the performance *seems* largely the same. We can leverage a tool like [Ragas]() to more effectively answer the question about the performance.

## Task 9: Ensemble Retriever

In brief, an Ensemble Retriever simply takes 2, or more, retrievers and combines their retrieved documents based on a rank-fusion algorithm.

In this case - we're using the [Reciprocal Rank Fusion](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf) algorithm.

Setting it up is as easy as providing a list of our desired retrievers - and the weights for each retriever.

In [36]:
from langchain.retrievers import EnsembleRetriever

retriever_list = [bm25_retriever, naive_retriever, parent_document_retriever, compression_retriever, multi_query_retriever]
equal_weighting = [1/len(retriever_list)] * len(retriever_list)

ensemble_retriever = EnsembleRetriever(
    retrievers=retriever_list, weights=equal_weighting
)

We'll pack *all* of these retrievers together in an ensemble.

In [37]:
ensemble_retrieval_chain = (
    {"context": itemgetter("question") | ensemble_retriever, "question": itemgetter("question")}
    | RunnablePassthrough.assign(context=itemgetter("context"))
    | {"response": rag_prompt | chat_model, "context": itemgetter("context")}
)

Let's look at our results!

In [38]:
ensemble_retrieval_chain.invoke({"question" : "What is the most common issue with loans?"})["response"].content

'The most common issue with loans, based on the provided complaints and narratives, appears to be problems related to "Dealing with your lender or servicer." This includes issues such as:\n\n- Receiving bad or conflicting information about loan balances, interest, or repayment terms\n- Errors in loan accounting or balance discrepancies\n- Poor communication, lack of transparency, or unresponsive customer service\n- Unauthorized transfers of loan management without borrower consent\n- Problems with repayment plans, forbearance, or misapplication of payments\n- Issues with incorrect reporting affecting credit scores\n- Mishandling of loan data, unauthorized disclosures, or privacy violations\n\nOverall, difficulties in managing interactions with loan servicers and inaccuracies or mishandling of loan information are most frequently cited as the most common issues.'

In [39]:
ensemble_retrieval_chain.invoke({"question" : "Did any complaints not get handled in a timely manner?"})["response"].content

"Based on the provided complaints, it appears that some complaints were not handled in a timely manner. Specifically, at least two instances indicate responses were delayed:\n\n1. Complaint with Complaint ID '12668396' (MOHELA, NJ, ZIP 074XX), received on 03/26/25, was marked as 'No' in the 'Timely response?' field.\n2. Complaint with Complaint ID '12935889' (MOHELA, CO, ZIP 80209), received on 04/11/25, was marked as 'No' in the 'Timely response?' field.\n\nAdditionally, several other complaints note very long wait times or lack of response, which suggests that not all complaints were handled promptly.\n\n**Therefore, the answer is yes — some complaints did not get handled in a timely manner.**"

In [40]:
ensemble_retrieval_chain.invoke({"question" : "Why did people fail to pay back their loans?"})["response"].content

"People failed to pay back their loans primarily due to factors such as:\n\n- Lack of clear information about repayment options, interest accrual, and forgiveness programs, which led to misunderstandings and unmanageable debt accumulation.\n- Economic hardships like unemployment, low wages, or financial instability, making monthly payments difficult.\n- Mismanagement and miscommunication by loan servicers, including improper transfer of loans, failure to notify borrowers of payment obligations, and incorrect reporting of delinquencies and balances.\n- Reliance on forbearance or deferment, which allowed interest to continue accruing, often leading to higher overall debt and extended repayment periods.\n- Unexpected personal circumstances such as homelessness, health issues, or job loss, impairing borrowers' ability to make payments.\n- Lack of awareness or access to income-driven repayment plans, rehabilitation programs, or proper guidance from servicers.\n- Errors and delays in process

## Task 10: Semantic Chunking

While this is not a retrieval method - it *is* an effective way of increasing retrieval performance on corpora that have clean semantic breaks in them.

Essentially, Semantic Chunking is implemented by:

1. Embedding all sentences in the corpus.
2. Combining or splitting sequences of sentences based on their semantic similarity based on a number of [possible thresholding methods](https://python.langchain.com/docs/how_to/semantic-chunker/):
  - `percentile`
  - `standard_deviation`
  - `interquartile`
  - `gradient`
3. Each sequence of related sentences is kept as a document!

Let's see how to implement this!

We'll use the `percentile` thresholding method for this example which will:

Calculate all distances between sentences, and then break apart sequences of setences that exceed a given percentile among all distances.

In [41]:
from langchain_experimental.text_splitter import SemanticChunker

semantic_chunker = SemanticChunker(
    embeddings,
    breakpoint_threshold_type="percentile"
)

Now we can split our documents.

In [42]:
semantic_documents = semantic_chunker.split_documents(loan_complaint_data[:20])

Let's create a new vector store.

In [43]:
semantic_vectorstore = Qdrant.from_documents(
    semantic_documents,
    embeddings,
    location=":memory:",
    collection_name="Loan_Complaint_Data_Semantic_Chunks"
)

We'll use naive retrieval for this example.

In [44]:
semantic_retriever = semantic_vectorstore.as_retriever(search_kwargs={"k" : 10})

Finally we can create our classic chain!

In [45]:
semantic_retrieval_chain = (
    {"context": itemgetter("question") | semantic_retriever, "question": itemgetter("question")}
    | RunnablePassthrough.assign(context=itemgetter("context"))
    | {"response": rag_prompt | chat_model, "context": itemgetter("context")}
)

And view the results!

In [46]:
semantic_retrieval_chain.invoke({"question" : "What is the most common issue with loans?"})["response"].content

'The most common issue with loans, based on the provided complaints, appears to be problems related to loan servicing and management. This includes issues such as:\n- Struggling to repay or problems with understanding payment plans\n- Errors or discrepancies in loan account status, including mistaken defaults or delinquencies\n- Poor communication and lack of transparency from loan servicers\n- Mishandling of auto-debit enrollment and payments\n- Incorrect reporting or use of borrower reports\n- Issues arising from loan refinancing or re-amortization not occurring as expected\n- Unauthorized access or privacy violations involving borrower data\n\nOverall, many complaints highlight frustrations with loan servicers regarding mismanagement, incorrect information, delayed or confusing communications, and failures to properly process or verify borrower payments and statuses.'

In [47]:
semantic_retrieval_chain.invoke({"question" : "Did any complaints not get handled in a timely manner?"})["response"].content

'Based on the provided complaints, yes, some complaints did not get handled in a timely manner. Specifically, multiple complaints mention that despite written correspondence and acknowledgment of receipt, the companies did not respond to the complaints or questions raised, or the responses were simply "Closed with explanation." For example:\n\n- Complaint regarding Nelnet (Row 17): Despite signing for certified mail and acknowledging receipt, Nelnet never responded nor provided answers, although the response was marked as "Closed with explanation" and the response time was "Yes" (timely).\n- Similarly, complaints about AidVantage and EdFinancial Services were responded to with "Closed with explanation," indicating that the issues were not necessarily resolved satisfactorily or promptly from the complainants\' perspective.\n\nWhile many responses were marked as "timely" and "closed with explanation," the lack of substantive resolution or failure to respond to specific issues suggests th

In [48]:
semantic_retrieval_chain.invoke({"question" : "Why did people fail to pay back their loans?"})["response"].content

'Based on the provided context, people failed to pay back their loans for various reasons, including:\n\n1. **Lack of clear and accurate information:** Some individuals received bad or incomplete information about their loans, for example, incorrect forbearance status or payment obligations, which led to misunderstandings and default.\n\n2. **Delays and issues with payment processing:** Problems such as payments not clearing despite being processed, or payments being rejected due to bank or servicer errors, caused payments to be missed.\n\n3. **Administrative or procedural obstacles:** Issues like trouble logging into accounts, difficulty verifying documents, or servicers deliberately stalling or creating complications, led borrowers to give up or fall behind.\n\n4. **Disputes over loan legitimacy or reporting:** Some borrowers discovered their loans were erroneously reported as in default, or were being collected despite being legally void due to government actions, leading to confusi

#### ❓ Question #3:

If sentences are short and highly repetitive (e.g., FAQs), how might semantic chunking behave, and how would you adjust the algorithm?

##### ✅ Answer:

# 🤝 Breakout Room Part #2

#### 🏗️ Activity #1

Your task is to evaluate the various Retriever methods against eachother.

You are expected to:

1. Create a "golden dataset"
 - Use Synthetic Data Generation (powered by Ragas, or otherwise) to create this dataset
2. Evaluate each retriever with *retriever specific* Ragas metrics
 - Semantic Chunking is not considered a retriever method and will not be required for marks, but you may find it useful to do a "semantic chunking on" vs. "semantic chunking off" comparision between them
3. Compile these in a list and write a small paragraph about which is best for this particular data and why.

Your analysis should factor in:
  - Cost
  - Latency
  - Performance

> NOTE: This is **NOT** required to be completed in class. Please spend time in your breakout rooms creating a plan before moving on to writing code.

##### HINTS:

- LangSmith provides detailed information about latency and cost.

In [None]:
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
generator_llm = LangchainLLMWrapper(ChatOpenAI(model="gpt-4.1-nano"))
generator_embeddings = LangchainEmbeddingsWrapper(OpenAIEmbeddings())

from ragas.testset.graph import KnowledgeGraph

kg = KnowledgeGraph()
kg
# loan_complaint_data = loader.load()

# for doc in loan_complaint_data:
#     doc.page_content = doc.metadata["Consumer complaint narrative"]
