# Advanced Retrieval with LangChain

In the following notebook, we'll explore various methods of advanced retrieval using LangChain!

We'll touch on:

- Naive Retrieval
- Best-Matching 25 (BM25)
- Multi-Query Retrieval
- Parent-Document Retrieval
- Contextual Compression (a.k.a. Rerank)
- Ensemble Retrieval
- Semantic chunking

We'll also discuss how these methods impact performance on our set of documents with a simple RAG chain.

There will be two breakout rooms:

- 🤝 Breakout Room Part #1
  - Task 1: Getting Dependencies!
  - Task 2: Data Collection and Preparation
  - Task 3: Setting Up QDrant!
  - Task 4-10: Retrieval Strategies
- 🤝 Breakout Room Part #2
  - Activity: Evaluate with Ragas

# 🤝 Breakout Room Part #1

## Task 1: Getting Dependencies!

We're going to need a few specific LangChain community packages, like OpenAI (for our [LLM](https://platform.openai.com/docs/models) and [Embedding Model](https://platform.openai.com/docs/guides/embeddings)) and Cohere (for our [Reranker](https://cohere.com/rerank)).

We'll also provide our OpenAI key, as well as our Cohere API key.

In [1]:
import os
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API Key:")

In [2]:
os.environ["COHERE_API_KEY"] = getpass.getpass("Cohere API Key:")

## Task 2: Data Collection and Preparation

We'll be using our Loan Data once again - this time the strutured data available through the CSV!

### Data Preparation

We want to make sure all our documents have the relevant metadata for the various retrieval strategies we're going to be applying today.

In [3]:
from langchain_community.document_loaders.csv_loader import CSVLoader
from datetime import datetime, timedelta

loader = CSVLoader(
    file_path=f"./data/complaints.csv",
    metadata_columns=[
      "Date received", 
      "Product", 
      "Sub-product", 
      "Issue", 
      "Sub-issue", 
      "Consumer complaint narrative", 
      "Company public response", 
      "Company", 
      "State", 
      "ZIP code", 
      "Tags", 
      "Consumer consent provided?", 
      "Submitted via", 
      "Date sent to company", 
      "Company response to consumer", 
      "Timely response?", 
      "Consumer disputed?", 
      "Complaint ID"
    ]
)

loan_complaint_data = loader.load()

for doc in loan_complaint_data:
    doc.page_content = doc.metadata["Consumer complaint narrative"]

Let's look at an example document to see if everything worked as expected!

In [4]:
loan_complaint_data[0]

Document(metadata={'source': './data/complaints.csv', 'row': 0, 'Date received': '03/27/25', 'Product': 'Student loan', 'Sub-product': 'Federal student loan servicing', 'Issue': 'Dealing with your lender or servicer', 'Sub-issue': 'Trouble with how payments are being handled', 'Consumer complaint narrative': "The federal student loan COVID-19 forbearance program ended in XX/XX/XXXX. However, payments were not re-amortized on my federal student loans currently serviced by Nelnet until very recently. The new payment amount that is effective starting with the XX/XX/XXXX payment will nearly double my payment from {$180.00} per month to {$360.00} per month. I'm fortunate that my current financial position allows me to be able to handle the increased payment amount, but I am sure there are likely many borrowers who are not in the same position. The re-amortization should have occurred once the forbearance ended to reduce the impact to borrowers.", 'Company public response': 'None', 'Company'

## Task 3: Setting up QDrant!

Now that we have our documents, let's create a QDrant VectorStore with the collection name "LoanComplaints".

We'll leverage OpenAI's [`text-embedding-3-small`](https://openai.com/blog/new-embedding-models-and-api-updates) because it's a very powerful (and low-cost) embedding model.

> NOTE: We'll be creating additional vectorstores where necessary, but this pattern is still extremely useful.

In [5]:
from langchain_community.vectorstores import Qdrant
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

vectorstore = Qdrant.from_documents(
    loan_complaint_data,
    embeddings,
    location=":memory:",
    collection_name="LoanComplaints"
)

## Task 4: Naive RAG Chain

Since we're focusing on the "R" in RAG today - we'll create our Retriever first.

### R - Retrieval

This naive retriever will simply look at each review as a document, and use cosine-similarity to fetch the 10 most relevant documents.

> NOTE: We're choosing `10` as our `k` here to provide enough documents for our reranking process later

In [6]:
naive_retriever = vectorstore.as_retriever(search_kwargs={"k" : 10})

### A - Augmented

We're going to go with a standard prompt for our simple RAG chain today! Nothing fancy here, we want this to mostly be about the Retrieval process.

In [7]:
from langchain_core.prompts import ChatPromptTemplate

RAG_TEMPLATE = """\
You are a helpful and kind assistant. Use the context provided below to answer the question.

If you do not know the answer, or are unsure, say you don't know.

Query:
{question}

Context:
{context}
"""

rag_prompt = ChatPromptTemplate.from_template(RAG_TEMPLATE)

### G - Generation

We're going to leverage `gpt-4.1-nano` as our LLM today, as - again - we want this to largely be about the Retrieval process.

In [8]:
from langchain_openai import ChatOpenAI

chat_model = ChatOpenAI(model="gpt-4.1-nano")

### LCEL RAG Chain

We're going to use LCEL to construct our chain.

> NOTE: This chain will be exactly the same across the various examples with the exception of our Retriever!

In [9]:
from langchain_core.runnables import RunnablePassthrough
from operator import itemgetter
from langchain_core.output_parsers import StrOutputParser

naive_retrieval_chain = (
    # INVOKE CHAIN WITH: {"question" : "<<SOME USER QUESTION>>"}
    # "question" : populated by getting the value of the "question" key
    # "context"  : populated by getting the value of the "question" key and chaining it into the base_retriever
    {"context": itemgetter("question") | naive_retriever, "question": itemgetter("question")}
    # "context"  : is assigned to a RunnablePassthrough object (will not be called or considered in the next step)
    #              by getting the value of the "context" key from the previous step
    | RunnablePassthrough.assign(context=itemgetter("context"))
    # "response" : the "context" and "question" values are used to format our prompt object and then piped
    #              into the LLM and stored in a key called "response"
    # "context"  : populated by getting the value of the "context" key from the previous step
    | {"response": rag_prompt | chat_model, "context": itemgetter("context")}
)

Let's see how this simple chain does on a few different prompts.

> NOTE: You might think that we've cherry picked prompts that showcase the individual skill of each of the retrieval strategies - you'd be correct!

In [10]:
naive_retrieval_chain.invoke({"question" : "What is the most common issue with loans?"})["response"].content

'The most common issues with loans, based on the complaints in the provided context, seem to involve mismanagement and misinformation. Specifically, these include errors in loan balances, misapplied payments, wrongful denials of payment plans, incorrect reporting of account status, and issues related to loan transfer or sale without proper notification. Additionally, problems with how payments are applied—such as only to interest rather than principal—and difficulties accessing accurate information are prevalent. Therefore, a key common issue is **mismanagement and incorrect handling of loan information**, which can lead to financial hardships and credit reporting problems.'

In [11]:
naive_retrieval_chain.invoke({"question" : "Did any complaints not get handled in a timely manner?"})["response"].content

'Based on the provided complaints, yes, some complaints indicate that issues were not handled in a timely manner. Specifically, at least two complaints show responses marked as "No" for timely response:\n\n1. Complaint from 03/28/25 (Complaint ID: 12709087) submitted to MOHELA; it was marked as "Timely response?": No.\n2. Complaint from 04/24/25 (Complaint ID: 13160766) submitted to Maximus Federal Services, Inc.; it was marked as "Timely response?": Yes, but the complaint details suggest ongoing unresolved issues.\n\nAdditionally, multiple complaints mention extended durations without resolution, such as:\n\n- Over 1 year of unresolved requests and no response.\n- Nearly 18 months with no resolution.\n- Multiple instances where complainants report that their issues remain unaddressed despite repeated follow-ups.\n\nTherefore, yes, some complaints were not handled in a timely manner.'

In [12]:
naive_retrieval_chain.invoke({"question" : "Why did people fail to pay back their loans?"})["response"].content

"People failed to pay back their loans for several reasons, including:\n\n1. **Accumulation of interest during forbearance or deferment:** Many borrowers could not afford increased payments once interest continued to accrue during forbearance or deferment periods, leading to larger balances that became harder to repay.\n\n2. **Lack of clear communication and notification:** Some borrowers were not properly notified about loan transfer, repayment start dates, or changes in payment status, resulting in unawareness of when to pay or if payments were required.\n\n3. **Economic hardships and stagnant wages:** Many borrowers faced financial hardships, low wages, unemployment, or were in industries with irregular income, making it difficult to keep up with payments.\n\n4. **Inability to qualify for income-driven repayment or loan forgiveness:** Borrowers who did not qualify for programs like Public Service Loan Forgiveness or TLF found themselves burdened with unmanageable debt.\n\n5. **Misma

Overall, this is not bad! Let's see if we can make it better!

## Task 5: Best-Matching 25 (BM25) Retriever

Taking a step back in time - [BM25](https://www.nowpublishers.com/article/Details/INR-019) is based on [Bag-Of-Words](https://en.wikipedia.org/wiki/Bag-of-words_model) which is a sparse representation of text.

In essence, it's a way to compare how similar two pieces of text are based on the words they both contain.

This retriever is very straightforward to set-up! Let's see it happen down below!


In [13]:
from langchain_community.retrievers import BM25Retriever

bm25_retriever = BM25Retriever.from_documents(loan_complaint_data, )

We'll construct the same chain - only changing the retriever.

In [14]:
bm25_retrieval_chain = (
    {"context": itemgetter("question") | bm25_retriever, "question": itemgetter("question")}
    | RunnablePassthrough.assign(context=itemgetter("context"))
    | {"response": rag_prompt | chat_model, "context": itemgetter("context")}
)

Let's look at the responses!

In [15]:
bm25_retrieval_chain.invoke({"question" : "What is the most common issue with loans?"})["response"].content

"Based on the provided information, the most common issue with loans appears to be problems related to dealing with lenders or servicers, including issues such as incorrect or bad information about loans, difficulty in applying payments correctly, disputes over fees or interest calculations, and lack of transparent communication. Specifically, many complaints involve the servicers' handling of payments, interest, and loan information, which are recurring themes across multiple complaints."

In [16]:
bm25_retrieval_chain.invoke({"question" : "Did any complaints not get handled in a timely manner?"})["response"].content

'Based on the provided information, all the complaints included in the context were responded to in a timely manner, with the response status noted as "Yes" under "Timely response?" for each complaint. Therefore, there is no indication that any complaints did not get handled in a timely manner.'

In [17]:
bm25_retrieval_chain.invoke({"question" : "Why did people fail to pay back their loans?"})["response"].content

'People may fail to pay back their loans for various reasons, including issues with payment plans, miscommunication between borrowers and lenders, or complications related to loan servicing. For example, some complainants faced problems with their payment plans or forbearance options, which led to missed payments or increased balances. Others experienced procedural errors, such as being unenrolled from autopay without their knowledge, resulting in missed payments and negative impacts on credit scores. Additionally, delays or failures in communication from the loan servicers—like not receiving notices about payment obligations, forbearance status, or loan transfers—can cause borrowers to fall behind on their payments. Overall, such issues highlight the importance of clear communication, proper management of payment plans, and transparent processes to ensure borrowers can meet their repayment obligations.'

It's not clear that this is better or worse, if only we had a way to test this (SPOILERS: We do, the second half of the notebook will cover this)

#### ❓ Question #1:

Give an example query where BM25 is better than embeddings and justify your answer.

## Task 6: Contextual Compression (Using Reranking)

Contextual Compression is a fairly straightforward idea: We want to "compress" our retrieved context into just the most useful bits.

There are a few ways we can achieve this - but we're going to look at a specific example called reranking.

The basic idea here is this:

- We retrieve lots of documents that are very likely related to our query vector
- We "compress" those documents into a smaller set of *more* related documents using a reranking algorithm.

We'll be leveraging Cohere's Rerank model for our reranker today!

All we need to do is the following:

- Create a basic retriever
- Create a compressor (reranker, in this case)

That's it!

Let's see it in the code below!

In [18]:
from langchain.retrievers.contextual_compression import ContextualCompressionRetriever
from langchain_cohere import CohereRerank

compressor = CohereRerank(model="rerank-v3.5")
compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor, base_retriever=naive_retriever
)

Let's create our chain again, and see how this does!

In [19]:
contextual_compression_retrieval_chain = (
    {"context": itemgetter("question") | compression_retriever, "question": itemgetter("question")}
    | RunnablePassthrough.assign(context=itemgetter("context"))
    | {"response": rag_prompt | chat_model, "context": itemgetter("context")}
)

In [20]:
contextual_compression_retrieval_chain.invoke({"question" : "What is the most common issue with loans?"})["response"].content

'Based on the provided data, a common issue with loans, particularly student loans, is dealing with the lender or servicer, which often involves errors such as incorrect information, misapplied payments, wrongful denials of payment plans, or mishandling of the loan with inaccuracies and lack of communication. Issues like receiving bad information, discrepancies in loan balances, unauthorized transfers of loans, and privacy violations are also frequent complaints.'

In [21]:
contextual_compression_retrieval_chain.invoke({"question" : "Did any complaints not get handled in a timely manner?"})["response"].content

'Based on the provided information, at least one complaint was handled in a timely manner, as indicated by the responses labeled "Timely response? Yes" for the complaints on 04/14/25 from both Maximus Federal Services, Inc. and EdFinancial Services. However, the first complaint from Maximus has been open for nearly 18 months with no resolution, indicating that some issues have not been resolved promptly. Therefore, while some complaints received timely responses, others have not been addressed in a timely manner.'

In [22]:
contextual_compression_retrieval_chain.invoke({"question" : "Why did people fail to pay back their loans?"})["response"].content

'People failed to pay back their loans for several reasons, including:\n\n1. Lack of Awareness or Information: Some borrowers were not properly informed by their financial aid officers about the necessity to repay the loans, leading to confusion and unawareness of repayment obligations.\n\n2. Administrative and Communication Issues: Borrowers experienced problems such as not being notified when payments were due, being transferred between loan servicers without their knowledge or consent, and facing difficulties in accessing or updating their account information.\n\n3. Accumulation of Interest and Unmanageable Payments: Many borrowers faced ongoing interest accumulation, especially when loans were placed into forbearance or deferment, causing their outstanding balances to grow over time, even when they made payments.\n\n4. Limited or Unsuitable Options for Repayment: The only options offered, such as forbearance or deferment, often led to increased interest and extended repayment perio

We'll need to rely on something like Ragas to help us get a better sense of how this is performing overall - but it "feels" better!

## Task 7: Multi-Query Retriever

Typically in RAG we have a single query - the one provided by the user.

What if we had....more than one query!

In essence, a Multi-Query Retriever works by:

1. Taking the original user query and creating `n` number of new user queries using an LLM.
2. Retrieving documents for each query.
3. Using all unique retrieved documents as context

So, how is it to set-up? Not bad! Let's see it down below!



In [23]:
from langchain.retrievers.multi_query import MultiQueryRetriever

multi_query_retriever = MultiQueryRetriever.from_llm(
    retriever=naive_retriever, llm=chat_model
)

In [24]:
multi_query_retrieval_chain = (
    {"context": itemgetter("question") | multi_query_retriever, "question": itemgetter("question")}
    | RunnablePassthrough.assign(context=itemgetter("context"))
    | {"response": rag_prompt | chat_model, "context": itemgetter("context")}
)

In [25]:
multi_query_retrieval_chain.invoke({"question" : "What is the most common issue with loans?"})["response"].content

'Based on the information provided, the most common issues with loans appear to be:\n\n- Trouble with how payments are being handled, including restrictions on applying extra funds to principal, predatory repayment practices, or misapplication of payments.\n- Inaccuracies or difficulties with loan balances and interest calculations, often compounded by transfers between loan servicers.\n- Poor communication and lack of transparency from servicers regarding loan terms, interest rates, fees, or account status.\n- Problems related to loan discharge, forgiveness, or discharge programs, including mismanagement or misinformation about eligibility.\n- Unauthorized sharing of personal information and potential privacy violations.\n- Errors and discrepancies in credit reporting related to student loans.\n- Difficulties in navigating income-driven repayment and public service loan forgiveness programs.\n\nOverall, the most prevalent issue seems to be problems with how payments are handled and th

In [26]:
multi_query_retrieval_chain.invoke({"question" : "Did any complaints not get handled in a timely manner?"})["response"].content

'Based on the provided complaints data, yes, some complaints were not handled in a timely manner. For example:\n\n- Complaint ID 12709087 (Row 441) received on 03/28/25 had a response marked as "No" for timely response.\n- Complaint ID 12744910 (Row 400) received on 03/31/25 was handled with a "Yes" for timely response.\n- Complaint ID 12650717 (Row 575) received on 03/25/25 was marked "No" for timely response.\n- Complaint ID 12668396 (Row 760) received on 04/02/25 was handled timely ("Yes").\n- Complaint ID 13062402 (Row 66) received on 04/18/25 was marked "Yes".\n- Complaint IDs 13091395, 13515083, and others generally show responses as "Yes" or "Closed with explanation."\n\nTherefore, at least some complaints did not get handled in a timely manner, notably the complaint received on 03/28/25, which was marked as "No" for response timeliness.'

In [27]:
multi_query_retrieval_chain.invoke({"question" : "Why did people fail to pay back their loans?"})["response"].content

"People failed to pay back their loans for various reasons, including:\n\n- Errors and misconduct by loan servicers, such as misapplied payments, wrongful denials of payment plans, or errors in loan balances.\n- Lack of proper information or guidance about available repayment options, like income-driven repayment plans or forgiveness programs, leading to default or unmanageable debt.\n- Systemic failures in loan reporting and handling, causing incorrect account status updates and negative impacts on credit scores.\n- Being steered into long-term forbearances or incorrect repayment methods that increased interest and debt over time.\n- Problems with loan transfer, incorrect attribution of loans, or failure to properly notify borrowers of status changes.\n- Financial hardships, employment issues, or medical problems, compounded by systemic mismanagement and inadequate communication from servicers.\n\nOverall, a combination of systemic issues, mismanagement by servicers, and borrowers' la

#### ❓ Question #2:

Explain how generating multiple reformulations of a user query can improve recall.

## Task 8: Parent Document Retriever

A "small-to-big" strategy - the Parent Document Retriever works based on a simple strategy:

1. Each un-split "document" will be designated as a "parent document" (You could use larger chunks of document as well, but our data format allows us to consider the overall document as the parent chunk)
2. Store those "parent documents" in a memory store (not a VectorStore)
3. We will chunk each of those documents into smaller documents, and associate them with their respective parents, and store those in a VectorStore. We'll call those "child chunks".
4. When we query our Retriever, we will do a similarity search comparing our query vector to the "child chunks".
5. Instead of returning the "child chunks", we'll return their associated "parent chunks".

Okay, maybe that was a few steps - but the basic idea is this:

- Search for small documents
- Return big documents

The intuition is that we're likely to find the most relevant information by limiting the amount of semantic information that is encoded in each embedding vector - but we're likely to miss relevant surrounding context if we only use that information.

Let's start by creating our "parent documents" and defining a `RecursiveCharacterTextSplitter`.

In [28]:
from langchain.retrievers import ParentDocumentRetriever
from langchain.storage import InMemoryStore
from langchain_text_splitters import RecursiveCharacterTextSplitter
from qdrant_client import QdrantClient, models

parent_docs = loan_complaint_data
child_splitter = RecursiveCharacterTextSplitter(chunk_size=750)

We'll need to set up a new QDrant vectorstore - and we'll use another useful pattern to do so!

> NOTE: We are manually defining our embedding dimension, you'll need to change this if you're using a different embedding model.

In [29]:
from langchain_qdrant import QdrantVectorStore

client = QdrantClient(location=":memory:")

client.create_collection(
    collection_name="full_documents",
    vectors_config=models.VectorParams(size=1536, distance=models.Distance.COSINE)
)

parent_document_vectorstore = QdrantVectorStore(
    collection_name="full_documents", embedding=OpenAIEmbeddings(model="text-embedding-3-small"), client=client
)

Now we can create our `InMemoryStore` that will hold our "parent documents" - and build our retriever!

In [30]:
store = InMemoryStore()

parent_document_retriever = ParentDocumentRetriever(
    vectorstore = parent_document_vectorstore,
    docstore=store,
    child_splitter=child_splitter,
)

By default, this is empty as we haven't added any documents - let's add some now!

In [31]:
parent_document_retriever.add_documents(parent_docs, ids=None)

We'll create the same chain we did before - but substitute our new `parent_document_retriever`.

In [32]:
parent_document_retrieval_chain = (
    {"context": itemgetter("question") | parent_document_retriever, "question": itemgetter("question")}
    | RunnablePassthrough.assign(context=itemgetter("context"))
    | {"response": rag_prompt | chat_model, "context": itemgetter("context")}
)

Let's give it a whirl!

In [33]:
parent_document_retrieval_chain.invoke({"question" : "What is the most common issue with loans?"})["response"].content

'The most common issue with loans, based on the provided complaints, appears to be related to problems with federal student loan servicing. Specific sub-issues include errors in loan balances and misapplied payments, wrongful denials of payment plans, discrepancies in loan balances and interest rates, and issues with credit reporting and verification of debt. Many complaints also involve misconduct by loan servicers, systemic breakdowns, and lack of clear communication or transparency.\n\nIn summary, a prevalent issue is mismanagement and errors within loan servicing processes, which lead to inaccurate reporting, unfair fee charges, and difficulty in managing or verifying loan details.'

In [34]:
parent_document_retrieval_chain.invoke({"question" : "Did any complaints not get handled in a timely manner?"})["response"].content

'Based on the provided information, all the complaints listed were marked as not handled in a timely manner. Specifically, two complaints regarding federal student loan servicing by MOHELA indicate delays: one that states "no one has reached out to me" and another mentioning unacceptable wait times of four hours or more, as well as a 7-hour wait to speak with someone. Both are marked as "Timely response?": "No." \n\nTherefore, yes, there were complaints that did not get handled in a timely manner.'

In [35]:
parent_document_retrieval_chain.invoke({"question" : "Why did people fail to pay back their loans?"})["response"].content

"People often fail to pay back their loans due to various reasons, including financial hardship, lack of proper information or communication from loan servicers, and issues related to the stability or legitimacy of the loan or institution. In the provided context, some specific reasons include:\n\n- **Financial Hardship:** Borrowers have experienced severe financial difficulties after graduation, making it difficult to make loan payments. For example, one individual was unable to secure employment and struggled with health issues related to debt burdens.\n\n- **Lack of Proper Communication and Support:** Some borrowers reported not being notified about payment due dates or changes in their loan status, leading to missed payments. For instance, there were cases where borrowers were not informed about their loan repayment obligations or the buyout of their loan servicers.\n\n- **Problems with Loan Servicing:** Issues such as the failure of loan servicers to provide evidence of ownership 

Overall, the performance *seems* largely the same. We can leverage a tool like [Ragas]() to more effectively answer the question about the performance.

## Task 9: Ensemble Retriever

In brief, an Ensemble Retriever simply takes 2, or more, retrievers and combines their retrieved documents based on a rank-fusion algorithm.

In this case - we're using the [Reciprocal Rank Fusion](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf) algorithm.

Setting it up is as easy as providing a list of our desired retrievers - and the weights for each retriever.

In [36]:
from langchain.retrievers import EnsembleRetriever

retriever_list = [bm25_retriever, naive_retriever, parent_document_retriever, compression_retriever, multi_query_retriever]
equal_weighting = [1/len(retriever_list)] * len(retriever_list)

ensemble_retriever = EnsembleRetriever(
    retrievers=retriever_list, weights=equal_weighting
)

We'll pack *all* of these retrievers together in an ensemble.

In [37]:
ensemble_retrieval_chain = (
    {"context": itemgetter("question") | ensemble_retriever, "question": itemgetter("question")}
    | RunnablePassthrough.assign(context=itemgetter("context"))
    | {"response": rag_prompt | chat_model, "context": itemgetter("context")}
)

Let's look at our results!

In [38]:
ensemble_retrieval_chain.invoke({"question" : "What is the most common issue with loans?"})["response"].content

'The most common issue with loans, based on the provided complaints, appears to be dealing with your lender or servicer, specifically problems like receiving bad information about your loan, improper handling of payments, unauthorized transfer of loans, inaccurate reporting to credit bureaus, and lack of proper documentation such as signed Master Promissory Notes. Many complaints also involve issues like errors in loan balances, misapplied payments, difficulty obtaining loan information, challenges with loan transfer or reassignment without proper notices, and problems with payment plans or forgiveness applications.\n\nIn summary, a predominant issue is **poor or inaccurate servicing and communication regarding loan information, transfer, or management**, which can lead to errors, negative credit impacts, and a lack of transparency and proper documentation.\n\nIf you need a concise answer:  \n**The most common issue with loans is poor management and miscommunication by lenders or servi

In [39]:
ensemble_retrieval_chain.invoke({"question" : "Did any complaints not get handled in a timely manner?"})["response"].content

'Based on the provided complaints, yes, some complaints indicate they were not handled in a timely manner. For example, the complaint from 04/11/25 involving Mohela was marked as "No" for timely response, suggesting it was not addressed promptly. Similarly, the complaint from 03/25/25 also notes that the issue was not resolved within a reasonable timeframe. \n\nTherefore, the answer is: Yes, some complaints did not get handled in a timely manner.'

In [40]:
ensemble_retrieval_chain.invoke({"question" : "Why did people fail to pay back their loans?"})["response"].content

'People failed to pay back their loans for several reasons, including:\n\n1. Lack of Clear Communication or Notices: Many borrowers reported not receiving timely, accurate, or sufficient notices about loan repayment status, due dates, or changes when loans were transferred or modified. This led to unawareness of when payments were due, resulting in missed payments and reported delinquencies.\n\n2. Difficulties with Payment Handling and Applying Funds: Some borrowers experienced issues with their loan servicers where extra payments were misapplied—most often to interest rather than principal—and payments were reversed or not properly processed, making repayment challenging.\n\n3. Multiple Transfers and Confusing Account Information: Frequent transfers between loan servicers (e.g., from Great Lakes to Nelnet to Aidvantage) often resulted in inconsistent account balances, missing or incorrect information, and inability to access or verify loan details, adding to repayment difficulties.\n\

## Task 10: Semantic Chunking

While this is not a retrieval method - it *is* an effective way of increasing retrieval performance on corpora that have clean semantic breaks in them.

Essentially, Semantic Chunking is implemented by:

1. Embedding all sentences in the corpus.
2. Combining or splitting sequences of sentences based on their semantic similarity based on a number of [possible thresholding methods](https://python.langchain.com/docs/how_to/semantic-chunker/):
  - `percentile`
  - `standard_deviation`
  - `interquartile`
  - `gradient`
3. Each sequence of related sentences is kept as a document!

Let's see how to implement this!

We'll use the `percentile` thresholding method for this example which will:

Calculate all distances between sentences, and then break apart sequences of setences that exceed a given percentile among all distances.

In [41]:
from langchain_experimental.text_splitter import SemanticChunker

semantic_chunker = SemanticChunker(
    embeddings,
    breakpoint_threshold_type="percentile"
)

Now we can split our documents.

In [42]:
semantic_documents = semantic_chunker.split_documents(loan_complaint_data[:20])

Let's create a new vector store.

In [43]:
semantic_vectorstore = Qdrant.from_documents(
    semantic_documents,
    embeddings,
    location=":memory:",
    collection_name="Loan_Complaint_Data_Semantic_Chunks"
)

We'll use naive retrieval for this example.

In [44]:
semantic_retriever = semantic_vectorstore.as_retriever(search_kwargs={"k" : 10})

Finally we can create our classic chain!

In [45]:
semantic_retrieval_chain = (
    {"context": itemgetter("question") | semantic_retriever, "question": itemgetter("question")}
    | RunnablePassthrough.assign(context=itemgetter("context"))
    | {"response": rag_prompt | chat_model, "context": itemgetter("context")}
)

And view the results!

In [46]:
semantic_retrieval_chain.invoke({"question" : "What is the most common issue with loans?"})["response"].content

'The most common issue with loans, based on the provided complaints, appears to be problems related to borrower communication, account management, and accuracy of loan information. Many complaints involve difficulties in understanding or obtaining information about loan balances, payment plans, or servicing details, as well as errors in reporting, unauthorized account activity, or delays in processing applications. These issues often cause stress, financial uncertainty, and damage to credit reports.\n\nIn summary, the most common issues include:\n- Lack of clear or accurate information about loan status, balances, or payment terms\n- Problems with loan servicing and account management\n- Errors or discrepancies in credit reporting\n- Delays or failures in processing requests or payments\n- Difficulties in communication and customer service responses\n\nIf you need specific details, I can help further!'

In [47]:
semantic_retrieval_chain.invoke({"question" : "Did any complaints not get handled in a timely manner?"})["response"].content

'Yes, based on the provided complaints, several complaints indicate that they were not handled in a timely manner. For example:\n\n- The complaint regarding the transfer of an account to Nelnet (Complaint ID: 13331376) was responded to with "Closed with explanation" and a note that responses were timely ("Yes").\n- The complaint about trouble with payment handling (Complaint ID: 13207537) also received a response marked as "Closed with explanation" and a timely response.\n- The complaint about bad information and lack of response (Complaint ID: 13347464) similarly received a "Closed with explanation" response within the expected time frame.\n- Multiple complaints involving disputes, incorrect billing, or account issues (e.g., Complaint IDs: 13425612, 12962044, 13020950, 13281034, 13179688) were responded to with "Closed with explanation" and noted as "Timely response: Yes."\n\nWhile responses were generally marked as timely in the data, some complaints describe prolonged issues or fail

In [48]:
semantic_retrieval_chain.invoke({"question" : "Why did people fail to pay back their loans?"})["response"].content

'People failed to pay back their loans for various reasons, including issues with loan servicing, miscommunication, and legal disputes. Some specific reasons include:\n\n- Receiving bad or unclear information from lenders or servicers regarding loan status or repayment terms.\n- Difficulty with the loan repayment process due to technical issues or lack of transparency.\n- Disputes over the legitimacy or legality of the debt, such as claims that the debt is legally void or improperly reported.\n- Challenges with documentation, like inconsistent or rejected paperwork needed for loan forgiveness or discharge.\n- Problems with payment processing, such as payments not being properly credited or rejected bank transactions.\n- Confusion or misinformation about repayment plans, for example, due to failed re-amortization after forbearance.\n- Legal or privacy issues involving data breaches or unauthorized access that impacted their ability to manage or confirm their loans.\n\nOverall, these iss

#### ❓ Question #3:

If sentences are short and highly repetitive (e.g., FAQs), how might semantic chunking behave, and how would you adjust the algorithm?

# 🤝 Breakout Room Part #2

#### 🏗️ Activity #1

Your task is to evaluate the various Retriever methods against eachother.

You are expected to:

1. Create a "golden dataset"
 - Use Synthetic Data Generation (powered by Ragas, or otherwise) to create this dataset
2. Evaluate each retriever with *retriever specific* Ragas metrics
 - Semantic Chunking is not considered a retriever method and will not be required for marks, but you may find it useful to do a "semantic chunking on" vs. "semantic chunking off" comparision between them
3. Compile these in a list and write a small paragraph about which is best for this particular data and why.

Your analysis should factor in:
  - Cost
  - Latency
  - Performance

> NOTE: This is **NOT** required to be completed in class. Please spend time in your breakout rooms creating a plan before moving on to writing code.

##### HINTS:

- LangSmith provides detailed information about latency and cost.

In [None]:
### YOUR CODE HERE