## Setup

In [1]:
# install required packages
!pip install -q PyPDF openai langchain chromadb faiss-cpu pypdf tiktoken docarray langchain-openai langchain-community sentence-transformers
!pip freeze> requirements.txt


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip3 install --upgrade pip[0m
Reshimming asdf python...


In [2]:
import warnings
warnings.filterwarnings('ignore')

## Set the OpenAI Key

In [3]:
# import the necessary libraies
import os
import openai

folder_path = './'
os.chdir(folder_path)

# Read the text file containing the API key
with open(folder_path + "OpenAI_API_Key.txt", "r") as f:
  openai.api_key = ' '.join(f.readlines())

# Update the OpenAI API key by updating the environment variable
os.environ["OPENAI_API_KEY"] = openai.api_key
# os.environ["OPENAI_API_KEY"]

# 🧠 LLM Application Architecture Overview

This architecture outlines the key building blocks used to develop applications powered by large language models (LLMs). Each component plays a specific role in creating robust, modular, and intelligent systems.

---

## 🔌 Model I/O  
**Interface with language models and handle input/output logic**

- **Prompts**  
  Define static or dynamic templates to craft structured inputs for LLMs.

- **Language Models**  
  Interact with foundational models (e.g., OpenAI, Cohere, Anthropic) to generate outputs based on prompts.

- **Output Parsers**  
  Transform raw LLM outputs into structured formats like JSON, strings, or numbers.

---

## 🔍 Retrieval  
**Connect your application to external or domain-specific knowledge**

- **Document Loaders**  
  Load documents from various sources like PDFs, web pages, databases, or APIs.

- **Document Transformers**  
  Preprocess and split documents into smaller chunks for efficient indexing and retrieval.

- **Embedding Models**  
  Convert text chunks into vector representations using models like OpenAI, HuggingFace, etc.

- **Vector Stores**  
  Store and index embeddings to enable similarity-based search (e.g., FAISS, Chroma, Pinecone).

- **Retrievers**  
  Retrieve the most relevant chunks or documents based on a user query's vector similarity.

---

## 🔗 Chains  
**Combine multiple components into an execution flow**

- Link prompts, models, retrievers, tools, and logic into a single pipeline.
- Useful for multi-step reasoning, conditional logic, or tool-assisted tasks.

---

## 🧠 Memory  
**Maintain context across multiple runs**

- Track and persist conversation history or task state.
- Useful for chatbots and context-aware applications.

---

## 🤖 Agents  
**Empower LLMs to choose actions based on intent**

- Agents decide which tools or chains to invoke to fulfill a user's goal.
- Excellent for complex or dynamic tasks requiring multiple capabilities.

---

## 🛠 Tools  
**External functions or utilities accessible to agents or chains**

- Examples: Web search, database query, code execution, calculators, etc.

---

## 📡 Callbacks  
**Monitor and debug system behavior**

- Track intermediate steps, visualize flows, and log outputs for observability and debugging.

---

## ✅ Summary

This modular architecture enables the development of intelligent systems that can reason, retrieve, generate, and interact dynamically with users or data. Whether building a chatbot, semantic search tool, or agent-based system, these components can be composed to suit various real-world applications.


![LangChain-Components.png](./images/LangChain-Components.png)

## 🔹 1. Model I/O

The **Model I/O** module acts as the interface between your application and a large language model (LLM). It is responsible for constructing prompts, invoking the model, and processing the output.

### Components:

- **Language Models**  
  Provides a standard interface to communicate with LLMs such as OpenAI, Cohere, Anthropic, etc.

- **Prompts**  
  Templates or dynamic generators that structure the input provided to the model.

- **Output Parsers**  
  Extract and format the raw responses from the model into structured data (e.g., JSON, strings, numbers).

---

### 🔁 General Flow

1. Input data is formatted using a **Prompt**
2. The **Language Model** is queried with the prompt
3. The response is processed using an **Output Parser**

This structured approach ensures consistency, modularity, and easier debugging across model-based workflows.

---

### 🖼️ Model I/O Flow Diagram

> 📷 *Diagram inspired by LangChain docs and adapted from [lakeFS blog](https://lakefs.io/blog/what-is-langchain-ml-architecture/)*




![modelio.png](./images/modelio.png)

## 🧠 Model

LangChain simplifies integration with large language models (LLMs) by providing built-in support and unified interfaces. It distinguishes between two main types of models:

- **LLMs (Text Completion Models)**  
  These models accept a plain text string as input and generate a text string as output. They are ideal for tasks like summarization, translation, or question answering in a single-shot format.

- **Chat Models**  
  These models are optimized for multi-turn conversations. Instead of receiving just a string, they accept a sequence of **chat messages**, each tagged with roles such as:
  - `"System"` – Defines behavior or context for the assistant  
  - `"Human"` – User input  
  - `"AI"` – Assistant’s response (returned by the model)

### ⚠️ Key Differences:

- **LLMs** deal with flat, text-only prompts and outputs (e.g., `"Translate this to French: Hello"`).
- **Chat Models** are structured for conversational flows and include role-based inputs for more controlled dialogue (e.g., context setup, follow-ups).

These distinctions are essential when designing applications like chatbots, assistants, or retrieval-augmented systems that require nuanced input/output handling.

---


### 🔤 [LLMs](https://python.langchain.com/docs/modules/model_io/models/llms/)

The `LLM` class in LangChain offered a unified way to connect with various large language model providers such as OpenAI, Cohere, and Hugging Face for traditional **text completion tasks**. It allowed developers to send a plain text prompt and receive a plain text response.

However, this class has now been **deprecated** and is no longer actively supported by LangChain.

As OpenAI has marked traditional completion models as `legacy`, the focus has shifted towards **Chat Models**, which provide enhanced flexibility and conversational capabilities.

> 📌 **Note**: For the remainder of this course or project, we will use OpenAI’s **Chat Models** instead of legacy LLMs.


### 💬 [Chat Model](https://python.langchain.com/docs/modules/model_io/models/chat/)

Chat models are a specialized form of language models that offer a different interface. Instead of the traditional “text in, text out” approach, chat models work with a sequence of **chat messages**—each labeled with roles such as `System`, `Human`, or `AI`.

Though powered by the same core LLMs, chat models are optimized for **conversational interactions** and allow better context handling in multi-turn dialogue.

---

### 🧪 Using OpenAI's Chat Model

To use OpenAI’s chat model via LangChain, first install the appropriate package:

```bash
pip install -qU langchain-openai


In [4]:
# import required libraries
from langchain_openai import ChatOpenAI, OpenAI

# instantiate OpenAI's Chat Model
llm_chat = ChatOpenAI()

### ⚙️ Configuring `ChatOpenAI()`

The `ChatOpenAI()` constructor accepts several optional parameters that allow you to control the behavior of the model. While the full list of arguments can be found in the [official API reference](https://python.langchain.com/docs/modules/model_io/models/chat/), here are some of the most commonly used ones:

- **`model_name`**  
  Specifies the model to use. If not set, it defaults to: gpt-3.5-turbo


- **`max_tokens`**  
Limits the maximum number of tokens in the output generated by the model.

- **`temperature`**  
Controls randomness in the output. A lower value makes the output more focused and deterministic, while a higher value makes it more diverse.  
**Default:**  
temperature = 0.7


- **`max_retries`**  
Defines the number of retry attempts if the request fails.  
**Default:**  

max_retries = 6

---

> ✅ You can customize these parameters based on the use case—for example, use a low temperature for factual Q&A, or a higher value for creative writing tasks.




## 🔄 2. Data Connections and Retrieval

Beyond simplifying API access, LangChain also offers powerful tools for integrating and retrieving external data—critical for many real-world LLM applications.

Large language models are limited by the data they were trained on. To overcome this, developers often use **Retrieval-Augmented Generation (RAG)**, a process in which relevant data is first retrieved from an external source and then fed into the model to enhance its responses.

LangChain provides all the essential components needed to build a robust RAG pipeline, from basic implementations to advanced, production-ready solutions.

---

### 🧱 Key Components for Retrieval

LangChain breaks down the retrieval step into several modular tools that make document handling efficient and flexible:

- **Document Loaders**  
  Load files from various sources like PDFs, URLs, Notion, APIs, etc.

- **Text Splitters**  
  Divide long documents into smaller, manageable chunks optimized for retrieval and embedding.

- **Vector Stores**  
  Store and index embedded document chunks for similarity search (e.g., FAISS, Chroma, Pinecone).

- **Retrievers**  
  Pull the most relevant content from vector stores based on a user’s query or need.

---

### 🖼️ RAG Architecture Overview

The diagram below shows a typical Retrieval-Augmented Generation (RAG) pipeline:

![RAG Architecture](./images/rag-architecture.png)

> 📷 *Image source: [NVIDIA Developer Blog](https://developer.nvidia.com/blog/tips-for-building-a-rag-pipeline-with-nvidia-ai-langchain-ai-endpoints/)*

---

> ✅ By combining these tools, you can efficiently retrieve and inject external knowledge into your LLM applications—making them smarter and more context-aware.


### 📄 [Document Loaders](https://python.langchain.com/docs/modules/data_connection/document_loaders/)

Document loaders make it easy to import and preprocess content from a wide variety of sources and formats. Each loader converts the input into a standardized `Document` object that includes both:

- **Text content**
- **Associated metadata** (e.g., file name, source URL, timestamps)

---

### 🌐 Supported Sources and Formats

LangChain supports over **100+ document loaders**, allowing you to load data from:

- File types: PDFs, HTML, Markdown, CSV, TXT, JSON, etc.
- Sources: Cloud storage (e.g., S3, GCS), public websites, APIs, databases, and more
- Integrations: Third-party tools like **Airbyte**, **Unstructured**, and others

You can view the full list in the:

- [API Reference for Document Loaders](https://api.python.langchain.com/en/latest/api_reference.html#module-langchain.document_loaders)  
- [Official Documentation](https://python.langchain.com/docs/integrations/document_loaders)

---

### ⚠️ Note

Some loaders depend on external libraries (e.g., `PyMuPDF`, `BeautifulSoup`, `Unstructured`). Be sure to install the necessary packages when working with specific formats or integrations.

---

> ✅ Document loaders are typically the **first step** in any Retrieval-Augmented Generation (RAG) workflow, enabling you to bring custom or private data into your LLM pipeline.


### 📘 Loading PDF Files with `PyPDFLoader`

LangChain offers several connectors for parsing PDF documents. One of the most commonly used loaders is `PyPDFLoader`, which allows you to extract content from PDF files and convert them into an **array of Document objects**.

Each `Document` produced includes:
- The **text content** of a specific page
- **Metadata**, including the **page number**

---

### 📦 Requirements

To use `PyPDFLoader`, make sure the `pypdf` Python package is installed:

```bash
pip install pypdf
```

In [5]:
# !pip install -qU langchain-openai
# !pip install -U langchain-community

In [6]:
# import the PyPDFLoader class from LangChain
from langchain_community.document_loaders import PyPDFDirectoryLoader

# Read the insurance documents from directory
pdf_directory_loader = PyPDFDirectoryLoader(folder_path + '/DocumentsPolicy')

documents = pdf_directory_loader.load()

In [7]:
# print details and first 100 lines from each docucment
for doc in documents:
    print(f"Source: {doc.metadata['source']}")
    print(f"Page Number: {doc.metadata['page']}")
    print(f"Content: {doc.page_content[:150]}...")  # Displaying the first 150 characters

Source: DocumentsPolicy/HDFC-Life-Group-Term-Life-Policy.pdf
Page Number: 0
Content: F&U dated 15th October 2022                  UIN-101N169V02  P a g e  | 0                        
 
 
 
 
 
   HDFC Life Group Term Life 
 
OF 
 
 
«O...
Source: DocumentsPolicy/HDFC-Life-Group-Term-Life-Policy.pdf
Page Number: 1
Content: F&U dated 15th October 2022                  UIN-101N169V02  P a g e  | 1                        
 
 
PART A: Covering Letter with Policy Schedule 
  ...
Source: DocumentsPolicy/HDFC-Life-Group-Term-Life-Policy.pdf
Page Number: 2
Content: F&U dated 15th October 2022                  UIN-101N169V02  P a g e  | 2                        
 
Address    : 
Mobile/Landline Number :  
 
 
 
 
 ...
Source: DocumentsPolicy/HDFC-Life-Group-Term-Life-Policy.pdf
Page Number: 3
Content: F&U dated 15th October 2022                  UIN-101N169V02  P a g e  | 3                        
 
 
Benefits under this Policy will be payable on th...
Source: DocumentsPolicy/HDFC-Life-Group-Term

### ✂️ **Document Transformers / Text Splitters**

When working with long documents such as books, reports, or transcripts, the content is often **too large** to send to a language model in one go due to token limits. To address this, LangChain offers **text splitters** that break documents into smaller, manageable chunks—while attempting to preserve semantic coherence.

These chunks can then be passed into LLMs or used in vector search systems as part of Retrieval-Augmented Generation (RAG) pipelines.

---

### ⚙️ Why Use Text Splitters?

- LLMs have **token limits** on input size.
- Smaller chunks enable more **accurate retrieval** and **context-aware generation**.
- Splitting improves **embedding granularity** and model performance.

---

### 🧩 Common Text Splitting Strategies

LangChain provides multiple text splitters that you can choose from depending on your needs:

#### 1. **Split by Character**
- **Method**: Splits text based on a character pattern (default is `"\n\n"`).
- **Chunk size measured by**: Number of characters.
- **Use case**: Simple and fast; suitable for uniformly structured content.

#### 2. **Recursive Text Splitter** (Recommended)
- **Method**: Tries to split using a list of separators in descending order of granularity (default: `["\n\n", "\n", " ", ""]`).
- **Chunk size measured by**: Number of characters.
- **Advantage**: Preserves semantic relationships like paragraphs → sentences → words.
- **Use case**: Generic text like blogs, news articles, or emails.

#### 3. **Token Splitter**
- **Method**: Splits text based on token count, matching the tokenizer used by your LLM.
- **Chunk size measured by**: Number of tokens.
- **Advantage**: Ensures chunks remain under the model’s token limit.
- **Use case**: When fine control over token usage is critical (e.g., OpenAI API requests).

---

> 🔗 [Explore text splitting strategies in LangChain](https://python.langchain.com/docs/concepts/text_splitters/)

> 💡 **Tip**: The right splitter depends on your data type and model. Experiment with multiple approaches to find the optimal balance between chunk size, context, and semantic meaning.


In [8]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Initialize the RecursiveCharacterTextSplitter (customize chunk size and overlap as needed)
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)

splits = text_splitter.split_documents(documents)

In [9]:
# print a sample chunk
print(splits[2])

page_content='purchased HDFC Life Insurance Policy:  
 
 Policy Schedule   :  Summary of key features of your HDFC Life Insurance Policy 
 Premium Receipt   :  Acknowledgement of the first Premium paid by you 
 Terms & Conditions  :  Detailed terms of your Policy contract with HDFC Life                                                               
                                                    Insurance 
 Service Options  :  Wide range of Policy servicing options that you can Benefit 
from 
 
We request you to carefully go through the information given in this document. You are also advised to 
keep the Policy Bond with utmost care and safety because this document will be required at the time 
of availing Policy Benefits.  In case of employer -employee relationship and the policy has been issued 
in the benefit of employee you are advised to communicate to the employee on the details of insurance' metadata={'producer': 'Microsoft® Office Word 2007', 'creator': 'Microsoft® Off

In [10]:
print ("Text Preview:") # Preview the split texts and the character count
print (splits[0].page_content,"-", len(splits[0].page_content), "\n")
print (splits[1].page_content,"-", len(splits[1].page_content), "\n")
print (splits[2].page_content,"-", len(splits[2].page_content), "\n")
print (splits[3].page_content,"-", len(splits[3].page_content), "\n")
print (splits[4].page_content,"-", len(splits[4].page_content), "\n")
print (splits[5].page_content,"-", len(splits[5].page_content), "\n")

Text Preview:
F&U dated 15th October 2022                  UIN-101N169V02  P a g e  | 0                        
 
 
 
 
 
   HDFC Life Group Term Life 
 
OF 
 
 
«OWNERNAME» 
 
 
 
 
 
  
Based on the Proposal and the declarations and 
any 
statement made or referred to therein, 
We will pay the Benefits mentioned in this Policy 
subject to the terms and conditions contained 
herein 
 
 
 
 
 
 
<< Designation of the Authorised Signatory >> - 430 

F&U dated 15th October 2022                  UIN-101N169V02  P a g e  | 1                        
 
 
PART A: Covering Letter with Policy Schedule 
                                                                                                                                                <dd-mm-yyyy> 
__________________ 
__________________ 
__________________ 
__________________ 
__________________ 
 
 
Your HDFC Life <Policy Name> with Policy No. <Policy no.> 
 
Dear Mr./Ms.___________________________, 
 
We thank you for choosing HDFC L

### 🧠 [Text Embedding Models](https://python.langchain.com/docs/modules/data_connection/text_embedding/)

Text embedding models are essential for converting raw text into **numerical vector representations** that can be used for semantic operations like **similarity search**, **clustering**, or **classification**.

LangChain offers a unified interface via the `Embeddings` class to work with a variety of embedding providers, including:

- **OpenAI**
- **Cohere**
- **Hugging Face (SentenceTransformers)**

---

### 🔧 How It Works

The `Embeddings` class in LangChain exposes two core methods:

- `embed_documents(texts: List[str])`  
  Converts a list of text strings into corresponding vector embeddings.  
  ➤ Use this for indexing a corpus of documents.

- `embed_query(text: str)`  
  Converts a single query string into a vector embedding.  
  ➤ Use this to compare the query against stored document embeddings (e.g., in vector stores).

These vectors allow downstream tasks like:

- Semantic search
- Text similarity comparison
- Sentiment analysis
- Clustering and classification

---

> 🔗 [Learn more about embedding integrations](https://python.langchain.com/docs/how_to/embed_text/)

> 💡 Tip: Choose an embedding model that best matches your language, domain, and application size.


In [11]:
# Import the OpenAI Embeddings class from LangChain
from langchain.embeddings import OpenAIEmbeddings
embeddings_model = OpenAIEmbeddings()

  embeddings_model = OpenAIEmbeddings()


In [12]:
# doing a sample embedding of a random chunk, to check size
embeddings = embeddings_model.embed_documents([splits[0].page_content])
len(embeddings), len(embeddings[0])

(1, 1536)

In [13]:
type(embeddings)

list

### 📦 Vector Stores

Vector stores are essential components in Retrieval-Augmented Generation (RAG) systems. They allow you to **store**, **index**, and **search** over embedded (vectorized) representations of your unstructured data.

---

### 🧠 How It Works

The typical workflow is:

1. **Embed the documents** using a text embedding model.
2. **Store the resulting vectors** in a vector store.
3. At query time, **embed the user query** in the same vector space.
4. Use the vector store to **retrieve the most similar document vectors** based on similarity metrics (like cosine similarity or dot product).

---

### 🔍 What Vector Stores Do

- **Storage** of vectorized document chunks
- **Indexing** for efficient similarity search
- **Retrieval** of top-matching vectors based on a query embedding

LangChain supports a variety of vector stores including:

- FAISS
- Chroma
- Pinecone
- Weaviate
- Qdrant
- Elasticsearch (with vector plugins)

---

> 💡 Tip: Choose your vector store based on the scale of data, latency needs, and hosting preferences (local vs. cloud).


In [14]:
from langchain.vectorstores import Chroma
# Initialize OpenAIEmbeddings
openai_embeddings = OpenAIEmbeddings()

In [15]:
# creating a cache backed embeddings
from langchain.storage import InMemoryStore
from langchain.embeddings import CacheBackedEmbeddings

cache_store = InMemoryStore()
cached_embeddings = CacheBackedEmbeddings.from_bytes_store(
    openai_embeddings,
    cache_store,
    namespace="embeddings_namespace"
)

In [16]:
# Create a persistent ChromaDB instance with OpenAI embeddings
db = Chroma.from_documents(
    documents = splits,
    embedding = cached_embeddings,
    persist_directory="./chroma_persistence"  # Set a directory for persistent storage
)

Perform Similarity Search

In [17]:
def similarity_search(query):
    return db.similarity_search(query)

docs =  similarity_search("what is the life insurance coverage for disability?")
print(docs[0])

page_content='Page 7 of 31 
 
Part C 
1. Benefits: 
 
(1) Benefits on Death or diagnosis of contingency covered –  
 
Plan Option Events Benefit 
Life Death In the event of the death of the Scheme Member, the 
benefit payable shall be the Sum Assured.  
Extra Life Option Death In the event of the death of the Scheme Member, the 
benefit payable shall be the Sum Assured. 
Accidental Death In event of the Scheme Member’s death due to 
Accident, an additional death benefit equal to the Sum 
Assured will be payable. 
This is in addition to the death benefit mentioned 
above  
Accelerated Critical Illness 
Option 
 
Death In the event of the death of the Scheme Member, the 
benefit payable shall be the Sum Assured. 
Diagnosis of a 
Critical Illness 
In the event of Scheme Member being diagnosed with 
any of the covered Critical Illnesses during the Policy 
Term, the benefit payable shall be the Sum Assured 
and the policy will terminate.' metadata={'creator': 'PyPDF', 'title': 'HDFC Life Gr

LangChain also support all major vector stores and databases such as FAISS, ElasticSearch, LanceDB, Milvus, Pinecone etc. Refer to the [API documentation](https://api.python.langchain.com/en/latest/api_reference.html#module-langchain.vectorstores) for the complete list.

### 🔎 **Retrievers**

Retrievers act as the bridge between your data and the language model. They allow the LLM to access relevant external information at query time—without requiring the model to store or remember all data internally.

---

### 📚 What Is a Retriever?

A **retriever** is a component that takes an **unstructured query** (typically a natural language question) and returns a set of **relevant documents**. Unlike vector stores, retrievers don't need to store data themselves—they just need to know **how to fetch it**.

> Think of a retriever as a smart filter: it surfaces the most relevant information to pass along to the LLM for answering a query.

---

### 🔁 Relationship with Vector Stores

- Vector stores can **power** a retriever (e.g., `VectorStoreRetriever`).
- However, retrievers can also be built from other mechanisms such as:
  - Keyword search
  - BM25
  - Web scraping
  - SQL queries
  - Hybrid methods

---

### ⚙️ Common Retriever: `VectorStoreRetriever`

This is the most widely used retriever and is backed by a vector store. It performs similarity search on embeddings and returns the top-k most relevant chunks.

---

### 📘 Further Reading

- 📄 [Retriever Integrations – Official Docs](https://python.langchain.com/docs/integrations/retrievers/)

---

> 💡 Tip: Use `retrievers` to decouple the document fetching logic from the storage layer. This gives you flexibility in how and where your data lives.


In [18]:
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import CrossEncoderReranker
from langchain_community.cross_encoders import HuggingFaceCrossEncoder

# Initialize a document retriever using the existing vector storage (db).
# The retriever is configured to retrieve a top 20 documents with  mmr score more than 0.8 with cross encoding enabled

def get_retriever(topk):
    search_kwargs={"k": topk, "score_threshold": 0.8}
    retriever = db.as_retriever(search_type="mmr", search_kwargs=search_kwargs)

    # Initialize cross-encoder model
    cross_encoder = HuggingFaceCrossEncoder(model_name="BAAI/bge-reranker-base")
    
    # Set up reranker
    reranker = CrossEncoderReranker(model=cross_encoder, top_n=20)
    return ContextualCompressionRetriever(base_compressor=reranker, base_retriever=retriever)

    

# Combine retriever and reranker
def get_topk_relevant_documents(query, topk):
    retriever = get_retriever(topk)
    relevant_docs = retriever.invoke(query)
    return relevant_docs

## 🧪 Sample Queries: Disability Coverage in Life Insurance

The following natural language questions can be used to query the insurance documents for details related to **disability coverage under life insurance**. These are designed for use with a Retrieval-Augmented Generation (RAG) pipeline or semantic search system.

### 🔍 Sample Questions

- *“What is the life insurance coverage in the event of a disability?”*
- *“Does the life insurance policy include disability benefits?”*
- *“Is there a disability clause in the life insurance plan?”*
- *“What happens to my life insurance if I become disabled?”*
- *“Does the policy waive premiums during disability?”*
- *“Are there any disability riders included in this life insurance policy?”*
- *“What type of disability coverage is provided under the life insurance plan?”*
- *“Is total permanent disability covered in the life insurance terms?”*
- *“Does the life insurance plan provide payouts for critical or long-term disability?”*
- *“Are accelerated benefits available in case of terminal illness or disability?”*

---

### 💡 Tip for Retrieval:

To improve retrieval accuracy:
- Ensure document chunks include headers like “Disability Benefits,” “Waiver of Premium,” or “Riders & Add-ons.”
- Use chunk sizes large enough to capture context but small enough to stay within token limits.

---

> These queries can help validate if a policy includes **premium waivers**, **payouts for disability**, or **add-on riders** for total or partial disability. Customize based on your specific dataset schema or search interface.


In [19]:
# !pip install sentence-transformers

In [20]:
retriever_docs = get_topk_relevant_documents("What is the life insurance coverage in the event of a disability?", 50)

In [21]:
#len(retriever_docs)
!pip show langchain

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Name: langchain
Version: 0.3.27
Summary: Building applications with LLMs through composability
Home-page: 
Author: 
Author-email: 
License: MIT
Location: /Users/jason/.jjDataDir/~~asdf_SARAVA/installs/python/3.13.3/lib/python3.13/site-packages
Requires: langchain-core, langchain-text-splitters, langsmith, pydantic, PyYAML, requests, SQLAlchemy
Required-by: langchain-community


In [22]:
# !pip freeze > requirements.txt   

In [23]:
# print one page content
retriever_docs[0]

Document(metadata={'title': 'HDFC Life Group Poorna Suraksha (101N137V02) - Policy Document', 'creator': 'PyPDF', 'source': 'DocumentsPolicy/HDFC-Life-Group-Poorna-Suraksha-101N137V02-Policy-Document.pdf', 'total_pages': 31, 'page_label': '14', 'moddate': '2022-01-20T07:02:14+00:00', 'page': 13, 'producer': 'Microsoft: Print To PDF', 'creationdate': '2022-01-10T13:40:09+00:00'}, page_content='Page 14 of 31 \n \nPart F \n \n1. Waiting Period and Exclusions: \ni. 90 Days Waiting Period for Accelerated Critical Illness Benefit \nNo benefit shall be paid in case the Scheme Member is diagnosed with any of the applicable listed Critical \nIllnesses or surgeries within 90 days from the date of commencement of the Coverage term except in cases \nwhere the Critical Illness occurs as a result of an Accident (such as Major Head Trauma). \n \nii. Suicide exclusion (Single & Joint Life) \n\uf0b7 For employer-employee groups, sum Assured will be payable to the nominee in case of death due to \nSuici

In [24]:
# method for combining all relevant page content
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

In [25]:
from langchain import hub
prompt = hub.pull("rlm/rag-prompt")

## 🔗 4. Chains

While it's possible to use an LLM directly for basic tasks, most real-world applications require **multiple steps**—often involving several components working together. This is where **Chains** come in.

LangChain enables you to **combine various components**—like prompts, LLMs, retrievers, tools, and memory—into a single unified pipeline called a **Chain**.

---

### 🧱 What Are Chains?

Chains allow you to:

- Take user input
- Format it using a `PromptTemplate`
- Send the formatted prompt to an LLM
- Process and return the model's response

This modular structure supports everything from simple input-output flows to complex, multi-step logic.

---

### 🔁 LLMChain: The Basic Unit

The most fundamental type of chain in LangChain is the `LLMChain`.

- It takes **an input string**
- Applies a **prompt template**
- Sends it to an **LLM**
- Returns the **output string**

```mermaid
graph LR
A[User Input] --> B[Prompt Template] --> C[LLMChain] --> D[LLM Response]


In [26]:
# In LangChain, the rag-prompt is a prompt template designed for Retrieval-Augmented Generation (RAG) tasks, 
# such as chat and question-answering applications. It facilitates the integration of external context into 
# the language model's responses, enhancing the relevance and accuracy of the generated content.

# pulling rag prompt from LangChain hub
from langchain import hub
prompt = hub.pull("rlm/rag-prompt")

Let's now create a simple LLMChain that takes an input, formats it, and passes it to an LLM for processing. The basic components are PromptTemplate, input queries, an LLM, and optional output parsers.

### 🧪 Create a Basic LLMChain

We will build a simple LLMChain that processes user input by first formatting it with a prompt template, then passing it to a language model for response generation. The key components involved are:

- `PromptTemplate` – to structure the input
- User input – the query to be answered
- An LLM – to generate the response
- *(Optional)* Output parsers – to refine or format the result


In [27]:
from langchain_openai import ChatOpenAI
llm = ChatOpenAI()

In [28]:
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

retriever = get_retriever(50)
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [29]:
# test a query
query = "What happens to my life insurance if I become disabled?"
rag_chain.invoke(query)

'If you become disabled, your life insurance will not be terminated unless specified events occur such as surrendering the certificate of insurance or termination of the master policy. The benefits will be payable to the nominee or policyholder based on the terms of the policy. In case of death, the death benefit will be paid to the nominee or policyholder as defined in the policy.'

In [30]:
# test a query
query = "What type of disability coverage is provided under the life insurance plan?"
rag_chain.invoke(query)

'The disability coverage provided under the life insurance plan includes benefits for Accelerated Critical Illness and coverage in case of death due to Suicide. There is no maturity benefit payable under the policy, and the benefits on Surrender depend on the type of premium payment chosen.'

In [31]:
# test another query
query = "Does the life insurance plan provide payouts for critical or long-term disability?"
rag_chain.invoke(query)

'The life insurance plan does not provide payouts for critical illness, as the Critical Illness Benefit terminates upon diagnosis within 90 days of cover commencement. Instead, the policy offers benefits for death or accidental death, with the coverage amount payable upon those events. The Critical Illness Benefit option results in policy termination upon diagnosis of a covered critical illness.'

In [32]:
# test another query
query = "Are accelerated benefits available in case of terminal illness or disability?"
rag_chain.invoke(query)

'Yes, accelerated benefits are available in the case of a terminal illness or disability under the Accelerated Critical Illness Option of the Master Policy. This benefit will be payable if the Scheme Member is diagnosed with one of the covered Critical Illnesses or undergoes specific surgeries listed in the policy during the coverage term. However, after payment of the Accelerated Critical Illness Benefit, the coverage for the Scheme Member will cease, and all benefits will expire.'