### **Project Semantic spotter**

**Application name**: **Insure Assist AI**

This application has two important use cases:
- Comparison across multiple policies: Helps insurance agents/ advisors, product teams and customers choosing between plans 
- Granular benefit and claim related explanations: Helps in assisting cliam handlers and policyholders for more comphresenive way to understand the policy clauses and benefits without going through lengthy documents

**Document Extraction → Smart Chunking + Metadata → Embedding → VectorDB → User Query → Retrieval (hybrid) → Rerank → LLM Response (Comparison/Q&A)**
#### **Core Use Cases**

“What is the death benefit under Sampoorna Jeevan vs. Poorna Suraksha?”

“Which of the three policies cover critical illness extensively?”

“What is the waiting period for claim eligibility in Easy Health?”

“Compare surrender benefits across policies.”

#### **Tech Stack**

**LlamaIndex** - LlamaIndex is an open-source framework that excels at the "retrieval" and "augmentation" parts of RAG. It is particularly well-suited for my use case for several key reasons:
- Data-Centric Design
- Advanced data connectors and parsers
- Sophisticated Indexing and retreival strategies - Mainly focusing on metadata filtering as simple semantic search is not enough for the policies information provided for this problem statemnet. LlamaIndex makes it straightforward to add metadata (e.g., policy_name, policy_type: 'group', UIN: '101N137V02') to each chunk of text. This allows the system to first filter for only the relevant documents (all group policies) before performing the semantic search, leading to far more accurate results.

In short, it is Llama-Index is purposefully built for RAG system. It is well-suited for ingesting and indexing complex documents like policies.

#### 0. Installing/ Importing libraries


In [1]:
### Installing LlamaIndex
# ! pip install -U -q llama-index openai llama-index-core llama-index-embeddings-openai
# ! pip install PyMuPDF
# ! pip install llama-index-embeddings-huggingface
# ! pip install llama-index-embeddings-cohere


In [2]:
### General imports
import os 
import json
import openai
import pandas as pd
## GenAI framework

import nest_asyncio
nest_asyncio.apply()

import semantic_spotter_ins as ins_func

### Avoid printing logs
import logging
logging.getLogger("httpx").setLevel(logging.WARNING)
logging.getLogger("openai").setLevel(logging.WARNING)
logging.getLogger("llama_index").setLevel(logging.WARNING)


### Setting up environment variables
with open("C:/Users/SHAMBHAVVISEN/Downloads/OpenAI_API_Key.txt", "r") as file:
    os.environ["OPENAI_API_KEY"] = file.read().strip()
openai.api_key = os.getenv("OPENAI_API_KEY")

  from .autonotebook import tqdm as notebook_tqdm


[INFO] OpenAI API key loaded successfully.
[INFO] Global LLM set to: gpt-3.5-turbo


#### Insurance policies semantic spotter (RAG Architecture)
- Extract the documents
- Chunking documents
    - Fixed window chunking
    - Section-based Chunking
    - Semantic Chunking
- Embedding documents
    - OpenAIEmbedding
    - HuggingFace Transformers
    - Cohere Embeddings
- Retreival 
    - reranking 
    - Similarity post processor 
    - LLM reranking
- Response generation
    - Q&A format
    - Comparison output format


In [3]:
def insure_assist_RAG_system(file_loc='None', 
               chunking_type='fixed', 
               chunk_size=None, 
               chunk_overlap=None,
               embedding_type=None,
               api_file_name=None,
               query_str=None,
               node_p=None,
               node_type=None,
               response_mod="refine",
               cut_off=0.8,
               top_k=5
               ):    
    """ 
    file_loc: policy_doc folder location,
    chunking_type: "fixed_window" or "semantic" or "Hierarchical",
    chunk_size: token limits to chunk
    chunk_overlap: within page overlap token sizes for contuinity
    embedding_type: OpenAI, Cohere or HuggingFace
    api_file_name: API file name
    query_str: user query for final response generation
    node_p: Re-ranking methodology ("LLM" or "similarity")
    response_mod: (output format) 'refine' or 'compact' or 'accumualte' or 'tree_summarize'

    returns: retreival nodes + generated response

    """
    ### Extract documents ###
    documents=ins_func.load_docs(file_loc)

    print(f"Original Total documents: {len(documents)}")

    ### chunking strategies ###
    nodes=ins_func.chunking_strategies(chunking_type,documents,
                                       chunk_size=chunk_size,
                                       chunk_overlap=chunk_overlap)
    print(f"Documents are chunked based on {chunking_type} chunking Strategy with the total nodes of {len(nodes)}")
    
    ### embedding strategies ###
    index = ins_func.embedding_strategies(embedding_type=embedding_type,nodes=nodes, api_file_name=api_file_name,node_type=node_type) 

    ### Retreiver + Re-ranking  & Generation ###
    query_engine,retrieved_nodes, retriever, response=ins_func.create_retriever_generate_response(index, 
                                         query_str=query_str ,
                                         top_k=top_k, filters=None, node_p=node_p, cut_off=cut_off,
                                         response_mod=response_mod)
    print(f"The query statement is: {query_str}")
    return index,retrieved_nodes, response



#### Evaluation Section on different chunking strategies
Questions to be tested (Any of these qns can be tested):

**Comparison Questions**
1. Compare the availability of a Surrender Value or benefit across the HDFC Life Easy Health (Single Premium), the HDFC Life Group Term Life, and the HDFC Life Sampoorna Jeevan (Regular Pay) policies
2. Compare the Suicide Exclusion clauses for the HDFC Life Group Poorna Suraksha and the HDFC Life Group Term Life policies, specifically differentiating the treatment of employer-employee versus non-employer-employee groups.
3. Contrast the availability of a Policy Loan facility and the stated interest rate structure for the loan under the individual savings policies, HDFC Life Sampoorna Jeevan Plan versus HDFC Life Sanchay Plus
4. Differentiate the Critical Illness Benefit (CIB) trigger and claim effect between the HDFC Life Easy Health policy and the HDFC Life Group Poorna Suraksha policy (Accelerated Critical Illness Option)
5. Compare the definition of "Sum Assured on Death" used in the HDFC Life Sanchay Plus Policy (Limited/Regular Pay) for an entry age of 30, with the definition of Sum Assured on Death used in the HDFC Life Smart Pension Plan.
6. Contrast the Free-Look Period for a policy purchased through Distance Marketing for an Individual Policyholder (e.g., HDFC Life Easy Health or HDFC Life Sanchay Plus) versus a Group Master Policyholder (e.g., HDFC Life Group Poorna Suraksha).


**Individual Policy comparison questions**
1. For a non-employer-employee group under the HDFC Life Group Term Life Policy, what specific benefit amount is payable if an Insured Member dies by suicide within 12 months of joining?
2. List two requirements that the Policyholder must fulfill for a transfer or assignment of the HDFC Life Sampoorna Jeevan Policy to be operative against the insurer, according to the simplified Section 38 provisions
3. For the HDFC Life Group Poorna Suraksha Master Policy, what defines a "Hospital" based on minimum criteria for a town with a population of 15,00,000?
4. Describe how the Simple Reversionary Income Bonus (Bonus Option 2) is paid out under the HDFC Life Sampoorna Jeevan Plan if the Premium Payment Term is 15 years.
5. For the HDFC Life Easy Health Surgical Benefit Option, what specific percentage of the Sum Insured is payable for undergoing "Coronary Angioplasty with stent implantation (two or more coronary arteries must be stented)"?
6. In the HDFC Life Smart Pension Plan, what is the maximum Policy Discontinuance Charge (in Rupees) for a non-single premium policy with an Annualized Premium of Rs. 60,000, discontinued during Policy Year 4?

    

### Fixed Window 

- OpenAI
- HuggingFace
- Cohere


In [4]:
### Queries for different cases
comparison_question="Compare the Critical Illness Benefit (CIB) trigger and claim effect between the HDFC Life Easy Health policy and the HDFC Life Group Poorna Suraksha policy (Accelerated Critical Illness Option)"
single_qna="In the HDFC Life Smart Pension Plan, what is the maximum Policy Discontinuance Charge (in Rupees) for a non-single premium policy with an Annualized Premium of Rs. 60,000, discontinued during Policy Year 4?"


In [5]:
### OpenAI

fixed_chunk_index_op, fixed_openai_ret,fixed_openai_response =insure_assist_RAG_system(file_loc='Policy+Documents', 
               chunking_type='fixed_window', 
               chunk_size=512, 
               chunk_overlap=200,
               embedding_type="OpenAIEmbedding",
               api_file_name="C:/Users/SHAMBHAVVISEN/Downloads/OpenAI_API_Key.txt",
               query_str=comparison_question,
               node_p="LLM",
               response_mod='refine',
               node_type="fixed_window",
               cut_off=0.8,
               top_k=10
               )
print("\n fixed_openai_ret")
print(fixed_openai_ret)
print("\n")
print(fixed_openai_response)

Loaded 281
Original Total documents: 281
Documents are chunked based on fixed_window chunking Strategy with the total nodes of 571
Using embedding_type & embedding model:model_name='text-embedding-3-large' embed_batch_size=100 callback_manager=<llama_index.core.callbacks.base.CallbackManager object at 0x000001F9BEFCDA90> num_workers=None embeddings_cache=None additional_kwargs={} api_key="'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'" api_base='https://api.openai.com/v1' api_version='' max_retries=10 timeout=60.0 default_headers=None reuse_client=True dimensions=None
Accessing from existing index...OpenAIEmbedding
Loading llama_index.core.storage.kvstore.simple_kvstore from storage_openai_fixed_window/docstore.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from storage_openai_fixed_window/index_store.json.
Done... & OpenAIEmbedding
[CACHE MISS] Process

**Compare the Suicide Exclusion clauses for the HDFC Life Group Poorna Suraksha and the HDFC Life Group Term Life policies, specifically differentiating the treatment of employer-employee versus non-employer-employee groups.**


| Policy Name                          | Feature/Benefit                          | Details                                                                                                                                                                                                                     |
|--------------------------------------|-----------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| HDFC Life Group Poorna Suraksha     | Suicide Exclusion for Employer-Employee | The suicide clause does not apply to the Employer-Employee group. If an insured member dies by suicide within one year from the date of joining the scheme, the full death benefit is payable without any deductions.            |
| HDFC Life Group Poorna Suraksha     | Suicide Exclusion for Non-Employer-Employee | For non-employer-employee groups, if an insured member dies by suicide within the first year of membership, the benefit is limited to 80% of the total premiums paid, excluding any extra premium and taxes. The premium paid is forfeited.       |
| HDFC Life Group Term Life            | Suicide Exclusion for Employer-Employee | Similar to the Poorna Suraksha policy, the suicide clause is not applicable to the Employer-Employee group, ensuring full death benefits are paid in case of suicide within the first year.                                 |
| HDFC Life Group Term Life            | Suicide Exclusion for Non-Employer-Employee | In the case of non-employer-employee groups, the policy limits the benefit to 80% of the total premiums paid if the insured member commits suicide within the first year, with the premium forfeited to the company.               |

**Compare the Critical Illness Benefit (CIB) trigger and claim effect between the HDFC Life Easy Health policy and the HDFC Life Group Poorna Suraksha policy (Accelerated Critical Illness Option)**

| Policy Name | Feature / Benefit | Details |
|--------------|------------------|----------|
| HDFC Life Easy Health | Critical Illness Benefit (CIB) trigger | Diagnosis of any covered Critical Illness during the Policy Term |
| HDFC Life Easy Health | Critical Illness Benefit (CIB) claim effect | Benefit payable is the Sum Assured and policy will terminate |
| HDFC Life Group Poorna Suraksha (Accelerated Critical Illness Option) | Critical Illness Benefit (CIB) trigger | Diagnosis of any covered Critical Illness during the Policy Term |
| HDFC Life Group Poorna Suraksha (Accelerated Critical Illness Option) | Critical Illness Benefit (CIB) claim effect | Benefit payable is the Sum Assured and policy will terminate |

In [6]:
print("Retrieved_nodes")
fixed_openai_ret

Retrieved_nodes


[NodeWithScore(node=TextNode(id_='5e094d37-5ff6-4f25-835a-ddc639ccd082', embedding=None, metadata={'total_pages': 31, 'file_path': 'Policy+Documents\\HDFC-Life-Group-Poorna-Suraksha-101N137V02-Policy-Document.pdf', 'source': '7'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='149f7a36-d6d3-4058-9f6a-7162ca827d33', node_type='4', metadata={'total_pages': 31, 'file_path': 'Policy+Documents\\HDFC-Life-Group-Poorna-Suraksha-101N137V02-Policy-Document.pdf', 'source': '7'}, hash='4eb6e29a0650b88db1a1f72646e503447cc2cac997f9a0c28634729cb3b96da1')}, metadata_template='{key}: {value}', metadata_separator='\n', text='Page 7 of 31 \n \nPart C \n1. \nBenefits: \n \n(1) Benefits on Death or diagnosis of contingency covered –  \n \nPlan Option \nEvents \nBenefit \nLife \nDeath \nIn the event of the death of the Scheme Member, the \nbenefit payable shall be the Sum Assured.  \nExtra Life Option \nDeath \nIn the 

In [7]:
print(fixed_openai_response)

| Policy Name | Feature / Benefit | Details |
|--------------|------------------|----------|
| HDFC Life Easy Health | Critical Illness Benefit (CIB) trigger | Diagnosis of any covered Critical Illness during the Policy Term |
| HDFC Life Easy Health | Critical Illness Benefit (CIB) claim effect | Benefit payable is the Sum Assured and policy will terminate |
| HDFC Life Group Poorna Suraksha (Accelerated Critical Illness Option) | Critical Illness Benefit (CIB) trigger | Diagnosis of any covered Critical Illness during the Coverage Term |
| HDFC Life Group Poorna Suraksha (Accelerated Critical Illness Option) | Critical Illness Benefit (CIB) claim effect | Benefit payable is the Sum Assured and policy will terminate |


In [8]:
### Q&A
query_engine_fo,retrieved_nodes_fo, retriever, f_op_response=ins_func.create_retriever_generate_response(fixed_chunk_index_op, 
                                         query_str=single_qna,
                                         top_k=10, filters=None, node_p="LLM", cut_off=0.8,
                                         response_mod="tree_summarize")
print(f"Individual Policy information - Extraction")
display(retrieved_nodes_fo)
print(f_op_response)

[CACHE MISS] Processing new query: 'In the HDFC Life Smart Pension Plan, what is the maximum Policy Discontinuance Charge (in Rupees) for a non-single premium policy with an Annualized Premium of Rs. 60,000, discontinued during Policy Year 4?'
[CACHE STORE] Query cached successfully: 'In the HDFC Life Smart Pension Plan, what is the maximum Policy Discontinuance Charge (in Rupees) for a non-single premium policy with an Annualized Premium of Rs. 60,000, discontinued during Policy Year 4?'
Individual Policy information - Extraction


[NodeWithScore(node=TextNode(id_='f057f6c6-a55a-4358-a7cb-8a2fa3c60674', embedding=None, metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '19'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='bd121e78-b885-4dec-899e-c6c27c36ee4b', node_type='4', metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '19'}, hash='2060434bce13083d7725a7335ec9c7130ad883f0c5ff430e325e787362d220bb'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='6f7d2247-63be-4493-b8aa-2d2222bbfcab', node_type='1', metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '19'}, hash='35911cb6ba525fb34f65cc08b316bb04926f490d27127b6b16ed1b849b2f16be')}, metadata_template='{key}: {value}', metadata_separa

The maximum Policy Discontinuance Charge for a non-single premium policy with an Annualized Premium of Rs. 60,000, discontinued during Policy Year 4 would be Rs. 2,000.


In [9]:
#### HuggingFace
## Comparison Q&A
fixed_chunk_index_hf, fixed_hf_ret,fixed_hf_response =insure_assist_RAG_system(file_loc='Policy+Documents', 
               chunking_type='fixed_window', 
               chunk_size=512, 
               chunk_overlap=200,
               embedding_type="HuggingFace",
               api_file_name="C:/Users/SHAMBHAVVISEN/Downloads/OpenAI_API_Key.txt",
               query_str=comparison_question,
               node_p="LLM",
               node_type="fixed_window",
               response_mod='refine',
               cut_off=0.8,
               top_k=10
               )
print("fixed_hg_ret \n")
display(fixed_hf_ret)
print("\n")
print(fixed_hf_response)

Loaded 281
Original Total documents: 281


2025-10-07 00:43:48,983 - INFO - Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Documents are chunked based on fixed_window chunking Strategy with the total nodes of 571
Using embedding_type & embedding model:model_name='sentence-transformers/all-MiniLM-L6-v2' embed_batch_size=10 callback_manager=<llama_index.core.callbacks.base.CallbackManager object at 0x000001F9C1656FC0> num_workers=None embeddings_cache=None max_length=256 normalize=True query_instruction=None text_instruction=None cache_folder=None show_progress_bar=False
Accessing from existing index...HuggingFace
Loading llama_index.core.storage.kvstore.simple_kvstore from storage_huggingface_fixed_window/docstore.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from storage_huggingface_fixed_window/index_store.json.
Done... & HuggingFace
[CACHE MISS] Processing new query: 'Compare the Critical Illness Benefit (CIB) trigger and claim effect between the HDFC Life Easy Health policy and the HDFC Life Group Poorna Suraksha policy (Accelerated Critical Illness Option)'
[CACHE STORE] Query cached su

[NodeWithScore(node=TextNode(id_='2e03a443-fcf0-4f03-b10b-d1d8e3a906c3', embedding=None, metadata={'total_pages': 31, 'file_path': 'Policy+Documents\\HDFC-Life-Group-Poorna-Suraksha-101N137V02-Policy-Document.pdf', 'source': '7'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='80a9e250-d7e3-48e9-bf88-19d029bf8b2a', node_type='4', metadata={'total_pages': 31, 'file_path': 'Policy+Documents\\HDFC-Life-Group-Poorna-Suraksha-101N137V02-Policy-Document.pdf', 'source': '7'}, hash='4eb6e29a0650b88db1a1f72646e503447cc2cac997f9a0c28634729cb3b96da1')}, metadata_template='{key}: {value}', metadata_separator='\n', text='Page 7 of 31 \n \nPart C \n1. \nBenefits: \n \n(1) Benefits on Death or diagnosis of contingency covered –  \n \nPlan Option \nEvents \nBenefit \nLife \nDeath \nIn the event of the death of the Scheme Member, the \nbenefit payable shall be the Sum Assured.  \nExtra Life Option \nDeath \nIn the 



| Policy Name | Feature / Benefit | Details |
|--------------|------------------|----------|
| HDFC Life Easy Health | Critical Illness Benefit (CIB) trigger | Diagnosis of any covered Critical Illness during the Policy Term |
| HDFC Life Easy Health | Claim Effect | Benefit payable is the Sum Assured and the policy will terminate |
| HDFC Life Group Poorna Suraksha (Accelerated Critical Illness Option) | Critical Illness Benefit (CIB) trigger | Diagnosis of any covered Critical Illness during the Policy Term |
| HDFC Life Group Poorna Suraksha (Accelerated Critical Illness Option) | Claim Effect | Benefit payable is the Sum Assured and the policy will terminate |


In [10]:
# print("fixed_hg_ret \n")
display(fixed_hf_ret)

[NodeWithScore(node=TextNode(id_='2e03a443-fcf0-4f03-b10b-d1d8e3a906c3', embedding=None, metadata={'total_pages': 31, 'file_path': 'Policy+Documents\\HDFC-Life-Group-Poorna-Suraksha-101N137V02-Policy-Document.pdf', 'source': '7'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='80a9e250-d7e3-48e9-bf88-19d029bf8b2a', node_type='4', metadata={'total_pages': 31, 'file_path': 'Policy+Documents\\HDFC-Life-Group-Poorna-Suraksha-101N137V02-Policy-Document.pdf', 'source': '7'}, hash='4eb6e29a0650b88db1a1f72646e503447cc2cac997f9a0c28634729cb3b96da1')}, metadata_template='{key}: {value}', metadata_separator='\n', text='Page 7 of 31 \n \nPart C \n1. \nBenefits: \n \n(1) Benefits on Death or diagnosis of contingency covered –  \n \nPlan Option \nEvents \nBenefit \nLife \nDeath \nIn the event of the death of the Scheme Member, the \nbenefit payable shall be the Sum Assured.  \nExtra Life Option \nDeath \nIn the 

In [11]:
### Q&A
query_engine_fo,retrieved_nodes_hg_f, retriever, f_hg_response=ins_func.create_retriever_generate_response(fixed_chunk_index_hf, 
                                         query_str=single_qna,
                                         top_k=10, filters=None, node_p="LLM", cut_off=0.8,
                                         response_mod="tree_summarize")
print(f"Individual Policy information - Extraction")
display(retrieved_nodes_hg_f)
print(f_hg_response)

[CACHE MISS] Processing new query: 'In the HDFC Life Smart Pension Plan, what is the maximum Policy Discontinuance Charge (in Rupees) for a non-single premium policy with an Annualized Premium of Rs. 60,000, discontinued during Policy Year 4?'
[CACHE STORE] Query cached successfully: 'In the HDFC Life Smart Pension Plan, what is the maximum Policy Discontinuance Charge (in Rupees) for a non-single premium policy with an Annualized Premium of Rs. 60,000, discontinued during Policy Year 4?'
Individual Policy information - Extraction


[NodeWithScore(node=TextNode(id_='52ebd844-3107-49a9-b777-85c1a6897b89', embedding=None, metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '19'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='8927a036-becf-4634-9b09-897e5a7fb48b', node_type='4', metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '19'}, hash='2060434bce13083d7725a7335ec9c7130ad883f0c5ff430e325e787362d220bb'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='6665a749-481b-4ceb-9ddc-1ad9472f227e', node_type='1', metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '19'}, hash='35911cb6ba525fb34f65cc08b316bb04926f490d27127b6b16ed1b849b2f16be')}, metadata_template='{key}: {value}', metadata_separa

The maximum Policy Discontinuance Charge for a non-single premium policy with an Annualized Premium of Rs. 60,000, discontinued during Policy Year 4 would be Rs. 2,000.


In [12]:
display(retrieved_nodes_hg_f)

[NodeWithScore(node=TextNode(id_='52ebd844-3107-49a9-b777-85c1a6897b89', embedding=None, metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '19'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='8927a036-becf-4634-9b09-897e5a7fb48b', node_type='4', metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '19'}, hash='2060434bce13083d7725a7335ec9c7130ad883f0c5ff430e325e787362d220bb'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='6665a749-481b-4ceb-9ddc-1ad9472f227e', node_type='1', metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '19'}, hash='35911cb6ba525fb34f65cc08b316bb04926f490d27127b6b16ed1b849b2f16be')}, metadata_template='{key}: {value}', metadata_separa

In [13]:
#### Cohere
## Comparison Q&A
fixed_chunk_index_cohere, fixed_cohere_ret,fixed_cohere_response =insure_assist_RAG_system(file_loc='Policy+Documents', 
               chunking_type='fixed_window', 
               chunk_size=512, 
               chunk_overlap=200,
               embedding_type="Cohere",
               api_file_name="C:/Users/SHAMBHAVVISEN/Desktop/Upgrad-Notes/cohere_api_key.txt",
               query_str=comparison_question,
               node_p="LLM",
               node_type="cohere",
               response_mod='refine',
               cut_off=0.8,
               top_k=10
               )
print("fixed_cohere_ret \n")
display(fixed_cohere_ret)
print("\n")
print(fixed_cohere_response)

Loaded 281
Original Total documents: 281
Documents are chunked based on fixed_window chunking Strategy with the total nodes of 571
Using embedding_type & embedding model:model_name='embed-english-v3.0' embed_batch_size=10 callback_manager=<llama_index.core.callbacks.base.CallbackManager object at 0x000001F9C1A30A10> num_workers=None embeddings_cache=None api_key="'ptWeIWfFGEGMRzhqG2CPgFLnz5OGPu5PxOTEbeVJ'" base_url=None truncate='END' input_type=None embedding_type='float'
Accessing from existing index...Cohere
Loading llama_index.core.storage.kvstore.simple_kvstore from storage_cohere_cohere/docstore.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from storage_cohere_cohere/index_store.json.
Done... & Cohere
[CACHE MISS] Processing new query: 'Compare the Critical Illness Benefit (CIB) trigger and claim effect between the HDFC Life Easy Health policy and the HDFC Life Group Poorna Suraksha policy (Accelerated Critical Illness Option)'
[CACHE STORE] Query cached successfu

[NodeWithScore(node=TextNode(id_='b83b656b-e565-4e16-bc9a-7f1edb578244', embedding=None, metadata={'total_pages': 31, 'file_path': 'Policy+Documents\\HDFC-Life-Group-Poorna-Suraksha-101N137V02-Policy-Document.pdf', 'source': '7'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='41d55790-9b16-4de9-b69c-8693168c2b25', node_type='4', metadata={'total_pages': 31, 'file_path': 'Policy+Documents\\HDFC-Life-Group-Poorna-Suraksha-101N137V02-Policy-Document.pdf', 'source': '7'}, hash='4eb6e29a0650b88db1a1f72646e503447cc2cac997f9a0c28634729cb3b96da1')}, metadata_template='{key}: {value}', metadata_separator='\n', text='Page 7 of 31 \n \nPart C \n1. \nBenefits: \n \n(1) Benefits on Death or diagnosis of contingency covered –  \n \nPlan Option \nEvents \nBenefit \nLife \nDeath \nIn the event of the death of the Scheme Member, the \nbenefit payable shall be the Sum Assured.  \nExtra Life Option \nDeath \nIn the 



| Policy Name | Feature / Benefit | Details |
|--------------|------------------|----------|
| HDFC Life Easy Health | Critical Illness Benefit (CIB) trigger | Diagnosis of any covered Critical Illness during the Policy Term |
| HDFC Life Easy Health | Claim Effect | Benefit payable is the Sum Assured and policy will terminate |
| HDFC Life Group Poorna Suraksha (Accelerated Critical Illness Option) | Critical Illness Benefit (CIB) trigger | Diagnosis of any covered Critical Illness during the Policy Term |
| HDFC Life Group Poorna Suraksha (Accelerated Critical Illness Option) | Claim Effect | Benefit payable is the Sum Assured and policy will terminate |


**Question: Compare the Suicide Exclusion clauses for the HDFC Life Group Poorna Suraksha and the HDFC Life Group Term Life policies, specifically differentiating the treatment of employer-employee versus non-employer-employee groups.**

| Policy Name                                   | Feature/Benefit                     | Details                                                                                                                                                                                                                     |
|-----------------------------------------------|------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| HDFC Life Group Poorna Suraksha               | Suicide Exclusion for Employer-Employee Groups | In employer-employee groups, the full sum assured is payable to the nominee in the event of death due to suicide, regardless of the time since joining the scheme. This provides comprehensive coverage for employees.         |
| HDFC Life Group Poorna Suraksha               | Suicide Exclusion for Non-Employer-Employee Groups | For non-employer-employee groups, if a member dies by suicide within 12 months of joining or reviving the policy, the nominee is entitled to at least 80% of the total premiums paid or the surrender value, whichever is higher. |
| HDFC Life Group Term Life                     | Suicide Exclusion for Employer-Employee Groups | Similar to the Poorna Suraksha policy, in employer-employee groups, the full sum assured is payable upon death by suicide, ensuring that employees have robust protection under this clause.                                   |
| HDFC Life Group Term Life                     | Suicide Exclusion for Non-Employer-Employee Groups | In non-employer-employee groups, the same 12-month limitation applies, where the nominee receives 80% of premiums paid or the surrender value if the member dies by suicide within that period, providing limited coverage.   |
| Both Policies                                 | Joint Life Provision                | For both policies, in the case of joint life, if one member dies by suicide, the respective benefits will be payable to the surviving member, ensuring that the coverage continues for the remaining insured individual.     |

In [14]:
### Q&A
query_engine_fch,retrieved_nodes_fch, retriever, fch_response=ins_func.create_retriever_generate_response(fixed_chunk_index_cohere, 
                                         query_str=single_qna,
                                         top_k=10, filters=None, node_p="LLM", cut_off=0.8,
                                         response_mod="tree_summarize")
print(f"Individual Policy information - Extraction")
display(retrieved_nodes_fch)
print("\n")
print(fch_response)

[CACHE MISS] Processing new query: 'In the HDFC Life Smart Pension Plan, what is the maximum Policy Discontinuance Charge (in Rupees) for a non-single premium policy with an Annualized Premium of Rs. 60,000, discontinued during Policy Year 4?'
[CACHE STORE] Query cached successfully: 'In the HDFC Life Smart Pension Plan, what is the maximum Policy Discontinuance Charge (in Rupees) for a non-single premium policy with an Annualized Premium of Rs. 60,000, discontinued during Policy Year 4?'
Individual Policy information - Extraction


[NodeWithScore(node=TextNode(id_='f0f9d9a4-eb8f-42c1-862e-43d0324fd47f', embedding=None, metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '19'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='8ebb54d9-6540-44a9-b4f2-6ab24a9a7d41', node_type='4', metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '19'}, hash='2060434bce13083d7725a7335ec9c7130ad883f0c5ff430e325e787362d220bb'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='90d231dd-cb97-4a57-81c0-dbcb55d5423c', node_type='1', metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '19'}, hash='35911cb6ba525fb34f65cc08b316bb04926f490d27127b6b16ed1b849b2f16be')}, metadata_template='{key}: {value}', metadata_separa



The maximum Policy Discontinuance Charge for a non-single premium policy with an Annualized Premium of Rs. 60,000, discontinued during Policy Year 4 would be Rs. 2,000.


#### Hierarchial chunking with the different Embeds

- OpenAI
- HuggingFace
- Cohere

In [15]:
#### OpenAI
## Comparison Q&A
hier_chunk_index_op, hier_op_ret,hier_op_response =insure_assist_RAG_system(file_loc='Policy+Documents', 
               chunking_type='Hierarchical', 
               chunk_size=[1024,512,256], 
               chunk_overlap=[50,50,50],
               embedding_type="OpenAIEmbedding",
               api_file_name="C:/Users/SHAMBHAVVISEN/Downloads/OpenAI_API_Key.txt",
               query_str=comparison_question,
               node_p="similar",
               node_type="hier_nodes",
               response_mod='refine',
               cut_off=0.8,
               top_k=10
               )
print("Hierarchical chunking with Open AI Embeds\n")
display(hier_chunk_index_op)
print("\n")
print(hier_op_response)

Loaded 281
Original Total documents: 281
Documents are chunked based on Hierarchical chunking Strategy with the total nodes of 1987
Using embedding_type & embedding model:model_name='text-embedding-3-large' embed_batch_size=100 callback_manager=<llama_index.core.callbacks.base.CallbackManager object at 0x000001F9C17A8050> num_workers=None embeddings_cache=None additional_kwargs={} api_key="'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'" api_base='https://api.openai.com/v1' api_version='' max_retries=10 timeout=60.0 default_headers=None reuse_client=True dimensions=None
Accessing from existing index...OpenAIEmbedding
Loading llama_index.core.storage.kvstore.simple_kvstore from storage_openai_hier_nodes/docstore.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from storage_openai_hier_nodes/index_store.json.
Done... & OpenAIEmbedding
[CACHE MISS] Processing

<llama_index.core.indices.vector_store.base.VectorStoreIndex at 0x1f9c1aab360>



| Policy Name | Feature / Benefit | Details |
|--------------|------------------|----------|
| HDFC Life Easy Health | Critical Illness Benefit (CIB) trigger | Triggered upon diagnosis of covered critical illness as per policy terms |
| HDFC Life Group Poorna Suraksha (Accelerated Critical Illness Option) | Critical Illness Benefit (CIB) trigger | Triggered upon diagnosis of covered critical illness as listed in the policy document, after which coverage ceases and benefits expire |


In [16]:
### Q&A
query_engine_hr_op,retrieved_nodes_hr_op, hr_op_retriever, hr_op_response=ins_func.create_retriever_generate_response(hier_chunk_index_op, 
                                         query_str=single_qna,
                                         top_k=20, filters=None, node_p="LLM", cut_off=0.8,
                                         response_mod="tree_summarize")
print(f"Individual Policy information - Extraction")
display(retrieved_nodes_hr_op)
print("\n")
print(hr_op_response)

[CACHE MISS] Processing new query: 'In the HDFC Life Smart Pension Plan, what is the maximum Policy Discontinuance Charge (in Rupees) for a non-single premium policy with an Annualized Premium of Rs. 60,000, discontinued during Policy Year 4?'
[CACHE STORE] Query cached successfully: 'In the HDFC Life Smart Pension Plan, what is the maximum Policy Discontinuance Charge (in Rupees) for a non-single premium policy with an Annualized Premium of Rs. 60,000, discontinued during Policy Year 4?'
Individual Policy information - Extraction


[NodeWithScore(node=TextNode(id_='ceb114ae-4c9e-40c9-b26f-55386dd3db2c', embedding=None, metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '19'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='47a11d12-9e1f-471e-a507-ce275e14e61f', node_type='4', metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '19'}, hash='2060434bce13083d7725a7335ec9c7130ad883f0c5ff430e325e787362d220bb'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='77853ac2-8d2b-42ae-9100-3cc1b3f382c6', node_type='1', metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '19'}, hash='5419472e14aab8d361f087cdf5c07674b716f4419e6cb55183138cd9fc354eb3')}, metadata_template='{key}: {value}', metadata_separa



The maximum Policy Discontinuance Charge for a non-single premium policy with an Annualized Premium of Rs. 60,000, discontinued during Policy Year 4 would be Rs. 3,600.


**Question: What is the claim procedure for HDFC Sanchay Plus Life Policy?**

The claim procedure for the HDFC Sanchay Plus Life Policy involves the following steps:

1. **Maturity Benefit**: This will be paid if:
   - The policy has matured and the life assured is alive on the maturity date.
   - No claim has been made on the policy, except for any survival benefit.
   - The policy has not been discontinued, surrendered, cancelled, or terminated.
   - All relevant documents, including the original policy document, have been provided.

2. **Death Benefit**: This will be paid if:
   - The death of the life assured occurs before the maturity date.
   - The standard policy provisions regarding exclusions and incorrect information are not applicable.
   - The policy has not been discontinued, surrendered, cancelled, or terminated.
   - All relevant documents in support of the claim have been provided.

**Basic Documentation Required**:
- For death due to natural causes:
  - Completed claim form (including NEFT details and bank account proof).
  - Original policy document.
  - Original or copy of the death certificate issued by the relevant authority.
  - Claimant’s identity and residence proof.

- For death due to unnatural causes:
  - Completed claim form (including NEFT details and bank account proof).
  - Original policy document.
  - Original or copy of the death certificate.
  - Original or copy of the First Information Report (FIR) and police panchnama report.
  - Original or copy of the postmortem report.

Claims must be intimated within 90 days from the date of death, although delays may be condoned if justified.

In [17]:
#### HuggingFace
## Comparison Q&A
hier_chunk_index_hf, hier_hf_ret,hier_hf_response =insure_assist_RAG_system(file_loc='Policy+Documents', 
               chunking_type='Hierarchical', 
               chunk_size=[1024,512,256], 
               chunk_overlap=[50,50,50],
               embedding_type="HuggingFace",
               api_file_name="C:/Users/SHAMBHAVVISEN/Downloads/OpenAI_API_Key.txt",
               query_str=comparison_question,
               node_p="similar",
               node_type="hier_nodes",
               response_mod='refine',
               cut_off=0.8,
               top_k=10
               )
print("Hierarchical chunking with OpenAI Embeds\n")
display(hier_chunk_index_hf)
print("\n")
print(hier_hf_response)

Loaded 281
Original Total documents: 281


2025-10-07 00:45:48,437 - INFO - Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Documents are chunked based on Hierarchical chunking Strategy with the total nodes of 1987
Using embedding_type & embedding model:model_name='sentence-transformers/all-MiniLM-L6-v2' embed_batch_size=10 callback_manager=<llama_index.core.callbacks.base.CallbackManager object at 0x000001F9D0C16AB0> num_workers=None embeddings_cache=None max_length=256 normalize=True query_instruction=None text_instruction=None cache_folder=None show_progress_bar=False
Accessing from existing index...HuggingFace
Loading llama_index.core.storage.kvstore.simple_kvstore from storage_huggingface_hier_nodes/docstore.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from storage_huggingface_hier_nodes/index_store.json.
Done... & HuggingFace
[CACHE MISS] Processing new query: 'Compare the Critical Illness Benefit (CIB) trigger and claim effect between the HDFC Life Easy Health policy and the HDFC Life Group Poorna Suraksha policy (Accelerated Critical Illness Option)'
[CACHE STORE] Query cached succe

<llama_index.core.indices.vector_store.base.VectorStoreIndex at 0x1f9d774c9d0>



| Policy Name | Feature / Benefit | Details |
|--------------|------------------|----------|
| HDFC Life Easy Health | Critical Illness Benefit (CIB) trigger | Upon diagnosis of a covered critical illness as per policy terms |
| HDFC Life Group Poorna Suraksha (Accelerated Critical Illness Option) | Critical Illness Benefit (CIB) trigger | Payment of benefit results in coverage cessation and expiration of all benefits |



| Policy Name                              | Feature/Benefit                     | Details                                                                                                                                                                                                                     |
|------------------------------------------|-------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| HDFC Life Group Poorna Suraksha         | Suicide Exclusion for Employer-Employee Groups | In employer-employee groups, the full sum assured is payable to the nominee in the event of death due to suicide, regardless of the time frame since joining the scheme.                                                      |
| HDFC Life Group Poorna Suraksha         | Suicide Exclusion for Non-Employer-Employee Groups | For non-employer-employee schemes, if a member dies by suicide within 12 months of joining or reviving the policy, the nominee is entitled to at least 80% of the total premiums paid or the surrender value, whichever is higher. |
| HDFC Life Group Term Life                | Suicide Exclusion for Employer-Employee Groups | Similar to the Poorna Suraksha policy, the full sum assured is payable to the nominee in case of suicide for employer-employee groups, ensuring comprehensive coverage for employees.                                         |
| HDFC Life Group Term Life                | Suicide Exclusion for Non-Employer-Employee Groups | In non-employer-employee groups, the same 12-month limitation applies, where the nominee receives 80% of the premiums paid or the surrender value, emphasizing a reduced benefit compared to employer-employee groups.          |

In [18]:
### Q&A
query_engine_hr_hf,retrieved_nodes_hr_hf, hr_hf_op_retriever, hr_hf_response=ins_func.create_retriever_generate_response(hier_chunk_index_hf, 
                                         query_str=single_qna,
                                         top_k=20, filters=None, node_p="LLM", cut_off=0.8,
                                         response_mod="tree_summarize")
print(f"Individual Policy information - Extraction")
display(retrieved_nodes_hr_hf)
print("\n")
print(hr_hf_response)

[CACHE MISS] Processing new query: 'In the HDFC Life Smart Pension Plan, what is the maximum Policy Discontinuance Charge (in Rupees) for a non-single premium policy with an Annualized Premium of Rs. 60,000, discontinued during Policy Year 4?'
[CACHE STORE] Query cached successfully: 'In the HDFC Life Smart Pension Plan, what is the maximum Policy Discontinuance Charge (in Rupees) for a non-single premium policy with an Annualized Premium of Rs. 60,000, discontinued during Policy Year 4?'
Individual Policy information - Extraction


[NodeWithScore(node=TextNode(id_='b2a9ae26-cd6a-4932-a47d-0b32b8311834', embedding=None, metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '20'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='a5584680-6c1d-4208-ada3-6b8a752442b1', node_type='4', metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '20'}, hash='11ead82f39e60797cecab80245d512c696f37638cbdfa5e1fb0bf4b0c8a99fd8')}, metadata_template='{key}: {value}', metadata_separator='\n', text='HDFC Life Smart Pension Plan 101L164V02 – Terms and Conditions (Direct & \nOnline Sales) \n(A Unit Linked Non-Participating Individual Pension Plan)  \n \n \nPage 20 of 37  \n \n \n*AP – Annualized Premium \n FV – Fund Value \n \n2) For Single Premium Policy: \n \nWhere the Policy is \ndiscontinued during 



Rs. 3,600


In [19]:
#### Cohere
## Comparison Q&A
hier_chunk_index_ch, hier_ch_ret,hier_ch_response =insure_assist_RAG_system(file_loc='Policy+Documents', 
               chunking_type='Hierarchical', 
               chunk_size=[1024,512,256], 
               chunk_overlap=[50,50,50],
               embedding_type="Cohere",
               api_file_name="C:/Users/SHAMBHAVVISEN/Desktop/Upgrad-Notes/cohere_api_key.txt",
               query_str=comparison_question,
               node_p="similar",
               node_type="hier_nodes",
               response_mod='refine',
               cut_off=0.8,
               top_k=10
               )
print("Hierarchical chunking with Cohere Embeds\n")
display(hier_ch_ret)
print("\n")
print(hier_ch_response)

Loaded 281
Original Total documents: 281
Documents are chunked based on Hierarchical chunking Strategy with the total nodes of 1987
Using embedding_type & embedding model:model_name='embed-english-v3.0' embed_batch_size=10 callback_manager=<llama_index.core.callbacks.base.CallbackManager object at 0x000001F9DFF62AB0> num_workers=None embeddings_cache=None api_key="'ptWeIWfFGEGMRzhqG2CPgFLnz5OGPu5PxOTEbeVJ'" base_url=None truncate='END' input_type=None embedding_type='float'
Accessing from existing index...Cohere
Loading llama_index.core.storage.kvstore.simple_kvstore from storage_cohere_hier_nodes/docstore.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from storage_cohere_hier_nodes/index_store.json.
Done... & Cohere
[CACHE MISS] Processing new query: 'Compare the Critical Illness Benefit (CIB) trigger and claim effect between the HDFC Life Easy Health policy and the HDFC Life Group Poorna Suraksha policy (Accelerated Critical Illness Option)'
[CACHE STORE] Query cached 

[NodeWithScore(node=TextNode(id_='9d6b8642-0a4c-4455-8b47-5fa7f5e4b2b3', embedding=None, metadata={'total_pages': 31, 'file_path': 'Policy+Documents\\HDFC-Life-Group-Poorna-Suraksha-101N137V02-Policy-Document.pdf', 'source': '7'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='8a6ee441-49dc-4e3c-a78b-b905e97ef84a', node_type='4', metadata={'total_pages': 31, 'file_path': 'Policy+Documents\\HDFC-Life-Group-Poorna-Suraksha-101N137V02-Policy-Document.pdf', 'source': '7'}, hash='4eb6e29a0650b88db1a1f72646e503447cc2cac997f9a0c28634729cb3b96da1'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='8cc16594-00b6-4c9a-8cfe-a6abec7e0e9e', node_type='1', metadata={'total_pages': 31, 'file_path': 'Policy+Documents\\HDFC-Life-Group-Poorna-Suraksha-101N137V02-Policy-Document.pdf', 'source': '7'}, hash='9f15833a63617e38974c610659b78affdcc92fe6772d603a4be6a02e6ef1c0cf')}, metadata_template='{key}: {value}



| Policy Name | Feature / Benefit | Details |
|--------------|------------------|----------|
| HDFC Life Easy Health | Critical Illness Benefit (CIB) trigger | Upon diagnosis of a covered critical illness as per policy terms |
| HDFC Life Group Poorna Suraksha (Accelerated Critical Illness Option) | Critical Illness Benefit (CIB) trigger | Payment of benefit results in coverage cessation and expiration of all benefits |


| Policy Name | Feature/Benefit | Details |
| --- | --- | --- |
| HDFC Life Easy Health | Critical Illness Benefit (CIB) trigger | The CIB trigger in HDFC Life Easy Health policy is typically based on the diagnosis of specific critical illnesses as defined in the policy document. |
| HDFC Life Easy Health | Claim Effect | Upon successful claim of the Critical Illness Benefit, the policyholder will receive a lump sum amount to cover medical expenses and other financial needs during the illness period. |
| HDFC Life Group Poorna Suraksha (Accelerated Critical Illness Option) | Critical Illness Benefit (CIB) trigger | The CIB trigger in HDFC Life Group Poorna Suraksha policy (Accelerated Critical Illness Option) is activated upon the diagnosis of specific critical illnesses listed in Annexure IV of the Master Policy. |
| HDFC Life Group Poorna Suraksha (Accelerated Critical Illness Option) | Claim Effect | After the payment of the Accelerated Critical Illness Benefit, the coverage for the Scheme Member ceases, and all benefits under the policy expire. The covered critical illnesses include Myocardial Infarction, Cancer of Specified Severity, Stroke resulting in permanent symptoms, and other critical conditions as specified in the policy document. |

In [20]:
### Q&A
query_engine_hr_ch,retrieved_nodes_hr_ch, hr_ch_op_retriever, hr_ch_response=ins_func.create_retriever_generate_response(hier_chunk_index_ch, 
                                         query_str=single_qna,
                                         top_k=20, filters=None, node_p="LLM", cut_off=0.8,
                                         response_mod="tree_summarize")
print(f"Individual Policy information - Extraction")
display(retrieved_nodes_hr_ch)
print("\n")
print(hr_ch_response)

[CACHE MISS] Processing new query: 'In the HDFC Life Smart Pension Plan, what is the maximum Policy Discontinuance Charge (in Rupees) for a non-single premium policy with an Annualized Premium of Rs. 60,000, discontinued during Policy Year 4?'
[CACHE STORE] Query cached successfully: 'In the HDFC Life Smart Pension Plan, what is the maximum Policy Discontinuance Charge (in Rupees) for a non-single premium policy with an Annualized Premium of Rs. 60,000, discontinued during Policy Year 4?'
Individual Policy information - Extraction


[NodeWithScore(node=TextNode(id_='2f8e0faa-08ec-45af-b172-6a8c3038990d', embedding=None, metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '20'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='9b1c05ec-af35-4c43-be88-9b91817def07', node_type='4', metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '20'}, hash='11ead82f39e60797cecab80245d512c696f37638cbdfa5e1fb0bf4b0c8a99fd8')}, metadata_template='{key}: {value}', metadata_separator='\n', text='HDFC Life Smart Pension Plan 101L164V02 – Terms and Conditions (Direct & \nOnline Sales) \n(A Unit Linked Non-Participating Individual Pension Plan)  \n \n \nPage 20 of 37  \n \n \n*AP – Annualized Premium \n FV – Fund Value \n \n2) For Single Premium Policy: \n \nWhere the Policy is \ndiscontinued during 



Rs. 3,600


#### Semantic Chunking - with various embedding

- OpenAI
- Cohere
- HuggingFace

In [21]:
###
#### OpenAI
## Comparison Q&A
semantic_chunk_index_op, semantic_op_ret,semantic_op_response =insure_assist_RAG_system(file_loc='Policy+Documents', 
               chunking_type='semantic', 
               chunk_size=512, 
               chunk_overlap=50,
               embedding_type="OpenAIEmbedding",
               api_file_name="C:/Users/SHAMBHAVVISEN/Downloads/OpenAI_API_Key.txt",
               query_str=comparison_question,
               node_p="similar",
               node_type="semantic_nodes",
               response_mod='refine',
               cut_off=0.8,
               top_k=10
               )
print("Semantic chunking with OpenAI Embeds\n")
display(semantic_chunk_index_op)
print("\n")
print(semantic_op_response)

Loaded 281
Original Total documents: 281
Documents are chunked based on semantic chunking Strategy with the total nodes of 572
Using embedding_type & embedding model:model_name='text-embedding-3-large' embed_batch_size=100 callback_manager=<llama_index.core.callbacks.base.CallbackManager object at 0x000001F9F209F4D0> num_workers=None embeddings_cache=None additional_kwargs={} api_key="'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'" api_base='https://api.openai.com/v1' api_version='' max_retries=10 timeout=60.0 default_headers=None reuse_client=True dimensions=None
Accessing from existing index...OpenAIEmbedding
Loading llama_index.core.storage.kvstore.simple_kvstore from storage_openai_semantic_nodes/docstore.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from storage_openai_semantic_nodes/index_store.json.
Done... & OpenAIEmbedding
[CACHE MISS] Process

<llama_index.core.indices.vector_store.base.VectorStoreIndex at 0x1f9efa9c050>



I'm sorry, but I can't provide a comparison between the Critical Illness Benefit (CIB) trigger and claim effect of the HDFC Life Easy Health policy and the HDFC Life Group Poorna Suraksha policy (Accelerated Critical Illness Option).


In [24]:
### Q&A
query_engine_sem_op,retrieved_nodes_sem_op, sem_op_op_retriever, sem_op_response=ins_func.create_retriever_generate_response(semantic_chunk_index_op, 
                                         query_str= comparison_question,
                                         top_k=5, filters=None, node_p="LLM", cut_off=0.8,
                                         response_mod="refine")
print(f"Individual Policy information - Extraction")
display(retrieved_nodes_sem_op)
print("\n")
print(sem_op_response)

[CACHE MISS] Processing new query: 'Compare the Critical Illness Benefit (CIB) trigger and claim effect between the HDFC Life Easy Health policy and the HDFC Life Group Poorna Suraksha policy (Accelerated Critical Illness Option)'
[CACHE STORE] Query cached successfully: 'Compare the Critical Illness Benefit (CIB) trigger and claim effect between the HDFC Life Easy Health policy and the HDFC Life Group Poorna Suraksha policy (Accelerated Critical Illness Option)'
Individual Policy information - Extraction


[NodeWithScore(node=TextNode(id_='5f2d0e06-7b03-4654-bc15-3f3d0dd55c49', embedding=None, metadata={'total_pages': 31, 'file_path': 'Policy+Documents\\HDFC-Life-Group-Poorna-Suraksha-101N137V02-Policy-Document.pdf', 'source': '7'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='bf086721-129d-4e91-8498-bfc956e86ba5', node_type='4', metadata={'total_pages': 31, 'file_path': 'Policy+Documents\\HDFC-Life-Group-Poorna-Suraksha-101N137V02-Policy-Document.pdf', 'source': '7'}, hash='4eb6e29a0650b88db1a1f72646e503447cc2cac997f9a0c28634729cb3b96da1')}, metadata_template='{key}: {value}', metadata_separator='\n', text='Page 7 of 31 \n \nPart C \n1. \nBenefits: \n \n(1) Benefits on Death or diagnosis of contingency covered –  \n \nPlan Option \nEvents \nBenefit \nLife \nDeath \nIn the event of the death of the Scheme Member, the \nbenefit payable shall be the Sum Assured.  \nExtra Life Option \nDeath \nIn the 



| Policy Name | Feature / Benefit | Details |
|--------------|------------------|----------|
| HDFC Life Easy Health | Critical Illness Benefit (CIB) trigger | Diagnosis of any covered Critical Illness during the Policy Term |
| HDFC Life Easy Health | Critical Illness Benefit (CIB) claim effect | Benefit payable is the Sum Assured and policy will terminate |
| HDFC Life Group Poorna Suraksha (Accelerated Critical Illness Option) | Critical Illness Benefit (CIB) trigger | Diagnosis of any covered Critical Illness during the Coverage Term |
| HDFC Life Group Poorna Suraksha (Accelerated Critical Illness Option) | Critical Illness Benefit (CIB) claim effect | Benefit payable is the Sum Assured and policy will terminate |


Policy Name | Feature/Benefit | Details |
| --- | --- | --- |
| HDFC Life Easy Health | Critical Illness Benefit (CIB) trigger | - The CIB trigger in the HDFC Life Easy Health policy is the diagnosis of a covered critical illness during the policy term. |
|  | Claim Effect | - Upon diagnosis of a covered critical illness, the benefit payable is the sum assured, and the policy will terminate. |
| HDFC Life Group Poorna Suraksha (Accelerated Critical Illness Option) | Critical Illness Benefit (CIB) trigger | - The CIB trigger in the HDFC Life Group Poorna Suraksha policy with Accelerated Critical Illness Option is the diagnosis of a covered critical illness or undergoing a covered surgery during the coverage term. |
|  | Claim Effect | - Upon diagnosis of a covered critical illness or undergoing a covered surgery, the Accelerated Critical Illness Benefit is payable, and the coverage of the scheme member ceases with all benefits expiring. |

In [22]:
### Q&A
query_engine_sem_op,retrieved_nodes_sem_op, sem_op_op_retriever, sem_op_response=ins_func.create_retriever_generate_response(semantic_chunk_index_op, 
                                         query_str=single_qna,
                                         top_k=5, filters=None, node_p="LLM", cut_off=0.8,
                                         response_mod="tree_summarize")
print(f"Individual Policy information - Extraction")
display(retrieved_nodes_sem_op)
print("\n")
print(sem_op_response)

[CACHE MISS] Processing new query: 'In the HDFC Life Smart Pension Plan, what is the maximum Policy Discontinuance Charge (in Rupees) for a non-single premium policy with an Annualized Premium of Rs. 60,000, discontinued during Policy Year 4?'
[CACHE STORE] Query cached successfully: 'In the HDFC Life Smart Pension Plan, what is the maximum Policy Discontinuance Charge (in Rupees) for a non-single premium policy with an Annualized Premium of Rs. 60,000, discontinued during Policy Year 4?'
Individual Policy information - Extraction


[NodeWithScore(node=TextNode(id_='bc1e9c53-ac6d-41ea-8e82-8fae6dbd1cc3', embedding=None, metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '20'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='242a37fd-b5a9-42eb-9856-c8a36b5f3b1a', node_type='4', metadata={'total_pages': 37, 'file_path': 'Policy+Documents\\HDFC-Life-Smart-Pension-Plan-Policy-Document-Online.pdf', 'source': '20'}, hash='11ead82f39e60797cecab80245d512c696f37638cbdfa5e1fb0bf4b0c8a99fd8')}, metadata_template='{key}: {value}', metadata_separator='\n', text='HDFC Life Smart Pension Plan 101L164V02 – Terms and Conditions (Direct & \nOnline Sales) \n(A Unit Linked Non-Participating Individual Pension Plan)  \n \n \nPage 20 of 37  \n \n \n*AP – Annualized Premium \n FV – Fund Value \n \n2) For Single Premium Policy: \n \nWhere the Policy is \ndiscontinued during 



For a non-single premium policy with an Annualized Premium of Rs. 60,000 discontinued during Policy Year 4 in the HDFC Life Smart Pension Plan, the maximum Policy Discontinuance Charge would be Rs. 3,600.


### Conclusion


- To summarise, out of three different categories of chunking. Hierarchical chunking+OpenAI embedding with GPT-4o-mini has proven to be the best way of extracting relevant info with a constructive generation of response. 
- The major challenges were to pick the right chunking strategy and LLM model for response generation. GPT-3,5-turbo is the cheapest in terms of cost and credits efficiency. However, it was neither a good semantic extractor nor a response generator. GPT-4o-mini was doing a better job with massive exhaustion of credit limit. So, generated the responses for some questions to stay on budget.
- llama index keeps updating it's version by discommissioning certain modules which hard to follow
- Semantic chunking is consuming so much time with poor performance than the fixed window. Hence it's not necessary to check them on different embedding criteria. I have limited them to just openAI embedding as the embedding strategy does make a significant difference on RAG outputs.