# <font color='blue'>Index</font>
- <font color='blue'>1. Install required libraries</font>
- <font color='blue'>2. Import required libraries</font>
- <font color='blue'>3. Initialize OpenAI with Key</font>
- <font color='blue'>4. Loading pdf as document using SimpleDirectoryReader</font>
- <font color='blue'>5. Parsing document into nodes</font>
- <font color='blue'>6. Generating vector index for the nodes</font>
- <font color='blue'>7. Initialize Query engine</font>
- <font color='blue'>8. Method to retrieve and generate response with context</font>
- <font color='blue'>9. Testing</font>
    - <font color='blue'>9.1 Case #1</font>
    - <font color='blue'>9.2 Case #2</font>
    - <font color='blue'>9.3 Case #3</font>
    - <font color='blue'>9.4 Case #4</font>

## <font color='blue'>1. Install required libraries</font>

In [1]:
# !pip install pandas
# !pip install llama-index
# !pip install openai
# !pip install tqdm

## <font color='blue'>2. Import required libraries</font>

In [2]:
import pandas as pd
from llama_index.core.node_parser import SimpleNodeParser
from llama_index.core import VectorStoreIndex, Document
import openai
from llama_index.core import SimpleDirectoryReader
from tqdm import tqdm

## <font color='blue'>3. Initialize OpenAI with Key</font>

In [3]:
# Set your OpenAI API key
with open("openai_api_key.txt", "r") as f:
  openai.api_key = ' '.join(f.readlines())

## <font color='blue'>4. Loading pdf as document using SimpleDirectoryReader</font>

In [4]:
reader = SimpleDirectoryReader(input_dir="./", required_exts=[".pdf"])

documents = reader.load_data()
print(f"Loaded {len(documents)} docs")

Loaded 30 docs


## <font color='blue'>5. Parsing document into nodes</font>

In [5]:
# Initialize the parser and parse the documents into nodes
parser = SimpleNodeParser.from_defaults()

# Using tqdm to show progress for document parsing
nodes = []
for doc in tqdm(documents, desc="Parsing documents"):
    nodes.extend(parser.get_nodes_from_documents([doc]))

Parsing documents: 100%|██████████| 30/30 [00:00<00:00, 251.23it/s]


## <font color='blue'>6. Generating vector index for the nodes</font>

In [6]:
# Using tqdm to show progress for indexing
index = VectorStoreIndex(nodes)

## <font color='blue'>7. Initialize Query engine</font>

In [7]:
# Create the query engine from the index
query_engine = index.as_query_engine()

## <font color='blue'>8. Method to retrieve and generate response with context</font>

In [8]:
# Initialize conversation context
conversation_context = ""

def retrieve_and_generate(query, initial=True):
    global conversation_context

    # Retrieve relevant documents from the index
    retrieved_docs = query_engine.query(query)

    # Combine the content of retrieved documents
    context = "\n".join(doc.text for doc in retrieved_docs.source_nodes if hasattr(doc, 'text'))

    # Update conversation context
    if initial:
        conversation_context = ""
    conversation_context += f"\nUser: {query}\nDocuments: {context}"
    
    # Use OpenAI's gpt-3.5-turbo to generate a response based on the context
    messages = [
        {"role":"system", "content": "Only answer from the context provided. If the context is not relevant, say 'I don't know'."},
        {"role":"user", "content":f"Context: {conversation_context}\n\nAnswer:"}
        ]

    response = openai.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=messages,
        max_tokens=150
    )
        
    # Update conversation context with the LLM's response
    conversation_context += f"\nAI: {response.choices[0].message.content.strip()}"

    return response.choices[0].message.content.strip()

## <font color='blue'>9. Testing</font>

### <font color='blue'>9.1 Case #1</font>

In [9]:
initial_query = "What is this document talking about?"
initial_answer = retrieve_and_generate(initial_query)
print("\nQuestion:", initial_query)
print("Answer:", initial_answer)

follow_up_query = "Can you list the policy types for this document?"
follow_up_answer = retrieve_and_generate(follow_up_query, False)
print("\nQuestion:", follow_up_query)
print("Answer:", follow_up_answer)


Question: What is this document talking about?
Answer: The document is a policy document for HDFC Life Group Term Life mentioning benefits and terms and conditions. It states that there is no Surrender Value payable under this Policy.

Question: Can you list the policy types for this document?
Answer: The document is a policy document for HDFC Life Group Term Life and there is no Surrender Value payable under this Policy.


### <font color='blue'>9.2 Case #2</font>

In [10]:
initial_query = "What are the benefits payable under this policy?"
initial_answer = retrieve_and_generate(initial_query)
print("\nQuestion:", initial_query)
print("Answer:", initial_answer)

follow_up_query = "Who are the beneficiaries?"
follow_up_answer = retrieve_and_generate(follow_up_query, False)
print("\nQuestion:", follow_up_query)
print("Answer:", follow_up_answer)

follow_up_query = "What is the mode of payment?"
follow_up_answer = retrieve_and_generate(follow_up_query, False)
print("\nQuestion:", follow_up_query)
print("Answer:", follow_up_answer)


Question: What are the benefits payable under this policy?
Answer: The benefits payable under this policy include the Sum Assured in case of the death of the insured member, payment to the beneficiary in case of accidental death, and payment to the nominee designated by the insured member or the legal heir. Additionally, the mode of payment of benefits is specified in the policy.

Question: Who are the beneficiaries?
Answer: The beneficiaries under this policy can be the nominee or legal heir of the insured member.

Question: What is the mode of payment?
Answer: All benefits and other sums under this policy shall be payable in the manner and currency allowed/permitted under the Regulations and shall be payable by permissible modes.


### <font color='blue'>9.3 Case #3</font>

In [11]:
initial_query = "List all 'active' related definitions"
initial_answer = retrieve_and_generate(initial_query)
print("\nQuestion:", initial_query)
print("Answer:", initial_answer)

follow_up_query = "Is Active and Death related"
follow_up_answer = retrieve_and_generate(follow_up_query, False)
print("\nQuestion:", follow_up_query)
print("Answer:", follow_up_answer)


Question: List all 'active' related definitions
Answer: The 'Active Service,' 'Active Member,' 'Active Service Certificate,' 'Active Member Declaration,' 'Actively at Work,' and 'Member shall be Actively at Work' are related definitions in the provided context.

Question: Is Active and Death related
Answer: No, 'Active Service,' 'Active Member,' 'Active Service Certificate,' 'Active Member Declaration,' 'Actively at Work,' and 'Member shall be Actively at Work' are related definitions in the provided context.


### <font color='blue'>9.4 Case #4</font>

In [19]:
initial_query = "Frame 5 questions basing on the document"
initial_answer = retrieve_and_generate(initial_query)
print("\nQuestion:", initial_query)
print("Answer:", initial_answer)

follow_up_query = "Sort those questions basing on expectancy and get me top 3"
follow_up_answer = retrieve_and_generate(follow_up_query, False)
print("\nQuestion:", follow_up_query)
print("Answer:", follow_up_answer)


Question: Frame 5 questions basing on the document
Answer: 1. What is the time limit specified in Section 45 of the Insurance Act, 1938, within which a policy of Life Insurance cannot be called into question?
2. Under what circumstances can a Policy of Life Insurance be called into question on the ground of fraud within 3 years?
3. What constitutes fraud according to Section 45 of the Insurance Act, 1938, regarding Life Insurance Policy?
4. When can a Life insurance Policy be called into question within 3 years due to a misstatement or suppression of a material fact?
5. What action should the insurer take if a Policy of Life Insurance is repudiated on the ground of misstatement and not fraud?

Question: Sort those questions basing on expectancy and get me top 3
Answer: 1. What is the time limit specified in Section 45 of the Insurance Act, 1938, within which a policy of Life Insurance cannot be called into question?
2. When can a Life insurance Policy be called into question within 3 