<a href="https://colab.research.google.com/github/Aditi0712/PDFchat_langchain/blob/main/geekrabittask_(2).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Insurance Question Answering System Using LangChain and Open Source LLMs

Created By: Aditi Sharma






Step 1: Installing the required Libraries

In [64]:
!pip install -q accelerate einops bitsandbytes kaleido openai python-multipart tiktoken cohere langchain pypdf faiss-cpu torch chromadb transformers sentence-transformers

In [65]:
!pip install -q --upgrade tensorflow

Step 2: Importing the required modules

In [165]:
from langchain_community.document_loaders import PyPDFLoader
import transformers
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain_community.embeddings.sentence_transformer import (
    SentenceTransformerEmbeddings,
)
from langchain_community.vectorstores import Chroma
import torch
from langchain import HuggingFaceHub
from langchain.chains.question_answering import load_qa_chain

Step 3: PDF Ingestion using langchain's PyPDFLoader

In [68]:
loader = PyPDFLoader("/content/insurance-industry-in-canada.pdf")
documents=loader.load()

Step 4: Document Preparation by splitting into manageable chunks

In [69]:
text_splitter=CharacterTextSplitter(
    separator='\n',
    chunk_size=1000,
    chunk_overlap=200,
)
texts=text_splitter.split_documents(documents)

Step 4: Creating embeddings

In [70]:
embeddings = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")

Step 5: Storing the embedding in the vectorstore

In [83]:
vectorstore= Chroma.from_documents(texts,embeddings)

Step 6: Integrating the open source llm

In [85]:
from google.colab import userdata
hg_token=userdata.get('Huggingface')

In [180]:
model='tiiuae/falcon-7b-instruct'
llm=HuggingFaceHub(huggingfacehub_api_token=hg_token, repo_id=model, model_kwargs={"temperature":0.3, "max_new_token":2000})



Step 7: Perform similarity search in vectorstore for contextual resemblance

In [181]:
#docs=vectorstore.similarity_search(query, k=7)

 Step 8: Creating a q&a chain using langchain

In [182]:
chain=load_qa_chain(llm=llm, chain_type="stuff")

Testing:

In [186]:
samples = [
    "What are the key regulations affecting insurance companies in Canada?",
    "How has technology impacted the insurance industry?",
    "How many life insurance companies were in 1985 in Canada?",
    "What policy issues are addressed in the section on industrial development, and how do they impact the insurance industry?"
]

for question in samples:
    # Perform similarity search and run the question-answering chain
    docs = vectorstore.similarity_search(question, k=7)
    result = chain.run(input_documents=docs, question=question)
    # Evaluate and document results
    print(f"Question: {question}")
    print(f"Answer: {result}")
    print("\n---\n")


Question: What are the key regulations affecting insurance companies in Canada?
Answer:  The key regulations affecting insurance companies in Canada are:
- Financial Services Act
- Insurance Companies Act
- Life Insurance Act
- Health and Disability Insurance Act
- Motor Vehicle Act
- Financial Services Act
- Competition Act
- Consumer Protection Act
- Privacy Act
- Canadian Human Rights Act
- Canadian Environmental Protection Act
- Canada Labour Code
- Canadian Pension Plan Act
- Canada Health Act
- Canada Education Savings Plan Act
- Canada Retirement Savings Plan Act
- Canada

---

Question: How has technology impacted the insurance industry?
Answer:  Technology has had a significant impact on the insurance industry. It has enabled insurers to streamline their operations and reduce their costs, as well as to offer new products and services to their customers. Technology has also enabled insurers to better manage their risk and to better identify and target customers who are more lik