# POC - EPFO Question and Answer System
## Account (UAN) (creation, documents required, claims), KYC (procedure, update)
This is an end to end LLM project based on Google Palm and Langchain. In this project a question and answer system related to EPFO (Employee's Provident Fund Organization) is developed. EPFO is one of the World's largest Social Security Organisations in terms of clientele and the volume of financial transactions undertaken. In the developed project questions related to account (UAN) (creation, documents required, claims), KYC (procedure, update) etc. are tried to answered using google palm large language model.

## Project Architecture:
1. **CSV loading :** CSV loader from langchain document loader will load the csv question and answer file.
2. **Database questions embedding :** Questions from CSV question and answer file will be embedded using <u>huggingface embeding</u>.
3. **Vector Database :** Embedded questions and corresponding answers will be stored using <u>FAISS</u>.
4. **Creating a retrieval chain :** Using a <u>prompt template</u>  and <u>google palm api</u> retrieval chain will be prepared.

## Output:
Output will be an answer based on the input question. Following will happen in the background.
1. A question asked to the retrieval chain will try to find the similar questions from the vector database.
2. Corresponding answers from the vector database of the relevant questions from step 1 will be outputted nicely using google palm llm.

## Setting up the API keys

In [1]:
import os
from dotenv import load_dotenv
load_dotenv()

True

In [2]:
os.environ["GOOGLE_API_KEY"]=os.getenv("google_api_key")
os.environ["LANGCHAIN_API_KEY"]=os.getenv("langchain_api_key")
os.environ["LANGCHAIN_TRACING_V2"]="true"
os.environ["LANGCHAIN_PROJECT"]=os.getenv("langchain_project")
os.environ["GROQ_API_KEY"]=os.getenv("groq_api_key")
os.environ["HF_TOKEN"] = os.getenv('huggingface_access_token')

## CSV Loading

In [3]:
from langchain_community.document_loaders import CSVLoader

In [4]:
# Load the data from EPFO faq's
# loader = CSVLoader(file_path='C:\Swapnil\GenerativeAI\Practice_1\EPFO_Chatbot_Project\EPFO_FAQs.csv', encoding='unicode_escape', source_column="Question ")
loader = CSVLoader(file_path='EPFO_FAQs.csv', encoding='unicode_escape', source_column="Question ")

# Store the loaded data in the 'data' variable
data = loader.load()

# correcting the rows as there are only specific number of questions
data=data[:41]

# lets check the read data
data

[Document(metadata={'source': 'What is Universal Account Number or UAN', 'row': 0}, page_content='Question: What is Universal Account Number or UAN\nAnswer: UAN is 12-digit number provided to each member of EPFO. The UAN acts as an umbrella for the multiple Member IDs allotted to an individual. This number acts as a pivot to link multiple Member Identification Numbers (Member Id) allotted to a single member under single Universal Account Number. UAN duly seeded with KYC detail. This enables the member to avail various online services directly without the need for any intermediation by the employer.'),
 Document(metadata={'source': 'What is KYC', 'row': 1}, page_content='Question: What is KYC\nAnswer: Know Your Customer or KYC is a one-time process which helps in identity verification of subscribers by linking UAN with KYC details. The Employees / Employers need to provide KYC details viz., Aadhaar, PAN, Bank etc., for unique identification of the employees enabling seamless online serv

## Creating a vector database and question embedding

In [5]:
from langchain_huggingface import HuggingFaceEmbeddings

In [6]:
embeddings = HuggingFaceEmbeddings(model_name="all-mpnet-base-v2")

  from .autonotebook import tqdm as notebook_tqdm
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


In [7]:
from langchain_community.vectorstores import FAISS

In [8]:
# Create a FAISS instance for vector database from 'data'
vectordb = FAISS.from_documents(documents=data,embedding=embeddings)

# Create a retriever for querying the vector database
# retriever = vectordb.as_retriever(score_threshold = 0.7)
retriever = vectordb.as_retriever()

In [9]:
# Sample question and corresponding searched questions in the vector database
rdocs = retriever.get_relevant_documents("What should I do if I change my job")
rdocs

  rdocs = retriever.get_relevant_documents("What should I do if I change my job")


[Document(id='7cc36465-97d9-4841-a7c0-a8df7c2f798f', metadata={'source': 'What is to be done in case I change the job and join somewhere else', 'row': 33}, page_content='Question: What is to be done in case I change the job and join somewhere else\nAnswer: You need to simply declare your UAN with your subsequent employer.'),
 Document(id='2056cfd9-edf7-4adc-8fbc-44b8a8f672c7', metadata={'source': 'I have changed my job. Should I activate my UAN again', 'row': 16}, page_content='Question: I have changed my job. Should I activate my UAN again\nAnswer: UAN has to be activated only once. You do not have to re-activate it every time you switch jobs.'),
 Document(id='9ad77fa6-1701-4f2f-9867-1817950243a5', metadata={'source': 'My employer erroneously entered wrong PAN and Bank account details', 'row': 26}, page_content='Question: My employer erroneously entered wrong PAN and Bank account details\nAnswer: You can login to your member portal and update the PAN and Bank account details yourself.

In [10]:
rdocs = retriever.get_relevant_documents("What is the procedure to change the password and can i link two mobile phones to a single account")
rdocs

[Document(id='5c99d05b-728d-4c38-883f-966974b10e6b', metadata={'source': 'What to do if I forgot my password and my registered mobile with UAN has also changed', 'row': 32}, page_content='Question: What to do if I forgot my password and my registered mobile with UAN has also changed\nAnswer: Please click on \x93Forgot Password\x94 at Member Interface of Unified Portal. Provide your UAN with CAPTCHA. System will ask whether OTP is to be sent on registered mobile or some other mobile. System will ask to enter your basic details (Name, DOB and Gender). After successful matching of basic details system will ask to provide your Aadhar or PAN. If KYC details are matched system will ask new mobile number and OTP will be sent to the new mobile. After successful verification of OTP, you can reset your password.'),
 Document(id='30796077-8b1d-4b4d-95b8-5c500af58a93', metadata={'source': 'How to change my UAN linked mobile number', 'row': 28}, page_content='Question: How to change my UAN linked m

In [11]:
rdocs = retriever.get_relevant_documents("How to link an AADHAR with UAN")
rdocs

[Document(id='483461e5-b868-4a30-851d-19a04f1db0dd', metadata={'source': 'What can I do if my UAN is not seeded with Aadhaar', 'row': 10}, page_content='Question: What can I do if my UAN is not seeded with Aadhaar\nAnswer: Member can himself seed UAN with Aadhaar by visiting member portal. Thereafter the employer must approve the same to complete the linkage. Alternatively, member can ask his employer to link Aadhaar with UAN. The member can use \x93e-KYC Portal\x94 under Online Service available on home page of EPFO website or e-KYC service under EPFO in UMANG APP to link his/her UAN with Aadhaar without employer\x92s intervention.'),
 Document(id='6e8a9261-02d6-4ed0-9a66-5a82d82b5732', metadata={'source': 'What are the minimum details which are required to be linked with UAN for availing online services', 'row': 21}, page_content='Question: What are the minimum details which are required to be linked with UAN for availing online services\nAnswer: Mobile, Aadhar and Bank account numbe

## Create RetrievalQA chain along with prompt template

In [12]:
from langchain.prompts import PromptTemplate
from langchain.chains import RetrievalQA
from langchain_google_genai import ChatGoogleGenerativeAI

In [13]:
llm = ChatGoogleGenerativeAI(
    model="gemini-2.0-flash-001",
    temperature=0.3,
    max_retries=2
)

In [19]:
prompt_template = """Given the following context and a question, generate an answer based on this context only.
In the answer try to provide as much text as possible from "Answer" section in the source document context without making much changes.
If the answer is not found in the context, kindly state "I don't know." Don't try to make up an answer.

CONTEXT: {context}

QUESTION: {question}"""


PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)
chain_type_kwargs = {"prompt": PROMPT}


chain = RetrievalQA.from_chain_type(llm=llm,
                            chain_type="stuff",
                            retriever=retriever,
                            input_key="query",
                            return_source_documents=True,
                            chain_type_kwargs=chain_type_kwargs)

In [20]:
chain("What should I do if I change my job")

{'query': 'What should I do if I change my job',
 'result': 'UAN has to be activated only once. You do not have to re-activate it every time you switch jobs.\nYou need to simply declare your UAN with your subsequent employer.',
 'source_documents': [Document(id='7cc36465-97d9-4841-a7c0-a8df7c2f798f', metadata={'source': 'What is to be done in case I change the job and join somewhere else', 'row': 33}, page_content='Question: What is to be done in case I change the job and join somewhere else\nAnswer: You need to simply declare your UAN with your subsequent employer.'),
  Document(id='2056cfd9-edf7-4adc-8fbc-44b8a8f672c7', metadata={'source': 'I have changed my job. Should I activate my UAN again', 'row': 16}, page_content='Question: I have changed my job. Should I activate my UAN again\nAnswer: UAN has to be activated only once. You do not have to re-activate it every time you switch jobs.'),
  Document(id='9ad77fa6-1701-4f2f-9867-1817950243a5', metadata={'source': 'My employer erroneo

In [16]:
chain("What is the procedure to change the password")

{'query': 'What is the procedure to change the password',
 'result': 'Please click on \x93Forgot Password\x94 at Member Interface of Unified Portal. Provide your UAN with CAPTCHA. System will send the OTP on your mobile which is seeded with UAN and you can reset the password.',
 'source_documents': [Document(id='0d57ed5e-bab8-468b-9c34-b6d345f0205e', metadata={'source': 'What to do if I forgot my password', 'row': 31}, page_content='Question: What to do if I forgot my password\nAnswer: Please click on \x93Forgot Password\x94 at Member Interface of Unified Portal. Provide your UAN with CAPTCHA. System will send the OTP on your mobile which is seeded with UAN and you can reset the password.'),
  Document(id='5c99d05b-728d-4c38-883f-966974b10e6b', metadata={'source': 'What to do if I forgot my password and my registered mobile with UAN has also changed', 'row': 32}, page_content='Question: What to do if I forgot my password and my registered mobile with UAN has also changed\nAnswer: Pleas

In [17]:
chain("How to link an AADHAR with UAN")

{'query': 'How to link an AADHAR with UAN',
 'result': 'Member can himself seed UAN with Aadhaar by visiting member portal. Thereafter the employer must approve the same to complete the linkage. Alternatively, member can ask his employer to link Aadhaar with UAN. The member can use \x93e-KYC Portal\x94 under Online Service available on home page of EPFO website or e-KYC service under EPFO in UMANG APP to link his/her UAN with Aadhaar without employer\x92s intervention.',
 'source_documents': [Document(id='483461e5-b868-4a30-851d-19a04f1db0dd', metadata={'source': 'What can I do if my UAN is not seeded with Aadhaar', 'row': 10}, page_content='Question: What can I do if my UAN is not seeded with Aadhaar\nAnswer: Member can himself seed UAN with Aadhaar by visiting member portal. Thereafter the employer must approve the same to complete the linkage. Alternatively, member can ask his employer to link Aadhaar with UAN. The member can use \x93e-KYC Portal\x94 under Online Service available o

In [21]:
chain("What if I purchase a mobile phone, do I need to create an account")

{'query': 'What if I purchase a mobile phone, do I need to create an account',
 'result': "I don't know.",
 'source_documents': [Document(id='c2fc07d2-5e0a-470e-8839-16679d8a9fe2', metadata={'source': 'Can I apply online claim if my mobile is not linked with Aadhaar', 'row': 29}, page_content='Question: Can I apply online claim if my mobile is not linked with Aadhaar\nAnswer: No, you cannot submit online claim if your mobile is not linked with Aadhaar. At the time of claim submission, OTP is sent to Aadhaar linked mobile only.'),
  Document(id='6e8a9261-02d6-4ed0-9a66-5a82d82b5732', metadata={'source': 'What are the minimum details which are required to be linked with UAN for availing online services', 'row': 21}, page_content='Question: What are the minimum details which are required to be linked with UAN for availing online services\nAnswer: Mobile, Aadhar and Bank account number.'),
  Document(id='055f5d83-c9f0-4370-98dc-521ce29bacb6', metadata={'source': 'Can one mobile number be l

## Observations:
1. For an asked questions, similar questions were able to find from vector database.
2. Multiple similar questions found for an asked questions. Multiple answers from these quesions were summarised nicely by llm.
3. For a question which is not related to the base question and answer document, retrieval chain provides answer I don't know.
