<a href="https://colab.research.google.com/github/ash-rulz/RAG/blob/main/RAG_Example1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# RAG based QA using Langchain

The below excercise is from a [post](https://www.linkedin.com/pulse/get-insight-from-your-business-data-build-llm-application-jain/) by Ashish Jain. Full credits goes to him.

We are using FLAN_T5 model in this excercise. The predecessor of this model is the T5 model which originated from the paper - Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer(see [video](https://www.youtube.com/watch?v=91iLu6OOrwk) for more details on the paper, or even detailed video [here](https://www.youtube.com/watch?v=Axo0EtMUK90)). The T5 model is based on the paper Scaling Instruction-Finetuned Language Models(see [video](https://www.youtube.com/watch?v=SHMsdAPo2Ls)). FLAN T5 is just Fine-tuned LANguage model on T5.

Next, we use [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) which converts the sentences to a 384 dimensional vector space. This sentence embedding is stored in FAISS vector DB.

The relevant documents are retrieved using *RetrievalQA* from *langchain*.

# Pre-requisites
Create a folder called example_data and in that folder place the pdf to be stored in the vector database.

# Step1: Split the PDF

In [1]:
!pip install langchain
!pip install pypdf

Collecting langchain
  Downloading langchain-0.0.352-py3-none-any.whl (794 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m794.4/794.4 kB[0m [31m8.4 MB/s[0m eta [36m0:00:00[0m
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain)
  Downloading dataclasses_json-0.6.3-py3-none-any.whl (28 kB)
Collecting jsonpatch<2.0,>=1.33 (from langchain)
  Downloading jsonpatch-1.33-py2.py3-none-any.whl (12 kB)
Collecting langchain-community<0.1,>=0.0.2 (from langchain)
  Downloading langchain_community-0.0.6-py3-none-any.whl (1.5 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.5/1.5 MB[0m [31m50.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting langchain-core<0.2,>=0.1 (from langchain)
  Downloading langchain_core-0.1.3-py3-none-any.whl (192 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m192.4/192.4 kB[0m [31m26.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting langsmith<0.1.0,>=0.0.70 (from langchain)
  Downloading langsmith-0.

In [3]:
#Load the pdf to memory
from langchain.document_loaders import PyPDFLoader
pdfLoader = PyPDFLoader("example_data/LetterToIndustry.pdf")
documents = pdfLoader.load()

In [4]:
len(documents)

2

In [5]:
#Split the file to chunks
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
docs = text_splitter.split_documents(documents)

In [6]:
len(docs)

9

# Step2: Create vector store
Here we use all-MiniLM-L6-v2 to create the sentence embedding and the embeddings are stored in the [FAISS](https://python.langchain.com/docs/integrations/vectorstores/faiss) vector store.

In [7]:
!pip install -U sentence-transformers

Collecting sentence-transformers
  Downloading sentence-transformers-2.2.2.tar.gz (85 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m86.0/86.0 kB[0m [31m1.9 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting sentencepiece (from sentence-transformers)
  Downloading sentencepiece-0.1.99-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m35.1 MB/s[0m eta [36m0:00:00[0m
Building wheels for collected packages: sentence-transformers
  Building wheel for sentence-transformers (setup.py) ... [?25l[?25hdone
  Created wheel for sentence-transformers: filename=sentence_transformers-2.2.2-py3-none-any.whl size=125923 sha256=b26ff28f45988d101a63a3671a191ac8aa4007e7ad9f13bf57e2d6f61af3b51f
  Stored in directory: /root/.cache/pip/wheels/62/f2/10/1e606fd5f02395388f74e7462910fe851042f97238cbbd902f
Successfully built sentence-tr

In [8]:
from langchain.embeddings import HuggingFaceEmbeddings

model_path = "sentence-transformers/all-MiniLM-L6-v2"
model_kwargs = {'device': 'cpu'}
encode_kwargs = {'normalize_embeddings': False}

embeddings = HuggingFaceEmbeddings(
    model_name=model_path,
    model_kwargs=model_kwargs,
    encode_kwargs=encode_kwargs
)

.gitattributes:   0%|          | 0.00/1.18k [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

data_config.json:   0%|          | 0.00/39.3k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

train_script.py:   0%|          | 0.00/13.2k [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

In [9]:
!pip install faiss-gpu

Collecting faiss-gpu
  Downloading faiss_gpu-1.7.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (85.5 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m85.5/85.5 MB[0m [31m6.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: faiss-gpu
Successfully installed faiss-gpu-1.7.2


In [10]:
#Create vector store
from langchain.vectorstores import FAISS
db = FAISS.from_documents(docs, embeddings)

In [12]:
#Example of documents retrieved from the vector DB for a question
question = "How many credits does the thesis comprise of?"
searchDocs = db.similarity_search_with_score(question)
print(searchDocs[0])

(Document(page_content='• The thesis  comprises  30 credit  points,  i.e. 100%  studies  for a semester  \n(the academic year in Sweden is divided in two semesters).  \n• The thesis  is organized  as a course  and runs  between  fixed  dates  \n(Jan - June  for the spring  semester  or Sept -Jan for the fall semester).  \n• The thesis  is individual  work  and the student  has to write  a \nsingle -authored master thesis report.  \n• The proposed project needs to be accepted  by the course \nexaminer.', metadata={'source': 'example_data/LetterToIndustry.pdf', 'page': 0}), 0.6723114)


# Step3: Create the generator
Here we use the FLAN-T5 model which is fine-tuned on many tasks including QA tasks.


In [24]:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM,pipeline
from langchain import HuggingFacePipeline

model_name_flan = "google/flan-t5-large"
tokenizer = AutoTokenizer.from_pretrained(model_name_flan)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name_flan)
pipe = pipeline("text2text-generation", model=model, tokenizer=tokenizer,max_new_tokens=200)
llm = HuggingFacePipeline(
    pipeline = pipe, #Corrected the issue
    model_kwargs={"temperature": 0, "max_length": 1000000},
)
#The model is locally downloaded in the original implementation by the author.

# Step4: Create the prompt template
During inference, when for the query sent to the vector search, the vector DB will provide multiple documents, from which we choose the best. This becomes the context in the prompt template.

In [14]:
from langchain.prompts import PromptTemplate

template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. Keep the answer as concise as possible.
{context}
Question: {question}
Helpful Answer:"""
QA_CHAIN_PROMPT = PromptTemplate.from_template(template)

# Step5: Create a retriever
We can use a RetrieverQA chain from langchain for this. See [this](https://docs.smith.langchain.com/cookbook/hub-examples/retrieval-qa-chain).

In [25]:
from langchain.chains import RetrievalQA

def getAnswer(question):
  qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=db.as_retriever(),
    chain_type_kwargs={"prompt": QA_CHAIN_PROMPT}
  )
  result = qa_chain ({ "query" : question })
  return(result["result"])

In [26]:
print(getAnswer(question))

30


In [27]:
print(getAnswer("Who will aprrove the thesis?"))

The proposed project needs to be accepted by the course examiner.


In [28]:
print(getAnswer("When does the thesis run, specify the exact dates?"))

Jan - June for the spring semester or Sept - Jan for the fall semester


# Notes



1.   While this works for simple pdf documents, complicated pdf documents resulted in very poor results.

