<a href="https://colab.research.google.com/github/gnanimail/GenerativeAI/blob/main/LangChain_RAG_HybridSearch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [7]:
!pip install -qU langchain Faiss-gpu tiktoken sentence-transformers

In [8]:
!pip install git+https://github.com/huggingface/transformers

Collecting git+https://github.com/huggingface/transformers
  Cloning https://github.com/huggingface/transformers to /tmp/pip-req-build-67s_tjd9
  Running command git clone --filter=blob:none --quiet https://github.com/huggingface/transformers /tmp/pip-req-build-67s_tjd9
  Resolved https://github.com/huggingface/transformers to commit 29e7a1e1834f331a4916853ecd58549ed78235d6
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone


In [9]:
!pip install -qU trl Py7zr auto-gptq optimum

In [10]:
# Base ctransformers with CUDA GPU acceleration
! pip install ctransformers[cuda]



In [11]:
!pip install PyPdf



In [12]:
!pip install rank_bm25



### Load Dataset
### Data parsing using LangChain


In [13]:
from langchain.document_loaders import PyPDFLoader

ros_loader = PyPDFLoader("/content/RSO/ESOP.pdf").load()
print(f"len of documents in :{len(ros_loader)}")

len of documents in :26


Chunk the documents text by using RecursiveCharacterTextSplitter to create chunks for reviews

In [14]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1500,
                                      chunk_overlap=100,
                                      length_function=len,)

ros_documents = text_splitter.transform_documents(ros_loader)
print(f"number of chunks in documents : {len(ros_documents)}")

number of chunks in documents : 53



###Create Vectorstore

    Here we will leverage a CacheBackedEmbeddings to prevent us from reembedding similar queries over and over again.

    Structured documents will be passed into a usesful format for querying ,retrieving and use in LLM application. we will use FAISS(Facebook AI similarity search) as the vectorstore.



In [15]:
from langchain.embeddings import CacheBackedEmbeddings,HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
from langchain.storage import LocalFileStore
from langchain.retrievers import BM25Retriever,EnsembleRetriever

In [19]:
store = LocalFileStore("./cache/")

embed_model_id = 'BAAI/bge-small-en-v1.5'
core_embeddings_model = HuggingFaceEmbeddings(model_name=embed_model_id)
embedder = CacheBackedEmbeddings.from_bytes_store(core_embeddings_model,
                                                  store,
                                                  namespace=embed_model_id)

In [20]:
# Create Vector Store
vectorstore = FAISS.from_documents(ros_documents,embedder)

### Create Sparse Embedding

In [21]:
bm25_retriever = BM25Retriever.from_documents(ros_documents)

In [22]:
query = "What is the early stage grant for CTO in ESOP alocation?"
embedding_vector = core_embeddings_model.embed_query(query)
len(embedding_vector)

384

In [23]:
#Retrieve context from the vectorstore that are similar to the query

documents = vectorstore.similarity_search_by_vector(embedding_vector, k=5)

In [24]:
for page in documents:
  print(page.page_content)
  print("\n")

10.20 • Pension and 401(k)Plan Overview and Update
BOT-00046.000\77202.doc4. Companies whose futures depend solely upon the efforts of one or two
essentialindividuals;
5. Capital intensive companies typically do not generate substantial
payroll to take full advantage of the ESOP contributions twenty-five
percentrule; and
6. Uncontrolled high growth companies typically have difficulties with
cash flowand debt service.
C. TheBenefits Availableto a Company Contemplating an ESOP Include:
1. Aretirement plan foremployees;
2. A vehicle to boost employee moral and motivation by offering
employees an equity position whereby they participate and share in the
rewards oftheirown hard work;
3. The company can improve cash flow with pre-tax loan principal and
interestrepayments;
4. In a closely held corporation, the current owner has the option to
rollover the proceeds from the sale of his stock to the ESOP into the
stock marketand deferpaying tax on any gains;
5. The current owner may still mainta

How much time CacheBackedEmbeddings pattern saves us

In [25]:
%%timeit -n 1 -r 1
query = "What is the early stage grant for CTO in ESOP alocation?"

embedding_vector = core_embeddings_model.embed_query(query)
documents_cache = vectorstore.similarity_search_by_vector(embedding_vector,k=5)

59.4 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


###Setup Ensemble Retriever (Hybrid Search)

In [26]:
faiss_retriever = vectorstore.as_retriever(search_kwargs={"k":5})

hybrid_retriever = EnsembleRetriever(retrievers=[bm25_retriever, faiss_retriever],
                                       weights=[0.5,0.5])

###Build a retrieval chain

Initialize LLM using a quantized GPTQ Model

In [24]:
!pip install tensorflow==2.14.1

Collecting tensorflow==2.14.1
  Downloading tensorflow-2.14.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (489.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m489.9/489.9 MB[0m [31m1.0 MB/s[0m eta [36m0:00:00[0m
Collecting tensorboard<2.15,>=2.14 (from tensorflow==2.14.1)
  Downloading tensorboard-2.14.1-py3-none-any.whl (5.5 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.5/5.5 MB[0m [31m58.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting tensorflow-estimator<2.15,>=2.14.0 (from tensorflow==2.14.1)
  Downloading tensorflow_estimator-2.14.0-py2.py3-none-any.whl (440 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m440.7/440.7 kB[0m [31m33.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting keras<2.15,>=2.14.0 (from tensorflow==2.14.1)
  Downloading keras-2.14.0-py3-none-any.whl (1.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m52.6 MB/s[0m eta [36m0:00:00[0m

In [1]:
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model_name_or_path = "TheBloke/Mistral-7B-Instruct-v0.1-GPTQ"
# To use a different branch, change revision
# For example: revision="gptq-4bit-32g-actorder_True"
model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
                                             device_map="auto",
                                             trust_remote_code=False,
                                             revision="gptq-8bit-32g-actorder_True")

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)

config.json:   0%|          | 0.00/962 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/8.17G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/1.46k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.80M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/72.0 [00:00<?, ?B/s]

In [2]:
!pip install torch
import torch



In [32]:
pipe = pipeline("text-generation",
                model=model,
                tokenizer=tokenizer,
                max_new_tokens=512,
                do_sample=True,
                temperature=0.1,
                top_p=0.95,
                top_k=40,
                repetition_penalty=1.1,
                torch_dtype=torch.float32
            )

In [33]:
from langchain.llms import HuggingFacePipeline

llm = HuggingFacePipeline(pipeline=pipe)

### Caching

In [34]:
import langchain
from langchain.cache import InMemoryCache

langchain.llm_cache = InMemoryCache()

###Setup Retrieval chain - without Hybrid Search

In [35]:
from langchain.chains import RetrievalQA
from langchain.callbacks import StdOutCallbackHandler

handler = StdOutCallbackHandler()

qa_with_sources_chain = RetrievalQA.from_chain_type(llm=llm,
                                                    chain_type="stuff",
                                                    retriever = vectorstore.as_retriever(search_kwargs={"k":5}),
                                                    callbacks=[handler],
                                                    return_source_documents=True
                                                   )


In [36]:
%%time
query = "What is the early stage grant for CTO in ESOP alocation?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")



[1m> Entering new RetrievalQA chain...[0m


RuntimeError: ignored

In [36]:
%%time
query = "What is the early stage grant for Key Developer or Engineer in ESOP alocation?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")



[1m> Entering new RetrievalQA chain...[0m


RuntimeError: ignored

In [None]:
%%time
query = "What are the segments in human resources?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")

In [None]:
%%time
query = "What is the early stage grant for Developer or Engineer in ESOP alocation?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")

In [None]:
%%time
query = "What is the early stage grant for non founding member of senior team in ESOP alocation?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")

In [None]:
%%time
query = "What is the early stage grant for Functional Team member of senior team in ESOP alocation?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")

In [None]:
%%time
query = "What is the document about?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")
print(f"Number of Documents returned : {len(response['source_documents'])}")

###LLM Caching applied as the context of the question is the same.


In [None]:
%%time
query = "Please summarize what is the document about?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")
print(f"Number of of Documents returned : {len(response['source_documents'])}")

In [None]:
%%time
query ="How to Determine the Dollar Value of the Options Grant?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")
print(f"Number of Documents returned : {len(response['source_documents'])}")

In [None]:
%%time
query ="How to Determine the Dollar Value of the Options Grant?Please give the formulae."
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")
print(f"Number of Documents returned : {len(response['source_documents'])}")

###Setup Retrieval chain - with Hybrid

In [None]:
##Ensemble Retriever
from langchain.chains import RetrievalQA
from langchain.callbacks import StdOutCallbackHandler
#
handler = StdOutCallbackHandler()
#
qa_with_sources_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever = ensemble_retriever,
    callbacks=[handler],
    return_source_documents=True
)

In [None]:
%%time
query = "Who are the different stakeholders for granting ESOPS asnd their respective share percentage according to the context in the document?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")

In [None]:
%%time
query = "What is the early stage grant for Key Developer or Engineer in ESOP alocation?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")

In [None]:
%%time
query = "What is the early stage grant for Developer or Engineer in ESOP alocation?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")

In [None]:
%%time
query = "What is the early stage grant for CTO in ESOP alocation?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")

In [None]:
%%time
query = "What is the early stage grant for non founding member of senior team in ESOP alocation?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")

In [None]:
%%time
query = "What is the early stage grant that no non founding member of senior team in ESOP should be allocated?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")

In [None]:
%%time
query = "How to Determine the Dollar Value of the Options Grant?Please give the formulae."
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")

In [None]:
%%time
query = "How to Determine the Dollar Value of the Options Grant?Please give the formulae."
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")

In [None]:
%%time
query ="What is the document about?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")
print(f"Number of Documents returned : {len(response['source_documents'])}")

In [None]:
%%time
query ="What is the difference between Retention grants and Discretionary Grants?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")
print(f"Number of Documents returned : {len(response['source_documents'])}")

In [None]:
%%time
query = "What are the Social Impact Considerations for Esops?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")
print(f"Number of Documents returned : {len(response['source_documents'])}")

In [None]:
query = "according to Fred Wilson what is the necessary part of Capital Structure?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")
print(f"Number of Documents returned : {len(response['source_documents'])}")

In [None]:
query = "according to Steven Johnson what is the difference between Sillicon valley companies and other companies?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")
print(f"Number of Documents returned : {len(response['source_documents'])}")

In [None]:
query = "What is  Fred Wilson's opinion about ESOPs?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")
print(f"Number of Documents returned : {len(response['source_documents'])}")

In [None]:
query = "Describe Top Down Process?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")
print(f"Number of Documents returned : {len(response['source_documents'])}")

In [None]:
%%time
query = "What is the early stage grant for Functional Team member of senior team in ESOP alocation?"
response = qa_with_sources_chain({"query":query})
print(f"Response generated : \n {response['result']}")
print(f"Source Documents : \n {response['source_documents']}")