In [1]:
#Step0: load openAI key
import os
import sys
import openai
sys.path.append('../..')

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.environ['OPENAI_API_KEY']

In [2]:
#step1: Load the document using PyPDFLoader - there are other options to load as well.

from langchain.document_loaders import PyPDFLoader
loader = PyPDFLoader("gita_english_full.pdf")
pages = loader.load()
print(len(pages))

1051


In [4]:
#step2: Split the text into chunks that can be stored

from langchain.text_splitter import CharacterTextSplitter
text_splitter = CharacterTextSplitter(
    chunk_size = 1500,
    chunk_overlap = 0
)

splits = text_splitter.split_documents(pages)
len(splits)
print(splits[0])

page_content='"Bhagavad-gita As It Is " \nby His Divine Grace A.C. Bh aktivedanta Swami Prabhupada. \n \nCOPYRIGHT NOTICE:  \n \nThis is an evaluation copy  of the printed version of this book, and is NOT FOR \nRESALE . This evaluation copy is intended  for personal non-commercial use only, \nunder the "fair use" guidelines establishe d by international copyright laws. You \nmay use this electronic file to evaluate the printed version of this book, for your \nown private use, or for short excerpts us ed in academic works,  research, student \npapers, presentations, and the like. You ca n distribute this evaluation copy to \nothers over the Internet, so long as you keep  this copyright information intact and \ndo not add or subtract anything to this f ile and its contents. You may not reproduce \nmore than ten percent (10%) of this book in any medium without the express \nwritten permission from the copyright holders. \n \nReference any excerpts in the follo wing way: “Excerpted from “S

In [5]:
#step3: Use vector store chroma, store the embeddings to disk - there are multiple vectorDB that can be used.

from langchain.embeddings import SentenceTransformerEmbeddings
from langchain.vectorstores import Chroma
embedding = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")

vectordb = Chroma.from_documents(
    documents = splits,
    embedding = embedding,
    persist_directory="./chroma_db"
)

In [9]:
#load vectorDB incase of running it for 2nd 3rd time etc, as we don't have to write it into DB everytime.
from langchain.embeddings import SentenceTransformerEmbeddings
from langchain.vectorstores import Chroma
vectordb = Chroma(
    embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2"),
    persist_directory="./chroma_db"
)

ImportError: Could not import sentence_transformers python package. Please install it with `pip install sentence_transformers`.

In [6]:
!pip install sentence_transformers



In [7]:
question = "what are things that human should not involve in"

In [8]:
# Invoke Max Marginal Relevance with Vector Store
docs = vectordb.max_marginal_relevance_search(question,k=3)
print(docs)

NameError: name 'vectordb' is not defined

In [7]:
# Step4: Applying Compression techniques to improve quality of extracted text
from langchain.llms import OpenAI
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import LLMChainExtractor

# Wrap our vectorstore
llm = OpenAI(temperature=0)
compressor = LLMChainExtractor.from_llm(llm)

compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor,
    base_retriever=vectordb.as_retriever()
)

In [8]:
# Invoke Compressor for better quality output from source document
compressed_docs = compression_retriever.get_relevant_documents(question)

Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-rDa5o8yXsO2QiirJG0nLT5ZQ on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..
Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-rDa5o8yXsO2QiirJG0nLT5ZQ on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/acco

In [9]:
print(compressed_docs)

[Document(page_content='"One who can see that all activities are performed by the body, which is created of material nature, and sees that the self does nothing, actually sees." "The self, however, is outside all these bodily activities." "One who has such a vision is an actual seer."', metadata={'page': 799, 'source': 'gita_english_full.pdf'}), Document(page_content='"Not only Arjuna, but every one of us is full of anxieties because of this material existence. Our very existence is in the atmosphere of nonexistence. Actually we are not meant to be threatened by nonexistence. Our existence is eternal. But somehow or other we are put into asat. Asat refers to that which does not exist. Out of so many human beings who are suffering, there are a few who are actually inquiring about their position, as to what they are, why they are put into this awkward position and so on. Unless one is awakened to this position of questioning his suffering, unless he realizes that he doesn’t want sufferin

In [None]:
# use Retrieval QA chain to pass context and receive response from LLM

from langchain.prompts import PromptTemplate
from langchain.chains import RetrievalQA
qa_chain_mr = RetrievalQA.from_chain_type(
    llm,
    retriever=vectordb.as_retriever(),
    chain_type="refine"
)
result = qa_chain_mr({"query": question})
result["result"]


In [11]:
# Build prompt
from langchain.prompts import PromptTemplate
template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. Use three sentences maximum. Keep the answer as concise as possible. Always say "thanks for asking!" at the end of the answer. 
{context}
Question: {question}
Helpful Answer:"""
QA_CHAIN_PROMPT = PromptTemplate(input_variables=["context", "question"],template=template)

# Run chain
from langchain.chains import RetrievalQA
qa_chain = RetrievalQA.from_chain_type(llm,
                                       retriever=vectordb.as_retriever(),
                                       return_source_documents=True,
                                       chain_type_kwargs={"prompt": QA_CHAIN_PROMPT})


result = qa_chain({"query": question})
result["result"]

' Human should not involve in activities that are based on material nature, such as social work, nationalism, and altruism. They should instead focus on spiritual activities and understanding the nature of the Absolute. Thanks for asking!'

In [12]:
# Step 5 adding a layer of memory

from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

In [13]:
from langchain.chains import ConversationalRetrievalChain
retriever=vectordb.as_retriever()
qa = ConversationalRetrievalChain.from_llm(
    llm,
    retriever=vectordb.as_retriever(),
    memory=memory
)

In [14]:
q1 = "What is the purpose of life?"
result = qa({"question": q1})
print(result['answer'])

 The purpose of life is to reach the eternal kingdom of God, either merging into the impersonal Brahman or associating with the Supreme Personality of Godhead, Kåñëa.


In [15]:
q2 = "Explain that in detail."
result = qa({"question": q2})
print(result['answer'])

Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-rDa5o8yXsO2QiirJG0nLT5ZQ on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..
Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-rDa5o8yXsO2QiirJG0nLT5ZQ on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/acco

 The purpose of life is to advance toward the supreme eternal atmosphere. This is done by controlling the senses, becoming cleansed of the sinful reactions of material existence, and entering into the eternal kingdom of God, either merging into the impersonal Brahman or associating with the Supreme Personality of Godhead, Kåñëa.


In [16]:
q3 = "How to advance towards it."
result = qa({"question": q3})
print(result['answer'])

Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-rDa5o8yXsO2QiirJG0nLT5ZQ on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..
Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-rDa5o8yXsO2QiirJG0nLT5ZQ on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/acco

 You can advance towards the supreme eternal atmosphere by attaining to the Supreme Personality of Godhead. This can be done by following the instructions of the Bhagavad-gétä, such as remembering the Supreme at the time of death and attaining to His abode.


In [28]:
q4 = "Explain that in detail."
result = qa({"question": q4})
print(result['answer'])

 To advance towards the supreme eternal atmosphere, one must attain to the Supreme Personality of Godhead. This can be done by cultivating spiritual knowledge, in eternal relationship with the Supreme Personality of Godhead, and by devoting oneself to the Lord. At the time of death, whoever thinks of the Lord as Brahman, Paramatma, or the Personality of Godhead, will enter the spiritual sky.
