# Project: Question-Answering on Private Documents

## Project Pipeline:

* Prepare the document,
* Embed the query,
* Ask questions.


## Installing necessary Libraries

In [1]:
!pip install -r requirements.txt -q

In [2]:
!pip show LangChain

Name: langchain
Version: 0.0.254
Summary: Building applications with LLMs through composability
Home-page: https://www.github.com/hwchase17/langchain
Author: 
Author-email: 
License: MIT
Location: c:\users\think\anaconda3\lib\site-packages
Requires: requests, PyYAML, dataclasses-json, openapi-schema-pydantic, SQLAlchemy, pydantic, numexpr, async-timeout, tenacity, numpy, aiohttp, langsmith
Required-by: 


In [3]:
!pip install langchain --upgrade -q

## Python-dotenv

In [4]:
import os
from dotenv import load_dotenv,find_dotenv

load_dotenv(find_dotenv(), override=True)

True

## Prepare the document
* Load the document.
* Make chunks of the document.
* Make embedding of the chunk.
* Store embedding in a vector database such as pinecone etc.

In [6]:
!pip install pypdf -q
!pip install wikipedia -q

## Loading the document

In [7]:
# function to load document
def load_doc(file):
    from langchain.document_loaders import PyPDFLoader
    print(f'Loading {file}....')
    loader = PyPDFLoader(file)
    data=loader.load()
    return data

## Chunking

In [8]:
# function to split the data from document_loader to chunks
def chunk_data(data,chunk_size=256):
    from langchain.text_splitter import RecursiveCharacterTextSplitter
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=chunk_size,chunk_overlap=0)
    chunks= text_splitter.split_documents(data)
    return chunks

## Calculating cost

In [9]:
# function to calculate and print the cost of creating embeddings
def print_embedding_cost(texts):
    import tiktoken
    enc = tiktoken.encoding_for_model('text-embedding-ada-002')
    total_tokens=sum([len(enc.encode(page.page_content))for page in texts])
    print(f'Total tokens:{total_tokens}')
    print(f'Embedding Cost in USD: {total_tokens/1000 *0.0004:.6f} ')
 

## Create and delete index from Vector Database(Pinecone)

In [10]:
# function that creates and populate a pinecone index with the embeddings
def creating_vector_store(index_name):
    import pinecone
  
    if index_name not in pinecone.list_indexes():
        print(f'Creating index {index_name}.')
        pinecone.create_index(index_name, dimension=1536, metric = 'cosine')
        print(f'Index {index_name} created.')
        
    else:
        print(f'Index {index_name} already exists...')

In [11]:
def delete_pinecone_index(index_name='all'):
    
    import pinecone
    pinecone.init(
    api_key=os.environ.get("PINECONE_API_KEY"),
    environment=os.environ.get("PINECONE_ENV")
    )  
    
    if index_name == "all":
        
        indexes = pinecone.list_indexes()
        print('Deleting all indexes...',end='')
        for index in indexes:
            pinecone.delete_index(index)
        print('Ok')
    else:
        print(f'Deleting {index_name}....',end='')
        pinecone.delete_index(index_name)
        print('Ok')

## Asking and Gettings Answers

In [24]:
def ask_and_get_answer(vector_store, query):
    
    from langchain.chains import RetrievalQA
    from langchain.chat_models import ChatOpenAI

    llm = ChatOpenAI(temperature = 0)

    retriever = vector_store.as_retriever(
        search_type='similarity',
        search_kwargs={'k':3}
    )

    chain = RetrievalQA.from_chain_type(
        llm=llm,
        chain_type='stuff',
        retriever = retriever 
    )
    
    answer = chain.run(query)
    return answer


def ask_with_memory(vector_store, query,chat_history=[]):
    
    from langchain.chains import ConversationalRetrievalChain
    from langchain.chat_models import ChatOpenAI
    
    llm = ChatOpenAI(temperature =0)
    
    retriever = vector_store.as_retriever(
        search_type='similarity',
        search_kwargs={'k':3}
    )
    
    crc=ConversationalRetrievalChain.from_llm(llm,retriever)
    result=crc({'question':query, 'chat_history':chat_history})
    chat_history.append((query,result['answer']))
    
    return result,chat_history

## Running Code

In [13]:
data= load_doc('./files/The Constitution of Pakistan.pdf')

Loading ./files/The Constitution of Pakistan.pdf....


In [14]:
print(f'There are {len(data)} pages in the document.')

There are 209 pages in the document.


In [15]:
chunks = chunk_data(data)

In [16]:
chunks[0].page_content

'CONSTITUTION OF PAKISTAN   \n 3 PART I \n \nIntroductory \n \n1. The Republic and its territories \n \n11. (1) Pakistan shall be Federal Republic to be known as the \nIslamic Republic of Pakistan, hereinafter referred to as Pakistan.'

In [17]:
len(chunks)

1931

In [18]:
print_embedding_cost(chunks)

Total tokens:99065
Embedding Cost in USD: 0.039626 


In [19]:
delete_pinecone_index()

  from tqdm.autonotebook import tqdm


Deleting all indexes...Ok


In [20]:
index_name = "ask-chat-gpt"
creating_vector_store(index_name)

Creating index ask-chat-gpt.
Index ask-chat-gpt created.


In [21]:
import pinecone
from langchain.vectorstores import Pinecone
from langchain.embeddings.openai import OpenAIEmbeddings

embeddings=OpenAIEmbeddings()

pinecone.init(api_key=os.environ.get("PINECONE_API_KEY"),environment=os.environ.get("PINECONE_ENV"))

vector_store= Pinecone.from_documents(chunks,embeddings,index_name=index_name)

Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-embedding-ada-002 in organization org-Zgvqg40aCEUeeAAUlnbvIafN on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..
Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-embedding-ada-002 in organization org-Zgvqg40aCEUeeAAUlnbvIafN on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/

Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-embedding-ada-002 in organization org-Zgvqg40aCEUeeAAUlnbvIafN on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..
Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-embedding-ada-002 in organization org-Zgvqg40aCEUeeAAUlnbvIafN on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/

In [None]:
index_name= pinecone.list_indexes()[0]

In [None]:
vector_store=Pinecone.from_existing_index(index_name,embeddings)

In [None]:
# creating a loop to question and answer
import time

i=1
flag=True
chat_history=[]

print("Type Exit or Quit to quit.....")

while flag:
    query = input(f"Question #{i}:")
    i+=1
    
    if query.lower() in ["quite","exit"]:
        print("Good Bye. Exiting.........")
        flag = False
        
    else:   
#         answer= ask_and_get_answer(vector_store, query)
        result,chat_history=ask_with_memory(vector_store, query,chat_history)

        print(f"Answer: {answer}")
        print(f'\n {"-"*50}\n')