In this notebook we will do the below mentioned steps:

1. Load the Llama-2 paper pdf using LangChain document loaders.
2. Create text chunks.
3. Create Embeddings on the text chunks.
4. Save the embeddings in Vectore Store using chroma.
5. Perform Semantic search without using LLM
6. Perform question answering using Retrieval-Augmented-Generation on the document using LLM (Llama-2)

In [1]:
import torch 
import time
import transformers # HF import
from langchain import HuggingFacePipeline # To build the HF pipeline using Llama-2
from langchain import PromptTemplate,  LLMChain # To create PromptTemplate and LLMChain
from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM , AutoModel  # For creating the model and tokenizer


In [2]:
from transformers import GPTQConfig

#mname = 'TheBloke/Llama-2-7B-Chat-GGUF'
mname = "TheBloke/Mistral-7B-OpenOrca-GPTQ"

tokenizer = AutoTokenizer.from_pretrained(mname)
tokenizer.pad_token = tokenizer.eos_token

quantization_config_loading = GPTQConfig(bits=4, 
                                         disable_exllama=True, 
                                         use_cuda_fp16=True,
                                         tokenizer=tokenizer)

model = AutoModelForCausalLM.from_pretrained(mname,
                                             quantization_config=quantization_config_loading,
                                             device_map="auto")

model.eval()

pipe = pipeline("text-generation",
                model=model,
                tokenizer= tokenizer,
                torch_dtype=torch.bfloat16,
                device_map="auto",
                max_new_tokens = 128,
                do_sample=True,
                top_k=1,
                num_return_sequences=1,
                eos_token_id=tokenizer.eos_token_id,
                repetition_penalty=1.2
                )

llm=HuggingFacePipeline(pipeline=pipe, model_kwargs={'temperature':0})


tokenizer_config.json:   0%|          | 0.00/1.69k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.80M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/51.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/72.0 [00:00<?, ?B/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Using `disable_exllama` is deprecated and will be removed in version 4.37. Use `use_exllama` instead and specify the version with `exllama_config`.The value of `use_exllama` will be overwritten by `disable_exllama` passed in `GPTQConfig` or stored in your config file.


config.json:   0%|          | 0.00/1.32k [00:00<?, ?B/s]

You passed `quantization_config` to `from_pretrained` but the model you're loading already has a `quantization_config` attribute and has already quantized weights. However, loading attributes (e.g. ['use_cuda_fp16', 'use_exllama', 'max_input_length', 'exllama_config', 'disable_exllama']) will be overwritten with the one you passed to `from_pretrained`. The rest will be ignored.
CUDA extension not installed.
CUDA extension not installed.


model.safetensors:   0%|          | 0.00/4.16G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/120 [00:00<?, ?B/s]

#### Document Preparation

In [3]:
#Loading the documents from langchain resources folder

from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

def load_pdf(path_pdf):
  get_text = PyPDFLoader(path_pdf)
  
  get_pages = get_text.load()

  final_text = []

  shredder = RecursiveCharacterTextSplitter(chunk_size=350,
                                            chunk_overlap=20,
                                            length_function=len) 
  
  final_shred = shredder.split_documents(get_pages)

  return final_shred


In [4]:

#Just to test the function
agent_pg = load_pdf("./Agents.pdf")


In [5]:
agent_pg[0]


Document(page_content='5/31/23, 6:16 AM Agents — \x00\x00 LangChain 0.0.186\nhttps://python.langchain.com/en/stable/modules/agents.html 1/3Agents\nContents\nAction Agents\nPlan-and-Execute Agents\nConceptual Guide\nSome applications will require not just a predetermined chain of calls to LLMs/other tools, but', metadata={'source': './Agents.pdf', 'page': 0})

In [6]:
import glob
file_list = glob.glob("./*.pdf")
all_docs = []
for file in file_list:
  temp_docs = load_pdf(file)
  all_docs.extend(temp_docs)


In [7]:
len(all_docs)


2822

#### Creation of Embeddings

We will use the open source sentence transformer embedding to create the embedding.

In [8]:
from langchain.embeddings import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2')


#### Vector Store

In [9]:
from langchain.vectorstores import Chroma

# load embeddings into Chroma - need to pass docs ,embedding function and path of the db

db = Chroma.from_documents(all_docs,
                           embedding=embeddings)


In [10]:
db_retriever = db.as_retriever()
db_retriever.get_relevant_documents("langchain concepts")


[Document(page_content='It creates a vibrant and thriving ecosystem.\nIntegrations : Guides for how other products can be used with LangChain.\nDependents : List of repositories that use LangChain.\nSkip to main content\x00\x00\nCTRL  + K', metadata={'page': 2, 'source': './WelcometoLangChain.pdf'}),
 Document(page_content='LangChainHub : The LangChainHub is a place to share and explore other prompts, chains,\nand agents.\nGallery : A collection of great projects that use Langchain, compiled by the folks at\nKyrolabs . Useful for finding inspiration and example implementations.\nTracing : A guide on using tracing in LangChain to visualize the execution of chains and', metadata={'page': 3, 'source': './WelcometoLangChain.pdf'}),
 Document(page_content='5/31/23, 6:10 AM Welcome to LangChain — \x00\x00 LangChain 0.0.186\nhttps://python.langchain.com/en/stable/ 1/4Welcome to LangChain\nContents\nGetting Started\nModules\nUse Cases\nReference Docs\nEcosystem\nAdditional Resources\nLangChain

#### Metadata Field Info

In [11]:
from langchain.chains.query_constructor.base import AttributeInfo

metadata_field_info=[
    AttributeInfo(
        name="source",
        description="Filename and location of the source file", 
        type="string", 
    ),
    AttributeInfo(
        name="page",
        description="Page number on which the document is found", 
        type="integer", 
    )
]
document_content_description = "Text documents from Langchain help and concept documentation"


#### Creating a Retrieval QA Chain using LLM (llama-2)

In [12]:
#Create our Q/A Chain

from langchain.retrievers.self_query.base import SelfQueryRetriever


In [13]:
retriever = SelfQueryRetriever.from_llm(
    llm,
    db,
    document_content_description,
    metadata_field_info,
    verbose=True
)

retriever.get_relevant_documents("What are some concepts of Agents")




[Document(page_content='Agents no longer do: they use an LLM to determine which actions to take and in what order . An\naction can either be using a tool and observing its output, or returning to the user .\nWhen used correctly agents can be extremely powerful. In this tutorial, we show you how to\neasily use agents through the simplest, highest level API.', metadata={'page': 4, 'source': './QuickstartGuide.pdf'}),
 Document(page_content='ing llm-augmented autonomous agents,” CoRR , vol.\nabs/2308.05960, 2023.\n[723] X. Liu, H. Yu, H. Zhang, Y. Xu, X. Lei, H. Lai, Y. Gu,\nH. Ding, K. Men, K. Yang, S. Zhang, X. Deng, A. Zeng,\nZ. Du, C. Zhang, S. Shen, T. Zhang, Y. Su, H. Sun,\nM. Huang, Y. Dong, and J. Tang, “Agentbench: Evalu-\nating llms as agents,” CoRR , vol. abs/2308.03688, 2023.', metadata={'page': 115, 'source': './survey_of_large_lang_models.pdf'}),
 Document(page_content='Agent: this is where the logic of the application lives. Agents expose an interface\nthat takes in user in

In [14]:
retriever = SelfQueryRetriever.from_llm(llm, 
                                        db, 
                                        document_content_description, 
                                        metadata_field_info, 
                                        verbose=True,
                                        enable_limit=True)

retriever.get_relevant_documents("Explain 3 concepts of Chains")


[Document(page_content='Chains : Chains are structured sequences of calls (to an LLM or to a dif ferent utility).\nAgents : An agent is a Chain in which an LLM, given a high-level directive and a set of\ntools, repeatedly decides an action, executes the action and observes the outcome until\nthe high-level directive is complete.', metadata={'page': 1, 'source': './WelcometoLangChain.pdf'}),
 Document(page_content='but understanding how it works will set you up well for working with more complex chains.\nFor more details, check out the getting started guide for chains.\nAgents: Dynamically Call Chains Based on\nUser Input\nSo far the chains we’ve looked at run in a predetermined order .', metadata={'page': 4, 'source': './QuickstartGuide.pdf'}),
 Document(page_content='agents.\nModel Laboratory : Experimenting with dif ferent prompts, models, and chains is a big part\nof developing the best possible application. The ModelLaboratory makes it easy to do so.\nDiscord : Join us on our Disco

In [15]:
retriever.get_relevant_documents("Give 2 example of autonomous agent")


[Document(page_content='Applications. Recently, LLM-based agents have shown\ngreat potential in autonomously solving complex tasks,\nmaking it feasible to rapidly develop capable applications\nfor specific domains or tasks. In this section, we will discuss\nthe applications in single-agent and multi-agent scenarios.\n•Single-agent based applications. Applications based on', metadata={'page': 78, 'source': './survey_of_large_lang_models.pdf'}),
 Document(page_content='ing llm-augmented autonomous agents,” CoRR , vol.\nabs/2308.05960, 2023.\n[723] X. Liu, H. Yu, H. Zhang, Y. Xu, X. Lei, H. Lai, Y. Gu,\nH. Ding, K. Men, K. Yang, S. Zhang, X. Deng, A. Zeng,\nZ. Du, C. Zhang, S. Shen, T. Zhang, Y. Su, H. Sun,\nM. Huang, Y. Dong, and J. Tang, “Agentbench: Evalu-\nating llms as agents,” CoRR , vol. abs/2308.03688, 2023.', metadata={'page': 115, 'source': './survey_of_large_lang_models.pdf'})]

In [16]:
from langchain.chains import RetrievalQAWithSourcesChain


In [17]:
chain = RetrievalQAWithSourcesChain.from_chain_type(llm, 
                                                    chain_type="stuff", 
                                                    retriever=retriever)

chain({"question":"Give 2 types of agents"})


  warn_deprecated(


{'question': 'Give 2 types of agents',
 'answer': ' One type of agent is an AI agent, also called a software agent or web robot. Another type of agent is an autonomous agent, which can act independently without human intervention.\n',
 'sources': './QuickstartGuide.pdf,./survey_of_large_lang_models.pdf'}

In [18]:
chain({"question":"How to combine LLMs"},
      return_only_outputs=False)


{'question': 'How to combine LLMs',
 'answer': ' To combine LLMs, one should consider various techniques and methods like red teaming, guiding the whole generation process through external tools or models, re-checking the reasoning process, and fine-tuning with process-based feedback.\n',
 'sources': './survey_of_large_lang_models.pdf'}

In [19]:
chain({"question":"Provide 2 examples of combining LLMs"},
      return_only_outputs=False)


{'question': 'Provide 2 examples of combining LLMs',
 'answer': ' Two examples of combining LLMs include discussing the development and usage techniques of large language models and providing a summary of the latest literature on LLMs.\n',
 'sources': './survey_of_large_lang_models.pdf'}

In [21]:
chain({"question":"4 Concepts in langchain"},
      return_only_outputs=False)




OutputParserException: Parsing text
```json
{
    "query": "concepts",
    "filter": "(gt(\"page\", 3)",
    "limit": 4
}
```
 raised following error:
Unexpected token Token('LPAR', '(') at line 1, column 1.
Expected one of: 
	* CNAME
Previous tokens: [None]


In [22]:
chain({"question":"Explain consistency in langchain."},
      return_only_outputs=False)




OutputParserException: Parsing text
```json
{
    "query": "consistency",
    "filter": "(eq(\"source\", \"Langchain Help and Concept Documentation\") and eq(\"page\", 5))",
    "limit": 1
}
```
 raised following error:
Unexpected token Token('LPAR', '(') at line 1, column 1.
Expected one of: 
	* CNAME
Previous tokens: [None]
