# **<center> 1. Introduction  </center>**

ChatGPT has taken the world by storm. Millions are using it. But while it's great for general purpose knowledge, it only knows information about what it has been trained on, which **is pre-2021** generally available internet data. It **doesn't know about your private data,** it doesn't know about recent sources of data.

Many LLM applications require user-specific data that is not part of the model's training set. The primary way of accomplishing this is through **Retrieval Augmented Generation (RAG)**. In this process, external data is retrieved and then passed to the LLM when doing the generation step.

In this session we will cover how to create our own Question Answering system over our data here using Langchain and Retrieval Augmented Generation (RAG) concepts.

# **<center> 2. Installing Required Packages </center>**

In [None]:
!pip install langchain
!pip install python-dotenv
!pip install openai
!pip install pdfplumber

Collecting langchain
  Downloading langchain-0.0.352-py3-none-any.whl (794 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m794.4/794.4 kB[0m [31m6.0 MB/s[0m eta [36m0:00:00[0m
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain)
  Downloading dataclasses_json-0.6.3-py3-none-any.whl (28 kB)
Collecting jsonpatch<2.0,>=1.33 (from langchain)
  Downloading jsonpatch-1.33-py2.py3-none-any.whl (12 kB)
Collecting langchain-community<0.1,>=0.0.2 (from langchain)
  Downloading langchain_community-0.0.5-py3-none-any.whl (1.5 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.5/1.5 MB[0m [31m16.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting langchain-core<0.2,>=0.1 (from langchain)
  Downloading langchain_core-0.1.2-py3-none-any.whl (191 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m191.5/191.5 kB[0m [31m9.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting langsmith<0.1.0,>=0.0.70 (from langchain)
  Downloading langsmith-0.0

In [None]:
!pip install tiktoken
!pip install sentence_transformers

Collecting tiktoken
  Downloading tiktoken-0.5.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m11.4 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: tiktoken
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
llmx 0.0.15a0 requires cohere, which is not installed.[0m[31m
[0mSuccessfully installed tiktoken-0.5.2
Collecting sentence_transformers
  Downloading sentence-transformers-2.2.2.tar.gz (85 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m86.0/86.0 kB[0m [31m1.3 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting sentencepiece (from sentence_transformers)
  Downloading sentencepiece-0.1.99-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
[2K     [9

In [None]:
!pip install lark

Collecting lark
  Downloading lark-1.1.8-py3-none-any.whl (111 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/111.6 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━[0m [32m102.4/111.6 kB[0m [31m3.2 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m111.6/111.6 kB[0m [31m2.8 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: lark
Successfully installed lark-1.1.8


In [None]:
# Load the API key and relevant Python libaries.

import openai  # Importing openai python package
import os # Importing Operating system library to do file operations
from IPython.display import display, HTML # Just to visualize the whole text on one screen

# Loading the OpenAI API key

# ##### 1st method (Suggested)
# from dotenv import load_dotenv, find_dotenv
# _ = load_dotenv(find_dotenv())

# openai.api_key  = os.getenv('OPENAI_API_KEY')

###### 2nd Method

openai.api_key = "<YOUR API KEY>"

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
os.getenv('OPENAI_API_KEY') == "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"

False

# **<center> 3. Loading The Documents </center>**

Document loaders provide a "load" method for **loading data as documents from a configured source**. They optionally implement a "lazy load" as well for lazily loading data into memory.

Use document loaders to load data from a source as Document's. A Document is a piece of text and associated metadata.

For example, there are document loaders for loading a simple .txt file, for loading the text contents of any **web page,** or **even for loading a transcript of a YouTube video**.


<center><img src="https://miro.medium.com/v2/resize:fit:1400/1*ZwDfKQkV6RUhp35aJjW0Iw.png"  width="600" height="500"></center>

[Above Image Taken from](https://medium.com/@onkarmishra/using-langchain-for-question-answering-on-own-data-3af0a82789ed)

[List of Document Loaders Langchain supports](https://python.langchain.com/docs/integrations/document_loaders)

In [None]:
from langchain.document_loaders import PDFPlumberLoader

loader = PDFPlumberLoader("/content/drive/MyDrive/IMBD_Movies_List_2023[1].pdf")

In [None]:
data = loader.load()

In [None]:
len(data)

10

There are lots of different ways to load the PDF in langchain. You can use any one of them as per your requirenment you can refer the list of pdf loaders given below.

[PDF loaders in Langchain](https://python.langchain.com/docs/modules/data_connection/document_loaders/pdf#using-pdfplumber)

In [None]:
try:
  data[1].metadata
except IndexError:
  print("The list only has one item.")

But these documents are too large, and this can be problematic during retrieval phase as in that case it won't be able to found the similar documents accuratly. So to tackle this we so **split these large documents into small chunks**.

# **<center> 4. Document transformers </center>**

Once you've loaded documents, you'll often want to transform them to better suit your application. The simplest example is you may want to **split a long document into smaller chunks that can fit into your model's context window.** LangChain has a number of built-in document transformers that make it easy to split, **combine, filter,** and otherwise manipulate documents.

## **<center> 4.1 Text splitters (Chunkers) </center>**

When you want to deal with long pieces of text, it is necessary to split up that text into chunks. As simple as this sounds, there is a lot of potential complexity here. Ideally, you want to keep the semantically related pieces of text together. What **"semantically related"** means could depend on the type of text.

At a high level, text splitters work as following:

1. **Split the text up into small**, semantically meaningful chunks (often sentences).
2. Start combining these small chunks into a larger chunk until you reach a certain size (as measured by some function).
3. Once you reach that size, make that chunk its own piece of text and then start creating a new chunk of text **with some overlap** (to keep context between chunks).

That means there are two different axes along which you can customize your text splitter:

1. How the text is split
2. How the chunk size is measured

The default **recommended text splitter is the RecursiveCharacterTextSplitter**. This text splitter takes a list of characters. It tries to create chunks based on splitting on the first character, but if any chunks are too large it then moves onto the next character, and so forth. **By default the characters it tries to split on are ["\n\n", "\n", " ", ""]**

In addition to controlling which characters you can split on, you can also control a few other things:

- **`length_function`**: how the length of chunks is calculated. Defaults to just counting number of characters, but it's pretty common to pass a token counter here.
- **`chunk_size`**: the maximum size of your chunks (as measured by the length function).
- **`chunk_overlap`**: the maximum overlap between chunks. It can be nice to have some overlap to maintain some continuity between chunks (e.g. do a sliding window).
- **`add_start_index`**: whether to include the starting position of each chunk within the original document in the metadata.

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [None]:
text_splitter = RecursiveCharacterTextSplitter(
    separators = ["\n\n", "\n", " ", ""],
    chunk_size = 400,
    chunk_overlap  = 50,
    length_function = len,
    is_separator_regex = False,
)

In [None]:
chunks = text_splitter.split_documents(data)

In [None]:
print(f"Number of Chunks after splitting the text : {len(chunks)}")

Number of Chunks after splitting the text : 55


In [None]:
print(f"Chunk 1 :- ")
print(chunks[0].page_content)
print(f"Chunk 2 :- ")
print(chunks[1].page_content)

Chunk 1 :- 
BEST BOLLYWOOD MOVIES OF 2023
by Blockbusters2020 | created - 23 Jun 2022 | updated - 1 month ago | Public
Refine See titles to watch instantly, titles you haven't rated, etc
1. Jawan (2023)
Not Rated | 169 min | Action, Thriller
7.5 Rate
A high-octane action thriller which outlines the emotional journey of a man who is set to rectify the
wrongs in the society.
Chunk 2 :- 
wrongs in the society.
Director: Atlee | Stars: Shah Rukh Khan, Nayanthara, Vijay Sethupathi, Deepika Padukone
Votes: 62,729
2. Pathaan (2023)
Not Rated | 146 min | Action, Adventure, Thriller
5.9 Rate 47 Metascore
An Indian agent races against a doomsday clock as a ruthless mercenary, with a bitter vendetta, mounts
an apocalyptic attack against the country.


There are few more splitters available in langchain that can also be explored using the link given below.

[Text Splitters in Langchain](https://python.langchain.com/docs/modules/data_connection/document_transformers/)

## **<center> 4.2 Other transformations </center>**

We can do perform a number of transformations on docs which are not simply splitting the text. With the EmbeddingsRedundantFilter we can** identify similar documents and filter out redundancies.** With integrations like doctran we can do things like **translate documents from one language to another**, **extract desired properties** and add them to metadata, and convert conversational dialogue into a Q/A format set of documents.

# **<center> 5. Text embedding models </center>**

The Embeddings class is a class in langchain designed for interfacing with text embedding models. There are lots of **embedding model providers (OpenAI, Cohere, Hugging Face, etc)** - this class is designed to provide a standard interface for all of them.

**Embeddings create a vector representation** of a piece of text. This is useful because it means we can think about text in the vector space, and do things like **semantic search** where we look for pieces of text that are most similar in the vector space.

The base Embeddings class in LangChain provides **two methods**: one for embedding documents and one for embedding a query. The former takes as input multiple texts, while the latter takes a single text. The reason for having these as two separate methods is that some embedding providers have different embedding methods for documents (to be searched over) vs queries (the search query itself).

[Text Embeddings models](https://python.langchain.com/docs/integrations/text_embedding/)

[Embedding Models Leaderboard](https://huggingface.co/spaces/mteb/leaderboard)

In [None]:
from langchain.embeddings import OpenAIEmbeddings, HuggingFaceBgeEmbeddings

In [None]:
# embedding_model = OpenAIEmbeddings()
embedding_model = HuggingFaceBgeEmbeddings(
                        model_name="BAAI/bge-base-en",  #can try out the other models here
                        model_kwargs={'device': 'cpu'},
                        encode_kwargs={'normalize_embeddings': True} # set True to compute cosine similarity
                        )

.gitattributes:   0%|          | 0.00/1.52k [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/90.1k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/719 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/438M [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/52.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/366 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

In [None]:
embeddings = embedding_model.embed_documents(
    [
        "Hi there!",
        "Oh, hello!",
        "What's your name?",
        "My friends call me World",
        "Hello World!"
    ]
)
len(embeddings), len(embeddings[0])

(5, 768)

Embed a single piece of text for the purpose of comparing to other embedded pieces of texts.

In [None]:
embedded_query = embedding_model.embed_query("What was the name mentioned in the conversation?")
embedded_query[:5]

[-0.01641130819916725,
 -0.0365167036652565,
 -0.028666306287050247,
 -0.004012116231024265,
 0.018506264314055443]

We won't be creating embedding of the document everytime we want to query a question from it. So we will use **vector stores** to store the embeddings of the documents so that we won't have to create embeddings of the document again and again.


# **<center> 6. Vector stores </center>**

One of the most common ways to store and search over unstructured data is to embed it and **store the resulting embedding vectors**, and then at query time to embed the unstructured query and **retrieve the embedding vectors that are 'most similar**' to the embedded query.

A vector store takes care of :-
- **Storing embedded data** into Vector Database
- **Performing vector** search as per query.

In [None]:
# @title
from IPython.display import HTML
HTML("""<video width="1000" height="540" controls>
  <source src="https://drive.google.com/uc?id=1Cb5xbpvxAKCZebO4CYoRSTHjIqoPuXZG" type="video/mp4">
</video>""")

Again we have multiple options to choose the vector stores because lots of the are available few are open source few supports cloud service and many more. You can choose the vector store you want to use for your use case from the blog link given below

[Vector Stores](https://thedataquarry.com/posts/vector-db-1/)

This Session showcases basic functionality related to vector stores. A **key part** of working with vector stores is **creating the vector to put** in them, which is usually created via embeddings.

This session uses the chroma vector database, which runs on your local machine as a library.

This Session uses the chroma vector database, which runs on your local machine as a python library.

In [None]:
!pip install chromadb==0.4.18

Collecting chromadb==0.4.18
  Downloading chromadb-0.4.18-py3-none-any.whl (502 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m502.4/502.4 kB[0m [31m6.1 MB/s[0m eta [36m0:00:00[0m
Collecting chroma-hnswlib==0.7.3 (from chromadb==0.4.18)
  Downloading chroma_hnswlib-0.7.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m20.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting fastapi>=0.95.2 (from chromadb==0.4.18)
  Downloading fastapi-0.105.0-py3-none-any.whl (93 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m93.1/93.1 kB[0m [31m9.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting uvicorn[standard]>=0.18.3 (from chromadb==0.4.18)
  Downloading uvicorn-0.25.0-py3-none-any.whl (60 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.3/60.3 kB[0m [31m5.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting posthog>=2.4.0 (from chro

In [None]:
from langchain.vectorstores import Chroma
# from langchain.vectorstores import FAISS


In [None]:
# Creating the embeddings of the chunks and storing them into the vector store
db = Chroma.from_documents(chunks,embedding_model)
# db = FAISS.from_documents(chunks, embedding_model)

# **<center> 7. Retrievers </center>**

A retriever is an interface that returns documents given an unstructured query. It is more general than a vector store. A retriever does not need to be able to store documents, only to return (or retrieve) them. Vector stores can be used as the backbone of a retriever, but there are other types of retrievers as well.

## **<center> 7.1 Similarity search </center>**

Using Similarity Search We can find out the top k chunks which are similar to the query. In Chroma we can tweak following hyperparameters to get good results.

- **`query`** :- Query text to search for.

- **`k`** :- Number of results to return. Defaults to 4.

- **`filter`** :- Filter by metadata. Defaults to None.

[Cosine Similarity](https://www.learndatasci.com/glossary/cosine-similarity/#:~:text=Both%20vectors%20need%20to%20be,of%20the%20angle%20between%20them.)

![](https://python.langchain.com/assets/images/vector_stores-9dc1ecb68c4cb446df110764c9cc07e0.jpg)

In [None]:
query = "Tell us about Jawan"
docs = db.similarity_search_with_score(query, k = 6)

In [None]:
print(docs[4][0].metadata)

{'CreationDate': "D:20231020075614+00'00'", 'Creator': 'Pdfcrowd.com v20200619.084', 'ModDate': "D:20231020075614+00'00'", 'Producer': 'Skia/PDF', 'Title': 'BEST BOLLYWOOD MOVIES OF 2023 - IMDb', 'file_path': '/content/drive/MyDrive/IMBD_Movies_List_2023[1].pdf', 'page': 2, 'source': '/content/drive/MyDrive/IMBD_Movies_List_2023[1].pdf', 'total_pages': 10}


We can also use filter parameter to filter out the documents on basis of there metadata.

In [None]:
query = "highest rating movie"
docs = db.similarity_search_with_score(query, k = 6, filter = {"page" : {"$in" : [4,5,6]}})

In [None]:
if docs:
  docs[0][0].page_content

## **<center> 7.2 Maximum marginal relevance search (MMR) </center>**

 MMR tries to **reduce the redundancy of results** while at the same time maintaining query relevance of results for already ranked documents/phrases etc. In Chroma we can tweak following hyperparameters to get good results.


- **`query`** – Text to look up documents similar to.

- **`k`** – Number of Documents to return. Defaults to 4.

- **`fetch_k`** – Number of Documents to fetch to pass to MMR algorithm.

- **`lambda_mult`** – Number between 0 and 1 that determines the degree of diversity among the results with 0 corresponding to maximum diversity and 1 to minimum diversity. Defaults to 0.5.

- **`filter`** – Filter by metadata. Defaults to None.

<center><img src="https://drive.google.com/uc?id=17clTCfqLttUUQAjUcxbDB5NPIJ1gr5CL"  width="600" height="500"></center>


In [None]:
query = "Tell us about Jawan movie"
docs = db.max_marginal_relevance_search(query, k = 6, fetch_k = 2)

## **<center> 7.3 Self-querying Retriever </center>**

A self-querying retriever is one that, as the name suggests, has the **ability to query itself.** Specifically, given any natural language query, the retriever uses a query-constructing LLM chain to write a **structured query** and then applies that structured query to its underlying VectorStore.

This allows the retriever to not only use the user-input query for semantic similarity comparison with the contents of stored documents but to also **extract filters from the user query on the metadata of stored documents and to execute those filters.**

![](https://drive.google.com/uc?id=1OQUN-0MJcDUxmPXofgS7MqReEs720pqS)

In [None]:
# Creating the sample document just for understanding purposes
from langchain.schema import Document

sample_docs = [
    Document(page_content="A bunch of scientists bring back dinosaurs and mayhem breaks loose", metadata={"year": 1993, "rating": 7.7, "genre": "action"}),
    Document(page_content="Leo DiCaprio gets lost in a dream within a dream within a dream within a ...", metadata={"year": 2010, "director": "Christopher Nolan", "rating": 8.2}),
    Document(page_content="A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea", metadata={"year": 2006, "director": "Satoshi Kon", "rating": 8.6}),
    Document(page_content="A bunch of normal-sized women are supremely wholesome and some men pine after them", metadata={"year": 2019, "director": "Greta Gerwig", "rating": 8.3}),
    Document(page_content="Toys come alive and have a blast doing so", metadata={"year": 1995, "genre": "animated"}),
    Document(page_content="Three men walk into the Zone, three men walk out of the Zone", metadata={"year": 1979, "rating": 9.9, "director": "Andrei Tarkovsky", "genre": "science fiction", "rating": 9.9})
]

In [None]:
sample_db = Chroma.from_documents(sample_docs,embedding_model)

In [None]:
from langchain.llms import OpenAI
from langchain.retrievers.self_query.base import SelfQueryRetriever
from langchain.chains.query_constructor.base import AttributeInfo

metadata_field_info=[
    AttributeInfo(
        name="genre",
        description="The genre of the movie",
        type="string or list[string]",
    ),
    AttributeInfo(
        name="year",
        description="The year the movie was released",
        type="integer",
    ),
    AttributeInfo(
        name="director",
        description="The name of the movie director",
        type="string",
    ),
    AttributeInfo(
        name="rating",
        description="A 1-10 rating for the movie",
        type="float"
    ),
]
document_content_description = "Brief summary of a movie"
llm = OpenAI(openai_api_key="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")

In [None]:
retriever = SelfQueryRetriever.from_llm(llm,
                                        sample_db,
                                        document_content_description,
                                        metadata_field_info,
                                        verbose=True)

In [None]:
question = "Name the movies with ratings greater than 8.5"

In [None]:
for d in docs:
    print(d.page_content)
    print(d.metadata)

BEST BOLLYWOOD MOVIES OF 2023
by Blockbusters2020 | created - 23 Jun 2022 | updated - 1 month ago | Public
Refine See titles to watch instantly, titles you haven't rated, etc
1. Jawan (2023)
Not Rated | 169 min | Action, Thriller
7.5 Rate
A high-octane action thriller which outlines the emotional journey of a man who is set to rectify the
wrongs in the society.
{'CreationDate': "D:20231020075614+00'00'", 'Creator': 'Pdfcrowd.com v20200619.084', 'ModDate': "D:20231020075614+00'00'", 'Producer': 'Skia/PDF', 'Title': 'BEST BOLLYWOOD MOVIES OF 2023 - IMDb', 'file_path': '/content/drive/MyDrive/IMBD_Movies_List_2023[1].pdf', 'page': 0, 'source': '/content/drive/MyDrive/IMBD_Movies_List_2023[1].pdf', 'total_pages': 10}
breakup with Tapan. Through the journey, they discover each other's life and complement in
accomplishing what was left halfway.
Director: Sameer Vidwans | Stars: Kartik Aaryan, Kiara Advani, Supriya Pathak, Gajraj Rao
Votes: 27,782
Watch on Prime Video
included with Prime
Conv

# **<center> 8. Question Answering </center>**

We have discussed how to do question answering with the documents that we have just retrieved in Retrieval. Now, we take these **documents** and the original **question**, p**ass both of them to a language model** and ask the language model to answer the question.

**RetrievalQA Chain**

We will first see how to do question answering after multiple relevant splits have been retrieved from the vector store. We may also need to compress the relevant splits to fit into the LLM context. Finally, we send these splits along with a system prompt and human question to the language model to get the answer.

The objective here is to ask question to llm by following the format shown in the image below.

<center><img src="https://drive.google.com/uc?id=1puSUYQ1H9Eks2vrGz1-9OuoQ6RPnMmyK"  width="500" height="400"></center>

**Where context will be the top chunks retrieved by the retriever on basis of similiarity (for this case), instructions will be how should LLM treat the knowledge and generate the answer.**

Working of QA System

<center><img src="https://drive.google.com/uc?id=18uWOBhbOU4kMVkJE25_UMFW0bEl4RpFh"  width="700" height="400"></center>


In [None]:
from langchain.chains import RetrievalQA

In [None]:
# Hyperparameters
top_n_chunks = 8

In [None]:
from langchain.prompts import PromptTemplate

template = """Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Use three sentences maximum and keep the answer as concise as possible.
Always say "thanks for asking!" at the end of the answer.
{context}
Question: {question}
Helpful Answer:"""

prompt = PromptTemplate.from_template(template)

In [None]:
from langchain.chat_models import ChatOpenAI

In [None]:
### This cell will only work if you have access to the open ai api key and they key is set as an environmental variable
llm = ChatOpenAI(temperature=0)

ValidationError: ignored

If you don't have OpenAI API key then you can use open source models from huggingface by running the below cells

In [None]:
# import os
# os.environ['HUGGINGFACEHUB_API_TOKEN']= "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"

In [None]:
# from langchain.llms import HuggingFaceHub

# repo_id = "google/flan-t5-xxl"

# llm = HuggingFaceHub(
#     repo_id=repo_id, model_kwargs={"temperature": 0.5, "max_length": 2048}
# )

In [None]:
chain = RetrievalQA.from_chain_type(
            llm,
            chain_type = 'stuff',
            chain_type_kwargs={"prompt": prompt},
            return_source_documents=True,
            retriever = db.as_retriever(
                search_kwargs={'k': top_n_chunks}),
        )

In [None]:
question = "what is the cast and director  Bloody Daddy movie"

In [None]:
answer = chain({"query": question})

RateLimitError: ignored

In [None]:
answer['result']

NameError: ignored

The **`chain_type`** argument is an intresting argument as it is useful for summarizing documents, answering questions over documents, extracting information from documents, and more. chain_type supports 4 types of methods to add context to the prompt.

1. Stuff
2. Refine
3. Map Reduce
4. Map Re-rank

[Chain Types](https://python.langchain.com/docs/modules/chains/document/)

# **<center> References </center>**

1. [Retrieval Augmented Generation (RAG)](https://www.promptingguide.ai/techniques/rag)
2. [Chat With your data Using langchain](https://python.langchain.com/docs/modules/data_connection/)
3. [Langchain Document Loaders](https://python.langchain.com/docs/modules/data_connection/document_loaders/)
4. [PDF loaders in Langchain](https://python.langchain.com/docs/modules/data_connection/document_loaders/pdf#using-pdfplumber)
5. [Langchain Document Transformers](https://python.langchain.com/docs/modules/data_connection/document_transformers/)
6. [Text Embeddings models](https://python.langchain.com/docs/integrations/text_embedding/)
7. [Maximum marginal relevance search (MMR)](https://medium.com/tech-that-works/maximal-marginal-relevance-to-rerank-results-in-unsupervised-keyphrase-extraction-22d95015c7c5)
8. [Langchain Retrievers](https://python.langchain.com/docs/modules/data_connection/retrievers/)
9. [QA System](https://medium.com/@onkarmishra/using-langchain-for-question-answering-on-own-data-3af0a82789ed)