# This Note showcases the RAG usecase
- Uses Langchain
- FAISS vector Store
- Hugging Face Embedings
- Demonstrates, how to split the documents into multiple chuncks
- Demonstrates, how to query the embedings from the vector store
- Demostrates, calling BAM models for the query

In [None]:
!pip install pypdf

In [None]:
from langchain.document_loaders import UnstructuredPDFLoader, OnlinePDFLoader, PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.embeddings import HuggingFaceInstructEmbeddings
from langchain.llms import OpenAI
from langchain.chains.question_answering import load_qa_chain

#BAM
from genai.extensions.langchain import LangChainInterface
from genai.schemas import ModelType, GenerateParams
from genai.model import Credentials

from dotenv import load_dotenv

import os
import pickle

### Pass Credentials

Create a file named .env in the same directory and include the following:

```
GENAI_KEY=YOUR_GENAI_API_KEY
GENAI_API=https://workbench-api.res.ibm.com/v1/
```

In [None]:
load_dotenv(".env")
api_key = os.getenv("GENAI_KEY", None)
api_endpoint = os.getenv("GENAI_API", None)

# creds object
creds = Credentials(api_key=api_key, api_endpoint=api_endpoint)

## Global settings

- chunksize: size of chunks documents need to be splited
- chunk_overlap: overlap of the chunks


In [None]:
chunk_size = 2000
chunk_overlap = 100


## Loading the pdf, file using the PyPDFLoader

In [None]:
loader = PyPDFLoader("Over-the-Range Microwave with Sensor Cooking.pdf")
data = loader.load()

## Total number of documents (pages) in the pdf

In [None]:
#every page in pdf is counted as unique document
print (f'You have {len(data)} document(s) in your data')


## Printing the first page of the pdf file

In [None]:
print (f'There are {len(data[0].page_content)} characters in first page')
print(f"content of first page\n : {data[0].page_content}")

## Spliting the documents into multiple chuncks on the chunk size mentioned earlier

In [None]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size= chunk_size, chunk_overlap=chunk_overlap)
docs = text_splitter.split_documents(data)

In [None]:
print(f'We have total documents after split: {len(docs)}')

## Loading Hugging Face Emedings
- When you run the below cell for the first time, it does take some time

In [None]:
embeddings = HuggingFaceInstructEmbeddings(
            model_name="hkunlp/instructor-large",
            model_kwargs={"device": "cpu"}
        )

## Vector Store- FAISS
- We have our documents and embedding ready.
- Here we are storing our embeddings and docs in the vector store

In [None]:
#https://python.langchain.com/en/latest/modules/indexes/vectorstores/examples/faiss.html?highlight=faiss#faiss
# this will take a few minutes to run
db = FAISS.from_documents(docs, embeddings)

In [None]:
with open("db.pkl", "wb") as f:
    pickle.dump(db, f)

In [None]:
# Load the database from disk. If the database is saved, you can load it directly and don't have to regenerate it each time you run the notebook.
with open("db.pkl", "rb") as f:
    db = pickle.load(f)

## Lets test our embeddings
- We are passing the query, and looking for the closest 3 embedings.
- printing out the closest 3 embedings for the query from the documents or pdf file

In [None]:
query = "How to cook eggs"
docs = db.similarity_search(query, k=3)
print(len(docs))
print(docs[0].page_content)
print("----")
print(docs[1].page_content)
print("----")
print(docs[2].page_content)

## Creating LLM model
- Here we are using LangChainInterface to create out BAM model

In [None]:
model_llm = LangChainInterface(
        model=ModelType.FLAN_T5_11B,
        credentials=creds,
        params=GenerateParams(
            decoding_method="greedy",
            max_new_tokens=300,
            min_new_tokens=15,
            repetition_penalty=2,
        ).dict()
    )

## Loading lang chain qa
- creating a chain to get QA from our BAM modles.
- Here we are passing chain_type as stuff, which means we are passing all the embeddings fromt the query

In [None]:
chain = load_qa_chain(model_llm, chain_type="stuff")

## Let' get the embedings for the query

In [None]:
query = "How to cook eggs"
doc = db.similarity_search(query, k=3)
print(len(doc))

## Finally it's time for us to call our BAM model
- here we are passing all embedding and the query to the BAM models

In [None]:
chain.run(input_documents=doc, question=query)

## End of the notebook