# **RAG Application using DocRetriever**

# **Requirement Phase**

In [None]:
!pip install -r "requirements.txt"



<hr>

# **Data Loading**

### **Libraries Required**

In [None]:
# This is for input / output operation
import os
import time
# Warning to be ignored
import warnings
warnings.filterwarnings("ignore")
from google.colab import userdata
# This library is for loading textual data
from langchain.document_loaders import TextLoader
# This library will handle the splitting part of the data
from langchain.text_splitter import CharacterTextSplitter
# This library will handle embedding of data
from langchain.embeddings import HuggingFaceEmbeddings

### **Data Loading**
  **`HuggingFaceEmbedding` is a class provided by the LangChain library that allows you to use Hugging Face's pre-trained sentence transformer models for text embedding. These embeddings are numerical representations of text, capturing semantic and syntactic information.**

In [None]:
# We will create a Text Loader that will help us in loading the document into the environment
# TextLoader will expect the path of your data
loader = TextLoader("/content/Machine Learning Operations.txt")

#### **Creating the documents for storing the data into DB**

In [None]:
# This process has been, since the data that has to be stored in db has to be in form of document,
# MOreover, it's easy to split documents
documents = loader.load()

### **Splitting of data**
  **We need to split the data, since the model context window won't be able to get complete information in one go, due to token limit. Hence, we need to break our documents**

In [None]:
text_splitter = CharacterTextSplitter(chunk_size = 1000, chunk_overlap = 4)

In [None]:
docs = text_splitter.split_documents(documents)



### **Embeddings**

In [None]:
embeddings = HuggingFaceEmbeddings()

  embeddings = HuggingFaceEmbeddings()
  embeddings = HuggingFaceEmbeddings()


<hr>

### **Database Setup**

<hr>

**Following this embedding part, the next step is to deposit this embedded data into the vector database.**
  * **First, we will initialize the pinecone using Pinecone API**
  * **We will see, if there is already index there if not we will create one**
  * **If it does exist, we link that index to document search variable. If not then we will create one index for us**
    * **HuggingFaceEmbedding: dimension: 768, metric = cosine**

<hr>

**Index**
  * **Your model processes data (like text or images) and converts it into numerical representations called vectors. These vectors capture the semantic meaning and relationships between different pieces of data.**

  * **When you want to find similar items, you query the index with a query vector. The index uses the specified metric (in this case, cosine similarity) to find the vectors that are most similar to the query vector.**

<hr>

#### **PineCone Setup**

In [None]:
!pip install pinecone



In [None]:
from pinecone import Pinecone, ServerlessSpec

#### **Initialize the Pinecone token**

In [None]:
pc = Pinecone(api_key = userdata.get("PCToken"))

#### **Initializing Pinecone Enviroments**

In [None]:
# We are initializing the cloud platform over here
cloud = os.environ.get("PINECONE_CLOUD") or "aws"
# We are going to give a region for aws
region = os.environ.get("PINECONE_REGION") or "us-east-1"
# Initialize the client
serv = ServerlessSpec(cloud = cloud, region = region)

#### **Create the storage index**

**Note: the index should always have lowercase letters with symbols**

In [None]:
index_name = "kart-15th-03"

In [None]:
# We are check if the name of our index is not existing in pinecone directory
if index_name not in pc.list_indexes().names():
  # if not then we will create a index for us
  pc.create_index(
      name = index_name,
      dimension = 768,
      metric = "cosine",
      spec = serv
  )
  # Waiting till the machine has not created the index
  while not pc.describe_index(index_name).status['ready']:
    time.sleep(1)

# Check to see if the index is ready
print("Index before inserting")
print(pc.Index(index_name).describe_index_stats())

Index before inserting
{'dimension': 768,
 'index_fullness': 0.0,
 'namespaces': {'': {'vector_count': 15}},
 'total_vector_count': 15}


**Cosine Similarity: It's basically comparison between two vector, wherein the angle between two given vectors should be as minimum as possible in order to considered them similar**

#### **Adding the data into database**

In [None]:
# INitializing Pinecone token
PINECONE_API_KEY = userdata.get("PCToken")
# Setup Pinecone into environment
os.environ["PINECONE_API_KEY"] = PINECONE_API_KEY

In [None]:
from langchain.vectorstores import Pinecone as PineconeVectorStore

**If the index is not available in the pinecone storage, then we will try to fetch the data directly from the documentation, or else if the index is present then we will try to fetch it from the index**

In [None]:
# IF the index is not there in the index list
if index_name not in pc.list_indexes():
  docsearch = PineconeVectorStore.from_documents(docs, embeddings, index_name = index_name)
else:
  docsearch = PineconeVectorStore.from_existing_index(index_name, embeddings, pinecone_index = pc.Index(index_name))

<hr>

### **Model Setup**

  **Since, we ahve the embedded text on vecDB, so we can now create our model, and HuggingFaceHub, we can connect to the model and we can directly use it**
  * **We will define the modelID, and in this case `mistralai/Mixtral-8x7B-Instruct-v0.1`**

In [None]:
model_id = "mistralai/Mixtral-8x7B-Instruct-v0.1"

In [None]:
from langchain.llms import HuggingFaceHub

#### **LLM Initialization**

In [None]:
llm = HuggingFaceHub(
    repo_id = model_id,
    model_kwargs = {"temperature" : 0.8, "top_k" : 50},
    huggingfacehub_api_token = userdata.get("HFToken")
)

  llm = HuggingFaceHub(


<hr>

### **Prompt Engineering**

In [None]:
from langchain import PromptTemplate

In [None]:
template = """
You are a MLOPs engineer. The user will ask you a question about Machine Learning Operations.
Use the following piece of context to answer the question.
If you don't know the answer, just say don't know/
Keep the answer brief

Context: {context}
Question: {question}
Answer:

"""

In [None]:
prompt = PromptTemplate(
    template = template,
    input_variables = ["context", "question"]
)

In [None]:
prompt

PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template="\nYou are a MLOPs engineer. The user will ask you a question about Machine Learning Operations.\nUse the following piece of context to answer the question.\nIf you don't know the answer, just say don't know/\nKeep the answer brief\n\nContext: {context}\nQuestion: {question}\nAnswer:\n\n")

### **Chaining it all together**
* **What all we are having**
  * **Pinecone Database (Embedded Data)**
  * **Model (mixtral model)**
  * **PromptTemplate**

**So, the process begins with the document search wherein we will try to search for the relevant information based on context. Then once the query goes through this, our prompt will start working and then it will return us the output**

In [None]:
from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser

In [None]:
rag_chain = (
    {"context" : docsearch.as_retriever(), "question" : RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [None]:
from langchain.chains import RetrievalQA
from langchain.llms import HuggingFaceHub
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Pinecone
from pinecone import Pinecone
import os
from google.colab import userdata

model_id = "mistralai/Mixtral-8x7B-Instruct-v0.1"
llm = HuggingFaceHub(
    repo_id=model_id,
    model_kwargs={"temperature": 0.8, "top_k": 50},
    huggingfacehub_api_token=userdata.get("HFToken")
)


qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=docsearch.as_retriever(),
)



query = "What is MLOps?"
result = qa_chain.run(query)
print(result)

Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

Till now, we have created all the MLOPs model and trained a lot of models, tested them and done all the aspects related to machine learning aspects. But what's the use to it? Where it is being utilized.
Here is where the MLOPs comes into play.:
Creating an ML model that can predict what you want it to predict from the data you have fed is easy, but creating a model that is reliable, fast, accurate, pinpoint and can be used by many users in difficult, isn't it?
So, that's where the MLOPs comes into the play:
•	 These models that rely on large amount of data, are very difficult for a single person to be handled and tracking their development or usage.
•	Since, due to having a lot of data, even if there is small tweak in the parameters it can result in the enormous difference in the results and accuracy.
•	Now, feature engineer

<hr>

### **Finalize the model**

In [None]:
class Chatbot():
  loader = TextLoader("/content/Machine Learning Operations.txt")
  documents = loader.load()

  rag_chain = (
    {"context" : docsearch.as_retriever(), "question" : RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [None]:
bot = Chatbot()

In [None]:
a = input("Ask Anything")
res = bot.rag_chain.invoke(a)
print(res)

Ask AnythingWhat is CI/CD

You are a MLOPs engineer. The user will ask you a question about Machine Learning Operations.
Use the following piece of context to answer the question.
If you don't know the answer, just say don't know/
Keep the answer brief

Context: [Document(metadata={'source': '/content/Machine Learning Operations.txt'}, page_content='This will not be much beneficial for some of the particular \nas some models might require learning from the user inputs and predictions it makes. This lifecycle is valid for most of the ML use cases.\n\nUNDERSTANDING THE CI/CD PIPELINE\n\nIn development, whenever we update the code, we want that the code should be updated everywhere it is being used, ensuring that each user is having the same functionality of it, on their respective devices. Now this seems as easy as it could be but is as complicated as it could be.\n\nCI/CD ensures that the integration and delivery of incremental changes to a live application. It is triggered when by a ne