## BUILDING A RAG SYSTEM
- 1 Data Ingestion.
- 2 Indexing.
- 3 Retriever.
- 4 Response Synthesizer.
- 5 Querying.

In [1]:
%%capture
pip install -q -U langchain langchain_community transformers sentence-transformers datasets faiss-cpu

In [2]:
import torch
from transformers import AutoModelForCausalLM,AutoTokenizer,pipeline
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
from langchain import HuggingFacePipeline
from langchain.chains import RetrievalQA
model_name="/kaggle/input/gemma-2/transformers/gemma-2-9b/2"

## 1. Data Ingestion

In [3]:
loader=PyPDFLoader(file_path='/kaggle/input/report/Generative AI Report.pdf')

In [4]:
documents=loader.load()

In [5]:
type(documents)

list

In [6]:
len(documents)

27

In [7]:
documents[0].metadata

{'source': '/kaggle/input/report/Generative AI Report.pdf', 'page': 0}

In [8]:
documents[0].page_content

'Genera tive AI \nreport\nOCT OBER  202 4\nMa chine Learning a s a Ser vice \n(ML aa S)'

In [9]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=150)
docs = text_splitter.split_documents(documents)

### LLM

In [10]:
model = AutoModelForCausalLM.from_pretrained(
   model_name,
    cache_dir="./cache",
    device_map="auto",
    offload_folder="offload",
    torch_dtype=torch.float16,
    low_cpu_mem_usage=True
)
tokenizer = AutoTokenizer.from_pretrained(model_name, cache_dir="./cache")

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

llm = HuggingFacePipeline(pipeline=pipe,
      model_kwargs={"temperature": 0.7, "max_length": 512},
              )

Loading checkpoint shards:   0%|          | 0/8 [00:00<?, ?it/s]

Device set to use cuda:0
  llm = HuggingFacePipeline(pipeline=pipe,


### Embedding Model

In [11]:
embed_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2") 

  embed_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.7k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

## 2. Indexing

In [12]:
index = FAISS.from_documents(docs, embed_model)

## 3. Retriever

In [13]:
retriever=index.as_retriever()

In [14]:
retrieved_documents = retriever.get_relevant_documents("What is Machine Learning as a Service (MLaaS), and how does it work?")
for i, doc in enumerate(retrieved_documents, 1):                # display retrieved information
    print(f"Result {i}: {doc.page_content}")


Result 1: Gener ativ e AI R epor t,  Oct ober 202 4 | Section 1 02
Executive Summar y
Machine Learning as a Ser vice (MLaaS) is r e v olutionizing ho w businesses 
access and implement ar tificial int elligence (AI) and machine learning (ML) 
t echnologies.  By off ering cloud-based solutions,  MLaaS allo ws or ganizations 
t o le v er age po w er ful ML t ools wit hout needing t o de v elop in-house 
infr astructur e or e xper tise. 

This ser vice-based model significant ly lo w ers t he barrier t o entr y f or companies, 
making adv anced analytics accessible t o businesses of all siz es.
MLaaS changes t his b y off ering scalable, on-demand access t o machine learning 
platf orms, enabling companies t o run pr edictiv e models, pr ocess v ast amount s of 
data, and deriv e insight s quickly and cost -eff ectiv ely .
Result 2: Anot her significant challenge of using MLaaS is t he risk of v endor lock -in, wher e 
businesses become hea vily r eliant on a single ser vice pr o vider f 

  retrieved_documents = retriever.get_relevant_documents("What is Machine Learning as a Service (MLaaS), and how does it work?")


In [15]:
retrieved_documents[1].metadata

{'source': '/kaggle/input/report/Generative AI Report.pdf', 'page': 19}

In [16]:
print(retrieved_documents[1].page_content)

Anot her significant challenge of using MLaaS is t he risk of v endor lock -in, wher e 
businesses become hea vily r eliant on a single ser vice pr o vider f or t heir machine 
learning needs. Each MLaaS platf orm has it s o wn ecosyst em, t ools, and APIs, 
making it difficult t o swit ch pr o viders once a compan y ’ s w orkflo ws and models 
ar e deeply int egrat ed int o a par ticular platf orm​ .


## 4. Response Synthesizer

In [17]:

response_synthesizer = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True
)



## 5. Querying

In [18]:
response_synthesizer.invoke("What is Machine Learning as a Service (MLaaS), and how does it work?")

The 'batch_size' attribute of HybridCache is deprecated and will be removed in v4.49. Use the more precisely named 'self.max_batch_size' attribute instead.


{'query': 'What is Machine Learning as a Service (MLaaS), and how does it work?',
 'result': "Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.\n\nGener ativ e AI R epor t,  Oct ober 202 4 | Section 1 02\nExecutive Summar y\nMachine Learning as a Ser vice (MLaaS) is r e v olutionizing ho w businesses \naccess and implement ar tificial int elligence (AI) and machine learning (ML) \nt echnologies.  By off ering cloud-based solutions,  MLaaS allo ws or ganizations \nt o le v er age po w er ful ML t ools wit hout needing t o de v elop in-house \ninfr astructur e or e xper tise.\xa0\n\nThis ser vice-based model significant ly lo w ers t he barrier t o entr y f or companies, \nmaking adv anced analytics accessible t o businesses of all siz es.\nMLaaS changes t his b y off ering scalable, on-demand access t o machine learning \nplatf orms, enabling companies t o run pr edictiv e mo

In [19]:
response_synthesizer.invoke("Can you explain the role of MLaaS in modern businesses?")

{'query': 'Can you explain the role of MLaaS in modern businesses?',
 'result': "Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.\n\nAnot her significant challenge of using MLaaS is t he risk of v endor lock -in, wher e \nbusinesses become hea vily r eliant on a single ser vice pr o vider f or t heir machine \nlearning needs. Each MLaaS platf orm has it s o wn ecosyst em, t ools, and APIs, \nmaking it difficult t o swit ch pr o viders once a compan y ’ s w orkflo ws and models \nar e deeply int egrat ed int o a par ticular platf orm\u200b .\n\nBy int egrating MLaaS wit h Io T , NLP , and ot her emer ging t echnologies, \nbusinesses will be able t o cr eat e mor e int elligent, aut omat ed syst ems t hat can \npr ocess v ast amount s of data and mak e decisions in r eal time. This will lead t o \ngr eat er efficiency , impr o v ed cust omer e xperiences, and new oppor tuniti

### **Give LlamaIndex a Try!**

In [20]:
%%capture
pip install --upgrade llama-index llama-index-embeddings-langchain llama-index-llms-langchain


In [21]:
from llama_index.core import get_response_synthesizer,VectorStoreIndex,SimpleDirectoryReader

documents=SimpleDirectoryReader(input_files=['/kaggle/input/report/Generative AI Report.pdf']).load_data()

In [22]:
index = VectorStoreIndex.from_documents(documents, embed_model=embed_model, llm=llm)
query_engine = index.as_query_engine(llm=llm)

In [23]:
print(query_engine.query("What is Machine Learning as a Service (MLaaS), and how does it work?").response)

  output_str = self._llm.predict(prompt, **kwargs)


Context information is below.
---------------------
page_label: 3
file_path: /kaggle/input/report/Generative AI Report.pdf

Gener ativ e AI R epor t,  Oct ober 202 4 | Section 1 02
Executive Summar y
Machine Learning as a Ser vice (MLaaS) is r e v olutionizing ho w businesses 
access and implement ar tificial int elligence (AI) and machine learning (ML) 
t echnologies.  By off ering cloud-based solutions,  MLaaS allo ws or ganizations 
t o le v er age po w er ful ML t ools wit hout needing t o de v elop in-house 
infr astructur e or e xper tise. 

This ser vice-based model significant ly lo w ers t he barrier t o entr y f or companies, 
making adv anced analytics accessible t o businesses of all siz es.
MLaaS changes t his b y off ering scalable, on-demand access t o machine learning 
platf orms, enabling companies t o run pr edictiv e models, pr ocess v ast amount s of 
data, and deriv e insight s quickly and cost -eff ectiv ely . 

Businesses can e xperiment wit h ML applications, su

In [24]:
print(query_engine.query("Can you explain the role of MLaaS in modern businesses?").response)

Context information is below.
---------------------
page_label: 25
file_path: /kaggle/input/report/Generative AI Report.pdf

Gener ativ e AI R epor t,  Oct ober 202 4 | Section 8 2 4
Summar y of MLaaS's R ole in Unlocking Ne w P ot entials f or Businesses

Machine Learning as a Ser vice (MLaaS) is transf orming t he wa y businesses of all 
siz es and industries access and implement machine learning t echnologies. By 
off ering cloud-based machine learning t ools t hat eliminat e t he need f or in-house 
infrastructur e and e xper tise, MLaaS democratiz es access t o ar tificial int elligence 
(AI) and adv anced analytics. It enables businesses t o le v erage t he po w er of 
data-driv en decision-making wit hout t he significant in v estment traditionally 
associat ed wit h AI de v elopment.

MLaaS platf orms pr o vide scalable, fle xible, and cost -eff ectiv e solutions f or a 
wide range of machine learning tasks, fr om data pr epr ocessing and model 
training t o r eal-time pr edict