#**Building a RAG Pipeline for LLMs: Getting Started**

**Installing Dependencies**

In [None]:
!pip install wikipedia sentence-transformers transformers  faiss-cpu



**Importing Libraries**

In [None]:
import wikipedia
from transformers import AutoTokenizer, AutoModelForQuestionAnswering, pipeline
from sentence_transformers import SentenceTransformer
import faiss
import numpy as np

**1. Pull content live from Wikipedia**



In [None]:
def get_wiki_data(topic):
  try:
    page=wikipedia.page(topic)
    return page.content
  except wikipedia.exceptions.PageError:
    return None
  except wikipedia.exceptions.DisambiguationError as e:
    print(f"ambiguous error : {e.options}")
    return None

topic=input("Enter a topic to get knowledge about:")
document=get_wiki_data(topic)

if not document:
  print("could not recieved document")
  exit()

Enter a topic to get knowledge about:Apple Computers


**2.Break it down into smaller chunks**

In [None]:
tokenizer=AutoTokenizer.from_pretrained("sentence-transformers/all-mpnet-base-v2")

def split_text(text,chunk_size=256,overlap=20):
  tokens=tokenizer.tokenize(text)
  chunks=[]
  start=0

  while start < len(tokens):
    end=min(start+chunk_size,len(tokens))
    chunks.append(tokenizer.convert_tokens_to_string(tokens[start:end]))
    if end== len(tokens):
      break
    start=end- overlap
  return chunks

chunks=split_text(document)
print(f"number of chunks : {len(chunks)}")

Token indices sequence length is longer than the specified maximum sequence length for this model (17454 > 512). Running this sequence through the model will result in indexing errors


number of chunks : 74


**3. Create vector embeddings**

In [None]:
embedding_model = SentenceTransformer("sentence-transformers/all-mpnet-base-v2")
embeddings=embedding_model.encode(chunks)

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.4k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

**4. Index the data with FAISS**

In [None]:
dimension=embeddings.shape[1]
index=faiss.IndexFlatL2(dimension)
index.add(np.array(embeddings))

**5. Retrieve relevant info using semantic search**

In [None]:
query=input("Ask a question: ")
query_embedding=embedding_model.encode([query])

k=3
distances,indices=index.search(np.array(query_embedding),k)
retirved_chunks=[chunks[i] for i in indices[0]]
print("Retrived chunks")
for chunk in retirved_chunks:
  print("-"+chunk)

Ask a question: who invented apple computers first?
Retrived chunks
-apple inc. is an american multinational corporation and technology company headquartered in cupertino, california, in silicon valley. it is best known for its consumer electronics, software, and services. founded in 1976 as apple computer company by steve jobs, steve wozniak and ronald wayne, the company was incorporated by jobs and wozniak as apple computer, inc. the following year. it was renamed apple inc. in 2007 as the company had expanded its focus from computers to consumer electronics. apple is the largest technology company by revenue, with us $ 391. 04 billion in the 2024 fiscal year. the company was founded to produce and market wozniak ' s apple i personal computer. its second computer, the apple ii, became a best seller as one of the first mass - produced microcomputers. apple introduced the lisa in 1983 and the macintosh in 1984, as some of the first computers to use a graphical user interface and a mous

**6. Use a QA model to generate the final answer!**

In [None]:
model_name="deepset/roberta-base-squad2"
qa_tokenizer=AutoTokenizer.from_pretrained(model_name)
qa_model=AutoModelForQuestionAnswering.from_pretrained(model_name)
qa_pipeline=pipeline("question-answering",model=qa_model,tokenizer=qa_tokenizer)

context=" ".join(retirved_chunks)
answer=qa_pipeline(question=query,context=context)
print(f"answer: {answer['answer']}")






tokenizer_config.json:   0%|          | 0.00/79.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/772 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/496M [00:00<?, ?B/s]

Device set to use cpu


answer: steve jobs
