# Step 1: Document Ingestion
### What It Does and Why It Matters
This is where you feed your "knowledge base" into the system—think loading books onto shelves. It matters because without good docs, RAG can't retrieve anything useful. Analogy: Like stocking a fridge before cooking; empty fridge = no meal!

In [None]:
# Import the tool we need
from langchain.document_loaders import TextLoader

# Load a text file (replace 'knowledge.txt' with your actual file)
loader = TextLoader("knowledge.txt")
documents = loader.load()

# Print to check
print(documents[0].page_content[:100])  # Shows first 100 characters

# Line-by-Line Walkthrough

1. from langchain.document_loaders import TextLoader: Brings in a helper from LangChain (a library that makes AI easier).
2. loader = TextLoader("your_file.txt"): Points to your text file.
3. documents = loader.load(): Reads the file into a list.
4. print(...): Just to peek at what's loaded.

# Common Pitfalls and Tips

* Pitfall: File not found? Check the path—use full path like /Users/you/docs/your_file.txt.
* Tip: Start with a small .txt file. If it's a PDF, use PyPDFLoader instead (install pypdf via pip).


# Step 2: Embedding Creation
### What It Does and Why It Matters
Embeddings are like turning words into math coordinates (vectors) so AI can measure "similarity"—e.g., "cat" and "kitten" are close. Why? Searching text directly is slow; vectors are fast. Analogy: Like assigning GPS coords to places; similar spots are nearby.

In [None]:
# Import the embedding model
from sentence_transformers import SentenceTransformer

# Pick a simple model (free from Hugging Face)
model = SentenceTransformer('all-MiniLM-L6-v2')

# Your text chunks (from ingestion)
texts = ["Cats are furry animals.", "Dogs love to play fetch."]

# Create embeddings
embeddings = model.encode(texts)

# Print shape to check
print(embeddings.shape)  # Should be (2, 384) for two texts

# Line-by-Line Walkthrough

1. from sentence_transformers import SentenceTransformer: Gets the tool.
2. model = SentenceTransformer('all-MiniLM-L6-v2'): Loads a free model—downloads automatically.
3. texts = [...]: Your document pieces (split big docs into chunks).
4. embeddings = model.encode(texts): Turns texts into vectors.
5. print(embeddings.shape): Shows dimensions (number of texts x vector size).

# Common Pitfalls and Tips

* Pitfall: Model download fails? Check internet. If slow, try a smaller model like 'paraphrase-MiniLM-L3-v2'.
* Tip: Split long docs into sentences for better embeddings—use text.split('.').


# Step 3: Retrieval

This searches your embeddings for the best matches to a question. Why? It's the "retrieval" in RAG—finds relevant info quickly. Analogy: Librarian scanning shelves for book titles matching your query.

In [None]:
# Import the vector store
import faiss
import numpy as np

# Your embeddings from before (as numpy array)
embeddings_np = np.array(embeddings).astype('float32')

# Create index (like a search database)
index = faiss.IndexFlatL2(embeddings_np.shape[1])  # L2 distance
index.add(embeddings_np)

# Query embedding
query = "What do cats look like?"
query_emb = model.encode([query])

# Search (k=1 means top 1 result)
D, I = index.search(query_emb.astype('float32'), k=1)

# Get the text
retrieved_text = texts[I[0][0]]
print(retrieved_text)

# Line-by-Line Walkthrough

1. import faiss and numpy: Tools for fast search and math.
2. embeddings_np = ...: Convert to format FAISS likes.
3. index = faiss.IndexFlatL2(...): Builds a simple search index.
4. index.add(...): Adds your embeddings.
5. query_emb = ...: Embed the question.
6. D, I = index.search(...): Finds closest (I is indices).
7. retrieved_text = ...: Gets the matching text.

# Common Pitfalls and Tips

* Pitfall: Shape mismatch? Ensure embeddings are 2D arrays.
* Tip: For more results, set k=3. FAISS is fast even for thousands of docs.


# Step 4: Augmentation

Combines the question with retrieved docs into a better prompt. Why? Helps the AI generate accurate answers. Analogy: Giving the librarian your question plus book snippets to craft a full response.

In [None]:
# Your question
question = "What do cats look like?"

# Retrieved text from before
retrieved = "Cats are furry animals."

# Augment: Simple template
prompt = f"Based on this info: {retrieved}\nAnswer: {question}"

print(prompt)

# Line-by-Line Walkthrough
 
1. question = ...: The user's ask.
2. retrieved = ...: From retrieval.
3. prompt = f"...": Formats a string with both—f-string is Python magic for inserting variables.

# Common Pitfalls and Tips

* Pitfall: Prompt too long? AI models have limits—keep under 2000 words.
* Tip: Make templates fancier, like adding "Be helpful and accurate."


# Step 5: Generation

Uses an AI model to create the final answer from the augmented prompt. Why? This is where magic happens—turning data into human-like text. Analogy: The librarian writing a summary based on fetched books.

In [None]:
# Import generator (using free Hugging Face)
from transformers import pipeline

# Load a small model
generator = pipeline('text-generation', model='gpt2')

# Your augmented prompt
prompt = "Based on this info: Cats are furry animals.\nAnswer: What do cats look like?"

# Generate
response = generator(prompt, max_length=50, num_return_sequences=1)[0]['generated_text']

print(response)

# Line-by-Line Walkthrough

1. from transformers import pipeline: Gets the generation tool.
2. generator = pipeline(...): Loads 'gpt2'—a free, small AI.
3. prompt = ...: From augmentation.
4. response = generator(...): Runs it (max_length limits output).
5. print(response): Shows the answer.

# Common Pitfalls and Tips

* Pitfall: Output gibberish? Try better models like 'distilgpt2' or paid ones (add OpenAI key).
* Tip: Set max_length low for speed. Install torch if errors.