# 🛠️ Week 5-6 · Notebook 08 · RAG Implementation Walkthrough

Build an end-to-end retrieval-augmented assistant for manufacturing maintenance knowledge using open-source tooling. We'll cover ingestion, chunking, embeddings, retrieval, grounded generation, and evaluation.

## 🎯 Learning Objectives
- Configure LangChain/Chroma primitives for chunking and vector search.
- Compose prompts that bind retrieved evidence with safety-conscious instructions.
- Evaluate retrieval + generation quality with manufacturing-specific metrics.
- Capture observability data (latency, citations, feedback) for continuous improvement.

## 🔄 Pipeline Overview
| Stage | Tooling | Key Decisions |
| --- | --- | --- |
| Ingest | File loaders / APIs | Metadata schema, access control |
| Chunk | `RecursiveCharacterTextSplitter` | Chunk size, overlap, separators |
| Embed | `sentence-transformers` | Model choice, normalization |
| Index | `Chroma`, `FAISS`, `PGVector` | Persistence, replication |
| Retrieve | Similarity search + re-rank | k value, filters |
| Generate | HF pipeline / Azure OpenAI | Prompt template, temperature |
| Evaluate | Custom harness | Accuracy, citation, latency |

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import Chroma
from transformers import pipeline
import pandas as pd

maintenance_docs = pd.DataFrame([
    {
        "doc_id": "press_maintenance.txt",
        "text": "Inspect ram alignment weekly. Replace hydraulic oil every 2000 hours. Verify lockout-tagout steps before servicing.",
    },
    {
        "doc_id": "conveyor_ops.txt",
        "text": "Monitor belt tension, adjust idlers quarterly, calibrate speed sensors after major maintenance.",
    },
    {
        "doc_id": "vision_sop.txt",
        "text": "Clean lenses daily, check lighting uniformity, rerun baseline calibration monthly for the vision system.",
    },
])

splitter = RecursiveCharacterTextSplitter(chunk_size=200, chunk_overlap=40, separators=["\n\n", ". ", " "])
records = []
for _, row in maintenance_docs.iterrows():
    for chunk in splitter.split_text(row.text):
        records.append({"source": row.doc_id, "text": chunk})

chunks = pd.DataFrame(records)
chunks.head()

In [None]:
embedding_fn = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

vector_store = Chroma.from_texts(
    texts=chunks.text.tolist(),
    embedding=embedding_fn,
    metadatas=chunks.drop(columns="text").to_dict(orient="records"),
)

vector_store._collection.count()

In [None]:
def retrieve(query: str, k: int = 3):
    docs = vector_store.similarity_search(query, k=k)
    return pd.DataFrame(
        {
            "source": [doc.metadata["source"] for doc in docs],
            "snippet": [doc.page_content for doc in docs],
            "score": [round(doc.metadata.get("distance", 0.0), 3) if "distance" in doc.metadata else None],
        }
    )

query = "How often should hydraulic oil be replaced?"
retrieve(query)

In [None]:
generator = pipeline(
    "text-generation",
    model="tiiuae/falcon-7b-instruct",
    max_new_tokens=160,
    temperature=0.2,
)

retrieved_df = retrieve(query)

context = "\n".join(
    f"Source: {row.source}\n{row.snippet}" for _, row in retrieved_df.iterrows()
)

prompt = f"""
SYSTEM: You are a manufacturing maintenance assistant. Use only the context provided. Cite sources at the end of each sentence.
CONTEXT:
{context}

QUESTION: {query}
RESPONSE:
""".strip()

rag_answer = generator(prompt)[0]["generated_text"]
print(rag_answer)

## 📊 Retrieval Quality Checks
- Inspect top-k snippets for relevance and diversity.
- Track coverage of metadata (machine, shift, language) in retrieved results.
- Evaluate recall@k against a labelled question set.
- Compare similarity scores before/after domain-specific embedding fine-tuning.