<a href="https://colab.research.google.com/github/Decoding-Data-Science/aiguild/blob/main/RAG_Demo_LangChain_Chroma_Colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# RAG Demo with LangChain + Chroma (Google Colab)

This notebook is a clean, step-by-step **Retrieval-Augmented Generation (RAG)** demo using:

- **LangChain** (orchestration)
- **Chroma** (vector store)
- **OpenAI Embeddings + Chat Model** (embedding + generation)

You can run this as a live demo: **notes → code → notes → code**.

---

## What you will build

A tiny pipeline that:
1. Creates a sample document (in code)
2. Splits it into chunks
3. Embeds chunks and stores them in **Chroma**
4. Retrieves the most relevant chunks for a query
5. Generates an answer using the retrieved context

> **Assumption:** Your OpenAI API key is stored in **Google Colab Secrets** under the name: `OpenAI`.


## Step 0 — Install minimal required packages

We install only what this demo uses:
- `langchain`
- `langchain-openai`
- `langchain-community`
- `langchain-text-splitters`
- `chromadb`


In [None]:
!pip -q install langchain langchain-openai langchain-community langchain-text-splitters chromadb

## Step 1 — Load OpenAI API key from Colab Secrets

In Google Colab:
1. Click the **key icon** (Secrets) in the left sidebar
2. Add a secret named **`OpenAI`**
3. Paste your OpenAI API key as its value

This cell reads the secret and sets `OPENAI_API_KEY` in the environment.


https://platform.openai.com/api-keys

In [7]:
import os

# Colab-only: read secret named "OpenAI"
try:
    from google.colab import userdata
    os.environ["OPENAI_API_KEY"] = userdata.get("openai")
    assert os.environ.get("OPENAI_API_KEY"), "OPENAI_API_KEY is empty. Check Colab Secret named 'OpenAI'."
    print("✅ OPENAI_API_KEY loaded from Colab Secrets.")
except Exception as e:
    print("⚠️ Could not load from Colab Secrets. If you're not in Colab, set OPENAI_API_KEY another way.")
    print("Error:", e)

✅ OPENAI_API_KEY loaded from Colab Secrets.


## Step 2 — Import the libraries

We use:
- `OpenAIEmbeddings` for embeddings
- `ChatOpenAI` as the LLM
- `Chroma` as the vector store
- `RecursiveCharacterTextSplitter` to chunk text
- A simple prompt to enforce “answer from context”


In [3]:
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.vectorstores import Chroma

from langchain_core.documents import Document
from langchain_text_splitters import RecursiveCharacterTextSplitter

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

print("✅ Imports successful.")



✅ Imports successful.


## Step 3 — Create a sample document (in code)

For a demo, it’s convenient to keep content in the notebook.
You can replace `document_text` later with real plant documentation.


In [4]:
document_text = """
ENERGY PLANT OPERATIONS NOTE
Document ID: OPS-PMP-017
Title: Troubleshooting Overheated Cooling Water Pump (P-201A)
Site: Al Noor Combined Cycle Power Plant
Unit: Block 2 | Area: Cooling Water System (CWS)
Last Updated: 2025-12-10
Owner: Maintenance Reliability Team

1) PURPOSE
This document provides a step-by-step procedure to diagnose and stabilize an overheated cooling water pump (P-201A). It is intended for control room operators, rotating equipment technicians, and reliability engineers.

2) EQUIPMENT OVERVIEW
- Asset Tag: P-201A (Cooling Water Pump – Duty)
- Type: Horizontal centrifugal pump, electric motor driven
- Normal Operating Range:
  - Discharge Pressure: 6.0–7.5 bar
  - Flow: 1,500–2,200 m3/h (depending on condenser demand)
  - Bearing Temperature (DE/NDE): 55–80°C typical (alarm at 90°C, trip at 100°C)
  - Motor Current: 120–165 A typical
- Instruments:
  - TT-201A-DE / TT-201A-NDE (bearing temp)
  - PT-201A-DIS (discharge pressure)
  - FT-201A (flow)
  - VT-201A (vibration monitoring, if installed)

3) INCIDENT TRIGGER (SYMPTOMS)
Typical event conditions reported in DCS:
- Alarm: “P-201A Bearing Temp High” (TT-201A-DE > 90°C)
- Operator notes:
  - Discharge pressure dropping or unstable
  - Abnormal vibration trend or noise near coupling
  - Seal flush line temperature elevated (if applicable)
  - Motor current rising above normal range

4) SAFETY FIRST (MANDATORY)
Before any field action:
- Follow plant LOTO procedure if opening guards, removing coupling cover, or working on energized equipment.
- Wear PPE: gloves, face shield, safety shoes, hearing protection.
- Hot surfaces hazard: bearing housings and casing can exceed safe touch temperature.
- If leak observed near mechanical seal, treat as pressurized water hazard.

5) QUICK STABILIZATION (FIRST 5 MINUTES)
Goal: prevent equipment damage and maintain plant cooling demand.
A. Confirm the alarm source:
   - Verify TT values for DE and NDE.
   - Cross-check with handheld IR thermometer if safe.
B. Reduce stress on the pump (operator action):
   - Verify pump is operating near its best efficiency point (BEP).
   - If suction pressure is low, check upstream strainers and basin level.
C. If temperature continues rising rapidly:
   - Prepare to start standby pump P-201B (if available).
   - Coordinate with Control Room Supervisor for controlled switchover.

6) ROOT CAUSE CHECKLIST (MOST COMMON)
A. Low suction / cavitation
- Evidence:
  - “Gravel” sound, fluctuating discharge pressure, vibration spikes
- Checks:
  - Cooling water basin level low
  - Suction strainer DP high (clogging)
  - Air ingress at suction flange / gasket
- Immediate action:
  - Clean strainer (per permit), restore basin level, verify suction valves fully open

B. Bearing lubrication issue
- Evidence:
  - DE bearing heats faster than NDE
  - Grease purge blocked or grease contaminated
- Checks:
  - Grease type correct per OEM?
  - Over-greasing can cause heat (too much grease churn)
- Action:
  - Follow OEM lubrication interval and quantity
  - If contamination suspected, plan bearing inspection

C. Misalignment / coupling issue
- Evidence:
  - Vibration increasing across 1X running speed
  - Coupling hot spot, noise, abnormal wear
- Action:
  - Schedule laser alignment check after switching to standby pump

D. Mechanical seal / seal flush failure
- Evidence:
  - Seal area hot, leakage rate changes, flush line blocked
- Checks:
  - Seal flush valve open? Flush flow present?
  - Flush strainer clogged?
- Action:
  - Restore flush flow, inspect flush strainer, log in CMMS

E. Blocked discharge / operation at low flow
- Evidence:
  - Discharge pressure high but flow low
  - Pump recirculation/min-flow line closed
- Action:
  - Verify min-flow / recirculation line per procedure
  - Avoid prolonged low-flow operation (overheating risk)

7) CONTROL ROOM DECISION: WHEN TO SWITCH TO STANDBY PUMP
Switch from P-201A to P-201B if any of the following:
- Bearing temperature > 95°C and rising for 3 minutes
- Vibration exceeds site limit (e.g., > 7.1 mm/s RMS)
- Motor current > 180 A with unstable discharge pressure
- Visible seal leak worsening or safety risk present

8) DOCUMENTATION & SYSTEMS (SAP / CMMS / DCS)
Log the event with:
- DCS Alarm Screenshot / Trend (TT, PT, FT, current)
- Operator logbook entry (time, alarm, actions taken)
- SAP PM Notification template:
  - Equipment: P-201A
  - Symptom: Overheating DE bearing temp high
  - Suspected cause: (choose from checklist)
  - Immediate action: (e.g., switched to P-201B, cleaned strainer)
  - Recommended follow-up: alignment check, bearing inspection, seal flush check

9) RECOMMENDED FOLLOW-UP WORK (NEXT 24–72 HOURS)
- Inspect suction strainer DP transmitter calibration
- Review vibration spectrum (if condition monitoring exists)
- Check lubrication records (grease type/quantity/interval)
- Verify pump curve vs operating point (ensure near BEP)
- Conduct coupling alignment and soft foot check
- If repeated events: perform RCA (5-Why) and update PM plan

10) KEY TERMS (FOR RAG SEARCH)
P-201A, pump overheating, bearing temperature high, cavitation, suction strainer DP, seal flush, misalignment, coupling heat, cooling water system, standby pump switchover, SAP PM notification, DCS alarm trend.
"""

docs = [Document(page_content=document_text, metadata={"source": "demo_in_memory"})]
print(f"✅ Created {len(docs)} document(s).")

✅ Created 1 document(s).


## Step 4 — Split the document into chunks

RAG works best when the knowledge base is split into retrievable chunks.

- `chunk_size`: how big each chunk is
- `chunk_overlap`: repeated text between chunks to preserve context


In [16]:
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
splits = splitter.split_documents(docs)

print(f"✅ Split into {len(splits)} chunk(s).")
print("\n--- Sample chunk ---\n")
print(splits[0].page_content[:400])

✅ Split into 15 chunk(s).

--- Sample chunk ---

ENERGY PLANT OPERATIONS NOTE
Document ID: OPS-PMP-017
Title: Troubleshooting Overheated Cooling Water Pump (P-201A)
Site: Al Noor Combined Cycle Power Plant
Unit: Block 2 | Area: Cooling Water System (CWS)
Last Updated: 2025-12-10
Owner: Maintenance Reliability Team

1) PURPOSE
This document provides a step-by-step procedure to diagnose and stabilize an overheated cooling water pump (P-201A). It i


## Step 5 — Create embeddings and store them in Chroma

This step converts each chunk into a vector embedding and stores it in a Chroma collection.

Notes:
- `collection_name` identifies the dataset inside Chroma.
- `persist_directory` lets you reuse the index across runs (optional but useful in demos).


In [17]:
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

vectorstore = Chroma.from_documents(
    documents=splits,
    embedding=embeddings,
    collection_name="energy-plant-demo",
    persist_directory="./chroma_energy_plant_demo"  # optional persistence
)

print("✅ Chroma vector store created (and persisted to ./chroma_energy_plant_demo).")

✅ Chroma vector store created (and persisted to ./chroma_energy_plant_demo).


## Step 6 — Create a retriever

The retriever fetches the **top-k** most relevant chunks for a question.

Important:
- In newer LangChain versions, retrievers are **Runnables**.
- So we use: `retriever.invoke(query)` (not `get_relevant_documents`).


In [9]:
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
print("✅ Retriever ready.")

✅ Retriever ready.


## Step 7 — Retrieve context for a question (and inspect results)

For demos, it’s helpful to show what was retrieved before generating an answer.


In [18]:
question = "What is the safety procedure before maintenance?"

retrieved_docs = retriever.invoke(question)

print(f"✅ Retrieved {len(retrieved_docs)} chunk(s).\n")
for i, d in enumerate(retrieved_docs, 1):
    print(f"--- Retrieved chunk {i} ---")
    print(d.page_content.strip())
    print()

✅ Retrieved 4 chunk(s).

--- Retrieved chunk 1 ---
4) SAFETY FIRST (MANDATORY)
Before any field action:
- Follow plant LOTO procedure if opening guards, removing coupling cover, or working on energized equipment.
- Wear PPE: gloves, face shield, safety shoes, hearing protection.
- Hot surfaces hazard: bearing housings and casing can exceed safe touch temperature.
- If leak observed near mechanical seal, treat as pressurized water hazard.

--- Retrieved chunk 2 ---
4) SAFETY FIRST (MANDATORY)
Before any field action:
- Follow plant LOTO procedure if opening guards, removing coupling cover, or working on energized equipment.
- Wear PPE: gloves, face shield, safety shoes, hearing protection.
- Hot surfaces hazard: bearing housings and casing can exceed safe touch temperature.
- If leak observed near mechanical seal, treat as pressurized water hazard.

--- Retrieved chunk 3 ---
7) CONTROL ROOM DECISION: WHEN TO SWITCH TO STANDBY PUMP
Switch from P-201A to P-201B if any of the following:
- 

## Step 8 — Generate an answer from retrieved context (RAG)

We now feed the retrieved chunks into a prompt and ask the LLM to answer **only from context**.


In [19]:
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

prompt = ChatPromptTemplate.from_messages([
    ("system", "You answer using ONLY the provided context. If the answer is not in the context, say you don't know."),
    ("human", "Context:\n{context}\n\nQuestion: {question}")
])

def format_docs(docs):
    return "\n\n".join(d.page_content for d in docs)

chain = prompt | llm | StrOutputParser()

answer = chain.invoke({
    "context": format_docs(retrieved_docs),
    "question": question
})

print("Answer:")
print(answer)

Answer:
Before any field action, the safety procedure includes the following steps:
- Follow plant LOTO procedure if opening guards, removing coupling cover, or working on energized equipment.
- Wear PPE: gloves, face shield, safety shoes, hearing protection.
- Be aware of hot surfaces hazard: bearing housings and casing can exceed safe touch temperature.
- If a leak is observed near the mechanical seal, treat it as a pressurized water hazard.


## Step 9 — Try more questions

Change the `question` below and re-run the retrieval + generation cells.

Suggested demo questions:
- “What is the alarm escalation rule?”
- “How often should coolant pumps be inspected?”
- “What fields should be captured in incident logging?”


In [24]:
question = "who is the owner of document?"

retrieved_docs = retriever.invoke(question)
answer = chain.invoke({
    "context": format_docs(retrieved_docs),
    "question": question
})

print("Answer:")
print(answer)

Answer:
The owner of the document is the Maintenance Reliability Team.


## Step 10 — (Optional) Reload the same Chroma index later

If you used `persist_directory`, you can reload the same collection without re-embedding.

Use this in a fresh runtime if you want to show persistence behavior.


In [None]:
# OPTIONAL: Reload example (uncomment to use in a fresh session)
# embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
# vectorstore = Chroma(
#     collection_name="energy-plant-demo",
#     embedding_function=embeddings,
#     persist_directory="./chroma_energy_plant_demo"
# )
# retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
# print("✅ Reloaded persisted Chroma collection.")