# Vertex AI RAG — Beginner Quickstart (Project: `xubisid-demo3`)

This notebook shows a **simple Retrieval-Augmented Generation (RAG)** pipeline using:

- **Vertex AI Gemini** for generation
- **Vertex AI `text-embedding-004`** for embeddings
- **FAISS (in-memory)** for vector search (beginner-friendly)

You can later swap FAISS for **Vertex AI Vector Search** once you're comfortable.

## 0) Prerequisites (run in your terminal / Cloud Shell **once**)

```bash
export PROJECT_ID=xubisid-demo3
export REGION=us-central1
gcloud config set project $PROJECT_ID
gcloud config set ai/region $REGION

# Enable APIs
gcloud services enable aiplatform.googleapis.com storage.googleapis.com run.googleapis.com

# (Optional) Create a service account for notebooks / CI
gcloud iam service-accounts create rag-sa --display-name="RAG Service Account"
gcloud projects add-iam-policy-binding $PROJECT_ID \
  --member="serviceAccount:rag-sa@${PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/aiplatform.user"
gcloud projects add-iam-policy-binding $PROJECT_ID \
  --member="serviceAccount:rag-sa@${PROJECT_ID}.iam.gserviceaccount.com" \
  --role="roles/storage.objectAdmin"

# (Optional) Create a bucket for your datasets/artifacts
export BUCKET=${PROJECT_ID}-rag-demo
gsutil mb -l $REGION gs://$BUCKET || echo "Bucket may already exist"
```

Then open this notebook in **Vertex AI Workbench** or **Colab**. If using Workbench, ensure the runtime has internet and you are authenticated.

## 1) Install dependencies (only first time per runtime)

In [ ]:
!pip -q install google-cloud-aiplatform>=1.60.0 faiss-cpu langchain>=0.2.0 langchain-community>=0.2.0 langchain-google-vertexai>=2.0.0

## 2) Set your project and region

In [ ]:
PROJECT_ID = "xubisid-demo3"  # <- change if needed
REGION = "us-central1"

from google.cloud import aiplatform
aiplatform.init(project=PROJECT_ID, location=REGION)
print("Vertex AI initialized:", PROJECT_ID, REGION)

## 3) Prepare some sample documents
We'll create a few small knowledge base documents directly in the notebook. Replace these with your PDFs/CSVs later.

In [ ]:
sample_docs = [
    ("doc1.txt", "GoBazzar is an e-commerce price comparison platform. It ingests product feeds and normalizes data."),
    ("doc2.txt", "EJADH Inspector app assigns field inspectors by region and city with shift-based scheduling."),
    ("doc3.txt", "Vertex AI supports Gemini for generation, and text-embedding-004 for high-quality embeddings."),
]

import os
os.makedirs("data", exist_ok=True)
for fname, text in sample_docs:
    with open(os.path.join("data", fname), "w", encoding="utf-8") as f:
        f.write(text)
print("Wrote", os.listdir("data"))

## 4) Chunk + Embed with Vertex AI `text-embedding-004`

In [ ]:
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_google_vertexai import VertexAIEmbeddings

# Load docs
docs = []
for fname, _ in sample_docs:
    loader = TextLoader(os.path.join("data", fname), encoding="utf-8")
    docs.extend(loader.load())

# Chunk
splitter = RecursiveCharacterTextSplitter(chunk_size=400, chunk_overlap=80)
chunks = splitter.split_documents(docs)
print("Total chunks:", len(chunks))

# Embeddings via Vertex AI
emb_model = VertexAIEmbeddings(model_name="text-embedding-004")
embeddings = emb_model.embed_documents([c.page_content for c in chunks])
print("Got", len(embeddings), "embeddings, dim:", len(embeddings[0]))

## 5) Build a FAISS vector index (local, simple)

In [ ]:
import faiss
import numpy as np

dim = len(embeddings[0])
index = faiss.IndexFlatIP(dim)  # cosine if normalized; here using dot-product
vecs = np.array(embeddings).astype("float32")
faiss.normalize_L2(vecs)  # normalize for cosine-like similarity
index.add(vecs)
print("FAISS index size:", index.ntotal)

## 6) Define a retriever helper

In [ ]:
def retrieve(query: str, top_k: int = 3):
    q_emb = emb_model.embed_query(query)
    q = np.array([q_emb]).astype("float32")
    faiss.normalize_L2(q)
    distances, idxs = index.search(q, top_k)
    ctx = []
    for i in idxs[0]:
        if i == -1:
            continue
        ctx.append(chunks[i].page_content)
    return ctx

print(retrieve("What is GoBazzar?"))

## 7) Call Gemini to generate grounded answers

In [ ]:
from vertexai.preview.generative_models import GenerativeModel

gemini = GenerativeModel("gemini-1.5-flash")  # or "gemini-1.5-pro"

def rag_answer(question: str, top_k: int = 3):
    context = retrieve(question, top_k=top_k)
    prompt = (
        "You are a helpful assistant. Use ONLY the context below to answer.\n\n"
        f"Context:\n{chr(10).join([f'- {c}' for c in context])}\n\n"
        f"Question: {question}\n"
        "If the answer is not in the context, say you don't know."
    )
    resp = gemini.generate_content(prompt)
    return resp.text

print(rag_answer("How does EJADH Inspector app schedule inspectors?"))

## 8) Try your own questions

In [ ]:
questions = [
    "What models does Vertex AI provide for embeddings?",
    "Explain GoBazzar in one sentence.",
]
for q in questions:
    print("\nQ:", q)
    print("A:", rag_answer(q))

## 9) (Optional) Save/Load FAISS index
You can persist the FAISS index to reuse without recomputing.

In [ ]:
faiss.write_index(index, "faiss.index")
with open("chunks.txt", "w", encoding="utf-8") as f:
    for c in chunks:
        f.write(c.page_content.replace("\n", " ") + "\n")
print("Saved faiss.index and chunks.txt")

## 10) Next steps: Move to **Vertex AI Vector Search**
Once you're ready, you can replace FAISS with **Vertex AI Vector Search**. The high-level steps:

1. Create an index (Matching Engine / Vector Search) in Vertex AI
2. Upload your embeddings to Cloud Storage as TFRecords or JSONL
3. Deploy the index to an index endpoint
4. Query the endpoint for the top-K vectors
5. Feed retrieved chunks to Gemini as in `rag_answer`

This gives you a managed, scalable vector store.