<a href="https://colab.research.google.com/github/rahiakela/genai-research-and-practice/blob/main/vector-databases/01_simple_rag_pipeline.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Setup

In [None]:
!pip install chromadb sentence-transformers

In [1]:
from sentence_transformers import SentenceTransformer
from chromadb import Client, Settings
import chromadb

## 1.Model Initialization

In [None]:
# Initialize the embedding model
# all-MiniLM-L6-v2 is a lightweight but effective model, good balance of speed/quality
model = SentenceTransformer('all-MiniLM-L6-v2')

## 2.Vector Store Setup

In [3]:
# Initialize ChromaDB as our vector store
# Using in-memory storage for this example
chroma_client = Client(Settings(is_persistent=False))
collection = chroma_client.create_collection(name="climate_docs")

## 3.Document Processing

In [4]:
# Example documents to process
documents = [
    "Climate change is affecting global weather patterns causing more extreme events.",
    "Rising sea levels threaten coastal communities worldwide.",
    "Greenhouse gas emissions continue to rise despite international agreements."
]

In [5]:
# Create embeddings for our documents
# model.encode() converts text to dense vectors (embeddings)
embeddings = model.encode(documents)

## 4.Storing Documents

In [6]:
# Store documents and their embeddings
# ChromaDB expects embeddings as lists, so we convert numpy arrays
collection.add(
    embeddings=[e.tolist() for e in embeddings],
    documents=documents,
    ids=[f"doc_{i}" for i in range(len(documents))]
)

## 5.Query Processing

In [7]:
# Process a query
query = "How does climate change affect weather?"
query_embedding = model.encode(query)

## 6.Similarity Search

In [8]:
# Search for similar documents
results = collection.query(
    query_embeddings=[query_embedding.tolist()],
    n_results=2
)

In [9]:
# Print results
for doc in results['documents'][0]:
    print(f"Retrieved document: {doc}")

Retrieved document: Climate change is affecting global weather patterns causing more extreme events.
Retrieved document: Rising sea levels threaten coastal communities worldwide.
