# Basic RAG with ChromaDB
This notebook demonstrates a minimal Retrieval-Augmented Generation (RAG) pipeline using **ChromaDB** as a persistent vector store.

RAG works by retrieving documents relevant to a user's question and feeding them to a language model.
Here the steps are:
1. Embed the question and documents using a sentence transformer.
2. Perform a similarity search in ChromaDB.
3. Compose a prompt with the retrieved docs.
4. Ask an LLM to answer.

```User question -> embedding -> retrieval -> prompt -> answer```

In [None]:
# Install dependencies if needed# !pip install chromadb langchain sentence-transformers openai

In [None]:
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma
from langchain.llms import OpenAI
from langchain.chains import RetrievalQA

DB_DIR = 'rag_db'
embeddings = HuggingFaceEmbeddings(model_name='all-MiniLM-L6-v2')

texts = [
    'Small dogs are friendly and great for apartments.',
    'Large dogs require more space and daily exercise.',
    'Cats are independent pets that enjoy quiet environments.'
]
metadatas = [{'source': f'doc{i}'} for i in range(len(texts))]

vectordb = Chroma.from_texts(texts, embeddings, metadatas=metadatas, persist_directory=DB_DIR)
vectordb.persist()

In [None]:
qa = RetrievalQA.from_chain_type(llm=OpenAI(temperature=0), chain_type='stuff', retriever=vectordb.as_retriever())
query = 'Which pets are good for apartments?'
result = qa({'query': query})
print('Answer:', result['result'])

docs = vectordb.similarity_search(query, k=2)
for doc in docs:
    print(doc.page_content, '->', doc.metadata)

The output shows which documents were retrieved to answer the question. Persistence is handled via `vectordb.persist()` so re-running the notebook keeps the data on disk.