üß† Simple RAG Practice with ChromaDB + LlamaIndex + Ollama


This is a **minimal** practice version ‚Äî clean, easy, and focused on understanding how RAG works.


---

In [17]:
!pip install llama-index-core llama-index-llms-ollama llama-index-embeddings-huggingface \
llama-index-vector-stores-chroma chromadb

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Collecting llama-index-core
  Using cached llama_index_core-0.14.4-py3-none-any.whl.metadata (2.5 kB)
Collecting llama-index-llms-ollama
  Using cached llama_index_llms_ollama-0.8.0-py3-none-any.whl.metadata (3.6 kB)
Collecting llama-index-embeddings-huggingface
  Using cached llama_index_embeddings_huggingface-0.6.1-py3-none-any.whl.metadata (458 bytes)
Collecting llama-index-vector-stores-chroma
  Downloading llama_index_vector_stores_chroma-0.5.3-py3-none-any.whl.metadata (413 bytes)
Collecting chromadb
  Downloading chromadb-1.1.1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.2 kB)
Collecting aiohttp<4,>=3.8.6 (from llama-index-core)
  Using cached aiohttp-3.13.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (8.1 kB)
Collecting aiosqlite (from llama-index-core)
  Using cached aiosqlite-0.21.0-py3-none-any.whl.metadata (4.3 kB)
Collecting banks<3,>=2.2.0 (from llama-index-core)
  Using cached banks-2.2.0-py3-none-any

Imports

In [18]:
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, StorageContext
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.ollama import Ollama
from llama_index.vector_stores.chroma import ChromaVectorStore
import chromadb

Load and index documents

In [19]:
data_path = "./data" # folder with .txt or .pdf files

# Load documents
documents = SimpleDirectoryReader(data_path).load_data()

Create embedding model

In [20]:
embed_model = HuggingFaceEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2")

Create ChromaDB client and collection

In [21]:
chroma_client = chromadb.PersistentClient(path="./chroma_db")
collection = chroma_client.get_or_create_collection(name="simple_rag")

Setup vector store and context

In [22]:
vector_store = ChromaVectorStore(chroma_collection=collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

Create index

In [23]:
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context, embed_model=embed_model)

Setup the LLM and Query Engine

In [25]:
llm = Ollama(model="llama3") # Use your local Ollama model

In [26]:
query_engine = index.as_query_engine(llm=llm, similarity_top_k=3)

Ask a question

In [27]:
query = "What are the skills given in the resume?"
response = query_engine.query(query)

In [28]:
print(response)

The individual has listed several skills throughout their resume. These include:

Technical Skills:
- Data Science: Machine Learning and Deep Learning algorithms for Computer Vision and LLM
- Data Engineering: Data Pipelines and Data Profiling
- Python

Tools and Technologies:
1. AWS: AWS Data pipeline, EMR, Sagemaker, SNS, Redshift
2. Azure: Databricks, Azure Data Factory
3. GCP: Vertex ai, Google cloud functions, Cloud Run, App Engine, Cloud Storage, Pub/Sub, BigQuery, Artificial Intelligence, AutoML for Vision
4. Big Data: PySpark
5. Web Frameworks: flask, streamlit, fastapi.
6. Database: MySQL, MongoDB, ElasticSearch, PostgreSQL
7. Orchestration: Airflow, rundeck
8. Containerization: Docker, kubernetes
9. Monitoring: Datadog
