# ChromaDB
ChromaDB (or just Chroma) is an open-source embedding database (vector store) used for retrieval-augmented generation (RAG), semantic search, and working with LLM-based applications.

It’s built to store and search embeddings (vectors) efficiently, making it ideal for use cases like:

- AI chatbots (based on document/query retrieval)

- Semantic search over text or logs

- Question-answering systems

- Custom LLM knowledge base using embeddings

# 📦 What is `chromadb.client`?
`chromadb.client` is part of the ChromaDB Python SDK. It's the main interface to interact with a Chroma vector database from your Python code.

You use it to:

- Initialize a client

- Create, read, and update collections

- Insert and query embeddings

# Example:


In [2]:
import chromadb
from chromadb.config import Settings

In [7]:

client = chromadb.Client(Settings(
    # chroma_db_impl="duckdb+parquet",  # default local setup
    persist_directory="./chroma_store",  # folder where data will be stored
    is_persistent=True,
))


In [8]:
collection = client.get_or_create_collection(name="my_docs")
collection.add(
    documents=["This is a log line", "Another message"],
    ids=["doc1", "doc2"]
)

In [9]:
results = collection.query(query_texts=["log line"], n_results=1)
print(results)

{'ids': [['doc1']], 'embeddings': None, 'documents': [['This is a log line']], 'uris': None, 'included': ['metadatas', 'documents', 'distances'], 'data': None, 'metadatas': [[None]], 'distances': [[0.265164315700531]]}


# ⚙️ Settings Options in chromadb.client
You pass `Settings()` into `chromadb.Client()` to configure behavior.

Here are the most important options:

|Setting Key|	Description|	Example|
|-----------|--------------|---------------|
|chroma_db_impl	|The database engine used by Chroma. Options: "duckdb+parquet" (local), "clickhouse" (for advanced)	|"duckdb+parquet"|
|persist_directory	|Path to store data if using duckdb+parquet	|"./chroma_store"|
|anonymized_telemetry	|Whether to send usage data to developers (set to False for privacy)|	False|
|is_persistent	|Whether data persists after client shutdown (only applies to duckdb+parquet)	|True|
|allow_reset	|Allows you to reset collections	|True|
|require_hnsw	|Use HNSW for ANN search (optional)	|True|

# Example with all Options

In [None]:
from chromadb.config import Settings

settings = Settings(
    chroma_db_impl="duckdb+parquet",
    persist_directory="./chroma_data",
    anonymized_telemetry=False,
    is_persistent=True,
    allow_reset=True
)

client = chromadb.Client(settings)


# 🧠 Key Concepts in Chroma
|Concept|	Description|
|--------|-------------|
|Collection|	Like a table; stores documents and embeddings|
|Document	|The actual text (or chunk of text)|
|Embedding|	Vector representation of a document|
|Query|	Find the most relevant documents using vector similarity|


# 🚀 Why Use Chroma?
- Very easy to set up (just install and run locally)

- Built-in persistence (no need for external DB)

- Works well with LangChain, LlamaIndex, and other RAG frameworks

- Supports filters and metadata


# 🔧 How to Install

```bash
pip install chromadb
```