# Chroma QuickStart

参考： https://docs.trychroma.com/deployment

## 在本机用 docker 把 chroma 跑起来

```bash
sudo docker pull chromadb/chroma
sudo docker run -p 8000:8000 chromadb/chroma
```

## 应用连接 chromadb

确保已经安装了 chromadb 的 pypi 包
```bash
pip install chromadb
```

In [1]:
import chromadb
chroma_client = chromadb.HttpClient(host='localhost', port=8000)

### 创建一个 collection

Colletions 可以用来存储 embeddings, documents 和其它更多的 metadata

In [2]:
collection = chroma_client.create_collection(name='my_collection')

### 在 collection 中加上一些 text documents

In [4]:
collection.add(
    documents=["This is a document", "This is another document", "yes no yes no makuna no"],
    metadatas=[{"source": "my_source"}, {"source": "my_source"}, {"source": "my_source"}],
    ids=["id1", "id2", "id3"]
)

#### 如果你已经自己生成了 embeddings， 也可以把 embeddings 存储到库中

In [4]:
collection.add(
    embeddings=[[1.2, 2.3, 4.5], [6.7, 8.2, 9.2]],
    documents=["This is a document", "This is another document"],
    metadatas=[{"source": "my_source"}, {"source": "my_source"}],
    ids=["id1", "id2"]
)

Exception: {"detail":"Embedding dimension 3 does not match collection dimensionality 384"}

### 查询collection

You can query the collection with a list of query texts, and Chroma will return the n most similar results. 

In [5]:
results = collection.query(
    query_texts=["This is a query document"],
    n_results=2
)
results

{'ids': [['id1', 'id2']],
 'distances': [[0.711121446165086, 1.010977382542355]],
 'embeddings': None,
 'metadatas': [[{'source': 'my_source'}, {'source': 'my_source'}]],
 'documents': [['This is a document', 'This is another document']],
 'uris': None,
 'data': None}