## Chroma Client 

[Create-Get-Delete](https://docs.trychroma.com/docs/collections/create-get-delete)

In [88]:
pip install chromadb python-dotenv


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


**Create a PersistentClient**

In [89]:
import chromadb

# create a client that will load our existing chromadb
client = chromadb.PersistentClient(path="chroma_db/")


**Check Our Connection**

In [90]:
client.heartbeat()

1735881139260715000

**Getting Our Data Metrics**

In [91]:
from dotenv import load_dotenv
import os

load_dotenv()

# loads the .env file with LLM API Keys
load_dotenv()

# sets API access keys
OPEN_AI_API_KEY = os.getenv("OPEN_AI_API_KEY")

if not OPEN_AI_API_KEY:
    raise ValueError("OPEN_AI_API_KEY not found, please add your API key in a .env")
    

**Create Embedding Function**

[Emedding Functions](https://docs.trychroma.com/integrations/embedding-models/openai) tell Chroma what type of embedding to look for when making queries to the Chroma client

In [92]:
# get openai embedding function
import chromadb.utils.embedding_functions as embedding_functions

# create embedding_function object
openai_ef = embedding_functions.OpenAIEmbeddingFunction(
                api_key=OPEN_AI_API_KEY,
                model_name="text-embedding-3-small"
            )


**Get Collection Details**

[Create and Get and Delete](https://docs.trychroma.com/docs/collections/create-get-delete) processing the collection for the standard `CRUD` operations requires a `collection` primitive.


```python
# create the collection if it doesn't exist
collection = client.create_collection(name="my_collection", embedding_function=emb_fn)

# get's a collection if it exists
collection = client.get_collection(name="my_collection", embedding_function=emb_fn)
```



In [93]:
# search for the KEXP collection
collection = client.get_collection(name="KEXP-24-Embeddings", embedding_function=openai_ef)


# look at the last item of the data base 
collection.peek() 



{'ids': ['4c90c3e2-4acd-4b5d-a37b-2dee221875e2',
  '0fad0c70-7340-407e-9f00-0e53170a8e48',
  'd7b2ee45-313c-4000-b19c-5119cca18025',
  'a3e634d8-ec70-4c7c-b141-3779a02b22c2',
  'c3f24a92-0de3-4c10-9245-709df95d75eb',
  'f7ac6af4-3bfc-4770-95c3-03b364ce2bd5',
  '89e5b1e7-2b15-452b-b503-80c973d1df40',
  '0ee7053c-ba36-493e-a586-740693b6d120',
  'be597368-5c4e-4dc4-96ee-1ff705893542',
  '7ac7c9c2-c120-40c5-bb1b-eb44fa18b5df'],
 'embeddings': array([[-1.60320718e-02, -5.72213493e-02,  3.75896972e-03, ...,
         -2.57682018e-02,  4.63562012e-02, -1.74533091e-02],
        [-1.52583746e-02,  2.94168349e-02,  7.39516318e-03, ...,
          1.51413623e-02,  1.63465869e-02, -5.04556075e-02],
        [ 3.74108693e-03,  3.49985845e-02, -3.90326865e-02, ...,
         -4.49202992e-02,  2.66577993e-02,  5.14825340e-03],
        ...,
        [ 8.28020554e-03, -2.33449154e-02, -2.96986569e-02, ...,
          6.77739881e-05,  1.68356206e-02,  9.78787336e-03],
        [ 5.62553070e-02, -4.07588333e-02

In [94]:
# looks at a count
print("objects within the collection: " + str(collection.count() ))

objects within the collection: 63482


**Querying the Database**

[Query and Get](https://docs.trychroma.com/docs/querying-collections/query-and-get) the objects can be queried. You can also query by a set of `query texts`. Chroma will first embed each query text with the collection's embedding function, and then perform the query with the generated embedding.



```python
collection.query(
    query_texts=["Nice Biscuit"],
    n_results=5,
    include=['documents', 'distances', 'metadatas']
)
```

In [96]:
collection.query(
    query_texts=["Khurangbin"],
    n_results=15,
    include=['documents', 'distances', 'metadatas']
)

{'ids': [['804df970-5923-437a-a554-bff07f8b84d2',
   '584c45d7-6dfd-4f51-89a4-521fceabe017',
   '1f7262a8-1242-4489-9b5b-5df42e74f2ee',
   'a2f71648-bab7-4273-a8ae-13b8c32c364d',
   '5526decc-470b-41a1-a701-2e8a60f1cad8',
   'ccbd8b6a-ffc4-4a6d-845b-7376d11e5ead',
   '98c2cbe1-40c6-4ac2-80c5-214e43b7cdfe',
   '9c6da98f-c355-4e2f-bbaf-2d1a7b8e82a3',
   'a74b19c5-ff65-466a-a26b-a05b7d6ab5ce',
   'ffc9fbaf-9e5b-4df4-9650-56fe2f397369',
   '3758940e-0725-47c9-8db6-bdb6f63b2ef4',
   '686d7ef2-97d7-4074-937a-0a38722fd433',
   'f6e61926-8c71-4a7a-bd74-2953d3350395',
   'bf22799d-1507-4134-ac86-70425f504c91',
   'c9c36040-94b6-45db-938a-7540337b82ea']],
 'embeddings': None,
 'documents': [['Khruangbin - A LA SALA',
   'Markus Stockhausen; Emilie-Claire Barlow; Michael Kiwanuka; Louis Cole; Jonathan Jeremiah; Basement',
   'Tyla - TYLA\n\n Kali Uchis - Orquídeas',
   'Amen Dunes\nAmen Dunes at Sacred Bones Records',
   'including: The Belligerents, Moses Gunn Collective and The Jungle Giants, a