## Chroma Client 

[Create-Get-Delete](https://docs.trychroma.com/docs/collections/create-get-delete)

In [69]:
pip install chromadb python-dotenv


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


**Create a PersistentClient**

In [70]:
import chromadb

# create a client that will load our existing chromadb
client = chromadb.PersistentClient(path="chroma_db/")


**Check Our Connection**

In [71]:
client.heartbeat()

1735528218037778000

**Getting Our Data Metrics**

In [72]:
from dotenv import load_dotenv
import os

load_dotenv()

# loads the .env file with LLM API Keys
load_dotenv()

# sets API access keys
OPEN_AI_API_KEY = os.getenv("OPEN_AI_API_KEY")

if not OPEN_AI_API_KEY:
    raise ValueError("OPEN_AI_API_KEY not found, please add your API key in a .env")
    

**Create Embedding Function**

[Emedding Functions](https://docs.trychroma.com/integrations/embedding-models/openai) tell Chroma what type of embedding to look for when making queries to the Chroma client

In [73]:
# get openai embedding function
import chromadb.utils.embedding_functions as embedding_functions

# create embedding_function object
openai_ef = embedding_functions.OpenAIEmbeddingFunction(
                api_key=OPEN_AI_API_KEY,
                model_name="text-embedding-3-small"
            )


**Get Collection Details**

[Create and Get and Delete](https://docs.trychroma.com/docs/collections/create-get-delete) processing the collection for the standard `CRUD` operations requires a `collection` primitive.


```python
# create the collection if it doesn't exist
collection = client.create_collection(name="my_collection", embedding_function=emb_fn)

# get's a collection if it exists
collection = client.get_collection(name="my_collection", embedding_function=emb_fn)
```



In [74]:
# search for the KEXP collection
collection = client.get_collection(name="KEXP-24-Embeddings", embedding_function=openai_ef)


# look at the last item of the data base 
collection.peek() 



{'ids': ['4c90c3e2-4acd-4b5d-a37b-2dee221875e2',
  '0fad0c70-7340-407e-9f00-0e53170a8e48',
  'd7b2ee45-313c-4000-b19c-5119cca18025',
  'a3e634d8-ec70-4c7c-b141-3779a02b22c2',
  'c3f24a92-0de3-4c10-9245-709df95d75eb',
  'f7ac6af4-3bfc-4770-95c3-03b364ce2bd5',
  '89e5b1e7-2b15-452b-b503-80c973d1df40',
  '0ee7053c-ba36-493e-a586-740693b6d120',
  'be597368-5c4e-4dc4-96ee-1ff705893542',
  '7ac7c9c2-c120-40c5-bb1b-eb44fa18b5df'],
 'embeddings': array([[-1.60320718e-02, -5.72213493e-02,  3.75896972e-03, ...,
         -2.57682018e-02,  4.63562012e-02, -1.74533091e-02],
        [-1.52583746e-02,  2.94168349e-02,  7.39516318e-03, ...,
          1.51413623e-02,  1.63465869e-02, -5.04556075e-02],
        [ 3.74108693e-03,  3.49985845e-02, -3.90326865e-02, ...,
         -4.49202992e-02,  2.66577993e-02,  5.14825340e-03],
        ...,
        [ 8.28020554e-03, -2.33449154e-02, -2.96986569e-02, ...,
          6.77739881e-05,  1.68356206e-02,  9.78787336e-03],
        [ 5.62553070e-02, -4.07588333e-02

In [75]:
# looks at a count
print("objects within the collection: " + str(collection.count() ))

objects within the collection: 43292


**Querying the Database**

[Query and Get](https://docs.trychroma.com/docs/querying-collections/query-and-get) the objects can be queried. You can also query by a set of `query texts`. Chroma will first embed each query text with the collection's embedding function, and then perform the query with the generated embedding.



```python
collection.query(
    query_texts=["Nice Biscuit"],
    n_results=5,
    include=['documents', 'distances', 'metadatas']
)
```

In [None]:
collection.query(
    query_texts=["Willie"],
    n_results=15,
    include=['documents', 'distances', 'metadatas']
)

{'ids': [['570e533f-7199-4665-8220-ffdbcbdde6ec',
   'f1459697-a2b9-4bde-9ccd-cc64a97f0339',
   '4974ef75-1dd7-4191-a2dc-891fd84e39f0',
   '4c9f0ed4-1624-403e-bdf2-8375309b2d0f',
   '87eecccf-9d61-421e-bf31-99c10378bf95',
   '83979996-1885-4450-86ac-127ceaaf8623',
   '4f58700b-8bcf-4a94-b13a-f8647b2f91c7',
   '9ee75072-1925-439e-8aa8-f750e6f4c328',
   'b8aafe0e-be90-4fa3-a9b8-2f7a672f48a4',
   '8cbf2682-676c-43be-8119-a42f497df876',
   'b2780936-7084-4085-bae3-77b1c9e19348',
   '3e18a12d-700b-4cd3-ab01-0d9a661a043b',
   '0f6561ee-1813-44c1-97e8-018a4319e634',
   'c18a2851-8b0d-4506-836f-148bc9511a42',
   'b7ffad3e-3bc3-45d2-984c-df1c7ba4e74b']],
 'embeddings': None,
 'documents': [['Willie Nelson - The Border',
   'Willie Nelson - Last Leaf on the Tree',
   'Dylan, and Lucinda Williams',
   'Gary Lee Clark Jr',
   'A Country Western - Life On the Lawn',
   'Orville Peck - Stampede',
   'Carsick Cars - Aha\n\n Johnny Cash - Songwriter',
   'Justin Vernon',
   'Folk: featured Ellen McIlw