## Chroma Vector Database Basics

[LangChain Chroma Db](https://python.langchain.com/docs/integrations/vectorstores/chroma/#query-directly)

**Install Required Libraries**


In [42]:
pip install -qU langchain-openai python-dotenv langchain-chroma


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


**Load OpenAI Credenitals**

In [43]:
from langchain_openai import OpenAIEmbeddings
from dotenv import load_dotenv
import os

# Load the .environment variables
load_dotenv()

# sets API access keys
OPEN_AI_API_KEY = os.getenv("OPEN_AI_API_KEY")

if not OPEN_AI_API_KEY:
        raise ValueError("OPEN_AI_API_KEY not found, please add your API key in a .env")
    



**Create OpenAIEmbeddings Model for Chroma**

In [44]:
from langchain_openai import OpenAIEmbeddings

# openai embedding model
embedding_model = OpenAIEmbeddings(
                        api_key=OPEN_AI_API_KEY,
                        model="text-embedding-3-small" 
                    )

**Create Chroma Client**

In [45]:

from langchain_chroma import Chroma

vector_store = Chroma(
            embedding_function=embedding_model,
            persist_directory='chroma_db',
            collection_name="KEXP-24-Embeddings"
        )

**Query The Chroma Vector Database**

In [48]:
results = vector_store.similarity_search_with_score(
    "Willie Nelson",
    k=5
)

for res, score in results:
    print(f"* [SIM={score:3f}] {res.page_content} [{res.metadata}]")

* [SIM=0.557158] Willie Nelson - The Border [{'source': "./Vote for KEXP's Best of 2024.html", 'title': "\n        \n        \n        Vote for KEXP's Best of 2024\n        \n    "}]
* [SIM=0.598908] Willie Nelson - Last Leaf on the Tree [{'source': "./Vote for KEXP's Best of 2024.html", 'title': "\n        \n        \n        Vote for KEXP's Best of 2024\n        \n    "}]
* [SIM=1.176673] Gary Lee Clark Jr [{'source': 'https://en.wikipedia.org/wiki/Gary_Clark_Jr.', 'summary': 'Gary Lee Clark Jr. (born February 15, 1984) is an American guitarist and singer who fuses blues, rock and soul music with elements of hip hop. In 2011, Clark signed with Warner Bros Records and released The Bright Lights EP. It was followed by the albums Blak and Blu (2012) and The Story of Sonny Boy Slim (2015). Throughout his career, Clark has been a prolific live performer, documented by Gary Clark Jr. Live (2014) and Gary Clark Jr Live/North America (2017).\nIn 2014, Clark was awarded a Grammy for Best Trad

In [47]:
results = vector_store.similarity_search_by_vector(
    embedding=embedding_model.embed_query("Last Leaf"), k=5
)
for doc in results:
    print(f"* {doc.page_content} [{doc.metadata}]")

* Willie Nelson - Last Leaf on the Tree [{'source': "./Vote for KEXP's Best of 2024.html", 'title': "\n        \n        \n        Vote for KEXP's Best of 2024\n        \n    "}]
* Leaf directed music video for the song [{'source': 'https://en.wikipedia.org/wiki/Gary_Clark_Jr.', 'summary': 'Gary Lee Clark Jr. (born February 15, 1984) is an American guitarist and singer who fuses blues, rock and soul music with elements of hip hop. In 2011, Clark signed with Warner Bros Records and released The Bright Lights EP. It was followed by the albums Blak and Blu (2012) and The Story of Sonny Boy Slim (2015). Throughout his career, Clark has been a prolific live performer, documented by Gary Clark Jr. Live (2014) and Gary Clark Jr Live/North America (2017).\nIn 2014, Clark was awarded a Grammy for Best Traditional R&B performance for the song "Please Come Home". In 2020, he won the Grammy Award for "Best Rock Song" and "Best Rock Performance" for the song "This Land" from that album. His most re