### Chroma Vector Store

This notebook covers how to get started with the Chroma vector store.

Chroma is a AI-native open-source vector database focused on developer productivity and happiness. Chroma is licensed under Apache 2.0. 

In [5]:
from langchain_ollama import OllamaEmbeddings
from  langchain_chroma import Chroma
from langchain_community.document_loaders import TextLoader 
from langchain_text_splitters import RecursiveCharacterTextSplitter

In [6]:
# loading of text
loader = TextLoader("../1.document_loader/speech.txt", encoding="UTF-8") 
data = loader.load()
print(data) 

[Document(metadata={'source': '../1.document_loader/speech.txt'}, page_content='The world must be made safe for democracy. Its peace must be planted upon the tested foundations of political liberty. We have no selfish ends to serve. We desire no conquest, no dominion. We seek no indemnities for ourselves, no material compensation for the sacrifices we shall freely make. We are but one of the champions of the rights of mankind. We shall be satisfied when those rights have been made as secure as the faith and the freedom of nations can make them.\n\nJust because we fight without rancor and without selfish object, seeking nothing for ourselves but what we shall wish to share with all free peoples, we shall, I feel confident, conduct our operations as belligerents without passion and ourselves observe with proud punctilio the principles of right and of fair play we profess to be fighting for.\n\nIt will be all the easier for us to conduct ourselves as belligerents in a high spirit of right

In [None]:
# splitting of text using recursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=30,
)

docs = text_splitter.split_documents(data)


[Document(metadata={'source': '../1.document_loader/speech.txt'}, page_content='The world must be made safe for democracy. Its peace must be planted upon the tested foundations of political liberty. We have no selfish ends to serve. We desire no conquest, no dominion. We seek no indemnities for ourselves, no material compensation for the sacrifices we shall freely make. We are but one of the champions of the rights of mankind. We shall be satisfied when those rights have been made as secure as the faith and the freedom of nations can make them.\n\nJust because we fight without rancor and without selfish object, seeking nothing for ourselves but what we shall wish to share with all free peoples, we shall, I feel confident, conduct our operations as belligerents without passion and ourselves observe with proud punctilio the principles of right and of fair play we profess to be fighting for.'),
 Document(metadata={'source': '../1.document_loader/speech.txt'}, page_content='It will be all 

In [None]:
embeddings = OllamaEmbeddings(
   model="llama3.2:1b",
   
)

In [20]:


vector_store = Chroma(
    collection_name="embedding_collection",
    embedding_function=embeddings,
    persist_directory="./chroma_db",  # Where to save data locally, remove if not necessary
)
vector_store.add_documents(documents=docs)

['97e1d94e-1e35-4090-9e40-19d421c01a59',
 '0c074cc2-ecdf-49d7-bec2-7499cc5cb24b',
 '3d4ed3b4-32de-40d8-8909-c7ce0be17f3a',
 '64a7554f-e71d-4d26-b451-eced9f10bd4f',
 'f0f97cef-a1ab-44c0-aa65-ab56433bdebd']

In [18]:
query ="How does the speaker justify America’s entry into the war as morally righteous and selfless?"
results = vector_store.similarity_search(query,k=1)

In [19]:
results

[Document(id='bd08b38f-a217-41a7-a7bc-921703139416', metadata={'source': '../1.document_loader/speech.txt'}, page_content='The world must be made safe for democracy. Its peace must be planted upon the tested foundations of political liberty. We have no selfish ends to serve. We desire no conquest, no dominion. We seek no indemnities for ourselves, no material compensation for the sacrifices we shall freely make. We are but one of the champions of the rights of mankind. We shall be satisfied when those rights have been made as secure as the faith and the freedom of nations can make them.\n\nJust because we fight without rancor and without selfish object, seeking nothing for ourselves but what we shall wish to share with all free peoples, we shall, I feel confident, conduct our operations as belligerents without passion and ourselves observe with proud punctilio the principles of right and of fair play we profess to be fighting for.')]

In [21]:
# load from disk

vector_store_load = Chroma(
    embedding_function=embeddings,
    persist_directory="./chroma_db",  # Where to save data locally, remove if not necessary
)

In [22]:
query ="How does the speaker justify America’s entry into the war as morally righteous and selfless?"
results = vector_store.similarity_search(query,k=1)

In [23]:
results

[Document(id='97e1d94e-1e35-4090-9e40-19d421c01a59', metadata={'source': '../1.document_loader/speech.txt'}, page_content='The world must be made safe for democracy. Its peace must be planted upon the tested foundations of political liberty. We have no selfish ends to serve. We desire no conquest, no dominion. We seek no indemnities for ourselves, no material compensation for the sacrifices we shall freely make. We are but one of the champions of the rights of mankind. We shall be satisfied when those rights have been made as secure as the faith and the freedom of nations can make them.\n\nJust because we fight without rancor and without selfish object, seeking nothing for ourselves but what we shall wish to share with all free peoples, we shall, I feel confident, conduct our operations as belligerents without passion and ourselves observe with proud punctilio the principles of right and of fair play we profess to be fighting for.')]

In [None]:
# retriever 

retriver = vector_store.as_retriever()
retriver.invoke(query)

[Document(id='97e1d94e-1e35-4090-9e40-19d421c01a59', metadata={'source': '../1.document_loader/speech.txt'}, page_content='The world must be made safe for democracy. Its peace must be planted upon the tested foundations of political liberty. We have no selfish ends to serve. We desire no conquest, no dominion. We seek no indemnities for ourselves, no material compensation for the sacrifices we shall freely make. We are but one of the champions of the rights of mankind. We shall be satisfied when those rights have been made as secure as the faith and the freedom of nations can make them.\n\nJust because we fight without rancor and without selfish object, seeking nothing for ourselves but what we shall wish to share with all free peoples, we shall, I feel confident, conduct our operations as belligerents without passion and ourselves observe with proud punctilio the principles of right and of fair play we profess to be fighting for.'),
 Document(id='3d4ed3b4-32de-40d8-8909-c7ce0be17f3a',

Failed to get info from https://api.smith.langchain.com: LangSmithConnectionError('Connection error caused failure to GET /info in LangSmith API. Please confirm your internet connection. ConnectTimeout(MaxRetryError("HTTPSConnectionPool(host=\'api.smith.langchain.com\', port=443): Max retries exceeded with url: /info (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7c3ad8f40f80>, \'Connection to api.smith.langchain.com timed out. (connect timeout=10.0)\'))"))\nContent-Length: None\nAPI Key: lsv2_********************************************f2')
