### Chroma DB
##### Chroma is a AI-native open-source vector database focused on developer productivity and happiness. Chroma is licensed under Apache 2.0

In [2]:
!pip install langchain-chroma

Collecting langchain-chroma
  Downloading langchain_chroma-1.1.0-py3-none-any.whl.metadata (1.9 kB)
Downloading langchain_chroma-1.1.0-py3-none-any.whl (12 kB)
Installing collected packages: langchain-chroma
Successfully installed langchain-chroma-1.1.0


In [4]:
### Building a Vector DB

from langchain_chroma import Chroma
from langchain_community.document_loaders import TextLoader               # Data ingestion
from langchain_community.embeddings import OllamaEmbeddings               # Embeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter       # Text splitters

  from .autonotebook import tqdm as notebook_tqdm


In [5]:
loader= TextLoader("speech.txt")
data = loader.load()
data


[Document(metadata={'source': 'speech.txt'}, page_content='Sisters and Brothers of America,It fills my heart with joy unspeakable to rise in response to the warm and cordial welcome.I thank you in the name of the most ancient order of monks in the world.I thank you in the name of the mother of religions.I thank you in the name of millions and millions of Hindu people of all classes and sects. My thanks, also, to some of the speakers on this platform.They have told you that these men from far-off nations may well claim the honor of bearing to different lands the idea of toleration.I am proud to belong to a religion which has taught the world both tolerance and universal acceptance.We believe not only in universal toleration, but we accept all religions as true.I fervently hope that the bell that tolled this morning in honor of this convention may be the death-knell of all fanaticism.')]

In [7]:
### Split 
text_splitter=RecursiveCharacterTextSplitter(chunk_size=300,chunk_overlap=10)
splits = text_splitter.split_documents(data)

In [10]:
### Embeddings 
embeddings=OllamaEmbeddings(model="gemma:2b")
vectordb=Chroma.from_documents(documents=splits,embedding=embeddings)
vectordb

<langchain_chroma.vectorstores.Chroma at 0x27d185869f0>

In [12]:
### Querying
query="What does the speaker speaking about Hindu people in this speech?"
docs=vectordb.similarity_search(query)
docs[0].page_content

'both tolerance and universal acceptance.We believe not only in universal toleration, but we accept all religions as true.I fervently hope that the bell that tolled this morning in honor of this convention may be the death-knell of all fanaticism.'

In [25]:
### Saving to the disk 
vectordb=Chroma.from_documents(documents=splits,embedding=embeddings,persist_directory="./chroma_db")
print(vectordb)

<langchain_chroma.vectorstores.Chroma object at 0x0000027D189C41D0>


In [None]:
### Loading from disk
db2=Chroma(persist_directory="./chroma_db",embedding_function=embeddings)
docs=db2.similarity_search(query)
docs 

[Document(id='12d78c32-3e4f-415b-a1b3-773cd350e8f6', metadata={'source': 'speech.txt'}, page_content='both tolerance and universal acceptance.We believe not only in universal toleration, but we accept all religions as true.I fervently hope that the bell that tolled this morning in honor of this convention may be the death-knell of all fanaticism.'),
 Document(id='7b7dbc0e-c934-41bb-a1ed-d4bcc24f56d2', metadata={'source': 'speech.txt'}, page_content='both tolerance and universal acceptance.We believe not only in universal toleration, but we accept all religions as true.I fervently hope that the bell that tolled this morning in honor of this convention may be the death-knell of all fanaticism.'),
 Document(id='50ca5d95-77dc-4567-9048-e4be897216be', metadata={'source': 'speech.txt'}, page_content='both tolerance and universal acceptance.We believe not only in universal toleration, but we accept all religions as true.I fervently hope that the bell that tolled this morning in honor of this 

In [32]:
### Similarity search with score 
docs=vectordb.similarity_search_with_score(query)
docs

[(Document(id='12d78c32-3e4f-415b-a1b3-773cd350e8f6', metadata={'source': 'speech.txt'}, page_content='both tolerance and universal acceptance.We believe not only in universal toleration, but we accept all religions as true.I fervently hope that the bell that tolled this morning in honor of this convention may be the death-knell of all fanaticism.'),
  3058.208251953125),
 (Document(id='7b7dbc0e-c934-41bb-a1ed-d4bcc24f56d2', metadata={'source': 'speech.txt'}, page_content='both tolerance and universal acceptance.We believe not only in universal toleration, but we accept all religions as true.I fervently hope that the bell that tolled this morning in honor of this convention may be the death-knell of all fanaticism.'),
  3058.208251953125),
 (Document(id='50ca5d95-77dc-4567-9048-e4be897216be', metadata={'source': 'speech.txt'}, page_content='both tolerance and universal acceptance.We believe not only in universal toleration, but we accept all religions as true.I fervently hope that the 

In [34]:
### Retriever 

retriever=vectordb.as_retriever()
retriever.invoke(query)[0].page_content


'both tolerance and universal acceptance.We believe not only in universal toleration, but we accept all religions as true.I fervently hope that the bell that tolled this morning in honor of this convention may be the death-knell of all fanaticism.'