## Chroma

### Chroma is a AI-native open-source vector database focused on developer productivity and happiness. Chroma is licensed under Apache 2.0.

In [1]:
## Building a sample vectordb

from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.embeddings import OllamaEmbeddings
from langchain_chroma import Chroma

In [2]:
loader = TextLoader("speech.txt")
data = loader.load()
data

[Document(metadata={'source': 'speech.txt'}, page_content="Bangkok,[a] officially known in Thai as Krung Thep Maha Nakhon[b] and colloquially as Krung Thep,[c] is the capital and most populous city of Thailand. The city occupies 1,568.7 square kilometres (605.7 sq mi) in the Chao Phraya River delta in central Thailand and has an estimated population of 9.0 million as of 2021, 13% of the country's population. Over 17.4 million people (25%) lived within the surrounding Bangkok Metropolitan Region at the 2021 estimate, making Bangkok an extreme primate city, dwarfing Thailand's other urban centres in both size and importance to the national economy.[5]\n\nBangkok traces its roots to a small trading post during the Ayutthaya Kingdom in the 15th century, which eventually grew and became the site of two capital cities, Thonburi in 1767 and Rattanakosin in 1782. Bangkok was at the heart of the modernization of Siam during the late-19th century, as the country faced pressures from the West. Th

In [3]:
## Split
text_splitter = RecursiveCharacterTextSplitter(chunk_size = 200, chunk_overlap=0)
splits = text_splitter.split_documents(data)

In [4]:
embeddings = OllamaEmbeddings(model="gemma2:2b")
vectordb = Chroma.from_documents(documents=splits, embedding=embeddings)
vectordb

  embeddings = OllamaEmbeddings(model="gemma2:2b")


<langchain_chroma.vectorstores.Chroma at 0x231dbe27d00>

In [5]:
query = "What Bangkok is famous for"
docs = vectordb.similarity_search(query)
docs[0].page_content

'as well as its red-light districts. The Grand Palace and Buddhist temples including Wat Arun and Wat Pho stand in contrast with other tourist attractions such as the nightlife scenes of Khaosan Road'

In [6]:
## saving to the disk

vectordb = Chroma.from_documents(documents=splits, embedding=embeddings, persist_directory="./chroma_db")

In [7]:
## load from disk

db2 = Chroma(persist_directory="./chroma_db", embedding_function=embeddings)
docs = db2.similarity_search(query)
print(docs[0].page_content)

as well as its red-light districts. The Grand Palace and Buddhist temples including Wat Arun and Wat Pho stand in contrast with other tourist attractions such as the nightlife scenes of Khaosan Road


In [8]:
## similarity search with score

docs = vectordb.similarity_search_with_score(query)
docs

[(Document(metadata={'source': 'speech.txt'}, page_content='as well as its red-light districts. The Grand Palace and Buddhist temples including Wat Arun and Wat Pho stand in contrast with other tourist attractions such as the nightlife scenes of Khaosan Road'),
  4695.702172567742),
 (Document(metadata={'source': 'speech.txt'}, page_content='culture. It is an international hub for transport and health care, and has emerged as a centre for the arts, fashion, and entertainment. The city is known for its street life and cultural landmarks,'),
  5174.951623065762),
 (Document(metadata={'source': 'speech.txt'}, page_content="and Patpong. Bangkok is among the world's top tourist destinations, and has been named the world's most visited city consistently in several international rankings."),
  5598.7026105890345),
 (Document(metadata={'source': 'speech.txt'}, page_content="incorporated as a special administrative area under the Bangkok Metropolitan Administration in 1972, grew rapidly during 

In [9]:
## Retriever option

retriever = vectordb.as_retriever()
retriever.invoke(query)[0].page_content

'as well as its red-light districts. The Grand Palace and Buddhist temples including Wat Arun and Wat Pho stand in contrast with other tourist attractions such as the nightlife scenes of Khaosan Road'