### ChromaDB 

Chroma is a AI-native open-source vector database focused on developer productivity and happiness.   
Chrome is licensed under Apache 2.0 

In [6]:
from langchain_chroma import Chroma 
from langchain_community.document_loaders import TextLoader

from langchain_community.embeddings import OllamaEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter


In [7]:
loader = TextLoader("speech.txt")

data = loader.load() 

data 

[Document(metadata={'source': 'speech.txt'}, page_content='World War II was a global conflict that lasted from 1939 to 1945, involving virtually every part of the world. The war began when Germany, under the leadership of Adolf Hitler, invaded Poland on September 1, 1939. This act of aggression prompted Great Britain and France to declare war on Germany.\n\nThe war escalated as Germany expanded its territorial control, invading Denmark, Norway, Belgium, the Netherlands, and France. Italy, led by Benito Mussolini, joined the war on Germany’s side in 1940. Japan, meanwhile, had been at war with China since 1937 and eventually entered the global conflict in December 1941, following the surprise attack on Pearl Harbor, Hawaii.\n\nThe Allies, comprising the United States, Great Britain, France, the Soviet Union, and China, among others, fought against the Axis powers (Germany, Italy, and Japan). The war saw brutal battles on multiple fronts, including Europe, North Africa, and Asia.')]

In [8]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size = 500, chunk_overlap=0)

splits = text_splitter.split_documents(data)

In [10]:
# splits

In [12]:
embeddings = OllamaEmbeddings(model="llama3.1") 

vectorDb = Chroma.from_documents(splits, embedding=embeddings)
vectorDb

<langchain_chroma.vectorstores.Chroma at 0x7271cab87370>

In [13]:
## Query it 


query = "When world war 2 begins"

docs = vectorDb.similarity_search(query)


Number of requested results 4 is greater than number of elements in index 3, updating n_results = 3


In [15]:
docs[0].page_content

'World War II was a global conflict that lasted from 1939 to 1945, involving virtually every part of the world. The war began when Germany, under the leadership of Adolf Hitler, invaded Poland on September 1, 1939. This act of aggression prompted Great Britain and France to declare war on Germany.'

In [19]:
docs[0].page_content

'World War II was a global conflict that lasted from 1939 to 1945, involving virtually every part of the world. The war began when Germany, under the leadership of Adolf Hitler, invaded Poland on September 1, 1939. This act of aggression prompted Great Britain and France to declare war on Germany.'

In [20]:
### Save to the disk 

vectorDb = Chroma.from_documents(splits, embedding=embeddings, persist_directory="./chroma_db")

#### Chroma DB internally Creates SQlLite DB 

##### Loading the disk

In [21]:
## Loading the disk 


db2 = Chroma(persist_directory="./chroma_db" , embedding_function= embeddings )

docs = db2.similarity_search(query)

Number of requested results 4 is greater than number of elements in index 3, updating n_results = 3


In [25]:
docs[2].page_content

'The war escalated as Germany expanded its territorial control, invading Denmark, Norway, Belgium, the Netherlands, and France. Italy, led by Benito Mussolini, joined the war on Germany’s side in 1940. Japan, meanwhile, had been at war with China since 1937 and eventually entered the global conflict in December 1941, following the surprise attack on Pearl Harbor, Hawaii.'

In [27]:
retriever = vectorDb.as_retriever() 
ret=retriever.invoke(query)

Number of requested results 4 is greater than number of elements in index 3, updating n_results = 3


In [30]:
ret[2].page_content

'The war escalated as Germany expanded its territorial control, invading Denmark, Norway, Belgium, the Netherlands, and France. Italy, led by Benito Mussolini, joined the war on Germany’s side in 1940. Japan, meanwhile, had been at war with China since 1937 and eventually entered the global conflict in December 1941, following the surprise attack on Pearl Harbor, Hawaii.'


More About Vector stores

https://python.langchain.com/v0.2/docs/integrations/vectorstores/

