## Chroma

Chroma is a AI-native open-source vector database focused on developer productivity and happiness. Chroma is licensed under Apache 2.0.

https://python.langchain.com/v0.2/docs/integrations/vectorstores/

In [1]:
## building a sample vectordb
from langchain_community.vectorstores import Chroma
from langchain_community.document_loaders import TextLoader
from langchain_community.embeddings import OllamaEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

In [2]:
loader=TextLoader("data/loreal_shareholder_2022.txt")
data = loader.load()
data

[Document(metadata={'source': 'data/loreal_shareholder_2022.txt'}, page_content='“Dear Shareholders,\nL’Oréal continues on the path to success with an ever-stronger ambition, while acting with the sense of responsibility of a global leader. Dual financial and social excellence will always be at the heart of our business model.\nWe have set ourselves the ultimate goal of creating value that benefits everyone.\nWe create value for you, our shareholders.\xa0The resilience and outperformance of your Company are the perfect demonstration of its robust, virtuous and value creating business model. The quality of our results puts us in a position to offer a dividend of €6 per share, representing a significant increase of +25%. And the preferential dividend with a 10% loyalty bonus(1), at €6.60, is recognition of your long-term loyalty.\nI also know that you attach just as much importance to the quality of our relationship with you, our shareholders. I am delighted to welcome the more than 30,0

In [3]:
# Split
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
splits = text_splitter.split_documents(data)

In [4]:
len(splits)

8

In [5]:
embedding=OllamaEmbeddings(model="llama3.2:1b")
vectordb=Chroma.from_documents(documents=splits,embedding=embedding)
vectordb


  embedding=OllamaEmbeddings(model="llama3.2:1b")


<langchain_community.vectorstores.chroma.Chroma at 0x107f3c950>

In [6]:
## query it
query = "What is loyalty bonus for shareholder of Loreals?"
docs = vectordb.similarity_search(query)
docs[0].page_content

'It was only natural that we share L’Oréal’s performance with them. In 2022, we launched a third employee share ownership plan. More than a third of our employees around the world and two-thirds of employees in France are Group shareholders. That is outstanding proof of their attachment to L’Oréal. And it is yet another means of aligning their interests with yours.'

In [7]:
## Saving to the disk
vectordb=Chroma.from_documents(documents=splits,embedding=embedding,persist_directory="./chroma_db")


In [8]:
# load from disk
db2 = Chroma(persist_directory="./chroma_db", embedding_function=embedding)
docs=db2.similarity_search(query)
print(docs[0].page_content)

It was only natural that we share L’Oréal’s performance with them. In 2022, we launched a third employee share ownership plan. More than a third of our employees around the world and two-thirds of employees in France are Group shareholders. That is outstanding proof of their attachment to L’Oréal. And it is yet another means of aligning their interests with yours.


  db2 = Chroma(persist_directory="./chroma_db", embedding_function=embedding)


In [9]:
## similarity Search With Score
docs = vectordb.similarity_search_with_score(query)
docs

[(Document(metadata={'source': 'data/loreal_shareholder_2022.txt'}, page_content='It was only natural that we share L’Oréal’s performance with them. In 2022, we launched a third employee share ownership plan. More than a third of our employees around the world and two-thirds of employees in France are Group shareholders. That is outstanding proof of their attachment to L’Oréal. And it is yet another means of aligning their interests with yours.'),
  6911.93150222343),
 (Document(metadata={'source': 'data/loreal_shareholder_2022.txt'}, page_content='We create value for you, our shareholders.\xa0The resilience and outperformance of your Company are the perfect demonstration of its robust, virtuous and value creating business model. The quality of our results puts us in a position to offer a dividend of €6 per share, representing a significant increase of +25%. And the preferential dividend with a 10% loyalty bonus(1), at €6.60, is recognition of your long-term loyalty.'),
  7056.68432236

In [10]:
### Retriever option
retriever=vectordb.as_retriever()
retriever.invoke(query)[0].page_content

'It was only natural that we share L’Oréal’s performance with them. In 2022, we launched a third employee share ownership plan. More than a third of our employees around the world and two-thirds of employees in France are Group shareholders. That is outstanding proof of their attachment to L’Oréal. And it is yet another means of aligning their interests with yours.'