## Chroma

`Chroma is a AI-native open source vector database focused on developer productivity and happiness. CHroma is licensed under apache 2.0`

In [9]:
# building a simple vectordb
from langchain_chroma import Chroma
from langchain_community.document_loaders import TextLoader
# from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import OllamaEmbeddings
from langchain_text_splitters import CharacterTextSplitter, RecursiveCharacterTextSplitter

In [12]:
loader = TextLoader('../FAISS-DB/speech.txt')
data = loader.load()


In [13]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
splits = text_splitter.split_documents(data)

In [14]:
embeddings = OllamaEmbeddings(model="gemma:2b")
vectordb = Chroma.from_documents(documents=splits, embedding=embeddings)
vectordb

<langchain_chroma.vectorstores.Chroma at 0x1be3bbde450>

In [15]:
query = "What is the task of YOLOv5 model?"
docs = vectordb.similarity_search(query)
docs[0].page_content

'Example: For an object detection task, you could have:\nTeacher 1 (YOLOv5): Focuses on fast bounding box detection.\nTeacher 2 (EfficientNet): Focuses on object classification accuracy.\nTeacher 3 (ResNet-50): Helps the student extract robust features from the image.'

In [17]:
# save in the disk
vectordb = Chroma.from_documents(documents=splits, embedding=embeddings, persist_directory="chroma_db")

In [18]:
db2 = Chroma(persist_directory='./chroma_db', embedding_function=embeddings)
docs = db2.similarity_search(query=query)
print(docs[0].page_content)

Example: For an object detection task, you could have:
Teacher 1 (YOLOv5): Focuses on fast bounding box detection.
Teacher 2 (EfficientNet): Focuses on object classification accuracy.
Teacher 3 (ResNet-50): Helps the student extract robust features from the image.


In [22]:
docs = vectordb.similarity_search_with_score(query=query)
docs

[(Document(metadata={'source': '../FAISS-DB/speech.txt'}, page_content='Example: For an object detection task, you could have:\nTeacher 1 (YOLOv5): Focuses on fast bounding box detection.\nTeacher 2 (EfficientNet): Focuses on object classification accuracy.\nTeacher 3 (ResNet-50): Helps the student extract robust features from the image.'),
  3453.8228699758106),
 (Document(metadata={'source': '../FAISS-DB/speech.txt'}, page_content='Overview: In this project, the student model learns from multiple teacher models, each specializing in different aspects of the task. For example, one teacher could specialize in object detection, another in object classification, and a third in feature extraction.\nNovelty: The key is designing a custom loss function that weights the contributions of each teacher differently based on the task at hand. This adaptive weighting would allow the student model to benefit from diverse perspectives.'),
  3559.3035726328403),
 (Document(metadata={'source': '../FAI

In [24]:
## Retriever option
retriver = vectordb.as_retriever()
retriver.invoke(query)[0].page_content

'Example: For an object detection task, you could have:\nTeacher 1 (YOLOv5): Focuses on fast bounding box detection.\nTeacher 2 (EfficientNet): Focuses on object classification accuracy.\nTeacher 3 (ResNet-50): Helps the student extract robust features from the image.'

https://js.langchain.com/v0.2/docs/integrations/vectorstores/