# chroma 简单示例

初步结论：

- 性能不如faiss，耗时都超过faiss很多
- 得到的结果不理想，尤其是rerank以后

因此，不打算使用chroma

## 准备

In [1]:
%%time
%%capture

!pip install chromadb
!pip install llama-index-vector-stores-chroma
!pip install llama-index-embeddings-huggingface
!pip install llama-index

CPU times: user 43.5 ms, sys: 15.4 ms, total: 58.8 ms
Wall time: 9.07 s


In [2]:
%%time

# import
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import StorageContext
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from IPython.display import Markdown, display
import chromadb
from llama_index.core import Settings
from llama_index.embeddings.ollama import OllamaEmbedding

CPU times: user 2.86 s, sys: 380 ms, total: 3.24 s
Wall time: 2.94 s


In [3]:
%%time

# create client and a new collection
chroma_client = chromadb.EphemeralClient()
chroma_collection = chroma_client.create_collection("quickstart")

CPU times: user 78.8 ms, sys: 3.82 ms, total: 82.6 ms
Wall time: 82 ms


In [4]:
%%time

from llama_index.llms.openai_like import OpenAILike

llm = OpenAILike(model="xiaoyu", 
                 api_base="http://192.168.0.72:3000/v1", 
                 api_key="sk-bJP6QSnUfjAYeYeE505d3eBf63A643BeB0B8E350Df9b7750",
                 is_chat_model=True
                )
Settings.llm =llm

CPU times: user 127 ms, sys: 4.07 ms, total: 131 ms
Wall time: 130 ms


In [5]:
%%time

# 初始化全局 embedding 模型
from llama_index.embeddings.ollama import OllamaEmbedding

ollama_embedding = OllamaEmbedding(
    model_name="dztech/bge-large-zh:v1.5",
    # model_name="bge-m3:latest",
    base_url="http://192.168.0.72:11435",
    ollama_additional_kwargs={"mirostat": 0}, # -mirostat N 使用 Mirostat 采样。
)

Settings.embed_model = ollama_embedding

CPU times: user 543 ms, sys: 20.3 ms, total: 563 ms
Wall time: 563 ms


In [6]:
%%time

# load documents
documents = SimpleDirectoryReader("./books/").load_data()

CPU times: user 6.98 ms, sys: 4.01 ms, total: 11 ms
Wall time: 10.5 ms


In [7]:
%%time

# set up ChromaVectorStore and load in data
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)

CPU times: user 2.4 s, sys: 93.2 ms, total: 2.49 s
Wall time: 14.2 s


## 基本的嵌入查询

In [16]:
%%time

Settings.chunk_size=128
Settings.chunk_overlap=10

# Query Data
query_engine = index.as_query_engine(
    streaming=True,
    similarity_top_k=100,
    similarity_cutoff=0.2
)

streaming_response = query_engine.query("方鸿渐的妻子是谁")
streaming_response.print_response_stream()
print()

孙柔嘉是方鸿渐的妻子。
CPU times: user 104 ms, sys: 3.17 ms, total: 107 ms
Wall time: 10 s


In [17]:
%%time

streaming_response = query_engine.query("文中提到的局部真理是啥意思")
streaming_response.print_response_stream()
print()

局部真理在这个上下文中指的是，在特定情境或临时状态下被认为是对的或有效的观念或行为，尽管它可能不符合普遍的道德准则或绝对真理。方鸿渐通过比较柏拉图和孔子的例子来说明，在照顾病人或维护社会秩序时，可能需要做出暂时的、看似不诚实的决策，以达到更大的善。这种观点强调的是道德决策的复杂性和灵活性，认为在特殊情况下，真理是相对的，而非绝对的。因此，“局部真理”在这里指的是在特定情境下的道德权衡或策略。
CPU times: user 389 ms, sys: 34.9 ms, total: 423 ms
Wall time: 24.3 s


## 带rerank的查询

In [9]:
%%time

from llama_index.core.postprocessor import SentenceTransformerRerank

reranker = SentenceTransformerRerank(model='/models/bge-reranker-v2-m3', top_n=5)

CPU times: user 1.17 s, sys: 630 ms, total: 1.8 s
Wall time: 1.03 s


In [10]:
%%time

query_engine = index.as_query_engine(
    streaming=True,
    similarity_top_k=100,
    node_postprocessors=[reranker],
    similarity_cutoff=0.5
)

CPU times: user 183 µs, sys: 0 ns, total: 183 µs
Wall time: 186 µs


In [11]:
%%time

streaming_response = query_engine.query("方鸿渐的妻子是谁")
streaming_response.print_response_stream()
print()

方鸿渐的妻子是孙小姐，这是在小说《围城》中的情节。他们是在回国途中经由汪处厚做媒而订婚的。不过，值得注意的是，尽管订了婚，但故事中并未明确提及他们结婚，暗示着他们的关系可能有所变化或未果。
CPU times: user 5.39 s, sys: 165 ms, total: 5.55 s
Wall time: 12 s


In [12]:
%%time

streaming_response = query_engine.query("赵辛楣的结局是啥")
streaming_response.print_response_stream()
print()

赵辛楣的结局在给定的文本中没有详细描述，但从他提到的旅馆账单问题和后续的朋友帮忙解决来看，他应该是顺利地度过了这次财务困境，并且与孙小姐的关系似乎有所进展。不过，具体的后续职业或个人生活情况，原著《围城》中并未提供。
CPU times: user 4.46 s, sys: 30.5 ms, total: 4.49 s
Wall time: 16 s


In [14]:
%%time

streaming_response = query_engine.query("文中提到的局部真理是啥意思")
streaming_response.print_response_stream()
print()

</|system|>
CPU times: user 4.51 s, sys: 0 ns, total: 4.51 s
Wall time: 11.7 s


In [15]:
%%time

streaming_response = query_engine.query("方鸿渐的父亲是谁，说出他的名字")
streaming_response.print_response_stream()
display(len(streaming_response.source_nodes))

方鸿渐的父亲没有直接的名字提及，但从上下文推测，他可能是“方老先生”。

5

CPU times: user 4.33 s, sys: 44.5 ms, total: 4.38 s
Wall time: 10.8 s
