## Vector Stores

### FAISS

In [12]:
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import FAISS
from langchain_ollama import OllamaEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

In [24]:
loader = TextLoader("./data/speech.txt")
document = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=70)
docs = text_splitter.split_documents(document)

In [25]:
len(docs)

16

In [26]:
embeddings = OllamaEmbeddings(
    model="gemma:2b",
)
db = FAISS.from_documents(docs, embeddings)
db

<langchain_community.vectorstores.faiss.FAISS at 0x7b276d582050>

In [27]:
##Querying the vector store
query = "I was swimming in a river"
results = db.similarity_search(query)
for i, res in enumerate(results):
    print(f"Result {i+1}:\n{res.page_content}\n")

Result 1:
But here’s something I do regret:

Result 2:
Sumatra, a little buzzed, and looking up and seeing like 300 monkeys sitting on a pipeline, pooping down into the river, the river in which I was swimming, with my mouth open, naked? And getting deathly ill afterwards, and staying sick for the next seven months? Not so much. Do I regret the

Result 3:
So: What do I regret? Being poor from time to time? Not really. Working terrible jobs, like “knuckle-puller in a slaughterhouse?” (And don’t even ASK what that entails.) No. I don’t regret that. Skinny-dipping in a river in Sumatra, a little buzzed, and looking up and seeing like 300 monkeys

Result 4:
And I intend to respect that tradition.



##### Convert VectorStore as Retreiver

In [28]:
retreiver = db.as_retriever()
retreiver.invoke(query)

[Document(id='8627ac5e-b078-43f7-82f8-a330217abab5', metadata={'source': './data/speech.txt'}, page_content='But here’s something I do regret:'),
 Document(id='872353e9-441c-43aa-a935-bf50d764730f', metadata={'source': './data/speech.txt'}, page_content='Sumatra, a little buzzed, and looking up and seeing like 300 monkeys sitting on a pipeline, pooping down into the river, the river in which I was swimming, with my mouth open, naked? And getting deathly ill afterwards, and staying sick for the next seven months? Not so much. Do I regret the'),
 Document(id='fe0cfffa-2c16-473d-af55-4c255de90645', metadata={'source': './data/speech.txt'}, page_content='So: What do I regret? Being poor from time to time? Not really. Working terrible jobs, like “knuckle-puller in a slaughterhouse?” (And don’t even ASK what that entails.) No. I don’t regret that. Skinny-dipping in a river in Sumatra, a little buzzed, and looking up and seeing like 300 monkeys'),
 Document(id='64327b5b-f919-4600-ba31-2dd6f78

##### Similarity Search with Score

In [29]:
docs_sscore = db.similarity_search_with_score(query)
for i, (doc, score) in enumerate(docs_sscore):
    print(f"Result {i+1} (Score: {score}):\n{doc.page_content}\n")

Result 1 (Score: 0.8139021992683411):
But here’s something I do regret:

Result 2 (Score: 0.8903505802154541):
Sumatra, a little buzzed, and looking up and seeing like 300 monkeys sitting on a pipeline, pooping down into the river, the river in which I was swimming, with my mouth open, naked? And getting deathly ill afterwards, and staying sick for the next seven months? Not so much. Do I regret the

Result 3 (Score: 0.8950883150100708):
So: What do I regret? Being poor from time to time? Not really. Working terrible jobs, like “knuckle-puller in a slaughterhouse?” (And don’t even ASK what that entails.) No. I don’t regret that. Skinny-dipping in a river in Sumatra, a little buzzed, and looking up and seeing like 300 monkeys

Result 4 (Score: 0.9063223004341125):
And I intend to respect that tradition.



##### Similarity Search with Embeddings Vector

In [30]:
embedding_vector = embeddings.embed_query(query)
embedding_vector

[0.010204367,
 -0.009578988,
 -0.021301733,
 0.026438514,
 -0.011999188,
 0.05956895,
 0.023844054,
 0.02504292,
 0.017262975,
 0.01763909,
 0.008301522,
 0.0063337367,
 -0.0051265084,
 0.0012062313,
 -0.0036862742,
 0.0078035397,
 -0.052542202,
 -0.006850061,
 0.011125931,
 -0.016934162,
 -0.015243161,
 -0.002360845,
 0.013141816,
 -0.01913676,
 -0.0025862588,
 0.010031142,
 0.0062649506,
 0.009481578,
 0.023163866,
 0.01958326,
 0.009316168,
 -0.017674796,
 0.00029629495,
 -0.020154137,
 -0.027168052,
 0.0087882895,
 -0.0373169,
 -0.023649488,
 0.013834441,
 -0.0003006232,
 0.026261019,
 0.0007677487,
 0.0082058385,
 -0.0060306536,
 -0.032241546,
 -0.008258105,
 -0.011838068,
 -0.0036368743,
 0.018854756,
 -0.011143894,
 -0.14523937,
 -0.22225721,
 0.0074042273,
 -0.009107904,
 0.0029529806,
 0.0056574643,
 0.0025827377,
 0.008288562,
 -0.004877375,
 -0.00635736,
 -0.004179578,
 0.013843524,
 -0.03128049,
 -0.017566707,
 -0.03189847,
 -0.004602346,
 -0.019343335,
 -0.0034600056,
 -0.

In [31]:
db.similarity_search_by_vector(embedding_vector)

[Document(id='8627ac5e-b078-43f7-82f8-a330217abab5', metadata={'source': './data/speech.txt'}, page_content='But here’s something I do regret:'),
 Document(id='872353e9-441c-43aa-a935-bf50d764730f', metadata={'source': './data/speech.txt'}, page_content='Sumatra, a little buzzed, and looking up and seeing like 300 monkeys sitting on a pipeline, pooping down into the river, the river in which I was swimming, with my mouth open, naked? And getting deathly ill afterwards, and staying sick for the next seven months? Not so much. Do I regret the'),
 Document(id='fe0cfffa-2c16-473d-af55-4c255de90645', metadata={'source': './data/speech.txt'}, page_content='So: What do I regret? Being poor from time to time? Not really. Working terrible jobs, like “knuckle-puller in a slaughterhouse?” (And don’t even ASK what that entails.) No. I don’t regret that. Skinny-dipping in a river in Sumatra, a little buzzed, and looking up and seeing like 300 monkeys'),
 Document(id='64327b5b-f919-4600-ba31-2dd6f78

##### Save and Load vector store db

In [32]:
db.save_local("faiss_index")

In [37]:
new_vector_db = FAISS.load_local("faiss_index", embeddings, allow_dangerous_deserialization=True)

In [38]:
new_vector_db.similarity_search(query)

[Document(id='8627ac5e-b078-43f7-82f8-a330217abab5', metadata={'source': './data/speech.txt'}, page_content='But here’s something I do regret:'),
 Document(id='872353e9-441c-43aa-a935-bf50d764730f', metadata={'source': './data/speech.txt'}, page_content='Sumatra, a little buzzed, and looking up and seeing like 300 monkeys sitting on a pipeline, pooping down into the river, the river in which I was swimming, with my mouth open, naked? And getting deathly ill afterwards, and staying sick for the next seven months? Not so much. Do I regret the'),
 Document(id='fe0cfffa-2c16-473d-af55-4c255de90645', metadata={'source': './data/speech.txt'}, page_content='So: What do I regret? Being poor from time to time? Not really. Working terrible jobs, like “knuckle-puller in a slaughterhouse?” (And don’t even ASK what that entails.) No. I don’t regret that. Skinny-dipping in a river in Sumatra, a little buzzed, and looking up and seeing like 300 monkeys'),
 Document(id='64327b5b-f919-4600-ba31-2dd6f78

### ChromaDB

In [2]:
from langchain_chroma import Chroma
from langchain_community.document_loaders import TextLoader
from langchain_ollama import OllamaEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

2026-01-06 15:52:29.244761: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2026-01-06 15:52:29.290153: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2026-01-06 15:52:30.309378: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.


In [3]:
loader = TextLoader("./data/speech.txt")
document = loader.load()

In [4]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=70)
docs = text_splitter.split_documents(document)

In [5]:
embeddings = OllamaEmbeddings(
    model="gemma:2b",
)
chroma_db = Chroma.from_documents(docs, embeddings)
chroma_db

<langchain_chroma.vectorstores.Chroma at 0x7bf3f6bcd4d0>

In [6]:
## Querying the Chroma vector store
query = "I was swimming in a river"
results = chroma_db.similarity_search(query)
for i, res in enumerate(results):
    print(f"Result {i+1}:\n{res.page_content}\n")

Result 1:
But here’s something I do regret:

Result 2:
Sumatra, a little buzzed, and looking up and seeing like 300 monkeys sitting on a pipeline, pooping down into the river, the river in which I was swimming, with my mouth open, naked? And getting deathly ill afterwards, and staying sick for the next seven months? Not so much. Do I regret the

Result 3:
So: What do I regret? Being poor from time to time? Not really. Working terrible jobs, like “knuckle-puller in a slaughterhouse?” (And don’t even ASK what that entails.) No. I don’t regret that. Skinny-dipping in a river in Sumatra, a little buzzed, and looking up and seeing like 300 monkeys

Result 4:
And I intend to respect that tradition.



##### Save and Load vector store

In [None]:
#Save
vectordb = Chroma.from_documents(docs, embeddings, persist_directory="./chroma_db")

In [8]:
#Load
new_chroma_db = Chroma(persist_directory="./chroma_db", embedding_function=embeddings)
docs = new_chroma_db.similarity_search(query)
for i, res in enumerate(docs):
    print(f"Result {i+1}:\n{res.page_content}\n")

Result 1:
But here’s something I do regret:

Result 2:
Sumatra, a little buzzed, and looking up and seeing like 300 monkeys sitting on a pipeline, pooping down into the river, the river in which I was swimming, with my mouth open, naked? And getting deathly ill afterwards, and staying sick for the next seven months? Not so much. Do I regret the

Result 3:
So: What do I regret? Being poor from time to time? Not really. Working terrible jobs, like “knuckle-puller in a slaughterhouse?” (And don’t even ASK what that entails.) No. I don’t regret that. Skinny-dipping in a river in Sumatra, a little buzzed, and looking up and seeing like 300 monkeys

Result 4:
And I intend to respect that tradition.



In [9]:
docs = new_chroma_db.similarity_search_with_score(query)
for i, (doc, score) in enumerate(docs): 
    print(f"Result {i+1} (Score: {score}):\n{doc.page_content}\n")

Result 1 (Score: 0.8139020800590515):
But here’s something I do regret:

Result 2 (Score: 0.8903502821922302):
Sumatra, a little buzzed, and looking up and seeing like 300 monkeys sitting on a pipeline, pooping down into the river, the river in which I was swimming, with my mouth open, naked? And getting deathly ill afterwards, and staying sick for the next seven months? Not so much. Do I regret the

Result 3 (Score: 0.8950883150100708):
So: What do I regret? Being poor from time to time? Not really. Working terrible jobs, like “knuckle-puller in a slaughterhouse?” (And don’t even ASK what that entails.) No. I don’t regret that. Skinny-dipping in a river in Sumatra, a little buzzed, and looking up and seeing like 300 monkeys

Result 4 (Score: 0.9063223600387573):
And I intend to respect that tradition.



##### Changing vector db to retreiver

In [10]:
retreiver = vectordb.as_retriever()
retreiver.invoke(query)[0].page_content

'But here’s something I do regret:'