### Vecor stores
- Vector stores used to store high dimentional vectors
- Uses similarity search to fetch the data
- One one sized embeddings can be stored no mix of various sized embeddings
- By default vectore store have indexing techniques , or we can explicitely mention too
- There are different vector stores which are have there own advantages and disadvantages
- - Free vector db: FAISS, Chroma
- - Cloud/Paid: Pinecone, Weaviate Cloud, Qdrant Cloud, Zilliz Cloud

### Vector store using FAISS

In [2]:
import faiss
import numpy as np
from langchain_community.vectorstores import FAISS

In [5]:
from langchain_core.documents import Document
import uuid
sample_texts=[
    'A product shot of a sushi roll underwater',
    'A clown flying an orange kite on a sandy beach',
    'A small red block sitting on a large green block',
    'A T-rex tap dancing on Jupiter',
    'The Batmobile stuck in Los Angeles traffic, impressionist painting'
]


# Your sample sentences with rich metadata
sample_data = [
    {"content": "Machine learning studies algorithms", "topic": "ML", "chapter": 1},
    {"content": "Neural networks learn patterns", "topic": "NN", "chapter": 2},
    {"content": "Embeddings capture semantics", "topic": "Embeddings", "chapter": 3}
]

# Create Documents with metadata
docs = [
    Document(
        page_content=item["content"],
        metadata={
            "topic": item["topic"],
            "chapter": item["chapter"],
            "doc_id": str(uuid.uuid4()),  # Custom ID
            "created": "2026-01-08"
        }
    )
    for item in sample_data
]


In [6]:
#  embedding model
from langchain_huggingface.embeddings import HuggingFaceEmbeddings
embedding_model = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

  from .autonotebook import tqdm as notebook_tqdm
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


In [7]:
from langchain_community.vectorstores import FAISS

In [9]:
# initialize empty faiss db
faiss_db=FAISS(embedding_function=embedding_model, index=faiss.IndexFlatL2(384),index_to_docstore_id={}, docstore={})

In [12]:
# Initialize FAISS vector storewith texts
faiss_db_with_texts=FAISS.from_texts(sample_texts, embedding_model )

In [13]:
#Initialize FAISS vector store with Documents
faiss_db_with_docs=FAISS.from_documents(docs, embedding_model )

In [15]:
#  insert new texts into existing FAISS vector store
new_texts=[ 'A cat playing a piano on a rooftop', 'A futuristic city with flying cars']
faiss_db_with_texts.add_texts(new_texts)

['15e04f1e-b7c0-4390-9295-8234af088b51',
 '3a911443-e0c0-4646-9594-3fb43e0b7146']

In [16]:
#insert new Documents into existing FAISS vector store
new_docs = [    Document(
        page_content="A robot painting a portrait",     metadata={
            "topic": "Robotics",
            "chapter": 4,
            "doc_id": str(uuid.uuid4()),
            "created": "2026-01-08"
        }   ),    Document(        page_content="A spaceship landing on Mars",        metadata={            "topic": "Space",            "chapter": 5,            "doc_id": str(uuid.uuid4()),            "created": "2026-01-08"        }    )]        
faiss_db_with_docs.add_documents(new_docs)

['a6de9993-cc8f-4757-a6c4-de0ba5bfa973',
 'c9452228-3479-43af-a4b5-88aeaac029a2']

In [23]:
#  similarity search
query="what robot doing"
similar_docs=faiss_db_with_docs.similarity_search(query, k=1)
print(similar_docs[0].page_content, '\n', similar_docs[0].metadata)

A robot painting a portrait 
 {'topic': 'Robotics', 'chapter': 4, 'doc_id': 'ccb86b5e-89bb-4f74-9382-6ad905256594', 'created': '2026-01-08'}


In [None]:
#  similarity search n text embeddings
query="what robot doing"
similar_docs=faiss_db_with_texts.similarity_search(query, k=1)
print(similar_docs[0].page_content, '\n', similar_docs[0].metadata)

A T-rex tap dancing on Jupiter 
 {}


In [44]:
#  Deleting embeddings
# - You can delete or update the embeddings in FAISS vector store by manipulating the index directly. However, FAISS does not natively support deletion of individual vectors. A common approach is to mark vectors as deleted in your application logic or to rebuild the index without the unwanted vectors.
# Here is a simple example of how you might mark vectors as deleted:


# get all ids
sim_docs=faiss_db_with_docs.similarity_search("A cat playing a piano on a rooftop", k=5)
ids = [doc.id for doc in sim_docs]


In [45]:
ids

['c9452228-3479-43af-a4b5-88aeaac029a2',
 'dcdd1aec-76aa-4329-a302-a7b8a2c9986d',
 '56b26ece-3b56-4c33-971e-537dc257228c',
 'a6de9993-cc8f-4757-a6c4-de0ba5bfa973',
 '21598b56-60ef-45b1-b504-1d785b6fa18b']

In [46]:
faiss_db_with_docs.delete([ids[0]])

True

In [47]:
sim_docs=faiss_db_with_docs.similarity_search("A cat playing a piano on a rooftop", k=5)
ids = [doc.id for doc in sim_docs]

In [48]:
ids

['dcdd1aec-76aa-4329-a302-a7b8a2c9986d',
 '56b26ece-3b56-4c33-971e-537dc257228c',
 'a6de9993-cc8f-4757-a6c4-de0ba5bfa973',
 '21598b56-60ef-45b1-b504-1d785b6fa18b']

#### NOte
- To update any metadata, you get the metadata of a particular id and then update the key.
- To make empty the db, you have to pass all ids to delete function 
- faiss_db_with_docs.delete(ids)

In [49]:
# save and load FAISS vector store
faiss_db_with_docs.save_local("faiss_vectorstore")

In [51]:
# load faiss vector store
loaded_faiss_db=FAISS.load_local("faiss_vectorstore", embedding_model,allow_dangerous_deserialization=True)

In [52]:
loaded_faiss_db.similarity_search("A robot painting a portrait", k=1)

[Document(id='a6de9993-cc8f-4757-a6c4-de0ba5bfa973', metadata={'topic': 'Robotics', 'chapter': 4, 'doc_id': 'ccb86b5e-89bb-4f74-9382-6ad905256594', 'created': '2026-01-08'}, page_content='A robot painting a portrait')]