## Vector stores

Vector stores are specialized data stores that enable indexing and retrieving information based on vector representations.

These vectors, called embeddings, capture the semantic meaning of data that has been embedded.

Vector stores are frequently used to search over unstructured data, such as text, images, and audio, to retrieve relevant information based on semantic similarity rather than exact keyword matches.

<img src="https://python.langchain.com/assets/images/vectorstores-2540b4bc355b966c99b0f02cfdddb273.png" width="700px" height="400px"/>


LangChain provides a standard interface for working with vector stores, allowing users to easily switch between different vectorstore implementations.

The interface consists of basic methods for writing, deleting and searching for documents in the vector store.

The key methods are:

+ `add_documents`: Add a list of texts to the vector store.
+ `delete`: Delete a list of documents from the vector store.
+ `similarity_search`: Search for similar documents to a given query.

### Faiss
Facebook AI Similarity Search (FAISS) is a library for efficient similarity search and clustering of dense vectors. 

It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. 

It also includes supporting code for evaluation and parameter tuning.

In [31]:
from langchain_community.document_loaders import TextLoader 
from langchain_text_splitters import CharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain_ollama  import OllamaEmbeddings
import faiss

In [None]:
# loading of text
loader = TextLoader("../1.document_loader/speech.txt", encoding="UTF-8") 
data = loader.load()
print(data) 

[Document(metadata={'source': '../1.document_loader/speech.txt'}, page_content='The world must be made safe for democracy. Its peace must be planted upon the tested foundations of political liberty. We have no selfish ends to serve. We desire no conquest, no dominion. We seek no indemnities for ourselves, no material compensation for the sacrifices we shall freely make. We are but one of the champions of the rights of mankind. We shall be satisfied when those rights have been made as secure as the faith and the freedom of nations can make them.\n\nJust because we fight without rancor and without selfish object, seeking nothing for ourselves but what we shall wish to share with all free peoples, we shall, I feel confident, conduct our operations as belligerents without passion and ourselves observe with proud punctilio the principles of right and of fair play we profess to be fighting for.\n\nIt will be all the easier for us to conduct ourselves as belligerents in a high spirit of right

In [15]:
# spliting of text
text_split =CharacterTextSplitter(separator="\n\n",chunk_size=1000,chunk_overlap=30)
docs= text_split.split_documents(data)

In [16]:
docs

[Document(metadata={'source': '../1.document_loader/speech.txt'}, page_content='The world must be made safe for democracy. Its peace must be planted upon the tested foundations of political liberty. We have no selfish ends to serve. We desire no conquest, no dominion. We seek no indemnities for ourselves, no material compensation for the sacrifices we shall freely make. We are but one of the champions of the rights of mankind. We shall be satisfied when those rights have been made as secure as the faith and the freedom of nations can make them.\n\nJust because we fight without rancor and without selfish object, seeking nothing for ourselves but what we shall wish to share with all free peoples, we shall, I feel confident, conduct our operations as belligerents without passion and ourselves observe with proud punctilio the principles of right and of fair play we profess to be fighting for.'),
 Document(metadata={'source': '../1.document_loader/speech.txt'}, page_content='It will be all 

In [None]:
# vector store 
embeddings = OllamaEmbeddings(
   model="llama3.2:1b",
   
)
len(embeddings.embed_query("langchain query"))


2048

In [44]:
index = faiss.IndexFlatL2(len(embeddings.embed_query("length of embedding")))
from langchain_community.docstore.in_memory import InMemoryDocstore


vector_store = FAISS(
    embedding_function=embeddings,
    index=index,
    docstore=InMemoryDocstore(),
    index_to_docstore_id={},
)

In [52]:
vector_store.add_documents(docs)

['6ab17275-611f-4116-85c9-117fa1982356',
 'b2e851d7-0c44-4f91-a013-12a59a086754',
 '99c18de6-85b6-497a-a07b-2217f3281ef7',
 '8edb241e-b19c-408e-a05f-ddce3e2d3852',
 'e1536b9e-53eb-4088-867e-2fa279d154c6']

In [61]:
vector_store.get_by_ids(["ab35de26-937b-4ae5-ad3e-d0c443e2416b"])

[Document(id='ab35de26-937b-4ae5-ad3e-d0c443e2416b', metadata={'source': '../1.document_loader/speech.txt'}, page_content='The world must be made safe for democracy. Its peace must be planted upon the tested foundations of political liberty. We have no selfish ends to serve. We desire no conquest, no dominion. We seek no indemnities for ourselves, no material compensation for the sacrifices we shall freely make. We are but one of the champions of the rights of mankind. We shall be satisfied when those rights have been made as secure as the faith and the freedom of nations can make them.\n\nJust because we fight without rancor and without selfish object, seeking nothing for ourselves but what we shall wish to share with all free peoples, we shall, I feel confident, conduct our operations as belligerents without passion and ourselves observe with proud punctilio the principles of right and of fair play we profess to be fighting for.')]

In [59]:
vector_store.get_by_ids(["6ab17275-611f-4116-85c9-117fa1982356"])

[]

In [None]:
# vector_store.delete(['6ab17275-611f-4116-85c9-117fa1982356',
#  'b2e851d7-0c44-4f91-a013-12a59a086754',
#  '99c18de6-85b6-497a-a07b-2217f3281ef7',
#  '8edb241e-b19c-408e-a05f-ddce3e2d3852',
#  'e1536b9e-53eb-4088-867e-2fa279d154c6'])

True

In [None]:
# similaryity search
result = vector_store.similarity_search("what world  made safe for?")
result[0].page_content

'The world must be made safe for democracy. Its peace must be planted upon the tested foundations of political liberty. We have no selfish ends to serve. We desire no conquest, no dominion. We seek no indemnities for ourselves, no material compensation for the sacrifices we shall freely make. We are but one of the champions of the rights of mankind. We shall be satisfied when those rights have been made as secure as the faith and the freedom of nations can make them.\n\nJust because we fight without rancor and without selfish object, seeking nothing for ourselves but what we shall wish to share with all free peoples, we shall, I feel confident, conduct our operations as belligerents without passion and ourselves observe with proud punctilio the principles of right and of fair play we profess to be fighting for.'

### As Retriever

In [78]:
retriever = vector_store.as_retriever()

In [73]:
retriever.invoke("what world  made safe for?")

[Document(id='ab35de26-937b-4ae5-ad3e-d0c443e2416b', metadata={'source': '../1.document_loader/speech.txt'}, page_content='The world must be made safe for democracy. Its peace must be planted upon the tested foundations of political liberty. We have no selfish ends to serve. We desire no conquest, no dominion. We seek no indemnities for ourselves, no material compensation for the sacrifices we shall freely make. We are but one of the champions of the rights of mankind. We shall be satisfied when those rights have been made as secure as the faith and the freedom of nations can make them.\n\nJust because we fight without rancor and without selfish object, seeking nothing for ourselves but what we shall wish to share with all free peoples, we shall, I feel confident, conduct our operations as belligerents without passion and ourselves observe with proud punctilio the principles of right and of fair play we profess to be fighting for.'),
 Document(id='18b31252-1a2c-4ac0-a4fd-34d60562a897',

### Saving and Loading

In [81]:
vector_store.save_local("faiss_index")

new_vector_store = FAISS.load_local(
    folder_path="faiss_index",
    embeddings=embeddings,
    allow_dangerous_deserialization=True
)

docs = new_vector_store.similarity_search("what world  made safe for?")

In [82]:
docs

[Document(id='ab35de26-937b-4ae5-ad3e-d0c443e2416b', metadata={'source': '../1.document_loader/speech.txt'}, page_content='The world must be made safe for democracy. Its peace must be planted upon the tested foundations of political liberty. We have no selfish ends to serve. We desire no conquest, no dominion. We seek no indemnities for ourselves, no material compensation for the sacrifices we shall freely make. We are but one of the champions of the rights of mankind. We shall be satisfied when those rights have been made as secure as the faith and the freedom of nations can make them.\n\nJust because we fight without rancor and without selfish object, seeking nothing for ourselves but what we shall wish to share with all free peoples, we shall, I feel confident, conduct our operations as belligerents without passion and ourselves observe with proud punctilio the principles of right and of fair play we profess to be fighting for.'),
 Document(id='18b31252-1a2c-4ac0-a4fd-34d60562a897',