### Faiss 
Facebook Ai similarity Search (Faiss) is a library for efficient similarity search and clustering of dense vectors. it contains algorithms that search in sets of vectors of any size, up to onces that possibly do not fit in RAM.It also contains supporting code for <b>evaluation</b> and <b>parameter tuning</b>.

In [1]:
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import CharacterTextSplitter
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import FAISS
from rich import print 
from dotenv import load_dotenv 


In [2]:
loader=TextLoader("../3.2-dataIngestion/speech.txt")

In [8]:
doc_loader=loader.load()
text_splitter=CharacterTextSplitter(chunk_size=1100,chunk_overlap=30)
docs=text_splitter.split_documents(doc_loader)

In [9]:
print(docs[0:2])

In [15]:
embedding=OllamaEmbeddings()
db=FAISS.from_documents(docs,embedding)

In [16]:
db

<langchain_community.vectorstores.faiss.FAISS at 0x119a6ecb0>

In [18]:
### querying 
query="What does the speaker believe is the main reason the united States should enter the war?"

results=db.similarity_search(query,k=1)
print(results)

In [21]:
query2="how does the speaker describe the desired outcome of the war?"
results2=db.similarity_search(query2,k=1)
print(results2)


#### As a Retriever 

We can also convert the vectorstore into a Retriever class. This allows us to easily use it in other LangChain methods, which largely work with retrievers. 

In [22]:
retriever=db.as_retriever()

In [26]:
print(retriever.invoke(query))

#### Similarity Search with Score
Thses are some Faiss specific methods.one of them is similarity_search_with_score, which allows you to return not only the documents but also the distance score of the query to them.The returned distance score is L2 distance.Therefore , a lower score is better.

In [29]:
docs_score=db.similarity_search_with_score(query)
print(docs_score) # L2 score or Manhattan distance 

In [32]:
embedding_query=embedding.embed_query(query)

vector_similarity=db.similarity_search_by_vector(embedding_query)
print(vector_similarity[0])

#### Saving and Loading of vector db

In [33]:
db.save_local("faiss_index")

In [34]:
new_db=FAISS.load_local("faiss_index",embedding,allow_dangerous_deserialization=True)

In [35]:
print(new_db.similarity_search(query)[0])