Facebook AI Similarity Search (FAISS) is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also includes supporting code for evaluation and parameter tuning.

In [8]:
from langchain_community.document_loaders import  TextLoader
from langchain_community.vectorstores import  FAISS
from langchain_community.embeddings import OllamaEmbeddings
from langchain_text_splitters import CharacterTextSplitter
loader = TextLoader("speech.txt")
text_document = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=500,chunk_overlap=50)
text_splitter_result = text_splitter.split_documents(text_document)
embeddings = OllamaEmbeddings(
    model="gemma2:2b"
)
db = FAISS.from_documents(text_splitter_result,embeddings)
db
# Query the database
query = "what was the key feature of the langchain explain in 10 words."
result_of_query = db.similarity_search(query)
result_of_query[0].page_content

'Key Features of LangChain\nChains:\n\nCombines multiple LLM calls and other logic into a single sequence or workflow.\nExample: Querying a database, summarizing the data, and generating a user-friendly response.\nAgents:\n\nAllows applications to dynamically make decisions about which actions to take next based on user input or context.\nExample: Querying an API, fetching documents, or interacting with external tools like search engines or calculators.\nMemory:'

# As a retriver
We can also convert the vectorstore the retriver class. This allow us easly use it in the other langchain methods.

In [9]:
retriver = db.as_retriever()
retriver_result = retriver.invoke(query)
retriver_result[0].page_content

'Key Features of LangChain\nChains:\n\nCombines multiple LLM calls and other logic into a single sequence or workflow.\nExample: Querying a database, summarizing the data, and generating a user-friendly response.\nAgents:\n\nAllows applications to dynamically make decisions about which actions to take next based on user input or context.\nExample: Querying an API, fetching documents, or interacting with external tools like search engines or calculators.\nMemory:'

# Similarity search score
- Means return the distance scores of dcouments

In [10]:
similarity_score_result = db.similarity_search_with_score(query)
similarity_score_result

[(Document(id='9625229f-2d47-452f-a22f-1e2d75c48969', metadata={'source': 'speech.txt'}, page_content='Key Features of LangChain\nChains:\n\nCombines multiple LLM calls and other logic into a single sequence or workflow.\nExample: Querying a database, summarizing the data, and generating a user-friendly response.\nAgents:\n\nAllows applications to dynamically make decisions about which actions to take next based on user input or context.\nExample: Querying an API, fetching documents, or interacting with external tools like search engines or calculators.\nMemory:'),
  np.float32(4990.5654)),
 (Document(id='6ca218b8-1b05-42c4-a7a3-a170b42eab54', metadata={'source': 'speech.txt'}, page_content='Adds the ability to maintain conversational state over time.\nUseful for chatbots or applications requiring context persistence across interactions.\nData Integration:\n\nFacilitates connecting LLMs to external knowledge sources, including databases, APIs, and local files, to enrich responses.\nExa

## Saving and store in local

In [15]:
db.save_local("faiss_db")
new_db = FAISS.load_local("faiss_db",embeddings,allow_dangerous_deserialization=True)
new_db.similarity_search(query)

[Document(id='9625229f-2d47-452f-a22f-1e2d75c48969', metadata={'source': 'speech.txt'}, page_content='Key Features of LangChain\nChains:\n\nCombines multiple LLM calls and other logic into a single sequence or workflow.\nExample: Querying a database, summarizing the data, and generating a user-friendly response.\nAgents:\n\nAllows applications to dynamically make decisions about which actions to take next based on user input or context.\nExample: Querying an API, fetching documents, or interacting with external tools like search engines or calculators.\nMemory:'),
 Document(id='6ca218b8-1b05-42c4-a7a3-a170b42eab54', metadata={'source': 'speech.txt'}, page_content='Adds the ability to maintain conversational state over time.\nUseful for chatbots or applications requiring context persistence across interactions.\nData Integration:\n\nFacilitates connecting LLMs to external knowledge sources, including databases, APIs, and local files, to enrich responses.\nExample: Answering queries usin