##### Facebook AI Similarity Search (Faiss) is a library for efficient similarity seach and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also contains supporting code for evaluation and parameter tuning.

In [2]:
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import CharacterTextSplitter
from langchain_ollama import OllamaEmbeddings
from langchain_community.vectorstores import FAISS

# Step 1: Load the Document
loader = TextLoader('./data/speech.txt')
documents = loader.load()

# Step 2: Splitter the text
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=30)
docs = text_splitter.split_documents(documents)

In [3]:
# Step 3: Create Vector Store
embeddings = OllamaEmbeddings(model="gemma:2b")
db = FAISS.from_documents(docs, embeddings)
db

<langchain_community.vectorstores.faiss.FAISS at 0x149f46d2fe0>

In [4]:
# Step 4: Querying
query = "Kalam collapsed and died from an apparent cardiac arrest?"
docs = db.similarity_search(query)
docs[0].page_content

'Kalam was elected as the president of India in 2002 with the support of both the ruling Bharatiya Janata Party and the then-opposition Indian National Congress. He was widely referred to as the "People\'s President". He engaged in teaching, writing and public service after his presidency. He was a recipient of several awards, including the Bharat Ratna, India\'s highest civilian honour.\n\nWhile delivering a lecture at IIM Shillong, Kalam collapsed and died from an apparent cardiac arrest on 27 July 2015, aged 83. Thousands attended the funeral ceremony held in his hometown of Rameswaram, where he was buried with full state honours. A memorial was inaugurated near his home town in 2017.'

### As a Retriever

We can also convert the vectorstore into a Retriever class. This allows us to easily use it in other LangChain methods, which largely work with retrievers

In [5]:
retriever = db.as_retriever()
docs = retriever.invoke(query)
docs[0].page_content

'Kalam was elected as the president of India in 2002 with the support of both the ruling Bharatiya Janata Party and the then-opposition Indian National Congress. He was widely referred to as the "People\'s President". He engaged in teaching, writing and public service after his presidency. He was a recipient of several awards, including the Bharat Ratna, India\'s highest civilian honour.\n\nWhile delivering a lecture at IIM Shillong, Kalam collapsed and died from an apparent cardiac arrest on 27 July 2015, aged 83. Thousands attended the funeral ceremony held in his hometown of Rameswaram, where he was buried with full state honours. A memorial was inaugurated near his home town in 2017.'

### Similarity Search with Score

There are some FAISS specific methos. One of them is similarity_search_with_score, which allows you to return not only the documents but also the distance score of the query to them. The returned distance score is L2 distance. Therefore, a lower score is better.

In [6]:
doc_and_score = db.similarity_search_with_score(query)
doc_and_score

[(Document(id='cba730d6-d2e8-45b8-b3a0-c6099b6c7ee1', metadata={'source': './data/speech.txt'}, page_content='Kalam was elected as the president of India in 2002 with the support of both the ruling Bharatiya Janata Party and the then-opposition Indian National Congress. He was widely referred to as the "People\'s President". He engaged in teaching, writing and public service after his presidency. He was a recipient of several awards, including the Bharat Ratna, India\'s highest civilian honour.\n\nWhile delivering a lecture at IIM Shillong, Kalam collapsed and died from an apparent cardiac arrest on 27 July 2015, aged 83. Thousands attended the funeral ceremony held in his hometown of Rameswaram, where he was buried with full state honours. A memorial was inaugurated near his home town in 2017.'),
  np.float32(0.36573225)),
 (Document(id='7c6ec62e-b4fd-47f3-a158-5a0901cc1d83', metadata={'source': './data/speech.txt'}, page_content='Avul Pakir Jainulabdeen Abdul Kalam (/ˈʌbdʊl kəˈlɑːm/ 

In [7]:
embedding_vector = embeddings.embed_query(query)
embedding_vector

[0.006806112,
 -0.0056338967,
 -0.0021597461,
 0.023255391,
 0.016611401,
 -0.004846678,
 0.002894806,
 -0.0096783815,
 0.02136845,
 0.0132050505,
 0.00853634,
 -0.0035037019,
 -0.0017079404,
 0.019256108,
 -0.0061722347,
 -0.016424637,
 0.094363935,
 0.031179016,
 -0.012750671,
 -0.004946361,
 0.0048655914,
 -0.012644073,
 0.014425093,
 -0.005523108,
 -0.0055928915,
 -0.009780128,
 0.0038597605,
 -0.0033803901,
 0.003024939,
 0.004978002,
 -0.027426856,
 0.01163151,
 -0.0075990376,
 -0.00844644,
 -0.002672986,
 -0.0054438217,
 0.006479909,
 0.0032023785,
 0.0014943454,
 -0.015099433,
 -0.013831897,
 -0.010183958,
 0.01838612,
 -0.022309572,
 -0.00028813622,
 0.025679236,
 -0.0030657898,
 -0.012224759,
 0.0008124517,
 -0.007933902,
 -0.22710589,
 -0.33051252,
 -0.009988688,
 0.01134389,
 -0.010424019,
 0.013606397,
 -0.022808231,
 -0.0006667442,
 -0.0061033196,
 -0.020028396,
 -0.0035553991,
 -0.0075071207,
 -0.024807151,
 -6.518239e-05,
 0.01880724,
 0.00043054984,
 -0.0102721965,
 0.

In [8]:
doc_score = db.similarity_search_by_vector(embedding_vector)
doc_score

[Document(id='cba730d6-d2e8-45b8-b3a0-c6099b6c7ee1', metadata={'source': './data/speech.txt'}, page_content='Kalam was elected as the president of India in 2002 with the support of both the ruling Bharatiya Janata Party and the then-opposition Indian National Congress. He was widely referred to as the "People\'s President". He engaged in teaching, writing and public service after his presidency. He was a recipient of several awards, including the Bharat Ratna, India\'s highest civilian honour.\n\nWhile delivering a lecture at IIM Shillong, Kalam collapsed and died from an apparent cardiac arrest on 27 July 2015, aged 83. Thousands attended the funeral ceremony held in his hometown of Rameswaram, where he was buried with full state honours. A memorial was inaugurated near his home town in 2017.'),
 Document(id='7c6ec62e-b4fd-47f3-a158-5a0901cc1d83', metadata={'source': './data/speech.txt'}, page_content='Avul Pakir Jainulabdeen Abdul Kalam (/ˈʌbdʊl kəˈlɑːm/ ⓘ UB-duul kə-LAHM; 15 October

In [9]:
# Save and Loading
db.save_local('faiss_index')

In [13]:
new_db = FAISS.load_local('./faiss_index', embeddings, allow_dangerous_deserialization=True)
docs = new_db.similarity_search(query)
docs

[Document(id='cba730d6-d2e8-45b8-b3a0-c6099b6c7ee1', metadata={'source': './data/speech.txt'}, page_content='Kalam was elected as the president of India in 2002 with the support of both the ruling Bharatiya Janata Party and the then-opposition Indian National Congress. He was widely referred to as the "People\'s President". He engaged in teaching, writing and public service after his presidency. He was a recipient of several awards, including the Bharat Ratna, India\'s highest civilian honour.\n\nWhile delivering a lecture at IIM Shillong, Kalam collapsed and died from an apparent cardiac arrest on 27 July 2015, aged 83. Thousands attended the funeral ceremony held in his hometown of Rameswaram, where he was buried with full state honours. A memorial was inaugurated near his home town in 2017.'),
 Document(id='7c6ec62e-b4fd-47f3-a158-5a0901cc1d83', metadata={'source': './data/speech.txt'}, page_content='Avul Pakir Jainulabdeen Abdul Kalam (/ˈʌbdʊl kəˈlɑːm/ ⓘ UB-duul kə-LAHM; 15 October