FAISS - Facebook AI Similarity Search(FAISS) - is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possibly so not fit in RAM. It also contains supporting code for evaluation and parameter tuning. 

In [18]:
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import OllamaEmbeddings
from langchain_text_splitters import CharacterTextSplitter

loader = TextLoader("documents/sample.txt")
docs = loader.load()

text_splitter = CharacterTextSplitter(chunk_size=100, chunk_overlap=10)
docs_split = text_splitter.split_documents(docs)

embeddings = OllamaEmbeddings(model="codellama")
db = FAISS.from_documents(docs_split, embeddings)
db.save_local("faiss_vectorstore")

In [19]:
query = "Attention mechanisms have become an integral part of compelling sequence modeling"

retrieved_docs = db.similarity_search(query, k=3)
retrieved_docs[0].page_content


'1 Introduction\nRecurrent neural networks, long short-term memory [13] and gated recurrent [7] neural networks\nin particular, have been firmly established as state of the art approaches in sequence modeling and\ntransduction problems such as language modeling and machine translation [35, 2, 5]. Numerous\nefforts have since continued to push the boundaries of recurrent language models and encoder-decoder\narchitectures [38, 24, 15].\nRecurrent models typically factor computation along the symbol positions of the input and output\nsequences. Aligning the positions to steps in computation time, they generate a sequence of hidden\nstates ht, as a function of the previous hidden state ht−1 and the input for position t. This inherently\nsequential nature precludes parallelization within training examples, which becomes critical at longer\nsequence lengths, as memory constraints limit batching across examples. Recent work has achieved\nsignificant improvements in computational efficiency th

In [20]:
for i, doc in enumerate(retrieved_docs):
    print(f"Document {i+1}:\n{doc.page_content}\n") 

Document 1:
1 Introduction
Recurrent neural networks, long short-term memory [13] and gated recurrent [7] neural networks
in particular, have been firmly established as state of the art approaches in sequence modeling and
transduction problems such as language modeling and machine translation [35, 2, 5]. Numerous
efforts have since continued to push the boundaries of recurrent language models and encoder-decoder
architectures [38, 24, 15].
Recurrent models typically factor computation along the symbol positions of the input and output
sequences. Aligning the positions to steps in computation time, they generate a sequence of hidden
states ht, as a function of the previous hidden state ht−1 and the input for position t. This inherently
sequential nature precludes parallelization within training examples, which becomes critical at longer
sequence lengths, as memory constraints limit batching across examples. Recent work has achieved
significant improvements in computational efficiency th

### As a Retriever

We can also convert the vectorstore into a Retriever class. This allows us to easily use it in other LangChain Methods, which largely work with Retrievers.

In [None]:
retriever = db.as_retriever()
retriever.invoke(query)