## FAISS

Facebook AISimilary search (FAISS) - libfor efficient similarity search and clustering of dense vector. it contains algorithm that search in sets of vector of any size,up to ones that possible do not fit in RAM. It also contains supporting search for evaluation and parameter tuning.

In [7]:
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import OllamaEmbeddings
from langchain_text_splitters import CharacterTextSplitter

In [15]:
loader = TextLoader('speech.txt')
document = loader.load()
text_splitter = CharacterTextSplitter(chunk_size = 1000, chunk_overlap= 20)
docs = text_splitter.split_documents(document)

In [16]:
docs

[Document(metadata={'source': 'speech.txt'}, page_content='We are living in one of the most important technological moments in human history. For the first time, machines are not only helping us calculate numbers or store information, but they are beginning to understand language, generate ideas, and create content that feels human-like. This transformation is being driven by Large Language Models and Generative Artificial Intelligence, commonly known as LLMs and GenAI. These technologies are reshaping education, healthcare, business, creativity, and the future of work itself.'),
 Document(metadata={'source': 'speech.txt'}, page_content='Artificial intelligence did not appear suddenly. It evolved slowly over decades. In the beginning, computers followed fixed rules written by programmers. If a condition was met, the machine performed a specific action. This was useful but limited. Later, machine learning allowed systems to learn from data instead of rules. Then deep learning introduced

In [20]:
embedding = OllamaEmbeddings(model = 'gemma:2b')
db = FAISS.from_documents(docs, embedding)

In [21]:
db

<langchain_community.vectorstores.faiss.FAISS at 0x1b0fd08a0b0>

In [22]:
##quering
query = "What is Generative AI?"
doc = db.similarity_search(query)
doc

[Document(id='495f8d1c-2888-4dee-b2b2-8b5b41b5b0bc', metadata={'source': 'speech.txt'}, page_content='Artificial intelligence did not appear suddenly. It evolved slowly over decades. In the beginning, computers followed fixed rules written by programmers. If a condition was met, the machine performed a specific action. This was useful but limited. Later, machine learning allowed systems to learn from data instead of rules. Then deep learning introduced neural networks inspired by the human brain, enabling computers to recognize images, understand speech, and detect patterns. Now we have reached the generative era, where machines are not just analyzing existing information but creating new content. This is a fundamental shift in what technology can do.'),
 Document(id='94322591-42c6-41dc-801a-cacaa7ebada7', metadata={'source': 'speech.txt'}, page_content='Generative AI goes beyond language. It includes systems that can create images, videos, music, designs, and even virtual worlds. An a

In [23]:
doc[1].page_content

'Generative AI goes beyond language. It includes systems that can create images, videos, music, designs, and even virtual worlds. An artist can describe a scene, and AI can generate a realistic painting. A musician can hum a tune, and AI can produce a full song. A business can describe a logo idea, and AI can design multiple versions instantly. Creativity, once limited by time and skill, is now amplified by technology.'

In [24]:
doc[0].page_content

'Artificial intelligence did not appear suddenly. It evolved slowly over decades. In the beginning, computers followed fixed rules written by programmers. If a condition was met, the machine performed a specific action. This was useful but limited. Later, machine learning allowed systems to learn from data instead of rules. Then deep learning introduced neural networks inspired by the human brain, enabling computers to recognize images, understand speech, and detect patterns. Now we have reached the generative era, where machines are not just analyzing existing information but creating new content. This is a fundamental shift in what technology can do.'

### As a Reteriver
we can also convert the vector db into reterival class. this allow us to easily use it in other langchain methods, whcih largely uses in reterival

In [26]:
retrieval = db.as_retriever()
retrieval.invoke(query)

[Document(id='495f8d1c-2888-4dee-b2b2-8b5b41b5b0bc', metadata={'source': 'speech.txt'}, page_content='Artificial intelligence did not appear suddenly. It evolved slowly over decades. In the beginning, computers followed fixed rules written by programmers. If a condition was met, the machine performed a specific action. This was useful but limited. Later, machine learning allowed systems to learn from data instead of rules. Then deep learning introduced neural networks inspired by the human brain, enabling computers to recognize images, understand speech, and detect patterns. Now we have reached the generative era, where machines are not just analyzing existing information but creating new content. This is a fundamental shift in what technology can do.'),
 Document(id='94322591-42c6-41dc-801a-cacaa7ebada7', metadata={'source': 'speech.txt'}, page_content='Generative AI goes beyond language. It includes systems that can create images, videos, music, designs, and even virtual worlds. An a

In [27]:
docs[0].page_content

'We are living in one of the most important technological moments in human history. For the first time, machines are not only helping us calculate numbers or store information, but they are beginning to understand language, generate ideas, and create content that feels human-like. This transformation is being driven by Large Language Models and Generative Artificial Intelligence, commonly known as LLMs and GenAI. These technologies are reshaping education, healthcare, business, creativity, and the future of work itself.'

### Similarity Search with Score 

#### there r some FAISS specific methods. one of them is similarity_search_with_score. which allows to u to return not only the score also the distance score of the query to them. the returned distance score is L2 distance. therfore lower score is better

In [28]:
docs_and_score = db.similarity_search_with_score(query)
docs_and_score

[(Document(id='495f8d1c-2888-4dee-b2b2-8b5b41b5b0bc', metadata={'source': 'speech.txt'}, page_content='Artificial intelligence did not appear suddenly. It evolved slowly over decades. In the beginning, computers followed fixed rules written by programmers. If a condition was met, the machine performed a specific action. This was useful but limited. Later, machine learning allowed systems to learn from data instead of rules. Then deep learning introduced neural networks inspired by the human brain, enabling computers to recognize images, understand speech, and detect patterns. Now we have reached the generative era, where machines are not just analyzing existing information but creating new content. This is a fundamental shift in what technology can do.'),
  np.float32(2056.219)),
 (Document(id='94322591-42c6-41dc-801a-cacaa7ebada7', metadata={'source': 'speech.txt'}, page_content='Generative AI goes beyond language. It includes systems that can create images, videos, music, designs, an

In [30]:
#similarity search by vector

embedding_vector = embedding.embed_query(query)
embedding_vector

[-1.1420726776123047,
 -1.4524930715560913,
 -0.9919050335884094,
 2.6927990913391113,
 1.1165357828140259,
 1.4074416160583496,
 1.314515471458435,
 -0.7074708938598633,
 0.1734955757856369,
 1.5800668001174927,
 0.7172357439994812,
 1.086014986038208,
 -1.0724129676818848,
 0.6055935025215149,
 -0.47468826174736023,
 0.12275147438049316,
 7.638752460479736,
 0.16079626977443695,
 0.4377903640270233,
 0.11443234235048294,
 1.3078854084014893,
 -0.36647722125053406,
 -0.10603343695402145,
 0.48358431458473206,
 -2.4697964191436768,
 -0.08499854803085327,
 0.5191541910171509,
 0.7549610733985901,
 -0.2543017864227295,
 -2.061140298843384,
 -0.29515111446380615,
 -0.38792622089385986,
 -0.8896896839141846,
 -0.7074491381645203,
 0.05398281663656235,
 1.1523545980453491,
 -0.07655157893896103,
 0.14772966504096985,
 0.34549546241760254,
 -1.6134473085403442,
 0.30127280950546265,
 0.3766326606273651,
 0.9744135141372681,
 0.1819518804550171,
 -0.36336731910705566,
 -1.2739553451538086,
 0

In [33]:
docs_score = db.similarity_search_by_vector(embedding_vector)

In [34]:
docs_score

[Document(id='495f8d1c-2888-4dee-b2b2-8b5b41b5b0bc', metadata={'source': 'speech.txt'}, page_content='Artificial intelligence did not appear suddenly. It evolved slowly over decades. In the beginning, computers followed fixed rules written by programmers. If a condition was met, the machine performed a specific action. This was useful but limited. Later, machine learning allowed systems to learn from data instead of rules. Then deep learning introduced neural networks inspired by the human brain, enabling computers to recognize images, understand speech, and detect patterns. Now we have reached the generative era, where machines are not just analyzing existing information but creating new content. This is a fundamental shift in what technology can do.'),
 Document(id='94322591-42c6-41dc-801a-cacaa7ebada7', metadata={'source': 'speech.txt'}, page_content='Generative AI goes beyond language. It includes systems that can create images, videos, music, designs, and even virtual worlds. An a

In [36]:
##Saving and loading 

db.save_local("faiss_index")

In [40]:
new_db = FAISS.load_local("faiss_index", embedding, allow_dangerous_deserialization = True)
docs = new_db.similarity_search(query)

In [41]:
docs

[Document(id='495f8d1c-2888-4dee-b2b2-8b5b41b5b0bc', metadata={'source': 'speech.txt'}, page_content='Artificial intelligence did not appear suddenly. It evolved slowly over decades. In the beginning, computers followed fixed rules written by programmers. If a condition was met, the machine performed a specific action. This was useful but limited. Later, machine learning allowed systems to learn from data instead of rules. Then deep learning introduced neural networks inspired by the human brain, enabling computers to recognize images, understand speech, and detect patterns. Now we have reached the generative era, where machines are not just analyzing existing information but creating new content. This is a fundamental shift in what technology can do.'),
 Document(id='94322591-42c6-41dc-801a-cacaa7ebada7', metadata={'source': 'speech.txt'}, page_content='Generative AI goes beyond language. It includes systems that can create images, videos, music, designs, and even virtual worlds. An a