` Facebook Ai similarity search(Faiss) is a library for efficient similarity search and clustering of dense vectors. It contains algorithms that search in sets of vectors of any size, up to ones that possible do not fit in RAM. it also contains supporting code for evaluation and parameter tuning.`

In [7]:
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import OllamaEmbeddings
from langchain_text_splitters import CharacterTextSplitter

loader = TextLoader('speech.txt')
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=30)
docs = text_splitter.split_documents(documents)
docs

[Document(metadata={'source': 'speech.txt'}, page_content='Overview: In this project, the student model learns from multiple teacher models, each specializing in different aspects of the task. For example, one teacher could specialize in object detection, another in object classification, and a third in feature extraction.\nNovelty: The key is designing a custom loss function that weights the contributions of each teacher differently based on the task at hand. This adaptive weighting would allow the student model to benefit from diverse perspectives.\nExample: For an object detection task, you could have:\nTeacher 1 (YOLOv5): Focuses on fast bounding box detection.\nTeacher 2 (EfficientNet): Focuses on object classification accuracy.\nTeacher 3 (ResNet-50): Helps the student extract robust features from the image.'),
 Document(metadata={'source': 'speech.txt'}, page_content='Attended no do thoughts me on dissuade scarcely. Own are pretty spring suffer old denote his. By proposal speedi

In [9]:
embeddings = OllamaEmbeddings(model="gemma:2b")
db = FAISS.from_documents(docs, embeddings)

In [10]:
db

<langchain_community.vectorstores.faiss.FAISS at 0x1d2cae0cb30>

In [11]:
query = "What is the task of YOLOv5 model?"
docs=db.similarity_search(query)
docs

[Document(metadata={'source': 'speech.txt'}, page_content='Overview: In this project, the student model learns from multiple teacher models, each specializing in different aspects of the task. For example, one teacher could specialize in object detection, another in object classification, and a third in feature extraction.\nNovelty: The key is designing a custom loss function that weights the contributions of each teacher differently based on the task at hand. This adaptive weighting would allow the student model to benefit from diverse perspectives.\nExample: For an object detection task, you could have:\nTeacher 1 (YOLOv5): Focuses on fast bounding box detection.\nTeacher 2 (EfficientNet): Focuses on object classification accuracy.\nTeacher 3 (ResNet-50): Helps the student extract robust features from the image.'),
 Document(metadata={'source': 'speech.txt'}, page_content='Am if number no up period regard sudden better. Decisively surrounded all admiration and not you. Out particular

In [12]:
docs[0].page_content

'Overview: In this project, the student model learns from multiple teacher models, each specializing in different aspects of the task. For example, one teacher could specialize in object detection, another in object classification, and a third in feature extraction.\nNovelty: The key is designing a custom loss function that weights the contributions of each teacher differently based on the task at hand. This adaptive weighting would allow the student model to benefit from diverse perspectives.\nExample: For an object detection task, you could have:\nTeacher 1 (YOLOv5): Focuses on fast bounding box detection.\nTeacher 2 (EfficientNet): Focuses on object classification accuracy.\nTeacher 3 (ResNet-50): Helps the student extract robust features from the image.'

## As a Trtriver

`We can also convert the cectorstore into a retriver class. This allows us to easily use it in other LangChain methods, which largely work with retrievers`

In [13]:
retriever = db.as_retriever()
retriever.invoke(query)

[Document(metadata={'source': 'speech.txt'}, page_content='Overview: In this project, the student model learns from multiple teacher models, each specializing in different aspects of the task. For example, one teacher could specialize in object detection, another in object classification, and a third in feature extraction.\nNovelty: The key is designing a custom loss function that weights the contributions of each teacher differently based on the task at hand. This adaptive weighting would allow the student model to benefit from diverse perspectives.\nExample: For an object detection task, you could have:\nTeacher 1 (YOLOv5): Focuses on fast bounding box detection.\nTeacher 2 (EfficientNet): Focuses on object classification accuracy.\nTeacher 3 (ResNet-50): Helps the student extract robust features from the image.'),
 Document(metadata={'source': 'speech.txt'}, page_content='Am if number no up period regard sudden better. Decisively surrounded all admiration and not you. Out particular

In [14]:
docs[1].page_content

'Am if number no up period regard sudden better. Decisively surrounded all admiration and not you. Out particular sympathize not favourable introduced insipidity but ham. Rather number can and set praise. Distrusts an it contented perceived attending oh. Thoroughly estimating introduced stimulated why but motionless.'

## Similarity Search with score

`There are some FAISS specific methods. One of them is similarity search with score, which allows you to return not only the documents but also the distance score of the query to them. The returned distance score is L2 distance. Therefore, a lower score is better.`

In [15]:
query = "Explain about student model and loss function"
docs_and_score = db.similarity_search_with_score(query)
docs_and_score

[(Document(metadata={'source': 'speech.txt'}, page_content='Overview: In this project, the student model learns from multiple teacher models, each specializing in different aspects of the task. For example, one teacher could specialize in object detection, another in object classification, and a third in feature extraction.\nNovelty: The key is designing a custom loss function that weights the contributions of each teacher differently based on the task at hand. This adaptive weighting would allow the student model to benefit from diverse perspectives.\nExample: For an object detection task, you could have:\nTeacher 1 (YOLOv5): Focuses on fast bounding box detection.\nTeacher 2 (EfficientNet): Focuses on object classification accuracy.\nTeacher 3 (ResNet-50): Helps the student extract robust features from the image.'),
  4031.0898),
 (Document(metadata={'source': 'speech.txt'}, page_content='Am if number no up period regard sudden better. Decisively surrounded all admiration and not you

In [16]:
embedding_vector = embeddings.embed_query(query)
embedding_vector

[-0.37703776359558105,
 -0.7433834075927734,
 -1.15997314453125,
 3.1676275730133057,
 -0.7763984799385071,
 2.128129482269287,
 -1.3099576234817505,
 0.07591808587312698,
 0.8263306021690369,
 -0.8678130507469177,
 0.34514567255973816,
 0.6624451279640198,
 -1.4065450429916382,
 -0.14126887917518616,
 -1.197776198387146,
 -0.2404523640871048,
 3.6741721630096436,
 1.275036334991455,
 -0.6995092630386353,
 -1.0670925378799438,
 2.273432970046997,
 0.08906496316194534,
 0.6993593573570251,
 0.2796979546546936,
 -0.8534907102584839,
 0.4413204491138458,
 1.3451594114303589,
 0.4098145663738251,
 1.2362070083618164,
 -1.8584176301956177,
 -1.177116870880127,
 -0.11856210231781006,
 0.6960723400115967,
 -0.821532130241394,
 0.550538957118988,
 -0.38620564341545105,
 0.8539015054702759,
 0.04106445610523224,
 -0.08616988360881805,
 -1.4382717609405518,
 -0.4505300223827362,
 -1.381754755973816,
 0.9634058475494385,
 0.09524968266487122,
 -0.12069118767976761,
 -1.3017083406448364,
 0.084411

In [17]:
docs_score = db.similarity_search_by_vector(embedding_vector)
docs_score

[Document(metadata={'source': 'speech.txt'}, page_content='Overview: In this project, the student model learns from multiple teacher models, each specializing in different aspects of the task. For example, one teacher could specialize in object detection, another in object classification, and a third in feature extraction.\nNovelty: The key is designing a custom loss function that weights the contributions of each teacher differently based on the task at hand. This adaptive weighting would allow the student model to benefit from diverse perspectives.\nExample: For an object detection task, you could have:\nTeacher 1 (YOLOv5): Focuses on fast bounding box detection.\nTeacher 2 (EfficientNet): Focuses on object classification accuracy.\nTeacher 3 (ResNet-50): Helps the student extract robust features from the image.'),
 Document(metadata={'source': 'speech.txt'}, page_content='Am if number no up period regard sudden better. Decisively surrounded all admiration and not you. Out particular

In [18]:
## Saving and Load
db.save_local("faiss_index")

In [19]:
new_db = FAISS.load_local("faiss_index", embeddings, allow_dangerous_deserialization=True)
docs = new_db.similarity_search(query)
docs

[Document(metadata={'source': 'speech.txt'}, page_content='Overview: In this project, the student model learns from multiple teacher models, each specializing in different aspects of the task. For example, one teacher could specialize in object detection, another in object classification, and a third in feature extraction.\nNovelty: The key is designing a custom loss function that weights the contributions of each teacher differently based on the task at hand. This adaptive weighting would allow the student model to benefit from diverse perspectives.\nExample: For an object detection task, you could have:\nTeacher 1 (YOLOv5): Focuses on fast bounding box detection.\nTeacher 2 (EfficientNet): Focuses on object classification accuracy.\nTeacher 3 (ResNet-50): Helps the student extract robust features from the image.'),
 Document(metadata={'source': 'speech.txt'}, page_content='Am if number no up period regard sudden better. Decisively surrounded all admiration and not you. Out particular