## Vector Store Techniques

## FAISS

In [None]:
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import OllamaEmbeddings
from langchain_text_splitters import CharacterTextSplitter

loader=TextLoader("../data/Ai_and_society.txt") #load a text file
documents=loader.load() #list of documents
text_splitter=CharacterTextSplitter(chunk_size=1000, chunk_overlap=30)
docs=text_splitter.split_documents(documents) 

In [None]:
embeddings=OllamaEmbeddings(model="mxbai-embed-large") #create embeddings with model mxbai-embed-large
db=FAISS.from_documents(docs, embeddings)
db

  embeddings=OllamaEmbeddings(model="mxbai-embed-large")


<langchain_community.vectorstores.faiss.FAISS at 0x1ebb38971d0>

#### Metodo 1: Vector Store DB

In [3]:
#### querying 
query="How does the speaker describe the AI's role in the society?"
docs=db.similarity_search(query)
#Response
docs[0].page_content

'Artificial Intelligence and Society\n\nArtificial Intelligence (AI) is rapidly transforming the way we live, work, and interact with technology. \nToday, AI systems are able to recognize speech, translate languages, diagnose diseases, and even create art. \nThese capabilities open new possibilities for innovation across industries, including healthcare, finance, \ntransportation, and education.\n\nIn healthcare, AI helps doctors analyze medical images and detect illnesses earlier than ever before. \nIn finance, machine learning models can identify fraudulent transactions within seconds and provide \npersonalized investment advice. In transportation, autonomous vehicles are being tested to reduce \naccidents and make travel more efficient. In education, intelligent tutoring systems adapt lessons \nto the pace and needs of each student.'

- "db.similarity_search' --> search the most similar documents to the query in the FAISS database
- "docs[0].page_content" --> print the contents of the first most similar document

#### Metodo 2: Retriever Class
Possiamo anche convertire il vector store in una classe Retriever. 
Questo ci permetterà di usarlo più facilmente negli altri metodi di Langchain che utilizzano in larga scala i retrievers.


In [None]:
#Retriever
retriever=db.as_retriever() #converting the vector store db into a retriever object

docs=retriever.invoke(query)
docs[0].page_content

'Artificial Intelligence and Society\n\nArtificial Intelligence (AI) is rapidly transforming the way we live, work, and interact with technology. \nToday, AI systems are able to recognize speech, translate languages, diagnose diseases, and even create art. \nThese capabilities open new possibilities for innovation across industries, including healthcare, finance, \ntransportation, and education.\n\nIn healthcare, AI helps doctors analyze medical images and detect illnesses earlier than ever before. \nIn finance, machine learning models can identify fraudulent transactions within seconds and provide \npersonalized investment advice. In transportation, autonomous vehicles are being tested to reduce \naccidents and make travel more efficient. In education, intelligent tutoring systems adapt lessons \nto the pace and needs of each student.'

- "db.as_retriever()' --> converts FAISS into a retriever, making research more modular for other langchain applications
- "retriever.invoke(query)" --> do the research with the retriever

#### Metodo 3: Similarity Search con punteggio (score)
**similarity_search_with_score** is a FAISS method that returns not only the documents, but also the distance score of the query from the documents. This is an L2 distance score, so the lower the score, the better.

In [None]:
docs_and_score = db.similarity_search_with_score(query)
docs_and_score #we got a list of tuples (document, score) 

[(Document(id='0df542f0-093c-4d01-a3a1-1138c3fd1758', metadata={'source': '../data/Ai_and_society.txt'}, page_content='Artificial Intelligence and Society\n\nArtificial Intelligence (AI) is rapidly transforming the way we live, work, and interact with technology. \nToday, AI systems are able to recognize speech, translate languages, diagnose diseases, and even create art. \nThese capabilities open new possibilities for innovation across industries, including healthcare, finance, \ntransportation, and education.\n\nIn healthcare, AI helps doctors analyze medical images and detect illnesses earlier than ever before. \nIn finance, machine learning models can identify fraudulent transactions within seconds and provide \npersonalized investment advice. In transportation, autonomous vehicles are being tested to reduce \naccidents and make travel more efficient. In education, intelligent tutoring systems adapt lessons \nto the pace and needs of each student.'),
  np.float32(128.60834)),
 (Doc

We can also pass vectors, rather than sentences

In [7]:
embedding_vector=embeddings.embed_query(query) #convert the query into an embedding vector
embedding_vector

[0.5202500820159912,
 -0.00838441401720047,
 -0.33865606784820557,
 -0.24595975875854492,
 0.3362780809402466,
 -0.7365673780441284,
 -0.615524172782898,
 0.13088048994541168,
 0.17378191649913788,
 0.07084904611110687,
 -0.12041620910167694,
 0.15887565910816193,
 -0.1867741346359253,
 0.26649877429008484,
 -0.4558226764202118,
 -0.19813929498195648,
 -0.33517390489578247,
 -0.3104889392852783,
 0.23106595873832703,
 -0.514039158821106,
 0.30982980132102966,
 0.2622174918651581,
 -1.050834059715271,
 -0.5501865148544312,
 -1.33937406539917,
 0.37980589270591736,
 0.026841431856155396,
 0.26286762952804565,
 0.9006578326225281,
 -0.060406796634197235,
 0.1849794089794159,
 -0.15013396739959717,
 -0.5107271671295166,
 -0.9884954690933228,
 0.24002137780189514,
 -0.09940275549888611,
 0.7333295941352844,
 -1.6291735172271729,
 -0.530634343624115,
 -0.7067144513130188,
 0.38213932514190674,
 0.35081803798675537,
 0.36913713812828064,
 -1.6266767978668213,
 -1.2490603923797607,
 0.48051810

In [8]:
docs_score=db.similarity_search_by_vector(embedding_vector)
docs_score

[Document(id='0df542f0-093c-4d01-a3a1-1138c3fd1758', metadata={'source': '../data/Ai_and_society.txt'}, page_content='Artificial Intelligence and Society\n\nArtificial Intelligence (AI) is rapidly transforming the way we live, work, and interact with technology. \nToday, AI systems are able to recognize speech, translate languages, diagnose diseases, and even create art. \nThese capabilities open new possibilities for innovation across industries, including healthcare, finance, \ntransportation, and education.\n\nIn healthcare, AI helps doctors analyze medical images and detect illnesses earlier than ever before. \nIn finance, machine learning models can identify fraudulent transactions within seconds and provide \npersonalized investment advice. In transportation, autonomous vehicles are being tested to reduce \naccidents and make travel more efficient. In education, intelligent tutoring systems adapt lessons \nto the pace and needs of each student.'),
 Document(id='f0b31b73-a619-441e

##### Salvataggio e Caricamente del VectorStore DB

In [9]:
db.save_local("faiss_index") #save the db locally

In [10]:
new_df=FAISS.load_local("faiss_index", embeddings, allow_dangerous_deserialization=True) #load the db locally

In [12]:
docs=new_df.similarity_search(query) #search with new db loaded
docs

[Document(id='0df542f0-093c-4d01-a3a1-1138c3fd1758', metadata={'source': '../data/Ai_and_society.txt'}, page_content='Artificial Intelligence and Society\n\nArtificial Intelligence (AI) is rapidly transforming the way we live, work, and interact with technology. \nToday, AI systems are able to recognize speech, translate languages, diagnose diseases, and even create art. \nThese capabilities open new possibilities for innovation across industries, including healthcare, finance, \ntransportation, and education.\n\nIn healthcare, AI helps doctors analyze medical images and detect illnesses earlier than ever before. \nIn finance, machine learning models can identify fraudulent transactions within seconds and provide \npersonalized investment advice. In transportation, autonomous vehicles are being tested to reduce \naccidents and make travel more efficient. In education, intelligent tutoring systems adapt lessons \nto the pace and needs of each student.'),
 Document(id='f0b31b73-a619-441e

## Chroma DB

In [13]:
##Costruzione di a sample vector db
from langchain_chroma import Chroma
from langchain_community.document_loaders import TextLoader
from langchain_community.embeddings import OllamaEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

In [15]:
loader=TextLoader("../data/AI_and_society.txt") #caricamento
data=loader.load()
data

[Document(metadata={'source': '../data/AI_and_society.txt'}, page_content='Artificial Intelligence and Society\n\nArtificial Intelligence (AI) is rapidly transforming the way we live, work, and interact with technology. \nToday, AI systems are able to recognize speech, translate languages, diagnose diseases, and even create art. \nThese capabilities open new possibilities for innovation across industries, including healthcare, finance, \ntransportation, and education.\n\nIn healthcare, AI helps doctors analyze medical images and detect illnesses earlier than ever before. \nIn finance, machine learning models can identify fraudulent transactions within seconds and provide \npersonalized investment advice. In transportation, autonomous vehicles are being tested to reduce \naccidents and make travel more efficient. In education, intelligent tutoring systems adapt lessons \nto the pace and needs of each student.\n\nDespite these benefits, the rise of AI also raises important challenges. On

In [16]:
#Split
text_splitter=RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
splits=text_splitter.split_documents(data)

In [18]:
#Creazione Vector Store
embedding=OllamaEmbeddings(model="mxbai-embed-large")
vectordb=Chroma.from_documents(documents=splits, embedding=embedding)
vectordb

<langchain_chroma.vectorstores.Chroma at 0x1ec24ea50d0>

In [19]:
#query it
query = "Can AI have an impact on heathcare?"
docs= vectordb.similarity_search(query)
docs[0].page_content

'In healthcare, AI helps doctors analyze medical images and detect illnesses earlier than ever before. \nIn finance, machine learning models can identify fraudulent transactions within seconds and provide \npersonalized investment advice. In transportation, autonomous vehicles are being tested to reduce \naccidents and make travel more efficient. In education, intelligent tutoring systems adapt lessons \nto the pace and needs of each student.'

In [20]:
#Saving to the disk
vectordb=Chroma.from_documents(documents=splits, embedding=embedding, persist_directory="./chroma_db")



In [21]:
#Load from disk
db2 = Chroma(persist_directory="./chroma_db", embedding_function=embedding)
docs=db2.similarity_search(query) #we make the same query as before
print(docs[0].page_content)

In healthcare, AI helps doctors analyze medical images and detect illnesses earlier than ever before. 
In finance, machine learning models can identify fraudulent transactions within seconds and provide 
personalized investment advice. In transportation, autonomous vehicles are being tested to reduce 
accidents and make travel more efficient. In education, intelligent tutoring systems adapt lessons 
to the pace and needs of each student.


In [23]:
#Similarity Search con Similarity Score
docs=vectordb.similarity_search_with_score(query)
docs

[(Document(id='6dc23af7-a858-48df-aca3-5db8724b5fbc', metadata={'source': '../data/AI_and_society.txt'}, page_content='In healthcare, AI helps doctors analyze medical images and detect illnesses earlier than ever before. \nIn finance, machine learning models can identify fraudulent transactions within seconds and provide \npersonalized investment advice. In transportation, autonomous vehicles are being tested to reduce \naccidents and make travel more efficient. In education, intelligent tutoring systems adapt lessons \nto the pace and needs of each student.'),
  194.36978149414062),
 (Document(id='95170424-c4d8-4c82-9eda-938fd907c499', metadata={'source': '../data/AI_and_society.txt'}, page_content='Artificial Intelligence and Society\n\nArtificial Intelligence (AI) is rapidly transforming the way we live, work, and interact with technology. \nToday, AI systems are able to recognize speech, translate languages, diagnose diseases, and even create art. \nThese capabilities open new poss

Using as Retriever: The same of FAISS. It converts the vector store into a retrieevr for a more efficient search in the Langchain flow

In [24]:
#Retriever Option
retriever=vectordb.as_retriever()
print(retriever.invoke(query)[0].page_content)

In healthcare, AI helps doctors analyze medical images and detect illnesses earlier than ever before. 
In finance, machine learning models can identify fraudulent transactions within seconds and provide 
personalized investment advice. In transportation, autonomous vehicles are being tested to reduce 
accidents and make travel more efficient. In education, intelligent tutoring systems adapt lessons 
to the pace and needs of each student.
