## Vector stores
One of the most common ways to store and search over unstructured data is to embed it and store the resulting embedding vectors, and then at query time to embed the unstructured query and retrieve the embedding vectors that are 'most similar' to the embedded query. A vector store takes care of storing embedded data and performing vector search for you.

**A key part of working with vector stores is creating the vector to put in them, which is usually created via embeddings.**

In this walkthrough, we are using the FAISS vector database, which makes use of the Facebook AI Similarity Search (FAISS) library.

In [1]:
#pip install faiss-cpu

Collecting faiss-cpu
  Downloading faiss_cpu-1.7.4-cp39-cp39-win_amd64.whl (10.8 MB)
Installing collected packages: faiss-cpu
Successfully installed faiss-cpu-1.7.4
Note: you may need to restart the kernel to use updated packages.


In [2]:
import os 

os.environ["OPENAI_API_KEY"]="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"

In [3]:
from langchain.document_loaders import TextLoader
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import FAISS

In [14]:
raw_documents = TextLoader('state_of_the_union.txt').load()
text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=10)
documents = text_splitter.split_documents(raw_documents)

embeddings = OpenAIEmbeddings()
db = FAISS.from_documents(documents, embeddings)

In [15]:
query = "How much tax cut passed?"
docs = db.similarity_search(query)
print(docs[0].page_content)

And unlike the $2 Trillion tax cut passed in the previous administration that benefitted the top 1% of Americans, the American Rescue Plan helped working peopleâ€”and left no one behind. 

And it worked. It created jobs. Lots of jobs. 

In factâ€”our economy created over 6.5 Million new jobs just last year, more jobs created in one year  
than ever before in the history of America.


In [16]:
embedding_vector = embeddings.embed_query(query)
docs = db.similarity_search_by_vector(embedding_vector)
print(docs[0].page_content)

And unlike the $2 Trillion tax cut passed in the previous administration that benefitted the top 1% of Americans, the American Rescue Plan helped working peopleâ€”and left no one behind. 

And it worked. It created jobs. Lots of jobs. 

In factâ€”our economy created over 6.5 Million new jobs just last year, more jobs created in one year  
than ever before in the history of America.


## Pinecone
Pinecone is a vector database with broad functionality.

In [1]:
# pip install pinecone-client openai tiktoken

In [3]:
import os
import getpass

In [4]:
PINECONE_API_KEY = getpass.getpass("Pinecone API Key:")

Pinecone API Key:········


In [19]:
PINECONE_ENV = getpass.getpass("Pinecone Environment:")

Pinecone Environment:········


In [6]:
os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")

OpenAI API Key:········


In [7]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Pinecone
from langchain.document_loaders import TextLoader

In [43]:
from langchain.document_loaders import TextLoader

loader = TextLoader("state_of_the_union.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=800, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

embeddings = OpenAIEmbeddings()

In [44]:
import pinecone

# initialize pinecone
pinecone.init(
    api_key=PINECONE_API_KEY,  # find at app.pinecone.io
    environment=PINECONE_ENV,  # next to api key in console
)

In [45]:
index_name= 'langchain'
docsearch = Pinecone.from_documents(docs, embeddings, index_name=index_name)

query = "What did the president say about Ketanji Brown Jackson?"
docquery  = docsearch.similarity_search(query)

In [46]:
print(docquery[0].page_content)

One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. 

And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nationâ€™s top legal minds, who will continue Justice Breyerâ€™s legacy of excellence. 

A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since sheâ€™s been nominated, sheâ€™s received a broad range of supportâ€”from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. 

And if we are to advance liberty and justice, we need to secure the Border and fix the immigration system.
