## Text Embedding

In [29]:
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_core.prompts import PromptTemplate
from langchain_core.prompts import ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate, AIMessagePromptTemplate
from langchain.schema import SystemMessage, HumanMessage, AIMessage
from langchain.text_splitter import CharacterTextSplitter
from langchain_chroma import Chroma

from dotenv import load_dotenv

In [5]:
load_dotenv()

True

In [6]:
embeddings = OpenAIEmbeddings()

In [7]:
text = 'this is some normal text string that I want to embed as a vector'

In [8]:
embedded_text = embeddings.embed_query(text)

In [10]:
len(embedded_text)

1536

In [21]:
from langchain.document_loaders import TextLoader
from langchain.document_loaders import CSVLoader

In [12]:
loader = CSVLoader('penguins.csv')

In [13]:
data = loader.load()

In [15]:
embedded_docs = embeddings.embed_documents([text.page_content for text in data])

In [16]:
len(embedded_docs)

344

## Vector Stores

In [18]:
import chromadb

In [22]:
loader = TextLoader('FDR_State_of_Union_1944.txt')

In [23]:
documents = loader.load()

In [24]:
text_splitter = CharacterTextSplitter.from_tiktoken_encoder(chunk_size=500)
docs = text_splitter.split_documents(documents)

In [25]:
embedding_function = OpenAIEmbeddings()

In [26]:
db = Chroma.from_documents(documents=docs, 
                           embedding=embedding_function, 
                           persist_directory="./speech_new_db")

## Connecting to ChromaDB

In [32]:
db_new_connection = Chroma(persist_directory="./speech_new_db", 
                           embedding_function=embedding_function)

In [33]:
new_doc = "What did FDR say about the cost of food law?"

In [34]:
similar_docs = db_new_connection.similarity_search(new_doc)

In [37]:
print(similar_docs[0].page_content)

(2) A continuation of the law for the renegotiation of war contractsâ€”which will prevent exorbitant profits and assure fair prices to the Government. For two long years I have pleaded with the Congress to take undue profits out of war.

(3) A cost of food lawâ€”which will enable the Government (a) to place a reasonable floor under the prices the farmer may expect for his production; and (b) to place a ceiling on the prices a consumer will have to pay for the food he buys. This should apply to necessities only; and will require public funds to carry out. It will cost in appropriations about one percent of the present annual cost of the war.

(4) Early reenactment of. the stabilization statute of October, 1942. This expires June 30, 1944, and if it is not extended well in advance, the country might just as well expect price chaos by summer.

(5) A national service law- which, for the duration of the war, will prevent strikes, and, with certain appropriate exceptions, will make available

## Adding documents to ChromaDB

In [38]:
loader = TextLoader('Lincoln_State_of_Union_1862.txt')
documents = loader.load()

In [39]:
docs = text_splitter.split_documents(documents)

Created a chunk of size 611, which is longer than the specified 500
Created a chunk of size 539, which is longer than the specified 500
Created a chunk of size 686, which is longer than the specified 500


In [40]:
db_new_connection = Chroma.from_documents(documents=docs, 
                                          embedding=embedding_function,
                                          persist_directory="./speech_new_db")

In [41]:
docs = db_new_connection.similarity_search("What did Lincoln say about slavery")

In [45]:
print(docs[0].page_content)

In this view I recommend the adoption of the following resolution and articles amendatory to the Constitution of the United States:

Resolved by the Senate and House of Representatives of the United States of America in Congress assembled (two-thirds of both Houses concurring), That the following articles be proposed to the legislatures (or conventions) of the several States as amendments to the Constitution of the United States, all or any of which articles, when ratified by three-fourths of the said legislatures (or conventions ), to be valid as part or parts of the said Constitution, viz:
ART.--. Every State wherein slavery now exists which shall abolish the same therein at any time or times before the 1st day of January., A. D. 1900, shall receive compensation from the United States as follows, to wit:
The President of the United States shall deliver to every such State bonds of the United States bearing interest at the rate of per cent per annum to an amount equal to the aggregate

## Retrieval Documents

In [46]:
retriever = db_new_connection.as_retriever()

In [48]:
results = retriever.invoke('slavery')

In [58]:
print(results[0].page_content)

As to the second article, I think it would be impracticable to return to bondage the class of persons therein contemplated. Some of them, doubtless, in the property sense belong to loyal owners, and hence provision is made in this article for compensating such. The third article relates to the future of the freed people. It does not oblige, but merely authorizes Congress to aid in colonizing such as may consent. This ought not to be regarded as objectionable on the one hand or on the other, insomuch as it comes to nothing unless by the mutual consent of the people to be deported and the American voters, through their representatives in Congress.

I can not make it better known than it already is that I strongly favor colonization; and yet I wish to say there is an objection urged against free colored persons remaining in the country which is largely imaginary, if not sometimes malicious.

It is insisted that their presence would injure and displace white labor and white laborers. If th