### Documents

Document has 2 attributes:
- page_content
- metadata

In [1]:
from langchain_core.documents import Document

documents = [
    Document(metadata={'source': 'speech.txt'}, page_content='Ladies and Gentlemen,'),
    Document(metadata={'source': 'speech.txt'}, page_content='It is a pleasure for me to be here tonight and address such a great audience. The issue I would like to bring up threatens the prosperity and welfare of the whole nation, however, the majority of the'),
    Document(metadata={'source': 'speech.txt'}, page_content='however, the majority of the population tends to ignore it and pretend as if it is not a problem at all. Namely, I would like to talk about the risks of obesity.'),
    Document(metadata={'source': 'speech.txt'}, page_content='First of all, it would be reasonable to present the statistics that some of you might find shocking. To be more precise, in accordance with the data provided by the Office of Disease Prevention and'),
    Document(metadata={'source': 'speech.txt'}, page_content='of Disease Prevention and Health Promotion, the number of people suffering from obesity has already reached the point of 35% of the whole US population. Just imagine, one-third of Americans is'),
    Document(metadata={'source': 'speech.txt'}, page_content='one-third of Americans is particularly limited in their opportunities to have a happy, comprehensive, and productive life. Let me remind you, that the right to health was considered by the United'),
    Document(metadata={'source': 'speech.txt'}, page_content='was considered by the United Nations to be an integral element of the human rights and the individual dignity. I believe, that it is a high time to ask ourselves whether the lifestyle patterns'),
    Document(metadata={'source': 'speech.txt'}, page_content='the lifestyle patterns cultivated in the US are actually contributing to the concept of human dignity or not. Unfortunately, the honest answer would rather disappoint us.'),
    Document(metadata={'source': 'speech.txt'}, page_content='Obviously, the complaints and disappointment may hardly improve the recent state of affairs. Subsequently, I believe, that the awareness and personal determination to change the trend are the first'),
    Document(metadata={'source': 'speech.txt'}, page_content='the trend are the first steps towards the overall success. Some of you would definitely argue that it is an entirely personal responsibility or, at most, the responsibility of the Department of'),
    Document(metadata={'source': 'speech.txt'}, page_content='of the Department of Health. Let me disagree and explain the essence the obesity spread. In particular, reducing the time and money being spent on meal has become a generally accepted trend which,'),
    Document(metadata={'source': 'speech.txt'}, page_content='accepted trend which, however, may hardly bring the benefits to anyone except the fast-food companies’ bosses. Therefore, we should agree that the recent economic model, which has made healthy eating'),
    Document(metadata={'source': 'speech.txt'}, page_content='which has made healthy eating a privilege of a few, is exceptionally harmful.'),
    Document(metadata={'source': 'speech.txt'}, page_content='So, what can be done to change the pattern? First of all, there are two aspects of the campaign against the obesity spread. Namely, the first one implies the efforts of the official institutions and'),
    Document(metadata={'source': 'speech.txt'}, page_content='the official institutions and includes informational campaigns conduction as well as providing the obese people with a proper treatment. However, I would like to emphasize the importance of the other'),
    Document(metadata={'source': 'speech.txt'}, page_content='the importance of the other aspect which, I believe, implies a key meaning in terms of changing the negative statistics. Namely, I would like to talk about the determination and responsibility of'),
    Document(metadata={'source': 'speech.txt'}, page_content='and responsibility of each of us who are neither the medical officials nor the obesity victims. In particular, each of us has a friend or colleague who is suffering from obesity. There are also'),
    Document(metadata={'source': 'speech.txt'}, page_content='from obesity. There are also people, like our children, to whom our behavior may serve as a model to follow. We are responsible for those people and should put our best efforts to raise their'),
    Document(metadata={'source': 'speech.txt'}, page_content='best efforts to raise their awareness and convince them that it is better to lose some extra half-an-hour but keep our health and welfare safe.'),
    Document(metadata={'source': 'speech.txt'}, page_content='Ladies and Gentlemen, let me once again thank you for your patience. Being aware and responsible, we are capable of overcoming the threats of obesity spread. Thank you!')
]

In [2]:
import os
from dotenv import load_dotenv
load_dotenv()

True

In [3]:
os.environ["HF_TOKEN"] = os.getenv("HF_TOKEN")
os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY")

In [4]:
from langchain_huggingface import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

In [5]:
from langchain_groq import ChatGroq

groq_api_key = os.environ["GROQ_API_KEY"]

llm = ChatGroq(groq_api_key=groq_api_key, model="Llama3-8b-8192")
llm


ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x118a21f10>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x14f736f10>, model_name='Llama3-8b-8192', model_kwargs={}, groq_api_key=SecretStr('**********'))

In [6]:
## Vector stores
from langchain_chroma import Chroma

vectorstore = Chroma.from_documents(documents,embedding=embeddings)

In [7]:
vectorstore.similarity_search("ladies")

[Document(id='3d7aa4a6-0e36-45da-b528-bd9d7e784f44', metadata={'source': 'speech.txt'}, page_content='Ladies and Gentlemen,'),
 Document(id='9f9ec896-2982-4264-8ae7-0ddb128e5f77', metadata={'source': 'speech.txt'}, page_content='Ladies and Gentlemen, let me once again thank you for your patience. Being aware and responsible, we are capable of overcoming the threats of obesity spread. Thank you!'),
 Document(id='5f54363f-f70c-46ba-be3b-2efd67531882', metadata={'source': 'speech.txt'}, page_content='the lifestyle patterns cultivated in the US are actually contributing to the concept of human dignity or not. Unfortunately, the honest answer would rather disappoint us.'),
 Document(id='77067db4-3907-4bd7-8478-0553e51dd23d', metadata={'source': 'speech.txt'}, page_content='Obviously, the complaints and disappointment may hardly improve the recent state of affairs. Subsequently, I believe, that the awareness and personal determination to change the trend are the first')]

In [10]:
# Async query
await vectorstore.asimilarity_search("ladies")

[Document(id='3d7aa4a6-0e36-45da-b528-bd9d7e784f44', metadata={'source': 'speech.txt'}, page_content='Ladies and Gentlemen,'),
 Document(id='9f9ec896-2982-4264-8ae7-0ddb128e5f77', metadata={'source': 'speech.txt'}, page_content='Ladies and Gentlemen, let me once again thank you for your patience. Being aware and responsible, we are capable of overcoming the threats of obesity spread. Thank you!'),
 Document(id='5f54363f-f70c-46ba-be3b-2efd67531882', metadata={'source': 'speech.txt'}, page_content='the lifestyle patterns cultivated in the US are actually contributing to the concept of human dignity or not. Unfortunately, the honest answer would rather disappoint us.'),
 Document(id='77067db4-3907-4bd7-8478-0553e51dd23d', metadata={'source': 'speech.txt'}, page_content='Obviously, the complaints and disappointment may hardly improve the recent state of affairs. Subsequently, I believe, that the awareness and personal determination to change the trend are the first')]

### Retrievers

###### Retriever is a subclass of Runnable and vectorstore is not a part of Runnable 

In [11]:
from typing import List

from langchain_core.documents import Document
from langchain_core.runnables import RunnableLambda

retriever = RunnableLambda(vectorstore.similarity_search).bind(k=1)
retriever.batch(["ladies", "obesity"])

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


[[Document(id='3d7aa4a6-0e36-45da-b528-bd9d7e784f44', metadata={'source': 'speech.txt'}, page_content='Ladies and Gentlemen,')],
 [Document(id='748ddd2a-9394-49ba-8647-50b055ec40df', metadata={'source': 'speech.txt'}, page_content='of the Department of Health. Let me disagree and explain the essence the obesity spread. In particular, reducing the time and money being spent on meal has become a generally accepted trend which,')]]

In [12]:
retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k":1}
)

retriever.batch(["ladies", "obesity"])

[[Document(id='3d7aa4a6-0e36-45da-b528-bd9d7e784f44', metadata={'source': 'speech.txt'}, page_content='Ladies and Gentlemen,')],
 [Document(id='748ddd2a-9394-49ba-8647-50b055ec40df', metadata={'source': 'speech.txt'}, page_content='of the Department of Health. Let me disagree and explain the essence the obesity spread. In particular, reducing the time and money being spent on meal has become a generally accepted trend which,')]]

In [15]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough

message = """
Answer this question using provided context only.

{question}

Context:
{context}
"""
prompt = ChatPromptTemplate.from_messages([("human", message)])

rag_chain = {"context": retriever, "question": RunnablePassthrough()}|prompt|llm

response = rag_chain.invoke("tell me about obesity")

print(response.content)

Based on the provided context, here's what I can infer about obesity:

The context suggests that reducing the time and money spent on meals has become a generally accepted trend, which is related to the spread of obesity. This implies that obesity is a growing concern, and the trend of reducing meal time and expenses could be a contributing factor to its spread.
