# Document Loaders
Load the document from the local file and apply LLM to answer the questions.

In [1]:
# !git clone https://github.com/laxmimerit/rag-dataset.git

In [2]:
from dotenv import load_dotenv
load_dotenv()

True

In [3]:
from langchain_community.document_loaders import PyMuPDFLoader

loader = PyMuPDFLoader("./rag-dataset/Health/1. Dietary supplements.pdf")
docs = loader.load()

print("Total pages:", len(docs))

Total pages: 17


In [4]:
docs[0].metadata

{'source': './rag-dataset/Health/1. Dietary supplements.pdf',
 'file_path': './rag-dataset/Health/1. Dietary supplements.pdf',
 'page': 0,
 'total_pages': 17,
 'format': 'PDF 1.7',
 'title': '',
 'author': '',
 'subject': '',
 'keywords': '',
 'creator': '',
 'producer': 'iLovePDF',
 'creationDate': '',
 'modDate': 'D:20241021113754Z',
 'trapped': ''}

In [5]:
print(docs[0].page_content)

International  Journal  of
Environmental Research
and Public Health
Review
Dietary Supplements—For Whom? The Current State of
Knowledge about the Health Effects of Selected
Supplement Use
Regina Ewa Wierzejska


Citation: Wierzejska, R.E. Dietary
Supplements—For Whom? The
Current State of Knowledge about the
Health Effects of Selected Supplement
Use. Int. J. Environ. Res. Public Health
2021, 18, 8897. https://doi.org/
10.3390/ijerph18178897
Academic Editor: Paul B. Tchounwou
Received: 15 July 2021
Accepted: 21 August 2021
Published: 24 August 2021
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional afﬁl-
iations.
Copyright: © 2021 by the author.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed
under
the
terms
and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Department of Nutrition and Nutritional Value of Food,

In [6]:
import os

pdfs = []
for root, folders, files in os.walk("rag-dataset"):
  for file in files:
    if file.endswith(".pdf"):
         pdfs.append(os.path.join(root, file))

In [7]:
pdfs

['rag-dataset\\Gym\\1. Analysis of Actual Fitness Supplement.pdf',
 'rag-dataset\\Gym\\2. High Prevalence of Supplement Intake.pdf',
 'rag-dataset\\Health\\1. Dietary supplements.pdf',
 'rag-dataset\\Health\\2. Nutraceuticals research.pdf',
 'rag-dataset\\Health\\3. Health supplements side effect.pdf']

In [8]:
docs = []
for pdf in pdfs:
  loader = PyMuPDFLoader(pdf)
  temp = loader.load()
  docs.extend(list(temp))

In [9]:
print("Total pages from all docs:", len(docs))

Total pages from all docs: 64


In [10]:
def join_docs(docs):
  context = "\n\n".join([doc.page_content for doc in docs])
  return context

In [11]:
context = join_docs(docs)

In [12]:
import tiktoken

encoder = tiktoken.encoding_for_model("gpt-4o-mini")
encoder.encode("congratulations")

[542, 111291, 14571]

In [13]:
print("Total tokens in context:", len(encoder.encode(context)))

Total tokens in context: 60271


## QA with LLM

In [14]:
from langchain_ollama import ChatOllama
from langchain_core.prompts import (SystemMessagePromptTemplate, 
                                    HumanMessagePromptTemplate, 
                                    ChatPromptTemplate,
                                    PromptTemplate,
                                    MessagesPlaceholder)
from langchain_core.output_parsers import StrOutputParser

In [15]:
base_url = "http://localhost:11434/"
model_name = "llama3.2:1b"

llm = ChatOllama(
    base_url = base_url,
    model = model_name
)

In [16]:
system_prompt = SystemMessagePromptTemplate.from_template("You are a helpful AI assistant that answers user question based on the provided context. Do not answer in more than {words} words.")

human_prompt = """
Answer user question based on the provided context ONLY! If you do not know the answer, just say "I don't know".
### Context:
{context}

### Question:
{question}

### Answer:
"""
human_prompt = HumanMessagePromptTemplate.from_template(human_prompt)

template = ChatPromptTemplate([system_prompt, human_prompt])

In [17]:
qna_chain = template | llm | StrOutputParser()

In [22]:
question = "Are there any side effects on taking dietary supplements?"
response = qna_chain.invoke({'words': 20, 'context': context, 'question': question})
print(response)

According to the text, yes, there are several potential side effects associated with taking dietary supplements. Here are some of them:

1. **Botanical Supplements:**
	* Liver injury (inflammation and damage) in cases of certain botanicals like kava kava, saw palmetto, ginseng, and yohimbine.
	* Acute adverse effects due to bioactive constituents that may require hospitalization.
2. **Black Cohosh:**
	* Jaundice and liver failure in menopausal women.
3. **Immunohistochemistry of Cimicifuga racemosa (Black cohosh):** Increased mitochondrial reactive oxygen species, decreased catalase activity, which can lead to oxidative stress and liver toxicity.
4. **Kava Kava:**
	* Liver toxicity, sometimes requiring transplants.
5. **Ginkgo Biloba:**
	* Excessive bleeding in some cases due to inhibition of platelet aggregating factor.
6. **Saw Palmetto:**
	* Cholestatic hepatitis and pancreatitis in some patients.
7. **Valerian (Yohimbine):** Seizure with tachycardia and hypertension, potentially du

In [23]:
question = "Who is the current president of US?"
response = qna_chain.invoke({'words': 20, 'context': context, 'question': question})
print(response)

There is no specific question about a person's presidency that I can address directly in your request. However, I can provide you with information on who is currently serving as President of the United States.

As of my last update in 2021, Joe Biden is the President of the United States. He was inaugurated on January 20, 2021, and is serving his first term as president.


## Summarizer

In [25]:
system_prompt = SystemMessagePromptTemplate.from_template("You are a helpful AI assistant who works as a document summarizer. You must not hallucinate or provide false information.")

human_prompt = """
Summarize the given context in {words} words.

### Context:
{context}

### Summary:
"""
human_prompt = HumanMessagePromptTemplate.from_template(human_prompt)

template = ChatPromptTemplate([system_prompt, human_prompt])

In [26]:
sum_chain = template | llm | StrOutputParser()
response = sum_chain.invoke({'words': 50, 'context': context})
print(response)

The article discusses the potential health risks associated with various types of dietary supplements, including:

1. **Botanical Supplements**: These are derived from plants and can cause adverse effects such as liver injury, kidney damage, and allergic reactions.
2. **Body-Building Supplements**: These contain anabolic steroids that can cause cardiovascular problems, kidney damage, and other health issues.
3. **Herbal Medicines**: Many herbal medicines have been found to be contaminated with unknown substances, which can lead to adverse effects.

Some specific examples of botanical supplements and their potential risks include:

* **Black Cohosh**: Can cause liver failure in menopausal women
* **Kava Kava**: Can cause liver toxicity, pancreatitis, and seizures
* **Yohimbe**: Can cause a seizure with tachycardia and hypertension
* **Milk Thistle**: Can exacerbate iron overload in genetically predisposed individuals

The article also highlights the potential for herbal medicines to int

## Generate Report

In [27]:
question = "Provide a detailed report from the provided context. Write answer in Markdown."
response = qna_chain.invoke({'words': 500,
                             'context': context,
                             'question': question})
print(response)

**Adverse Effects and Herb-Drug Interactions of Dietary Supplements**

The review highlights several cases of adverse effects and herb-drug interactions associated with dietary supplements. The authors emphasize that many botanical supplements may interact with prescribed medications, leading to acute adverse effects or exacerbation of underlying conditions.

**Common Adverse Effects of Botanical Supplements**

1. **Liver Injury**: Several cases were reported in the review, including:
	* Black cohosh-induced jaundice and liver failure in menopausal women
	* Kava kava-induced liver toxicity with cholestatic hepatitis, pancreatitis, and acute liver failure
	* Saw palmetto use-induced cholestatic hepatitis
2. **Cardiovascular Outcomes**: Cases were reported for:
	* Ginseng use leading to transient ischemia attack
	* Garlic use causing excessive bleeding
3. **Non-Hepatic Symptoms**: Several cases were reported, including:
	* Yohimbine use resulting in seizures with tachycardia and hyperten