## Document Loaders
- Load various kind of documents from the web and local files.
- Apply LLM to the documents for summarization and question answering.

In [3]:
from dotenv import load_dotenv

load_dotenv('./../.env')

True

### Project 1: Question Answering from PDF Document
- We will load the document from the local file and apply LLM to answer the questions.
- Lets use research paper published on the missuse of the health supplements for workout. 

rag-dataset: git@github.com:laxmimerit/rag-dataset.git

```bash
git clone git@github.com:laxmimerit/rag-dataset.git
```

In [10]:
# !git clone git@github.com:laxmimerit/rag-dataset.git
# !pip install pymupdf tiktoken 


In [12]:
### Read PDF File
from langchain_community.document_loaders import PyMuPDFLoader

loader = PyMuPDFLoader("./rag-dataset/gym supplements/1. Analysis of Actual Fitness Supplement.pdf")

docs = loader.load()

doc = docs[0]
# print(doc.page_content)

In [13]:
### Get the list of all available PDF files
import os

pdfs = []
for root, dirs, files in os.walk('rag-dataset'):
    # print(root, dirs, files)
    for file in files:
        if file.endswith('.pdf'):
            pdfs.append(os.path.join(root, file))

In [14]:
### Read all pages of pdf files
docs = []
for pdf in pdfs:
    loader = PyMuPDFLoader(pdf)
    pages = loader.load()

    docs.extend(pages)


len(docs)

64

In [19]:
def format_docs(docs):
    return "\n\n".join([doc.page_content for doc in docs])

context = format_docs(docs)
# print(context)

In [24]:
### Count Total Tokens
import tiktoken

encoding = tiktoken.encoding_for_model('gpt-4o-mini')
len(encoding.encode(docs[0].page_content)), len(encoding.encode(context))

(969, 60271)

In [58]:
### Question Answering using LLM

from langchain_ollama import ChatOllama

from langchain_core.prompts import (
                                        SystemMessagePromptTemplate,
                                        HumanMessagePromptTemplate,
                                        ChatPromptTemplate
                                        )

from langchain_core.output_parsers import StrOutputParser

base_url = "http://localhost:11434"
model = 'llama3.2:3b'

llm = ChatOllama(base_url=base_url, model=model)

system = SystemMessagePromptTemplate.from_template("""You are helpful AI assistant who answer user question based on the provided context. 
                                                    Do not answer in more than {words} words""")

prompt = """Answer user question based on the provided context ONLY! If do not know the answer, just say "I don't know".
            ### Context:
            {context}

            ### Question:
            {question}

            ### Answer:"""


prompt = HumanMessagePromptTemplate.from_template(prompt)

messages = [system, prompt]
template = ChatPromptTemplate(messages)

qna_chain = template | llm | StrOutputParser()

# template



In [43]:
# template.invoke({'context': context, 'question': 'What is the best gym supplement?'})

In [45]:
response = qna_chain.invoke({'context': context, 'question': 'What is the best gym supplement?', 'words': 30})
print(response)

The answer to this question cannot be determined as it is based on personal preference, fitness goals, and individual needs. Different supplements may have varying effects on different people.

However, I can provide some general information about commonly used gym supplements:

1. Protein powder: A popular choice among athletes, protein powder can help with muscle growth and recovery.
2. Creatine: This supplement is known to increase strength and endurance, particularly in high-intensity activities like weightlifting.
3. Beta-Alanine: An amino acid that can help increase muscle carnosine levels, delaying the onset of fatigue during exercise.
4. Branched-Chain Amino Acids (BCAAs): BCAAs, consisting of leucine, isoleucine, and valine, can aid in muscle recovery and growth.
5. Pre-workout supplements: These typically contain a combination of ingredients such as caffeine, nitric oxide boosters, and other stimulants to enhance energy and performance during workouts.

It's essential to cons

In [46]:
response = qna_chain.invoke({'context': context, 'question': 'What is the best planet to live on?', 'words': 30})
print(response)

I don't have a question or an answer related to the text. The provided text appears to be a section from a scientific article about botanical supplements and their potential toxicities, interactions with other compounds, and mechanisms of action.

If you'd like, I can try to help you summarize or understand a specific part of the text.


In [47]:
response = qna_chain.invoke({'context': context, 'question': 'How to gain muscle mass?', 'words': 30})
print(response)

I can help with that question, but I need more context or information about the specific botanical supplements and their effects on muscle mass. However, based on general knowledge, here is a brief answer:

To gain muscle mass, it's essential to combine a healthy diet with regular exercise and sufficient protein intake. While certain botanical supplements may not directly contribute to muscle growth, some may have indirect effects or interact with other substances that can impact muscle development.

For example, ginseng has been associated with increased endurance and improved athletic performance, which could potentially help individuals achieve their muscle-building goals. However, more research is needed to fully understand the potential benefits of specific botanical supplements on muscle mass.

In contrast, some botanicals may have adverse effects when combined with exercise or diet, such as garlic and ginkgo biloba, which can increase the risk of bleeding. Therefore, it's crucia

In [48]:
response = qna_chain.invoke({'context': context, 'question': 'side effects of gym supplements?', 'words': 30})
print(response)

Side effects of botanical supplements, including gym supplements, can be varied and sometimes severe. Some reported effects include:

*   Nausea and fatigue
*   Headache
*   Jaundice and liver failure (e.g., black cohosh)
*   Drug-induced liver injury (e.g., kava kava)
*   Hepatitis (e.g., saw palmetto)
*   Cholestatic symptoms (e.g., echinacea, milk thistle)
*   Seizures with tachycardia and hypertension (e.g., yohimbe)
*   Exacerbated hemochromatosis (iron overload) in genetically predisposed individuals
*   Transient ischemic attack (e.g., ginseng)
*   Slow heart rate (e.g., black cohosh)
*   Excessive bleeding (e.g., garlic, ginkgo biloba)

These adverse effects are often mild but can be serious. In some cases, the mechanisms behind these effects are not fully understood and require further investigation.

Potential Herb-Drug Interactions:

Pharmacologically active compounds in botanicals are substrates of metabolizing enzymes, similar to drugs. This means that induction or suppres

### Project 2: PDF Document Summarization

In [59]:
system = SystemMessagePromptTemplate.from_template("""You are helpful AI assistant who works as document summarizer. 
                                                   You must not hallucinate or provide any false information.""")

prompt = """Summarize the given context in {words}.
            ### Context:
            {context}

            ### Summary:"""


prompt = HumanMessagePromptTemplate.from_template(prompt)

messages = [system, prompt]
template = ChatPromptTemplate(messages)

summary_chain = template | llm | StrOutputParser()

In [54]:
response = summary_chain.invoke({'context': context, 'words': 100})
print(response)

The article discusses the potential toxicities and interactions of various botanical supplements, including their active compounds, typical use, dosage, and reported adverse effects. The authors highlight that reports of adverse effects directly attributable to botanicals are rare, but more serious cases have appeared, often related to liver toxicity and drug-induced liver injury (DILI). Specific examples include:

* Black cohosh associated with jaundice and liver failure
* Kava kava linked to liver toxicity and depletion of glutathione
* Saw palmetto causing cholestatic hepatitis and pancreatitis
* Echinacea associated with acute liver failure without a specific mechanism
* Valerian use inducing jaundice that was reversed by steroid administration
* Yohimbine's sympathomimetic properties leading to seizures, tachycardia, and hypertension
* Ginseng implicated in a transient ischemic attack
* Black cohosh regulating heart rate via activation of serotonin receptors

The article also disc

In [60]:
### qna chain as summarizer

response = qna_chain.invoke({'context': context, 'question': 'Summarize the given context', 'words': 100})
print(response)

The given context discusses the mechanisms of action and adverse effects of various botanical supplements. The table provided lists commonly used botanicals, their primary active constituents, typical use and dosage, and reported adverse effects. Reports of adverse effects directly attributable to botanicals are rare, but more serious clinical cases have appeared, often related to drug-induced liver injury (DILI) and its associated mechanisms. Additionally, potential herb-drug interactions are discussed, highlighting the importance of induction or suppression of metabolizing enzymes by botanical compounds.


In [61]:
response = qna_chain.invoke({'context': context, 'question': 'Provide a detailed report from the provided context. Write answer in Markdown', 'words': 1000})
print(response)

**Botanical Supplements and Adverse Effects**

### Overview of Botanical Supplements

Botanical supplements, also known as herbal remedies or phytochemicals, are derived from plants and have been used for centuries to prevent and treat various health conditions. These supplements can be found in various forms, including capsules, tablets, teas, and extracts.

### Commonly Used Botanical Supplements

1. **Black Cohosh (Cimicifuga racemosa)**
	* Primary Active Constituent: Catechins
	* Typical Use: Menopausal symptoms relief
	* Dosage: 40-80 mg per day
	* Reported Adverse Effects:
		+ Jaundice and liver failure in menopausal women
		+ Mitochondrial dysfunction, oxidative stress, and alteration of bile acid homeostasis
2. **Kava Kava**
	* Primary Active Constituent: Kavalactones
	* Typical Use: Anxiety relief, sleep aid
	* Dosage: 250-500 mg per day
	* Reported Adverse Effects:
		+ Liver toxicity, including cases requiring liver transplants
		+ Depletion of glutathione and inhibition of c