<a href="https://colab.research.google.com/github/mdehghani86/AppliedGenAI/blob/main/Lab_7_HealthLLM.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🩺 Introduction to Health-LLM  

## 📌 Overview  
**Health-LLM** is a specialized large language model designed to **predict and analyze health measures** using **wearable sensor data** and **user demographics**. Developed by **MIT Media Lab**, this model enhances traditional LLMs by integrating physiological signals such as **heart rate, sleep patterns, activity levels, and stress indicators** to provide **personalized health assessments and insights**.  

🔗 **GitHub Repository**: [Health-LLM on GitHub](https://github.com/mitmedialab/Health-LLM/tree/main)  
📄 **Research Paper**: [Health-LLM Paper (PDF)](https://github.com/mitmedialab/Health-LLM/blob/main/pdf/paper.pdf)  

---

## 🛠 How Was Health-LLM Trained?  

Health-LLM was trained using a **fine-tuning approach** on multiple **public health datasets** to specialize in **consumer health prediction tasks**. The training process involved adapting **general-purpose LLMs** to better interpret **physiological and behavioral data**, improving their ability to generate **health insights and risk assessments**.  

### **1️⃣ Base Models Used**
Health-LLM fine-tunes existing large language models, including:  
- **GPT-3.5 / GPT-4** (OpenAI)  
- **Gemini-Pro** (Google)  
- **MedAlpaca** (Stanford Medicine)  

These models were **not originally trained for health predictions**, so Health-LLM **fine-tunes them on specialized medical datasets** to improve their performance in **health-related tasks**.  

### **2️⃣ Training Data**  
Health-LLM was trained on **four major health datasets**, each containing **physiological and behavioral data** from real-world users:  

| **Dataset** | **Tasks Covered** |
|------------|------------------|
| **PMData** | Fatigue, Stress, Readiness, Sleep Quality |
| **LifeSnaps** | Stress Resilience, Sleep Disorder |
| **GLOBEM** | Anxiety, Depression |
| **AW FB** | Calories, Activity |

These datasets allow the model to **learn human health patterns** and make **more accurate, personalized health predictions**.  

---

## 🚀 How to Use Health-LLM  

Unlike models hosted on **Hugging Face** or accessible via an **API**, **Health-LLM must be run locally**. To use it:  

- **Clone the GitHub repository** and install the required dependencies.  
- **Run the inference script locally** to input health-related queries.  
- **Ensure you have sufficient GPU memory** for efficient execution, or modify the settings for CPU compatibility.  

Health-LLM is designed for **customizable, offline health predictions**, making it ideal for **privacy-focused and research applications**.  



## 📝 **Note: Run This Notebook on a GPU!**  

Health-LLM **requires a GPU** for efficient execution. Running it on a CPU will be **slow** and may cause **memory issues**.  

### **🔧 Enable GPU in Google Colab (One-Step Guide)**  
🔹 **Runtime** → **Change runtime type** → **Set Hardware accelerator to GPU** → **Save** → **Restart runtime** ✅  

To check if GPU is active, run:  
```python
import torch
print("🚀 Device:", "CUDA" if torch.cuda.is_available() else "CPU")


In [None]:
# 🏥 Install Required Libraries

# ✅ Install the necessary libraries
# This installs Hugging Face's `transformers` for NLP models and `torch` for running deep learning models.
!pip install -q transformers torch

# ✅ Verify installation
import torch  # PyTorch - Required for handling tensors and running models on GPU
import transformers  # Hugging Face Transformers library for loading pre-trained models

print("✅ Transformers and Torch installed successfully!")


In [None]:
# 🏥 Clone Health-LLM Repository & Install Dependencies

# ✅ Clone the Health-LLM repository from GitHub
# This downloads the entire project folder, including model files and scripts.
!git clone https://github.com/mitmedialab/Health-LLM.git

# ✅ Navigate into the cloned repository
# Moves into the Health-LLM directory so that all commands run within the project.
%cd Health-LLM

# ✅ Install required dependencies
# Reads the `requirements.txt` file and installs all necessary Python libraries.
!pip install -r requirements.txt


### **⚡ Loading Health-LLM: Running an LLM Locally vs. via API**  

Health-LLM fine-tunes **MedAlpaca-7B**, a medical-focused language model built on Meta’s LLaMA. It is optimized for **clinical reasoning and health-related predictions**, making it well-suited for wearable health monitoring.  

The following setup loads the **MedAlpaca model and tokenizer**, ensuring the correct fine-tuned version is used for inference. It also enables **efficient execution** by leveraging FP16 precision and automatic GPU allocation.  

---
<p align="center">
  <img src="https://media.licdn.com/dms/image/D5612AQEERi6rjuXWwA/article-cover_image-shrink_720_1280/0/1716064697000?e=2147483647&v=beta&t=0mDj7CtX0ZfXJ2LTs48ly1F2_IsX4zaK6GuzExadWEI"
  alt="Local LLM vs API LLM" width="600">
</p>  
The image illustrates how a **local LLM** operates without relying on an external API. The system first retrieves data from a **local vector database**, ensuring all necessary context is available for processing. Then, the query is passed to the **local LLM**, where it is processed directly on the user's device. Unlike API-based models, the response is **generated entirely on the local machine**, eliminating the need for external requests. This method results in **faster inference times** and ensures **data privacy**, as no sensitive information is transmitted to external servers.  

---

### **🔍 Local LLM vs. API-based LLM: Key Differences**  

| Feature          | **Local LLM (Health-LLM)** | **API-based LLM (e.g., GPT-4, Claude)** |
|-----------------|--------------------------|----------------------------------|
| **Execution**   | Runs on the local machine | Runs on a cloud server |
| **Speed**       | Faster (no API calls) but hardware-dependent | Slightly slower due to network requests |
| **Privacy**     | Data stays on the device | Data is sent to external servers |
| **Cost**        | Free after downloading (requires GPU) | Pay per API call |
| **Flexibility** | Can be fine-tuned/customized | Limited customization |
| **Setup**       | Requires downloading & configuring the model | Simple API call with a key |

---

🔹 Unlike API-based LLMs (e.g., OpenAI’s GPT models), Health-LLM runs locally, meaning we don’t need an internet connection to send API requests.

🔹 This setup gives full control over the model and avoids API costs but requires more hardware resources (GPU/CPU). AO, instead of calling a remote API, we directly download and execute the model here.  


In [None]:
# ⚡ Load Health-LLM (Based on MedAlpaca)

# ✅ Import Hugging Face utilities for model & tokenizer
from transformers import AutoModelForCausalLM, AutoTokenizer

# ✅ Define the model name (ensures we use the correct fine-tuned version)
model_name = "medalpaca/medalpaca-7b"

# ✅ Load the tokenizer (converts text into token IDs for model processing)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# ✅ Load the causal language model (used for generating text responses)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,  # Uses FP16 precision for efficient execution on GPU
    device_map="auto"  # Automatically assigns model to GPU if available
)

# 🔹 Since we are using GPU, the model will run on GPU if detected, improving speed and efficiency.
# 🔹 If no GPU is available, it will fall back to CPU, but execution may be slower.


In [None]:
# 🏥 Define the Inference Function for Health-LLM

def generate_response(prompt, max_length=200):
    """
    Generates a response from the Health-LLM model based on a given prompt.

    Parameters:
    - prompt (str): The input question or statement for the model.
    - max_length (int): The maximum number of tokens the model should generate.

    Returns:
    - response (str): The generated text response.
    """

    # ✅ Convert the input text into tokenized format and move it to GPU if available
    inputs = tokenizer(prompt, return_tensors="pt").to("cuda" if torch.cuda.is_available() else "cpu")

    # ✅ Generate the model's response with sampling enabled
    outputs = model.generate(
        inputs.input_ids,  # Pass tokenized input IDs to the model
        max_length=max_length,  # Limit response length
        do_sample=True  # Enable randomness in text generation
    )

    # ✅ Decode the generated tokens back into a readable text response
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)

    return response  # Return the final response


In [None]:
# Example Usage of Health-LLM Inference

# Define a personalized test prompt
prompt = "I am a 25-year-old female experiencing frequent dizziness, fatigue, and low blood pressure (90/60 mmHg). What could be the possible causes, and how can I manage this condition?"

# Generate response using the model
response = generate_response(prompt)

# Print formatted output
print("Prompt:", prompt)
print("Response:", response)


In [None]:
import pandas as pd  # Import pandas for table formatting
from IPython.display import display  # Import display for showing tables in Colab

# 🏥 Store refined prompts and responses in a list
data = []

# 🏃‍♂️ Fitness & Health Optimization
prompt = "I am a 35-year-old male with a resting heart rate of 78 bpm. I walk 5,000 steps daily and sleep for 6 hours. My diet consists mainly of processed foods. What are the best strategies to improve my fitness and reduce stress?"
response = generate_response(prompt)
data.append(["🏃‍♂️ Fitness & Stress", prompt, response])

# 🥗 Nutrition & Weight Management
prompt = "I am a 45-year-old female. My weight is 85 kg, and my height is 165 cm. I have a sedentary lifestyle and consume a high-carb diet. How can I improve my nutrition and achieve a healthier BMI?"
response = generate_response(prompt)
data.append(["🥗 Nutrition & Weight", prompt, response])

# 😴 Sleep & Recovery
prompt = "I sleep only 5 hours per night and wake up feeling exhausted. I often drink coffee late at night due to work stress. How can I improve my sleep quality and energy levels throughout the day?"
response = generate_response(prompt)
data.append(["😴 Sleep & Energy", prompt, response])

# 📊 Convert to DataFrame and Display
df = pd.DataFrame(data, columns=["Category", "Prompt", "Response"])

# ✅ Print the table in Colab
display(df)


In [None]:
# ⚙️ Install required Hugging Face, LangChain & Wikipedia libraries

!pip install -q \
    datasets tokenizers accelerate huggingface_hub \
    langchain langchain-openai langchain-community \
    wikipedia-api wikipedia

# 🔹 datasets: Provides access to large datasets for training and evaluation
# 🔹 tokenizers: Efficiently tokenizes text for LLM processing
# 🔹 accelerate: Optimizes model execution across CPU, GPU, and multi-GPU setups
# 🔹 huggingface_hub: Enables downloading and managing models from Hugging Face
# 🔹 langchain: Core framework for integrating LLMs with memory, tools, and agents
# 🔹 langchain-openai: Official LangChain module for OpenAI models (GPT-4, ChatGPT)
# 🔹 langchain-community: Community-driven utilities for LangChain
# 🔹 wikipedia-api: Allows Wikipedia search integration for external knowledge retrieval
# 🔹 wikipedia: Provides direct API access to Wikipedia for fetching factual information

# 🔹 Ensure latest version of langchain-openai for OpenAI API integration
!pip install -U langchain-openai


In [None]:
# 📥 Import necessary libraries for LangChain, Hugging Face & Wikipedia integration

import os
import pandas as pd  # 🔹 For structured table formatting
from IPython.display import display  # 🔹 Enables table display in Colab

# 🔹 Hugging Face & Model Execution
import datasets  # 🔹 Provides access to large datasets for LLM training & evaluation
import accelerate  # 🔹 Speeds up execution on CPU, GPU, or multi-GPU
import huggingface_hub  # 🔹 Manages model downloads & integration from Hugging Face

# 🔹 LangChain - LLM Framework
import langchain
from langchain_openai import ChatOpenAI  # 🔹 Correct import for OpenAI chat models
from langchain.agents import initialize_agent, AgentType  # 🔹 Enables agent-based interactions
from langchain.tools import Tool  # 🔹 Defines external tools for agent-based workflows
from langchain.memory import ConversationBufferMemory  # 🔹 Enables memory for multi-turn interactions
from langchain_community.utilities import WikipediaAPIWrapper  # 🔹 Allows Wikipedia search integration
from langchain.prompts import PromptTemplate  # 🔹 Helps in structuring prompts for better responses

# ✅ Verify that all libraries are installed and imported correctly
print("✅ All required libraries installed and imported successfully!")


In [None]:

# ✅ Set OpenAI API Key from Colab Secrets
os.environ["OPENAI_API_KEY"] = os.environ.get("OpenAI_Key")  # Ensures LangChain uses the correct key

# ✅ Load GPT-4 (as a chat model)
gpt4_llm = ChatOpenAI(
    model="gpt-4",
    temperature=0.7,
    openai_api_key=os.getenv("OPENAI_API_KEY")  # Ensures correct API key usage
)

# ✅ Define conversation memory to store chat history
memory = ConversationBufferMemory(memory_key="chat_history")

# ✅ Define a structured prompt template with chat history
prompt_template = PromptTemplate(
    input_variables=["chat_history", "query"],
    template="""
    You are an AI assistant providing accurate answers.
    Maintain continuity by considering past conversation history.

    Chat History:
    {chat_history}

    User Question:
    {query}

    Provide a clear and well-structured response.
    """
)

# ✅ Modify GPT-4 tool to use the prompt template with memory
def gpt4_with_memory(query):
    """Formats the query with chat history before sending to GPT-4."""
    formatted_query = prompt_template.format(
        chat_history=memory.load_memory_variables({})["chat_history"], query=query
    )
    return gpt4_llm(formatted_query)

# ✅ Define GPT-4 as a tool
gpt4_tool = Tool(
    name="ChatGPT",
    func=gpt4_with_memory,
    description="Use this for answering general questions, explanations, and advice."
)

# ✅ Define Wikipedia Search as a tool
wiki_tool = Tool(
    name="Wikipedia Search",
    func=WikipediaAPIWrapper().run,
    description="Use this to look up general knowledge, historical facts, or scientific topics."
)

# ✅ Initialize the agent with Wikipedia & GPT-4
agent = initialize_agent(
    tools=[gpt4_tool, wiki_tool],  # Only two tools: ChatGPT & Wikipedia
    llm=gpt4_llm,  # Default model if tools aren't explicitly used
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,  # Lets the agent decide the best approach
    memory=memory,  # Uses memory to track chat history
    verbose=True
)


In [None]:
# 📌 Example Queries & Response Storage
data = []

# 🩺 Medical Query - Symptoms of High Blood Pressure
prompt = "What are the symptoms of high blood pressure?"
response = agent.run(prompt)  # ✅ Use LangChain agent to get response
data.append(["🩺 Medical - Blood Pressure", prompt, response])

# 🌍 Wikipedia Lookup - First Person on the Moon
prompt = "Who was the first person to walk on the moon?"
response = agent.run(prompt)  # ✅ Wikipedia should fetch this
data.append(["🌍 Wikipedia - Space", prompt, response])

# 😴 Sleep & Health Query
prompt = "How can I improve my sleep quality?"
response = agent.run(prompt)  # ✅ GPT-4 provides advice
data.append(["😴 Sleep & Energy", prompt, response])

# 🌏 Wikipedia Lookup - Capital of Japan
prompt = "What is the capital of Japan?"
response = agent.run(prompt)  # ✅ Wikipedia should fetch this
data.append(["🌏 Wikipedia - Geography", prompt, response])

# 💤 Follow-up Sleep Query
prompt = "Tell me more about sleep quality improvements."
response = agent.run(prompt)  # ✅ GPT-4 remembers chat history
data.append(["💤 Sleep Follow-Up", prompt, response])

# 📊 Convert to DataFrame and Display
df = pd.DataFrame(data, columns=["Category", "Prompt", "Response"])

# ✅ Print the table in Colab
display(df)

In [None]:
!pip install -q gradio


In [None]:
import gradio as gr  # 💻 Import Gradio to create an interactive web UI

# 🏥 Define sample health-related questions (e.g., predicting health risks)
sample_questions = [
    "Predict my health risks: Age 65, Male, BMI 28, Smoker, Blood Pressure 140/90",
    "Based on my data: Age 45, Female, BMI 22, Resting Heart Rate 80 bpm, how healthy am I?",
    "What are early signs of heart disease in a 55-year-old male?",
    "How does sleep deprivation affect long-term health?",
    "What are the biggest risk factors for diabetes?"
]

# 🏥 Define the Gradio UI for interacting with the LangChain Agent
with gr.Blocks() as demo:
    gr.Markdown("## 🏥 AI Health & Knowledge Assistant 🤖")  # 🔹 Title of the UI

    # 📥 User input field (where selected questions will be inserted)
    user_input = gr.Textbox(label="🔍 Ask a Health or General Question")

    # 📌 Add sample questions as buttons
    with gr.Row():
        for question in sample_questions:
            gr.Button(question).click(
                fn=lambda q=question: q,
                inputs=[],
                outputs=user_input
            )  # 📌 Clicking a button fills the textbox

    # 🔥 Temperature slider for GPT-4 (affects randomness in responses)
    temperature_slider = gr.Slider(
        minimum=0.1, maximum=1.0, value=0.7, step=0.1,
        label="🌡️ GPT-4 Temperature (Lower = Precise, Higher = Creative)"
    )

    # ✅ Submit button to process the query
    submit_button = gr.Button("💡 Get AI Response")

    # 📤 Output box to display the AI-generated response
    response_output = gr.Textbox(label="🤖 AI Response", interactive=False)

    # ✅ Define interaction: When the button is clicked, the LangChain agent is triggered
    submit_button.click(
        fn=lambda user_question: agent.run(user_question),  # 🔄 Uses the LangChain agent
        inputs=[user_input],  # 📥 Takes user query as input
        outputs=response_output  # 📤 Displays AI-generated response
    )

# 🌍 Launch the Gradio app
demo.launch()


##1️⃣ Install Required Libraries 🚀
