# Random Forest Classifier – Dataset 2 (Heart Disease Prediction)

This notebook demonstrates how to **load a pre-trained Random Forest model** (previously trained on Dataset 1) and use it to make predictions for new patients. Additionally, a simple **Gradio interface** is provided for interactive input and real-time predictions.  

## Workflow Description

The workflow in this notebook follows these steps:

1. **Load pre-trained model and scaler**  
   - The Random Forest model and the StandardScaler (fitted during training) are loaded from the `../models/random_forest_ds2/` directory.  

2. **Define the prediction function** (`predict_patient`)  
   - Takes patient data as input.  
   - Converts the input into a DataFrame with the same feature structure used in training.  
   - Applies the scaler transformation to match the training distribution.  
   - Uses the Random Forest model to make a prediction.  
   - Returns the result as text: *Heart Disease* or *No Heart Disease*.  

3. **Build the Gradio interface**  
   - Gradio provides a graphical interface where users can enter patient data.  
   - When the **Predict** button is clicked, Gradio calls the `predict_patient` function.  
   - The prediction result is displayed instantly in the interface.  

**In short:**  
**User Input → Prediction Function (scaling + model) → Gradio Output**  

In [1]:
import gradio as gr
import pandas as pd
import numpy as np 
import joblib

In [2]:
data = pd.read_csv('../data/preprocessed_rf/dataset_2_preprocessed.csv')
data

Unnamed: 0,age,gender,cholesterol,pressure_high,heart_rate,smoking,alcohol_intake,exercise_hours,family_history,diabetes,obesity,stress_level,blood_sugar,exercise_induced_angina,heart_disease,chest_pain_type_atypical_angina,chest_pain_type_non_anginal_pain,chest_pain_type_typical_angina
0,75,0,228,119,66,1,2,1,0,0,1,8,119,1,1,1,0,0
1,48,1,204,165,62,1,0,5,0,0,0,9,70,1,0,0,0,1
2,53,1,234,91,67,0,2,3,1,0,1,5,196,1,1,1,0,0
3,69,0,192,90,72,1,0,4,0,1,0,7,107,1,0,0,1,0
4,62,0,172,163,93,0,0,6,0,1,0,2,183,1,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,56,0,269,111,86,0,2,5,0,1,1,10,120,0,1,0,1,0
996,78,0,334,145,76,0,0,6,0,0,0,10,196,1,1,0,0,1
997,79,1,151,179,81,0,1,4,1,0,1,8,189,1,0,0,0,0
998,60,0,326,151,68,2,0,8,1,1,0,5,174,1,1,1,0,0


In [3]:
model = joblib.load("../models/random_forest_ds2/random_forest_ds2.pkl")
scaler = joblib.load("../models/random_forest_ds2/scaler_ds2.pkl")

In [4]:
def predict_patient(age, gender, cholesterol, 
                    pressure_high, heart_rate, 
                    smoking, alcohol_intake, exercise_hours,
                    family_history, diabetes, obesity,
                    stress_level, blood_sugar, exercise_induced_angina,
                    chest_pain_type_atypical_angina, 
                    chest_pain_type_non_anginal_pain,
                    chest_pain_type_typical_angina):
    
    gender = 1 if gender == "Male" else 0
    smoking = int(smoking) if smoking is not None else 0
    alcohol_intake = int(alcohol_intake) if alcohol_intake is not None else 0
    family_history = int(family_history) if family_history is not None else 0
    diabetes = int(diabetes) if diabetes is not None else 0
    obesity = int(obesity) if obesity is not None else 0
    exercise_induced_angina = int(exercise_induced_angina) if exercise_induced_angina is not None else 0
    
    chest_pain_type_atypical_angina = 1 if chest_pain_type_atypical_angina else 0
    chest_pain_type_non_anginal_pain = 1 if chest_pain_type_non_anginal_pain else 0
    chest_pain_type_typical_angina = 1 if chest_pain_type_typical_angina else 0
    
    patient_dict = {
        "age": age,
        "gender": gender,
        "cholesterol": cholesterol,
        "pressure_high": pressure_high,
        "heart_rate": heart_rate,
        "smoking": smoking,
        "alcohol_intake": alcohol_intake,
        "exercise_hours": exercise_hours,
        "family_history": family_history,
        "diabetes": diabetes,
        "obesity": obesity,
        "stress_level": stress_level,
        "blood_sugar": blood_sugar,
        "exercise_induced_angina": exercise_induced_angina,
        "chest_pain_type_atypical_angina": chest_pain_type_atypical_angina,
        "chest_pain_type_non_anginal_pain": chest_pain_type_non_anginal_pain,
        "chest_pain_type_typical_angina": chest_pain_type_typical_angina
    }
    
    new_patient = pd.DataFrame([patient_dict])
    new_patient_scaled = scaler.transform(new_patient)
    prediction = model.predict(new_patient_scaled)
    
    return "Result: Heart Disease" if prediction[0] == 1 else "Result: No Heart Disease"


In [5]:
with gr.Blocks(theme=gr.themes.Soft()) as demo:
    # Данчо, знам, че не харесваш емотиконките, но понякога отварят интерфейса много добре :D
    gr.Markdown(
        """
        # ❤️ Heart Disease Risk Predictor  
        Insert your data  
        """
    )

    with gr.Row():
        # Left column
        with gr.Column():
            with gr.Group():
                gr.Markdown("### Personal Info")
                age = gr.Slider(18, 100, step=1, label="Age")
                gender = gr.Radio(["Female", "Male"], label="Gender")

            with gr.Group():
                gr.Markdown("### Vital Signs")
                pressure_high = gr.Slider(80, 200, step=1, label="Blood Pressure (mmHg)")
                heart_rate = gr.Slider(40, 200, step=1, label="Heart Rate (bpm)")
                blood_sugar = gr.Slider(60, 300, step=1, label="Blood Sugar (mg/dL)")

            with gr.Group():
                gr.Markdown("### Cholesterol & Lifestyle")
                cholesterol = gr.Slider(100, 400, step=1, label="Cholesterol (mg/dL)")
                exercise_hours = gr.Slider(0, 20, step=1, label="Exercise Hours per Week")
                stress_level = gr.Slider(1, 10, step=1, label="Stress Level (1-10)")

        # Right column
        with gr.Column():
            with gr.Group():
                gr.Markdown("### Habits")
                smoking = gr.Radio([0, 1], label="Smoking (0=No, 1=Yes)")
                alcohol_intake = gr.Radio([0, 1, 2], label="Alcohol Intake (0=No, 1=Yes, 2=Heavy)")

            with gr.Group():
                gr.Markdown("### Medical History")
                family_history = gr.Radio([0, 1], label="Family History (0=No, 1=Yes)")
                diabetes = gr.Radio([0, 1], label="Diabetes (0=No, 1=Yes)")
                obesity = gr.Radio([0, 1], label="Obesity (0=No, 1=Yes)")
                exercise_induced_angina = gr.Radio([0, 1], label="Exercise Induced Angina (0=No, 1=Yes)")

            with gr.Group():
                gr.Markdown("### Chest Pain Type")
                chest_pain_type_atypical_angina = gr.Checkbox(label="Atypical Angina")
                chest_pain_type_non_anginal_pain = gr.Checkbox(label="Non-Anginal Pain")
                chest_pain_type_typical_angina = gr.Checkbox(label="Typical Angina")

    with gr.Row():
        predict_btn = gr.Button("Predict", variant="primary")
    
    output = gr.Textbox(label="Prediction Result", lines=3)

    predict_btn.click(
        fn=predict_patient,
        inputs=[age, gender, cholesterol, pressure_high, heart_rate, 
                smoking, alcohol_intake, exercise_hours, family_history, 
                diabetes, obesity, stress_level, blood_sugar, 
                exercise_induced_angina, chest_pain_type_atypical_angina,
                chest_pain_type_non_anginal_pain, chest_pain_type_typical_angina],
        outputs=output
    )

if __name__ == "__main__":
    demo.launch(inline=False, share=True)

* Running on local URL:  http://127.0.0.1:7860
* Running on public URL: https://a385afd7f5545e728c.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)
