# 🧪 Parkinson's Disease Detection - Jupyter Notebook

This notebook demonstrates how to load a pre-trained machine learning model to detect Parkinson's disease using extracted vocal features.  
It accepts a list of 22 biomedical voice measurements and predicts whether the subject has Parkinson's.

Model used: `RandomForestClassifier`  
Source: NeuroDetect (`src/train_model.py`)

In [7]:
# 📦 Import required libraries
import pandas as pd
import numpy as np
import joblib
from tensorflow.keras.models import load_model


In [9]:
from tensorflow.keras.models import load_model
import joblib

# Use relative path to reach models folder from current notebook location
model = load_model("../models/parkinsons_model.h5")
scaler = joblib.load("../models/scaler.pkl")

print("Model and scaler loaded successfully!")




Model and scaler loaded successfully!


In [10]:
import os

print("Notebook is running from:", os.getcwd())
print("Does model exist?", os.path.exists("../models/parkinsons_model.h5"))
print("Does scaler exist?", os.path.exists("../models/scaler.pkl"))


Notebook is running from: C:\Users\aizag\Desktop\Coding\Hackclub\NeuroDetect\disease-predictor
Does model exist? True
Does scaler exist? True


In [11]:
# Example input data (same features as training)
input_data = {
    'MDVP:Fo(Hz)': 119.992,
    'MDVP:Fhi(Hz)': 157.302,
    'MDVP:Flo(Hz)': 74.997,
    'MDVP:Jitter(%)': 0.00784,
    'MDVP:Jitter(Abs)': 0.00007,
    'MDVP:RAP': 0.00370,
    'MDVP:PPQ': 0.00554,
    'Jitter:DDP': 0.01109,
    'MDVP:Shimmer': 0.04374,
    'MDVP:Shimmer(dB)': 0.426,
    'Shimmer:APQ3': 0.02182,
    'Shimmer:APQ5': 0.03130,
    'MDVP:APQ': 0.02971,
    'Shimmer:DDA': 0.06545,
    'NHR': 0.02211,
    'HNR': 21.033,
    'RPDE': 0.414783,
    'DFA': 0.815285,
    'spread1': -4.813031,
    'spread2': 0.266482,
    'D2': 2.301442,
    'PPE': 0.284654
}


In [13]:
df = pd.read_csv("../data/processed/parkinsons_clean.csv")
drop_cols = ['Diagnosis', 'PatientID', 'DoctorInCharge']
features = df.drop(columns=[col for col in drop_cols if col in df.columns]).columns.tolist()
print("Model expects features:\n", features)


Model expects features:
 ['Age', 'Gender', 'Ethnicity', 'EducationLevel', 'BMI', 'Smoking', 'AlcoholConsumption', 'PhysicalActivity', 'DietQuality', 'SleepQuality', 'FamilyHistoryParkinsons', 'TraumaticBrainInjury', 'Hypertension', 'Diabetes', 'Depression', 'Stroke', 'SystolicBP', 'DiastolicBP', 'CholesterolTotal', 'CholesterolLDL', 'CholesterolHDL', 'CholesterolTriglycerides', 'UPDRS', 'MoCA', 'FunctionalAssessment', 'Tremor', 'Rigidity', 'Bradykinesia', 'PosturalInstability', 'SpeechProblems', 'SleepDisorders', 'Constipation']


In [14]:
input_data = {
    'Age': 65,
    'Gender': 1,  # 1 = Male, 0 = Female
    'Ethnicity': 2,  # e.g., encoded as 0, 1, 2
    'EducationLevel': 3,  # e.g., 0: None, 1: Primary, ..., 3: Graduate
    'BMI': 24.5,
    'Smoking': 0,
    'AlcoholConsumption': 1,
    'PhysicalActivity': 2,  # e.g., 0: None, 1: Light, 2: Moderate, 3: Intense
    'DietQuality': 3,  # e.g., 1 to 5
    'SleepQuality': 2,
    'FamilyHistoryParkinsons': 1,
    'TraumaticBrainInjury': 0,
    'Hypertension': 1,
    'Diabetes': 0,
    'Depression': 1,
    'Stroke': 0,
    'SystolicBP': 130,
    'DiastolicBP': 85,
    'CholesterolTotal': 190,
    'CholesterolLDL': 110,
    'CholesterolHDL': 50,
    'CholesterolTriglycerides': 130,
    'UPDRS': 20.5,
    'MoCA': 23,
    'FunctionalAssessment': 2,
    'Tremor': 1,
    'Rigidity': 1,
    'Bradykinesia': 1,
    'PosturalInstability': 0,
    'SpeechProblems': 1,
    'SleepDisorders': 0,
    'Constipation': 0
}


In [15]:
input_df = pd.DataFrame([input_data])
scaled_input = scaler.transform(input_df)
prediction = model.predict(scaled_input)

if prediction[0][0] > 0.5:
    print("Parkinson's Disease Detected.")
else:
    print("No Parkinson's Disease.")


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 370ms/step
Parkinson's Disease Detected.
