<a href="https://colab.research.google.com/github/glimmer-jm/Projects/blob/main/Detecting_Parkinson%E2%80%9Fs_Disease_Using_XGBoost.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Detecting Parkinson‟s Disease Using XGBoost

Parkinson’s Disease is a progressive neurological disorder that primarily affects movement. It occurs when nerve cells (neurons) in a part of the brain called the substantia nigra, which produces dopamine, begin to deteriorate or die. Dopamine is a chemical messenger that helps coordinate smooth and controlled muscle movements. When dopamine levels drop, it leads to the characteristic symptoms of Parkinson’s.

### Key Features and Symptoms:
1. **Motor Symptoms:**
   - **Tremors:** Shaking, often starting in the hands or fingers, especially at rest.
   - **Bradykinesia:** Slowness of movement, making simple tasks difficult and time-consuming.
   - **Muscle Rigidity:** Stiffness in the limbs or torso, which can limit range of motion and cause discomfort.
   - **Postural Instability:** Impaired balance and coordination, often appearing in later stages, increasing the risk of falls.

2. **Non-Motor Symptoms:**
   - Sleep disturbances
   - Depression or anxiety
   - Cognitive decline (e.g., memory issues or dementia in advanced cases)
   - Loss of sense of smell
   - Digestive issues, such as constipation

### Causes:
The exact cause isn’t fully understood, but it’s believed to involve a combination of:
- **Genetic Factors:** Certain gene mutations (e.g., LRRK2, SNCA) increase risk, though these account for a small percentage of cases.
- **Environmental Factors:** Exposure to toxins (like pesticides) or head injuries may contribute.
- **Aging:** Risk increases with age, with most cases diagnosed after 60.

### Diagnosis:
There’s no specific test for Parkinson’s. Doctors rely on medical history, symptom evaluation, and neurological exams. Imaging (like MRI or DaTscan) may help rule out other conditions.

### Treatment:
While there’s no cure, treatments aim to manage symptoms:
- **Medications:** Levodopa (converted to dopamine in the brain), dopamine agonists, and MAO-B inhibitors.
- **Therapies:** Physical therapy, occupational therapy, or speech therapy.
- **Surgery:** Deep brain stimulation (DBS) in advanced cases.
- **Lifestyle:** Exercise and diet can improve quality of life.

### Progression:
Parkinson’s progresses differently for everyone. Early stages may involve mild symptoms, while later stages can lead to significant disability, though many people live with it for decades with proper management.

## Objective:
The study aimed to detect PD early using the XGBoost algorithm, emphasizing its potential for improving patient outcomes through timely diagnosis.

In [2]:
!pip install ucimlrepo

Collecting ucimlrepo
  Downloading ucimlrepo-0.0.7-py3-none-any.whl.metadata (5.5 kB)
Downloading ucimlrepo-0.0.7-py3-none-any.whl (8.0 kB)
Installing collected packages: ucimlrepo
Successfully installed ucimlrepo-0.0.7


In [3]:
from ucimlrepo import fetch_ucirepo
parkinsons = fetch_ucirepo(id=174)
X = parkinsons.data.features  # Features
y = parkinsons.data.targets   # Status (0 or 1)

In [4]:
import pandas as pd
import numpy as np
import os, sys
from sklearn.preprocessing import MinMaxScaler
from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix

In [6]:
print("First 5 rows of features (X):")
print(X.head())
print("\nFirst 5 rows of target (y):")
print(y.head())
print("\nNumber of rows and columns in X:", X.shape)
print("Number of rows in y:", y.shape)

First 5 rows of features (X):
   MDVP:Fo  MDVP:Fhi  MDVP:Flo  MDVP:Jitter  MDVP:Jitter  MDVP:RAP  MDVP:PPQ  \
0  119.992   157.302    74.997      0.00784      0.00784   0.00370   0.00554   
1  122.400   148.650   113.819      0.00968      0.00968   0.00465   0.00696   
2  116.682   131.111   111.555      0.01050      0.01050   0.00544   0.00781   
3  116.676   137.871   111.366      0.00997      0.00997   0.00502   0.00698   
4  116.014   141.781   110.655      0.01284      0.01284   0.00655   0.00908   

   Jitter:DDP  MDVP:Shimmer  MDVP:Shimmer  ...  MDVP:APQ  Shimmer:DDA  \
0     0.01109       0.04374       0.04374  ...   0.02971      0.06545   
1     0.01394       0.06134       0.06134  ...   0.04368      0.09403   
2     0.01633       0.05233       0.05233  ...   0.03590      0.08270   
3     0.01505       0.05492       0.05492  ...   0.03772      0.08771   
4     0.01966       0.06425       0.06425  ...   0.04465      0.10470   

       NHR     HNR      RPDE       DFA   spread1  

In [19]:
# Explore the data
X.isnull().sum()
X.describe()

Unnamed: 0,MDVP:Fo,MDVP:Fhi,MDVP:Flo,MDVP:Jitter,MDVP:Jitter.1,MDVP:RAP,MDVP:PPQ,Jitter:DDP,MDVP:Shimmer,MDVP:Shimmer.1,...,MDVP:APQ,Shimmer:DDA,NHR,HNR,RPDE,DFA,spread1,spread2,D2,PPE
count,195.0,195.0,195.0,195.0,195.0,195.0,195.0,195.0,195.0,195.0,...,195.0,195.0,195.0,195.0,195.0,195.0,195.0,195.0,195.0,195.0
mean,154.228641,197.104918,116.324631,0.00622,0.00622,0.003306,0.003446,0.00992,0.029709,0.029709,...,0.024081,0.046993,0.024847,21.885974,0.498536,0.718099,-5.684397,0.22651,2.381826,0.206552
std,41.390065,91.491548,43.521413,0.004848,0.004848,0.002968,0.002759,0.008903,0.018857,0.018857,...,0.016947,0.030459,0.040418,4.425764,0.103942,0.055336,1.090208,0.083406,0.382799,0.090119
min,88.333,102.145,65.476,0.00168,0.00168,0.00068,0.00092,0.00204,0.00954,0.00954,...,0.00719,0.01364,0.00065,8.441,0.25657,0.574282,-7.964984,0.006274,1.423287,0.044539
25%,117.572,134.8625,84.291,0.00346,0.00346,0.00166,0.00186,0.004985,0.016505,0.016505,...,0.01308,0.024735,0.005925,19.198,0.421306,0.674758,-6.450096,0.174351,2.099125,0.137451
50%,148.79,175.829,104.315,0.00494,0.00494,0.0025,0.00269,0.00749,0.02297,0.02297,...,0.01826,0.03836,0.01166,22.085,0.495954,0.722254,-5.720868,0.218885,2.361532,0.194052
75%,182.769,224.2055,140.0185,0.007365,0.007365,0.003835,0.003955,0.011505,0.037885,0.037885,...,0.0294,0.060795,0.02564,25.0755,0.587562,0.761881,-5.046192,0.279234,2.636456,0.25298
max,260.105,592.03,239.17,0.03316,0.03316,0.02144,0.01958,0.06433,0.11908,0.11908,...,0.13778,0.16942,0.31482,33.047,0.685151,0.825288,-2.434031,0.450493,3.671155,0.527367


In [20]:
# Split the data into training and testing data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [21]:
# Check the sizes of our splits
print("Size of X_train:", X_train.shape)
print("Size of X_test:", X_test.shape)
print("Size of y_train:", y_train.shape)
print("Size of y_test:", y_test.shape)

Size of X_train: (156, 22)
Size of X_test: (39, 22)
Size of y_train: (156, 1)
Size of y_test: (39, 1)


In [23]:
# Train the model
model=XGBClassifier()
model.fit(X_train.values,y_train.values)