# What is Parkinson's Disease? 
![](http://www.parkinson.org/sites/default/files/styles/1902x600/public/images/gettingdiagnosed.jpg?h=83615a45&itok=wkmPjJFp)
**Parkinson's disease (PD)** is movement disorder of the nervous system that gets worse over time. As nerve cells (neurons) in parts of the brain weaken, are damaged, or die, people may begin to notice problems with movement, tremor, stiffness in the limbs or the trunk of the body, or impaired balance. As symptoms progress, people may have difficulty walking, talking, or completing other simple tasks. Not everyone with one or more of these symptoms has PD, as the symptoms appear in other diseases as well.

# Objective
To build a model to accurately detect the presence of Parkinson’s disease in an individual.

# About the Notebook
For the dataset we'll be using the **UC Irvine Parkinson's Dataset**, which is composed of a range of biomedical voice measurements from 31 people, 23 with Parkinson's disease (PD). Each column in the table is a particular voice measure, and each row corresponds one of 195 voice recording from these individuals.

As for classification we'll be using the **XGBClassifier** because of it's excellent scalability and ability to handle large datasets efficiently. 

In [1]:
# Importing the necessary libraries
import numpy as np
import pandas as pd
import os, sys
from sklearn.preprocessing import MinMaxScaler
from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

In [3]:
# Reading the data
df = pd.read_csv('/kaggle/input/uci-ml-parkinsons-dataset/parkinsons.data')
df.head()

Unnamed: 0,name,MDVP:Fo(Hz),MDVP:Fhi(Hz),MDVP:Flo(Hz),MDVP:Jitter(%),MDVP:Jitter(Abs),MDVP:RAP,MDVP:PPQ,Jitter:DDP,MDVP:Shimmer,...,Shimmer:DDA,NHR,HNR,status,RPDE,DFA,spread1,spread2,D2,PPE
0,phon_R01_S01_1,119.992,157.302,74.997,0.00784,7e-05,0.0037,0.00554,0.01109,0.04374,...,0.06545,0.02211,21.033,1,0.414783,0.815285,-4.813031,0.266482,2.301442,0.284654
1,phon_R01_S01_2,122.4,148.65,113.819,0.00968,8e-05,0.00465,0.00696,0.01394,0.06134,...,0.09403,0.01929,19.085,1,0.458359,0.819521,-4.075192,0.33559,2.486855,0.368674
2,phon_R01_S01_3,116.682,131.111,111.555,0.0105,9e-05,0.00544,0.00781,0.01633,0.05233,...,0.0827,0.01309,20.651,1,0.429895,0.825288,-4.443179,0.311173,2.342259,0.332634
3,phon_R01_S01_4,116.676,137.871,111.366,0.00997,9e-05,0.00502,0.00698,0.01505,0.05492,...,0.08771,0.01353,20.644,1,0.434969,0.819235,-4.117501,0.334147,2.405554,0.368975
4,phon_R01_S01_5,116.014,141.781,110.655,0.01284,0.00011,0.00655,0.00908,0.01966,0.06425,...,0.1047,0.01767,19.649,1,0.417356,0.823484,-3.747787,0.234513,2.33218,0.410335


In [4]:
# Getting the features and labels
features=df.loc[:,df.columns!='status'].values[:,1:]
labels=df.loc[:,'status'].values

In [5]:
# Getting the count of each label (0 and 1) in labels
print(labels[labels==1].shape[0], labels[labels==0].shape[0])

147 48


In [6]:
# Scaling the features to between -1 and 1
scaler=MinMaxScaler((-1,1))
x=scaler.fit_transform(features)
y=labels

In [7]:
# Splitting the dataset into a training set and a testing set
x_train,x_test,y_train,y_test=train_test_split(x, y, test_size=0.2, random_state=7)

In [8]:
# Training the model
model=XGBClassifier()
model.fit(x_train,y_train)

In [9]:
# Calculating the accuracy
y_pred=model.predict(x_test)
print(accuracy_score(y_test, y_pred)*100)

94.87179487179486


# Summary
In this notebook we learned how to detect the presence of Parkinson’s Disease in individuals using various factors. We used an XGBClassifier for this and made use of the sklearn library to prepare the dataset. This gives us an accuracy of 94.87%.