## Building Classifiers for Diabetes dataset using AdaBoost and XDABoost

The objective of the dataset is to diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements included in the dataset.All patients here are females of at least 21 years old of Pima Indian heritage.

**Independent variables**
	
1. Pregnancies : number of times pregnant
2. Glucose : plasma glucose concentration 
3. BloodPressure : Diastolic blood pressure (mm Hg)
4. SkinThickness : Triceps skin fold thickness (mm)
4. Insulin : 2-Hour serum insulin (mu U/ml)
5. BMI : Body mass index 
6. DiabetesPedigreeFunction : Diabetes pedigree function
7. Age : age in years

**Target variable**

Outcome : categorical variable (0 or 1) 




In [1]:
import pandas
from sklearn import model_selection
from sklearn.ensemble import AdaBoostClassifier

In [2]:
dataframe = pandas.read_csv('pima-indians-diabetes.csv')
array = dataframe.values
X = array[:,0:8]
Y = array[:,8]
seed = 7
num_trees = 30

In [4]:
kfold = model_selection.KFold(n_splits=10, random_state=seed)
model = AdaBoostClassifier(n_estimators=num_trees, random_state=seed)
results = model_selection.cross_val_score(model, X, Y, cv=kfold)
print(results.mean())

0.760457963089542


In [5]:
from sklearn import svm
from xgboost import XGBClassifier
clf = XGBClassifier()

seed = 7
num_trees = 30
kfold = model_selection.KFold(n_splits=10, random_state=seed)
model = XGBClassifier(n_estimators=num_trees, random_state=seed)
results = model_selection.cross_val_score(model, X, Y, cv=kfold)
print(results.mean())

0.768215994531784


There are only 2 classifiers.

You can use different classifiers and compare the results.
Also, check results after applying 10-fold Cross-validation.