**ENSEMBLE TECHNIQUE**

Ensemble Methods, what are they? Ensemble methods is a machine learning technique that combines several base models in order to produce one optimal predictive model. To better understand this definition lets take a step back into ultimate goal of machine learning and model building.

The most popular ensemble methods are boosting, bagging, and stacking. Ensemble methods are ideal for regression and classification, where they reduce bias and variance to boost the accuracy of models.

AdaBoost. AdaBoost is an ensemble machine learning algorithm for classification problems. It is part of a group of ensemble methods called boosting, that add new machine learning models in a series where subsequent models attempt to fix the prediction errors made by prior models.

**PROBLEM STATEMENT**

Perform AdaBoost and Extreme Gradient Boosting for the following WBCD dataset.

In [1]:
#Importing the required packages and libraries
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.ensemble import AdaBoostClassifier
from sklearn.metrics import accuracy_score, confusion_matrix
import xgboost as xgb

In [2]:
#loading the dataset and exploring the columns
data = pd.read_csv(r"C:\Users\D\Desktop\New Assignments  Keys\Datasets\wbcd.csv")

data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 569 entries, 0 to 568
Data columns (total 32 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   id                 569 non-null    int64  
 1   diagnosis          569 non-null    object 
 2   radius_mean        569 non-null    float64
 3   texture_mean       569 non-null    float64
 4   perimeter_mean     569 non-null    float64
 5   area_mean          569 non-null    float64
 6   smoothness_mean    569 non-null    float64
 7   compactness_mean   569 non-null    float64
 8   concavity_mean     569 non-null    float64
 9   points_mean        569 non-null    float64
 10  symmetry_mean      569 non-null    float64
 11  dimension_mean     569 non-null    float64
 12  radius_se          569 non-null    float64
 13  texture_se         569 non-null    float64
 14  perimeter_se       569 non-null    float64
 15  area_se            569 non-null    float64
 16  smoothness_se      569 non

**DATA UNDERSTANDING**

Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image.

Attribute Information:

1) ID number
2) Diagnosis (M = malignant, B = benign)
3-32)

Ten real-valued features are computed for each cell nucleus:

a) radius (mean of distances from center to points on the perimeter)

b) texture (standard deviation of gray-scale values)

c) perimeter

d) area

e) smoothness (local variation in radius lengths)

f) compactness (perimeter^2 / area - 1.0)

g) concavity (severity of concave portions of the contour)

h) concave points (number of concave portions of the contour)

i) symmetry

j) fractal dimension ("coastline approximation" - 1)

The mean, standard error and "worst" or largest (mean of the three
largest values) of these features were computed for each image,
resulting in 30 features. For instance, field 3 is Mean Radius, field
13 is Radius SE, field 23 is Worst Radius.

All feature values are recoded with four significant digits.

Missing attribute values: none

Class distribution: 357 benign, 212 malignant



**DATA PRE PROCESSING**

There are no null values inthe dataset

All the independent variable columns have numerical values.

Only the Trget variable has categorical values.

So We converted the target(output column) using label encoding

In [3]:
# using label encoder to convert the categorical(Diagnosis) column to numerical
lb = LabelEncoder()
data["diagnosis"] = lb.fit_transform(data["diagnosis"])

**AdaBoost**   
AdaBoost or Adaptive Boosting is the first Boosting ensemble model. The method automatically adjusts its parameters to the data based on the actual performance in the current iteration. Meaning, both the weights for re-weighting the data and the weights for the final aggregation are re-computed iteratively. 

In practice, this boosting technique is used with simple classification trees or stumps as base-learners, which resulted in improved performance compared to the classification by one tree or other single base-learner.

**Gradient Boosting**
Gradient Boost is a robust machine learning algorithm made up of Gradient descent and Boosting. The word ‘gradient’ implies that you can have two or more derivatives of the same function. Gradient Boosting has three main components: additive model, loss function and a weak learner. 

The technique yields a direct interpretation of boosting methods from the perspective of numerical optimisation in a function space and generalises them by allowing optimisation of an arbitrary loss function.



**Model Building**

Splitted the into predictors and Target

Then splitted the data into train and test dataset

Applied the Adaboost and XGBoosting Classification models

Created the confusion matrix.

Calculated the Accuracy score for predictors and Target.

In [4]:
#Splitting the dataset to predictors(Independent variables) and Target(Output)
predictors = data.loc[:, data.columns!="diagnosis"]
type(predictors)

pandas.core.frame.DataFrame

In [5]:
target = data["diagnosis"]
type(target)

pandas.core.series.Series

In [6]:
#splitting the predictors and Target into train and test dataset
x_train, x_test, y_train, y_test = train_test_split(predictors, target, test_size = 0.2, random_state=0)

In [7]:
#Applying Adaboosting classification Technique
# ada_clf = AdaBoostClassifier(learning_rate = 0.02, n_estimators = 3500)
ada_clf = AdaBoostClassifier(learning_rate = 0.001, n_estimators = 3500)

In [8]:
## fitting and train the model
ada_clf.fit(x_train, y_train)

AdaBoostClassifier(learning_rate=0.001, n_estimators=3500)

In [9]:
#Creating the confusion matrix for target
confusion_matrix(y_test, ada_clf.predict(x_test))

array([[77,  0],
       [ 4, 33]], dtype=int64)

In [10]:
#calculating the accuracy for target
accuracy_score(y_test, ada_clf.predict(x_test))

0.9649122807017544

In [11]:
#calculating the accuracy for predictors
accuracy_score(y_train, ada_clf.predict(x_train))

0.9758241758241758

In [12]:
#Applying the XGBoosting Classification Technique
xgb_clf = xgb.XGBClassifier(max_depths = 5, n_estimators = 10000, learning_rate = 0.3, n_jobs = -1)

In [13]:
#fitting the model and train the model
xgb_clf.fit(x_train, y_train)



Parameters: { "max_depths" } might not be used.

  This could be a false alarm, with some parameters getting used by language bindings but
  then being mistakenly passed down to XGBoost core, or some parameter actually being used
  but getting flagged wrongly here. Please open an issue if you find any such cases.




XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
              colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
              gamma=0, gpu_id=-1, importance_type=None,
              interaction_constraints='', learning_rate=0.3, max_delta_step=0,
              max_depth=6, max_depths=5, min_child_weight=1, missing=nan,
              monotone_constraints='()', n_estimators=10000, n_jobs=-1,
              num_parallel_tree=1, predictor='auto', random_state=0,
              reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
              tree_method='exact', validate_parameters=1, verbosity=None)

In [14]:
#creating confusion matrix for target predicted values
confusion_matrix(y_test, xgb_clf.predict(x_test))

array([[77,  0],
       [ 4, 33]], dtype=int64)

In [15]:
#calculating the accuracy score for target
accuracy_score(y_test, xgb_clf.predict(x_test))

0.9649122807017544

In [16]:
#calculating the accuracy score for predictors
accuracy_score(y_train,xgb_clf.predict(x_train))

1.0

**Results**

We can see that the accuracies of both Adaboosting and XGBoosting model are almost same.

As per the Adaboosting model and XGBoosting model the patient is more prominent to have Benign cancer. 


## Hyperparameter Tuning

### AdaBoost

In [17]:
#Applying Adaboosting classification Technique# ada1 = AdaBoostClassifier(learning_rate = 0.02, n_estimators = 3500)
ada1 = AdaBoostClassifier(learning_rate = 0.001, n_estimators = 3500)

In [18]:
## fitting and train the model
ada1.fit(x_train, y_train)

AdaBoostClassifier(learning_rate=0.001, n_estimators=3500)

In [19]:
#Creating the confusion matrix for target
confusion_matrix(y_test, ada1.predict(x_test))

array([[77,  0],
       [ 4, 33]], dtype=int64)

In [20]:
#calculating the accuracy for target
accuracy_score(y_test, ada1.predict(x_test))

0.9649122807017544

In [21]:
#calculating the accuracy for predictors
accuracy_score(y_train, ada1.predict(x_train))

0.9758241758241758

## XGBoost

In [22]:
#Applying XSBoosting Classification Technique
xgb1 = xgb.XGBClassifier(max_depths = 5, n_estimators = 10000, learning_rate = 0.3, n_jobs = -1,gamma = 5,min_child_weight=0)

In [23]:
#fitting the model and train the data
xgb1.fit(x_train, y_train)



Parameters: { "max_depths" } might not be used.

  This could be a false alarm, with some parameters getting used by language bindings but
  then being mistakenly passed down to XGBoost core, or some parameter actually being used
  but getting flagged wrongly here. Please open an issue if you find any such cases.




XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
              colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
              gamma=5, gpu_id=-1, importance_type=None,
              interaction_constraints='', learning_rate=0.3, max_delta_step=0,
              max_depth=6, max_depths=5, min_child_weight=0, missing=nan,
              monotone_constraints='()', n_estimators=10000, n_jobs=-1,
              num_parallel_tree=1, predictor='auto', random_state=0,
              reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
              tree_method='exact', validate_parameters=1, verbosity=None)

In [24]:
#creating confusion matrix for the target
confusion_matrix(y_test, xgb1.predict(x_test))

array([[76,  1],
       [ 5, 32]], dtype=int64)

In [25]:
#calculating the accuracy score for target
accuracy_score(y_test, xgb1.predict(x_test))

0.9473684210526315

In [26]:
#calculating the accuracy score for predictors
accuracy_score(y_train,xgb1.predict(x_train))

0.9868131868131869