## Train Auto-ML Model with TPOT

### Setup the Environment
You need to install the following Python packages:

sudo pip install tpot <br />
sudo pip install scikit-learn  <br />
sudo pip install numpy

### Load packages

In [None]:
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.externals import joblib
from tpot import TPOTClassifier

### Load Dataset
A breast cancer dataset is used from the UCI Machine Learning for demonsration. The dataset is accessible from the following URL: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic) .

In [None]:
breast_cancer_dataset = load_breast_cancer()
print(breast_cancer_dataset.data[0:2,])
print(breast_cancer_dataset.target[0:2,])

### Split dataset into training and testing
80% data will be used for training a model and 20% data will be used for model evaluation.

In [None]:
X_train, X_test, y_train, y_test = train_test_split(breast_cancer_dataset.data, 
                                                    breast_cancer_dataset.target, 
                                                    test_size=0.20) 

### Train a Classical Machine Learning Model
Trained a Random Forest model to see if Auto-ML model is performing better.

In [None]:
from sklearn.ensemble import RandomForestClassifier
rf = RandomForestClassifier(n_estimators=10, random_state=0)
rf.fit(X_train, y_train)
mean_accuracy = rf.score(X_test, y_test)
mean_accuracy

In [None]:
clf = TPOTClassifier(generations=10, population_size=20, verbosity=2)
fit = clf.fit(X_train, y_train)

In [None]:
print("Test Score: ", clf.score(X_test, y_test))

### Save model
Save your trained model for future use and to deploy in production environments.

In [None]:
filename = 'outputs/breast_cancer_model.pkl'
joblib.dump(clf.fitted_pipeline_, filename) # Extract the best fitted pipeline from TPOT and save it in a serializable object

### Export Script
Export the best fitted pipeline into a python script.

In [None]:
clf.export('tpot_breast_cancer_pipeline.py')

## Next
Practice different [examples](https://epistasislab.github.io/tpot/examples/) from  theTPOT documentation yourself and experimnet with different parameters and configurations.
