# Numerical Classification on Clean IRIS Dataset using TPOT

|Specification|Value|
|----|----|
|AutoML Algorithm|TPOT|
|Task|Text Classification|
|Dataset|IRIS Dataset|
|Dataset Clean|Yes|

## Load Packages

In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
random_state = 42

In [2]:
from tpot import TPOTClassifier
from tpot.export_utils import set_param_recursive

## Load Dataset

In [3]:
iris_df = pd.read_csv("../datasets/clean/iris.csv")


In [4]:
X_train, X_test, y_train, y_test = train_test_split(iris_df.loc[:, iris_df.columns != 'target'], 
                                                    iris_df.loc[:, 'target'], 
                                                    train_size=0.75, test_size=0.25, random_state=42)

## Train the Model

In [5]:
tpot = TPOTClassifier(generations=5, population_size=50, verbosity=2, random_state=42)
tpot.fit(X_train, y_train)
print(tpot.score(X_test, y_test))

Optimization Progress:   0%|          | 0/300 [00:00<?, ?pipeline/s]


Generation 1 - Current best internal CV score: 1.0

Generation 2 - Current best internal CV score: 1.0

Generation 3 - Current best internal CV score: 1.0

Generation 4 - Current best internal CV score: 1.0

Generation 5 - Current best internal CV score: 1.0

Best pipeline: KNeighborsClassifier(input_matrix, n_neighbors=3, p=1, weights=distance)
1.0


In [6]:
# tpot.export('best_model_pipeline.py')

In [8]:
from sklearn.neighbors import KNeighborsClassifier
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import Normalizer

In [9]:
# Average CV score on the training set was: 0.9826086956521738
exported_pipeline = make_pipeline(
    Normalizer(norm="l2"),
    KNeighborsClassifier(n_neighbors=3, p=1, weights="distance")
)
# Fix random state for all the steps in exported pipeline
set_param_recursive(exported_pipeline.steps, 'random_state', 42)

exported_pipeline.fit(X_train, y_train)
results = exported_pipeline.predict(X_test)


In [10]:
exported_pipeline.score(X_test, y_test)

0.9736842105263158