# Auto-Keras (0.4)
This notebook contains the experiments with AutoKeras, analyzing the quality and performance of the generated models and comparing them to handcrafted ones.

Main questions:
*   What does the structure of the generated networks look like?
*   How is the quality compared to handmade nets?

"Auto-Keras is an open source software library for automated machine learning (AutoML). It is developed by DATA Lab at Texas A&M University and community contributors. The ultimate goal of AutoML is to provide easily accessible deep learning tools to domain experts with limited data science or machine learning background. Auto-Keras provides functions to automatically search for architecture and hyperparameters of deep learning models." - *autokeras.com*

## Packages and Imports
The following section contains the packages that need to be installed and imported.

Notice: In google colab you have to **restart the environment** after installing autokeras.

In [None]:
!pip install autokeras==0.4.0 # Version 0.4

In [None]:
!python -V # Should output 3.6.* to work

In [None]:
import autokeras as ak
from autokeras.image.image_supervised import ImageClassifier
from autokeras.utils import pickle_from_file
import keras
from keras.models import Sequential
from keras.layers import Dense, Conv2D, Dropout, Flatten, MaxPooling2D
import torch
import numpy as np
from sklearn.model_selection import train_test_split
from google.colab import files

## Load data set
In the following sections you can load the data set you are going to use to test Auto-Keras. The training data is contained in `x_train` and `y_train`, while the test data is in `x_test` and `y_test`.

### [MNIST](http://yann.lecun.com/exdb/mnist/)

In [None]:
from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(x_train.shape + (1,))
x_test = x_test.reshape(x_test.shape + (1,))
data_set_name = "mnist"

### [Fashion-MNIST](https://research.zalando.com/welcome/mission/research-projects/fashion-mnist/)

In [None]:
from keras.datasets import fashion_mnist
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
x_train = x_train.reshape(x_train.shape + (1,))
x_test = x_test.reshape(x_test.shape + (1,))
data_set_name = "fashion_mnist"

### [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html)

In [None]:
from keras.datasets import cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train = x_train.reshape(x_train.shape + (1,))
x_test = x_test.reshape(x_test.shape + (1,))
data_set_name = "cifar_10"

### [MHSMA](https://github.com/soroushj/mhsma-dataset)

In [None]:
!git clone https://github.com/soroushj/mhsma-dataset

In [None]:
# 64x64-pixel version of x
x_train = np.load("mhsma-dataset/mhsma/x_64_train.npy")
x_test = np.load("mhsma-dataset/mhsma/x_64_test.npy")
x_train = x_train.reshape(x_train.shape + (1,))
x_test = x_test.reshape(x_test.shape + (1,))

# Different y labels
y_acrosome_train = np.load("mhsma-dataset/mhsma/y_acrosome_train.npy")
y_acrosome_test = np.load("mhsma-dataset/mhsma/y_acrosome_test.npy")
y_head_train = np.load("mhsma-dataset/mhsma/y_head_train.npy")
y_head_test = np.load("mhsma-dataset/mhsma/y_head_test.npy")
y_vacuole_train = np.load("mhsma-dataset/mhsma/y_vacuole_train.npy")
y_vacuole_test = np.load("mhsma-dataset/mhsma/y_vacuole_test.npy")

data_set_name = "mhsma"

In [None]:
# Adapt for other y
y_train = y_vacuole_train
y_test = y_vacuole_test

### [Breast Cancer](https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic))
Beware: This data set does not consist of images.

In [None]:
from sklearn.datasets import load_breast_cancer
X, y = load_breast_cancer(return_X_y=True)
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.15)
x_train = x_train.reshape(x_train.shape + (1,))
x_test = x_test.reshape(x_test.shape + (1,))
data_set_name = "breast_cancer"

## Create Auto-Keras model
As the ImageClassifier is the only working classifier for version 0.4, it is used here for all data sets.

In [None]:
%%time
%tensorflow_version 1.x
clf = ImageClassifier(verbose=True, augment=True, path=None, resume=False, searcher_args=None)
clf.fit(x_train, y_train, time_limit=1 * 60 * 60) # 1 Hour
clf.final_fit(x_train, y_train, x_test, y_test, retrain=False, trainer_args={'max_no_improvement_num': 5})

### Export model
You can export the model for later training.

In [None]:
clf.export_autokeras_model(data_set_name + ".pkl")

In [None]:
# Download model
files.download(data_set_name + ".pkl")

### Load model
Note that when loading a model, the class of `clf` changes to `PortableImageSupervised`.

In [None]:
# Upload model
files.upload()

In [None]:
clf = pickle_from_file(data_set_name + ".pkl")

## Create model for comparison

In [None]:
# Normalizing data
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255.
x_test /= 255.

In [None]:
%%time
%tensorflow_version 1.x
# Creating a Sequential Model and adding the layers
model = Sequential()
model.add(Conv2D(28, kernel_size=(3,3), input_shape=x_train[0].shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(128, activation="relu"))
model.add(Dropout(0.2))
model.add(Dense(len(np.unique(y_test)), activation="softmax"))

# Train
model.compile(optimizer='adam', 
              loss='sparse_categorical_crossentropy', 
              metrics=['accuracy'])
history = model.fit(x=x_train, y=y_train, epochs=10, verbose=1)

## Visualize and Evaluate

### Auto-Keras model

In [None]:
# Evaluate
print(clf.evaluate(x_test, y_test))

In [None]:
torch_model = clf.cnn.best_model.produce_model() # For trained models

In [None]:
torch_model = clf.graph.produce_model() # For portable models that got imported

In [None]:
# Print model structure
torch_model

In [None]:
# Save and download model
torch.save(torch_model, data_set_name + ".pth")
files.download(data_set_name + ".pth")

Subsequently use [this website](https://lutzroeder.github.io/netron/) to visualize the model.

### Keras model

In [None]:
# Evaluate
score = model.evaluate(x_test, y_test)
print("Training loss", score[0])
print("Training accuracy", score[1])

In [None]:
# Print model structure
model.summary()

In [None]:
# Save and download model
model.save(data_set_name + ".h5")
files.download(data_set_name + ".h5")

Subsequently use [this website](https://lutzroeder.github.io/netron/) to visualize the model.