# AutoKeras Practice
We will practice Auto Keras with simple example.  
You don't need experties nor GPU for this practice.  
All you need is just importing autokeras for image classification.

In [9]:
from IPython.display import Image
Image(url= "https://github.com/jhfjhfj1/autokeras/blob/master/logo.png?raw=true", width=500, height=250)

Auto-Keras is an open source software library for automated machine learning (AutoML).  
The ultimate goal of AutoML is to allow domain experts with limited data science or machine learning background easily accessible to deep learning models.  Auto-Keras provides functions to automatically search for architecture and hyperparameters of deep learning models.  
http://autokeras.com/

# Citing this work
If you use Auto-Keras in a scientific publication, you are highly encouraged (though not required) to cite the following paper:

Efficient Neural Architecture Search with Network Morphism. Haifeng Jin, Qingquan Song, and Xia Hu. arXiv:1806.10282.

# Why Auto Keras than other AutoML?
### Don't spend time for hyperparameter tuning or playing with different layers.
Auto Keras will find it for you automatically.
### Auto Keras doesn't have vendor nor cloud platform dependencies.  
For example, if you use Google Cloud AutoML, you will have Google Cloud dependency.  
With Auto Keras, you can practice AutoML with your laptop or with your GPU cluster if you have GPU cluster.

# Practice
We will practice MNIST image classifier on personal laptop.

In [1]:
from keras.datasets import mnist
from autokeras.classifier import ImageClassifier

Using TensorFlow backend.


## Load MNIST data
We will practice with MNIST data from keras dataset.

In [2]:
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(x_train.shape + (1,))
x_test = x_test.reshape(x_test.shape + (1,))

## Train


The main advantage of using Auto Keras is that you even don't need to know about   
which Neural Network you will use for your image classifier.  
While Auto Keras will try multiple CNN based neural network with different layers and find best one for you.

Simply running **clf = ImageClassifier()** will work. however in order to see  
1) how train is going  
2) shorten maximum iteration for fast training  
I gave few arguments in this practice.  
That said, you even don't need to know iteration for your image classifier training.

In [3]:
clf = ImageClassifier(verbose=True, searcher_args={'trainer_args':{'max_iter_num':5}})

importantly, I gave 5 hours time limit, in order to finish this practice in 5 hours.  
By default, Auto Keras has default time limit as 24 hours in current version.

In [4]:
clf.fit(x_train, y_train, time_limit=5 * 60 * 60)

Initializing search.
Initialization finished.
Training model  0
Saving model.
Model ID: 0
Loss: tensor(5.2479)
Accuracy 96.00399999999999
Training model  1
Father ID:  0
[('to_wider_model', 1, 64)]
Saving model.
Model ID: 1
Loss: tensor(5.0132)
Accuracy 96.25600000000001
Training model  2
Father ID:  1
[('to_wider_model', 19, 64)]
Saving model.
Model ID: 2
Loss: tensor(3.0112)
Accuracy 97.64000000000001
Training model  3
Father ID:  2
[('to_wider_model', 1, 128)]
Saving model.
Model ID: 3
Loss: tensor(2.3075)
Accuracy 98.296


from the above result, you can find the auto keras is searching the best model by adjusting CNN model with multiple approach.

## Train the best model
final_fit function will choose best model and fit the model with your data.  
in this example. I gave 10 more iteration to make the model be trained more with data.  
if you give retrain=True, the model architecture will initialize weights and bias and retrain again.

In [5]:
clf.final_fit(x_train, y_train, x_test, y_test, retrain=False, trainer_args={'max_iter_num':10})

...............................................
Epoch 1: loss 3.737567901611328, accuracy 98.43
...............................................
Epoch 2: loss 3.925536870956421, accuracy 98.34
...............................................
Epoch 3: loss 3.422757148742676, accuracy 98.53
...............................................
Epoch 4: loss 3.3036224842071533, accuracy 98.62
...............................................
Epoch 5: loss 4.0281524658203125, accuracy 98.45
...............................................
Epoch 6: loss 3.3080132007598877, accuracy 98.63
...............................................
Epoch 7: loss 3.359560966491699, accuracy 98.6
...............................................
Epoch 8: loss 3.4960057735443115, accuracy 98.59
...............................................
Epoch 9: loss 3.6699087619781494, accuracy 98.51
...............................................
Epoch 10: loss 3.0567498207092285, accuracy 98.74


## Test
Testing your best model with test dataset.

In [6]:
y = clf.evaluate(x_test, y_test)
print(y * 100)

98.58


# Best Model Architecture Overview
Let's take a look a the best image classifier model's architecture

In [10]:
best_model = clf.load_searcher().load_best_model()

we can find the total number of layers by command below,

In [19]:
best_model.n_layers

51

You can find model architecture by command below,

In [22]:
from torchvision import models
print(best_model.produce_model())

TorchModel(
  (0): ReLU()
  (1): Conv2d(1, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1.5, 1.5))
  (2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (3): Dropout2d(p=0.25)
  (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (5): ReLU()
  (6): Conv2d(128, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1.5, 1.5))
  (7): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (8): Dropout2d(p=0.25)
  (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (10): ReLU()
  (11): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1.5, 1.5))
  (12): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (13): Dropout2d(p=0.25)
  (14): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (15): TorchFlatten()
  (16): Linear(in_features=576, out_features=10, bias=True)
  (17): LogSoftmax()
  (18): ReLU()
  (19): C