## In Tensorflow.keras_learn(1) we have leraned how to compile, fit, how to build a model by keras.Moudle class, keras.Sequential, Functional API

### Now it's time to learn more, First we will learn how to modify hyperparameters in the model bu using `Keras Tuner`

#### First let's know what is hyperparameters:

The hypterparameters are the parameters which won't changed while training, they control both training progress and parameters of model topo, they has two kinds:

1. Model Hyperparameters: This kind of hyperparameter will infuence the model complexity, like the width of hidden layer and the number of it, has the model using `Dropout`, `BatchNorm`? 

2. Algorithm Hyperparameters: Have influence in algotithm's speed and quality, like the `learning rate` and the k of `knn`

In [1]:
import tensorflow as tf
from tensorflow import keras

 **The Tuner need be installed by pip**

 To make sure we only installed this in visual enviroment, we need to install it by command line, beacuase I am using the Raspberry pi for running this notebook

#### The command in python is `pip install -U keras-tuner`, the `-U` is upgrade if there's latest one

In [4]:
import keras_tuner as kt

We download Fashion MNIST dataset for this mission, the dataloading will be covered later

In [10]:
(img_train, lable_train), (img_test, lable_test) = keras.datasets.fashion_mnist.load_data()

In [9]:
print(len(img_train))
print(len(lable_train))

60000
60000


In [11]:
print(len(img_test))
print(len(lable_test))

10000
10000


Now We need normalize the data to `0-1`, but by the way we need to know the dtype, because lots of the model only accept float32 or float64

In [14]:
img_train[0].dtype

dtype('uint8')

In [15]:
img_test[0].dtype

dtype('uint8')

The datatype is unsigned int 8-bit, which usually used in images, because this do not has symble like plus or minus

**Because data which load by `keras.datasets`, it's always numpy array, so we can use `astype()` method for changing dtype, this can only used by numpy array, if the data is `tensorflow.tensors`, we can use `cast()` to change the dtype**

In [16]:
import numpy as np

In [20]:
img_train = img_train.astype(np.float32) / 255.0
img_test = img_test.astype(np.float32) / 255.0

In [21]:
img_train[0].dtype

dtype('float32')

In [22]:
img_test[0].dtype

dtype('float32')

### Now we need to define hyperparameter model

We not only need to define the model but also need to define the hyperparameter's searching space, we call the model and the configuration together "hyperparameter searching model"

There are two kinds of method to define a hyperparameter searching model:

1. Using `model_builder(hp)` function for creating, `hp` is `keras_tuner.Hyperparameters()`'s object, which will automatically input by keras_tuner when calling `model_builder()`


2. Using `keras Tuner API`, inside there's one `HyperModel` class, we can subclassing(extend, override) this class to create hyperparameter model.

For **computer vision**, can using `HyperXception`, `HyperResNet` for finding best hyperparameter

#### The First method: Create a hyperparameter model by `model_builder(hp)`

In [3]:
def model_builder(hp):
    model = keras.Sequential()
    model.add(keras.layers.Input(shape=(28, 28)))
    model.add(keras.layers.Flatten())

    # Now define the hyperparameters range of hidden layer, the output number of the hidden layer
    hp_units = hp.Int("units", min_value=32, max_value=512, step=32)
    model.add(keras.layers.Dense(units=hp_units, activation="relu"))
    model.add(keras.layers.Dense(10))

    # Now define the hyperparameters range of learning rate for optimizer
    hp_learning_rate = hp.Choice("learning_rate", values=[1e-2, 1e-3, 1e-4])
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=hp_learning_rate),
        # from_logits = True means input is logits, we haven't use softmax in the last layer, keras will helps us calculating possibilities
        loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
        metrics=[keras.metrics.SparseCategoricalAccuracy()]
    )

    return model

#### In the cell upward is method one, create a method called model_builder, there are something need to be noticed

1. For everymodel build by `tensorflow.keras`, we shoudn't use the `input_size` parameter now, the tensorflow official suggests using `keras.layers.Input()` to specify the inputsize.

2. The hp, is an object which will be automatically created by `keras_tuner.Hyperparameters()`, this object covers useful classes we need in hyperparameter selecting.

3. The hyperparameters in hiddenlayers and in algorithms can be changed by keras_tuner.

**The useful classes of object `hp`**

1. `hp.Int(name, min_value, max_value, step)`: This class will create a list of int value range from min_value to max_value, and we can also specify step, `eg:hp.Int("dense1", 1, 10, 2) gives hyperparameters : [1, 3, 5, 7, 9]`

2. `hp.Float(name, min_value, max_value, step, sampling)`: This class will create a list of float value range from min_value to max_value, also can specify the step, `parameter sampling` means how to select a hyperparameter, there are usually two values, `"linear"` means select value uniformed, good for **normalize, dropout**, `"log"` will first put value in log space and select, which perform well on **learning_rate**

3. `hp.Choice(name, values)`: This class provides several values but the model or the algorithm only select one of them in one searching epoch, `eg:hp.choice("activation_fn", ["relu", "sigmoid", "softmax", "tanh"])`


There are more functions inside, but from now these are enough

#### The second kind of method is subclassing the class `HyperModel`

The HyperModel is a abstract class of keras_tuner, we need to override the function `build(self, hp)` and define both the model and the hyperparameters' searching

In [1]:
import keras_tuner as kt
from tensorflow import keras

In [8]:
class MyHyperModel(kt.HyperModel):
    
    def build(self, hp):
        inputs = keras.layers.Input(shape=(28, 28))
        x = keras.layers.Flatten()(inputs)

        # Define model Hyperparameters 
        units = hp.Int("units", min_value=64, max_value=512, step=64)
        dropout = hp.Float("dropout", min_value=0.0, max_value=0.6, step=0.1)

        x = keras.layers.Dense(units=units, activation="relu")(x)
        x = keras.layers.Dropout(dropout)(x)
        outputs = keras.layers.Dense(10)(x)

        model = keras.Model(inputs, outputs)

        # Now the model has beem built, we need define algorithms' hyperparameters
        optimizer_name = hp.Choice("optimizer", ["adam", "sgd"])
        if optimizer_name=="adam":
            learning_rate = hp.Choice("learning_rate_adam", [1e-3, 5e-4, 1e-4])
            optimizer = keras.optimizers.Adam(learning_rate=learning_rate)

        else:
            learning_rate = hp.Choice("learning_rate_sgd", [1e-2, 5e-3, 1e-3])
            momentum = hp.Choice("momentum", [0.0, 0.9])
            optimizer = keras.optimizers.SGD(learning_rate=learning_rate, momentum=momentum)

        model.compile(
            optimizer=optimizer,
            loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
            metrics=[keras.metrics.SparseCategoricalAccuracy()]
        )

        return model

So the second method is similar with the first one, but the first one is very convinent, the second one can also do hyperparameter change with `model.fit()`, find the batch_size, callbacks

**We can find there are few same features of each method, both will create model, define hyperparameters(model and algorithm), and compile the model, prepare for the training**

### After define the model and define the hyperparameters, we need to instantiation the Tuner, `keras_tuners` offers four kinds of optimizer.

1. RandomSearch: The name is the method we using, by the randomly selecting the hyperparameters, good on small seaching area.

2. Hyperband: More efficient than the RandomSearch.

3. BayesianOptimization: Good performance on continues datasets.

4. Sklearn: Can use the keras_tuner on the scikt_learn model.

Now we use the `Hyperband` in this notebook

**To instantiation the Hyperband tuner, must specify the `hyper model`, the `objective(metrics)` we want to optimize, and the `maximum epoches` of the tuner**

The parameters mean:

1. hypermodel: Receive the model we want to optimize, includes `model_builder()` function and `HyperModel()` class objects.

2. objective: The metrics we want to optimize, "val_accuracy" is for accuracy of the validation dataset.

3. max_epochs: The number of epochs if objective haven't improve

4. factor: Specify how much percentage we want to store, factor=3, so we divided data into 3 parts, only store 1 part of them

In [9]:
# Create the tuner by the model_builder(hp)
tuner_1 = kt.Hyperband(
    hypermodel=model_builder,
    objective="val_accuracy",
    max_epochs=10,
    factor=3
)

In [10]:
# Create the tuner by the kt.HyperModel():
tuner_2 = kt.Hyperband(
    hypermodel=MyHyperModel(),
    objective='val_accuracy',
    max_epochs=15,
    factor=3
)

#### In the `keras_tuner` and `tensorflow.keras.Module`, the `objective` and the `metrics` all can use `strings`, the metrics/objective using decided by the losses specified in the `compile()`, like if we use `accuracy`, keras will dive in `compile()` and find out which loss function we use, then automatically select the metrics, if use `val_accuracy`, also will automatically find it, but this only works on the `validation dataset` 

Now we don't want the procedure continues such long, so we add `Early stopping`