# Neural Network

## Sequential Neural Network

### Perceptron -> Multiple Perceptrons (MLP)

- perceptron is one of the simplest ANN (artificial neural network); MLP is the resulting ANN. 

- similar to Stochastic Gradient Descent

- don't output a class probability, rather, they make a predictions based on a hard threshold. 

`from sklearn.linear_model import Perceptron`

- MLP (with two hidden layers)

    - scale first

    - Sequential API
        ```
        model = keras.models.Sequential([
            keras.layers.Flatten(input_shape = [28, 28]),
            keras.layers.Dense(300, activation = 'relu'), 
            keras.layers.Dense(100, activation = 'relu'), 
            keras.layers.Dense(10, activation = 'softmax')
        ])

        model = keras.models.Sequential([
            keras.layers.Dense(30, activation = 'relu', input_shape = X_train.shape[1:]), 
            keras.layers.Dense(1)
        ])
        ```
    - compile the model

        - loss: for sparse labels (0-9 exclusive) => sparse_categorcial_crossentropy; for one-hot vector => categorial_crossentropy; binary output => binary_crossentropy

        - optimizer: sgd => Stochastic Gradient Descent <= need to tune the learning rate; 

            ```
            model.compile(loss = 'sparse_categorical_crossentropy', optimizer = 'sgd', metrics = ['accuracy'])

            model.compile(loss = 'mean_squared_error', optimizer = 'sgd')
            ```

    - train data

        ```
        history = model.fit(X_train, y_train, epochs = 30, validation_data = (X_valid, y_valid))
        history.params
        history.epochs
        history.history 

        pd.DataFrame(history.history).plot(figsize = (8, 5))
        plt.grid(True)
        plt.gca().set_ylim(0, 1)
        plt.show()
        ```
    - tune

        - learning rate

        - other optimizer (also need to tune learning rate)

        - mode hyperparameter (# layers; # neurons; activation function; batch size)

    - test

        `model.evaluate(X_test, y_test)`

    - some layer info
        ```
        h = model.layers[1]
        h.name
        model.get_layer('dense') is h => True
        weights, biases = h.get_weights()
        ```

## Nonsequential Neural Network (Functional API)

- need both regression task and classification task

- multiple independent tasks on the same data

- add some auxiliary outputs

```
input_A = keras.layers.Input(shape=[5], name="wide_input")
input_B = keras.layers.Input(shape=[6], name="deep_input")
hidden1 = keras.layers.Dense(30, activation="relu")(input_B)
hidden2 = keras.layers.Dense(30, activation="relu")(hidden1)
concat = keras.layers.concatenate([input_A, hidden2])
output = keras.layers.Dense(1, name="output")(concat)
aux_output = keras.layers.Dense(1, name = 'aux_output')(hidden2)
model = keras.Model(inputs=[input_A, input_B], outputs=[output, aux_output])

model.compile(loss="mse", optimizer=keras.optimizers.SGD(lr=1e-3))
model.compile(loss = ['mse', 'mse'], loss_weight = [.9, .1], optimizer = 'sgd) <- for the aux_output

X_train_A, X_train_B = X_train[:, :5], X_train[:, 2:]
X_valid_A, X_valid_B = X_valid[:, :5], X_valid[:, 2:]
X_test_A, X_test_B = X_test[:, :5], X_test[:, 2:]
X_new_A, X_new_B = X_test_A[:3], X_test_B[:3]

history = model.fit((X_train_A, X_train_B), y_train, epochs=20, validation_data=((X_valid_A, X_valid_B), y_valid))

mse_test = model.evaluate((X_test_A, X_test_B), y_test)
total_loss, main_loss, aux_loss = model.evaluate([X_test_A, X_test_B], [y_test, y_test]) <- for aux_output

y_pred = model.predict((X_new_A, X_new_B))

```

## Subclassing AIP => Dynamic Models

In [None]:
class WideAndDeepModel(keras.Model):
    def __init__(self, units=30, activation="relu", **kwargs):
        super().__init__(**kwargs) # handles standard args (e.g., name)
        self.hidden1 = keras.layers.Dense(units, activation=activation)
        self.hidden2 = keras.layers.Dense(units, activation=activation)
        self.main_output = keras.layers.Dense(1)
        self.aux_output = keras.layers.Dense(1)

    def call(self, inputs):
        input_A, input_B = inputs
        hidden1 = self.hidden1(input_B)
        hidden2 = self.hidden2(hidden1)
        concat = keras.layers.concatenate([input_A, hidden2])
        main_output = self.main_output(concat)
        aux_output = self.aux_output(hidden2)
        return main_output, aux_output

model = WideAndDeepModel()

# Tensorflow

[playground](https://playground.tensorflow.org/)

# Save and Load Tensorflow

[save&load](https://www.tensorflow.org/tutorials/keras/save_and_load)

## Save & Restore

```
model = keras.layers.Sequential([...])
model.compile([...])
model.fit([...])
model.save('my_keras_model.h5')

model = keras.models.load_model('model_name.h5')
```

## Callbacks

- early stopping 
    ```
    checkpoint_cb = keras.ModelCheckpoint('my_keras_model.h5') 
    history = model.fit(X_train, y_train, epochs = 10, callbacks = [checkpoint_cb])
    ```

    - save_best_only = True -> only save the best one by checking the validation data

    - use EarlyStopping

        ```
        early_stopping_cb = keras.callbacks.EarlyStopping(patience = 10, restore_best_weights = True)

        history = model.fit(X_train, y_train, epochs = 100, validation_data = (X_valid, y_valid), callbacks = [checkpoint_cb, early_stopping_cb])
        ```



# TensorBoard

In [None]:
import os
root_logdir = os.path.join(os.curdir, "my_logs")

def get_run_logdir():
    import time
    run_id = time.strftime("run_%Y_%m_%d-%H_%M_%S")
    return os.path.join(root_logdir, run_id)

run_logdir = get_run_logdir() # e.g., './my_logs/run_2019_06_07-15_15_22

# after build and compile the model
tensorboard_cb = keras.callbacks.TensorBoard(run_logdir)
history = model.fit(X_train, y_train, epochs = 30, validation_data = (X_valid, y_valid), callbacks = [tensorboard_cb])