# Purpose

This notebook works through an example workflow of tracking Keras experiments using [MLFlow](https://www.mlflow.org).  

# Data

The CA housing data will be used for this example, which is a simple regressiont task. It will be loaded from the `sklearn` data loader.  I'll split off 20% into a test set and an additional 20% into a validation set.  Finally, I'll standardize the data using `StandardScaler` ahead of modeling.

In [1]:
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import pandas as pd

housing = fetch_california_housing()

X_train_full, X_test, y_train_full, y_test = train_test_split(
    housing.data, housing.target, test_size=0.2
)
X_train, X_valid, y_train, y_valid = train_test_split(
    X_train_full, y_train_full, test_size=0.2
)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_valid = scaler.transform(X_valid)
X_test = scaler.transform(X_test)

pd.DataFrame(X_train, columns=housing.feature_names).describe()


Unnamed: 0,MedInc,HouseAge,AveRooms,AveBedrms,Population,AveOccup,Latitude,Longitude
count,13209.0,13209.0,13209.0,13209.0,13209.0,13209.0,13209.0,13209.0
mean,3.553521e-15,1.25874e-16,-5.648193e-16,-7.127482e-15,-4.195801e-17,6.8639e-16,3.574365e-14,-1.464878e-13
std,1.000038,1.000038,1.000038,1.000038,1.000038,1.000038,1.000038,1.000038
min,-1.779689,-2.192862,-1.841598,-1.438534,-1.26319,-0.3395727,-1.44559,-2.358488
25%,-0.6883022,-0.8439976,-0.399789,-0.1862247,-0.5677128,-0.08730434,-0.7943021,-1.110999
50%,-0.1717385,0.02879702,-0.08497085,-0.101238,-0.2297322,-0.02983685,-0.6443653,0.5356868
75%,0.4589652,0.6635568,0.2442975,0.001356624,0.2626018,0.03705159,0.9768266,0.7801947
max,5.886135,1.853731,50.95891,65.45067,24.07471,86.64295,2.958804,2.626478


# Model

The model used for this example will be a wide and deep network with the following characteristics:
- a deep path with `n_hidden` hidden layers with `n_neurons` at each layer
- a wide path connecting all inputs to the output
- all layers are fully connected
- two outpus:
    - one from the deep path alone, fit to the target
    - one from the concatenaded wide and deep paths, fit to the target

This type of multi-output architecture is usually used as a regularization technique, but I'm simply employing it here so my example has more than one loss to simultaneously minimize.  This model is very similar to the regression example I used in my [intro to Keras](https://github.com/mcnewcp/book-geron-ml-sklearn-keras-tensorflow/blob/main/10-intro-ann-keras/10-intro-ann-keras.ipynb) notebook and from Chapter 10 of [Hands on ML](https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/), so I won't explain the Keras code below.

The model build code should be functionalized so that the hyperparameters are generalized in the build and compile steps for integration into hyperparameter tuning.  I'm pulling out the following hyperparameters for tuning:
- `n_hidden`: number of hidden layers
- `n_neurons`: number of neurons per layer
- `activation`: activation funciton used in hidden layers

*Note*: I'm not tuning learning rate here.  In general I think it's best practice to choose a sufficiently low learning rate, high number of epochs, and use early stopping.  The goal of this stage of hyperparameter tuning is to simply identify promising model candidates.  Once promising candidates have been identified, the learning rate will be fine tuned.


In [2]:
import tensorflow as tf
from tensorflow import keras

print("tf version:", tf.__version__, ", keras version:", keras.__version__)



2023-02-15 10:05:31.142284: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


tf version: 2.11.0 , keras version: 2.11.0


In [3]:
def build_model(n_hidden=1, n_neurons=20, activation="relu"):
    inp = keras.layers.Input(shape=[8], name="input")  # input layer
    for layer in range(n_hidden):  # sequentially add hidden layers
        if layer == 0:
            hl = keras.layers.Dense(n_neurons, activation=activation)(inp)
        else:
            hl = keras.layers.Dense(n_neurons, activation=activation)(hl)
    concat = keras.layers.Concatenate()([hl, inp])  # concat deep and wide paths
    main_output = keras.layers.Dense(1, name="main_output")(concat)  # combined output
    aux_output = keras.layers.Dense(1, name="aux_output")(hl)  # deep output
    model = keras.Model(inputs=[inp], outputs=[main_output, aux_output])
    model.compile(
        loss=["mse", "mse"],
        loss_weights=[0.9, 0.1],  # weighting heavily towards main output
        optimizer=keras.optimizers.SGD(learning_rate=1e-3),
    )
    return model


# MLFlow Experiment Tracking

[MLFLow](https://www.mlflow.org) is a full-featured end-to-end ML lifecycle management platform, but all I'll be using it for in this example is experiment tracking.  The [documentation on tracking](https://www.mlflow.org/docs/latest/tracking.html) is quite good and so I'm working primarily from that.  There is even an automatic logging submodule for Keras and Tensorflow, `mlflow.tensorflow.autolog()` which I will try out first. In addition, nearly anything can be logged manually, including categories of metrics, parameters, tags, and artifacts.  Artifacts can be nearly anything including plots or the modles themselves.

## Auto-Logging

First I'll give the auto logging a shot and see what it logs.  I've had issues with the auto logging submodule for scikit-learn because it simply logged too many parameters to be useful and after the experiment count reached a certain threshold, performance in the dashboard tool suffered.

**Note**: the auto-logging submodule only works for tensorflow versions 2.3.0 - 2.11.0, which I had to specify manually with `pip` as the `conda` installer chose a version outside of that range.

By default, on first execution, MLFlow creates the directory `./mlruns` on default and stores all experiment related information as individual files within.  Another option is to store the information as a SQLite database, or incorporate into Databricks. 

In [4]:
import mlflow

mlflow.set_experiment(experiment_name="auto-log")  # this will create ./mlruns
mlflow.tensorflow.autolog()  # turn on auto logging

model = build_model(n_hidden=2)

with mlflow.start_run(run_name="auto-log-1") as run:
    history = model.fit(
        X_train,
        {"main_output": y_train, "aux_output": y_train},
        epochs=250,
        validation_data=(X_valid, {"main_output": y_valid, "aux_output": y_valid}),
        callbacks=[
            keras.callbacks.EarlyStopping(patience=5, restore_best_weights=True)
        ],
        verbose=0,
    )
    total_loss, main_loss, aux_loss = model.evaluate(
        X_test, {"main_output": y_test, "aux_output": y_test}
    )



2023/02/15 10:05:37 INFO mlflow.tracking.fluent: Experiment with name 'auto-log' does not exist. Creating a new experiment.
2023-02-15 10:05:37.317847: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.






INFO:tensorflow:Assets written to: /var/folders/m8/0_prp1tj41s9n5xm0bfqp6wm0000gn/T/tmph4i6eno7/model/data/model/assets


INFO:tensorflow:Assets written to: /var/folders/m8/0_prp1tj41s9n5xm0bfqp6wm0000gn/T/tmph4i6eno7/model/data/model/assets




## Local Server

To serve the exploration tool locally, you simply run the following in the command line:

```zsh
mlflow ui
```

By default, this will look for local logged files in `./mlruns` and it will launch on port 5000.  If you need to change the location use `--backend-store-uri` and if you need to specify the port use `-p`.

The UI provides a simple table comparison of all runs within an experiment so you can quickly check the parameters used for each run and corresponding metrics to choose promising model candidates.  

It looks like the auto logger logs a lot of useful information including many parameters inferred from the model, early stopping results, and learning curves for each loss (see below).  Unsurprisingly though it doesn't log anything about model architecture which will be important in my case since most of my tuning will involve architecture changes, so these will need to be logged manually.

![MLFLow UI Screenshot](images/mlflow_ui_sn.png)

## Manual Logging

It's quite easy to log additional hyperparameters via `mlflow.log_params()`.  It accepts a dictionary of parameters and logs them to the corresponding run, as long as it's called under `with mlflow.start_run():`.  This means I'll need to define my hyperparameters in a dictionary at the start of my run, which will also aid in integrating optimization later.  I'll modify the above workflow to include both of these changes below.

In [5]:
hyper_params = {"n_hidden": 1, "n_neurons": 20, "activation": "relu"}

mlflow.set_experiment(experiment_name="auto-and-manual")
mlflow.tensorflow.autolog()  # turn on auto logging

model = build_model(**hyper_params)  # names must match

with mlflow.start_run(run_name="run-1") as run:
    history = model.fit(
        X_train,
        {"main_output": y_train, "aux_output": y_train},
        epochs=250,
        validation_data=(X_valid, {"main_output": y_valid, "aux_output": y_valid}),
        callbacks=[
            keras.callbacks.EarlyStopping(patience=5, restore_best_weights=True)
        ],
        verbose=0,
    )
    mlflow.log_params(hyper_params)  # log all hyperparams



2023/02/15 10:06:45 INFO mlflow.tracking.fluent: Experiment with name 'auto-and-manual' does not exist. Creating a new experiment.






INFO:tensorflow:Assets written to: /var/folders/m8/0_prp1tj41s9n5xm0bfqp6wm0000gn/T/tmp9gt000uv/model/data/model/assets


INFO:tensorflow:Assets written to: /var/folders/m8/0_prp1tj41s9n5xm0bfqp6wm0000gn/T/tmp9gt000uv/model/data/model/assets


In [6]:
hyper_params = {"n_hidden": 3, "n_neurons": 10, "activation": "relu"}
model = build_model(**hyper_params)  # names must match

with mlflow.start_run(run_name="run-2") as run:
    history = model.fit(
        X_train,
        {"main_output": y_train, "aux_output": y_train},
        epochs=250,
        validation_data=(X_valid, {"main_output": y_valid, "aux_output": y_valid}),
        callbacks=[
            keras.callbacks.EarlyStopping(patience=5, restore_best_weights=True)
        ],
        verbose=0,
    )
    mlflow.log_params(hyper_params)  # log all hyperparams







INFO:tensorflow:Assets written to: /var/folders/m8/0_prp1tj41s9n5xm0bfqp6wm0000gn/T/tmpg6p6qect/model/data/model/assets


INFO:tensorflow:Assets written to: /var/folders/m8/0_prp1tj41s9n5xm0bfqp6wm0000gn/T/tmpg6p6qect/model/data/model/assets


# Train/Val Workflow

Now to use the above workflow in hyperparameter tuning experiments, I'm going to functionalize it below.  The inputs include the following: 
- run_name
- training data
- validation data
- model
- hyperparameters (for logging)

The function should output a measure of loss to pass to optimization.

In [7]:
# def run_log_exp(run_name: str, train_data: tuple, val_data: tuple, model: keras.Model):
#     with mlflow.start_run(run_name=run_name):
#         history = model.fit(
#             train_data[0],
#             train_data[1],
#             epochs=250,
#             validation_data=val_data,
#             callbacks=[
#                 keras.callbacks.EarlyStopping(patience=5, restore_best_weights=True)
#             ],
#             verbose=0,
#         )
#     val_total_loss, __, __ = model.evaluate(val_data[0], val_data[1])


In [8]:
# mlflow.set_experiment(experiment_name="exp-2")

# model = build_model(n_hidden=1, n_neurons=15)
# loss = run_log_exp(
#     run_name="run-1",
#     train_data=(X_train, [y_train, y_train]),
#     val_data=(X_valid, [y_valid, y_valid]),
#     model=model,
# )

