# Tutorial 5: Model

## Overview

In this tutorial we will cover:

* [Instantiating and Compiling a Model](#t05compile)
* [The Model Function](#t05model)
    * [Custom Models](#t05custom)
    * [FastEstimator Models](#t05fe)
    * [Pre-Trained Models](#t05trained)
* [The Optimizer Function](#t05optimizer)
* [Loading Model Weights](#t05weights)
* [Specifying a Model Name](#t05name)
* [Related Apphub Examples](#t05apphub)

<a id='t05compile'></a>

## Instantiating and Compiling a model

We need to specify two things to instantiate and compile a model:
* model_fn
* optimizer_fn

Model definitions can be implemented in Tensorflow or Pytorch and instantiated by calling **`fe.build`** which constructs a model instance and associates it with the specified optimizer.

<a id='t05model'></a>

## Model Function

`model_fn` should be a function/lambda function which returns either a `tf.keras.Model` or `torch.nn.Module`. FastEstimator provides several ways to specify the model architecture:

* Custom model architecture
* Importing a pre-built model architecture from FastEstimator
* Importing pre-trained models/architectures from PyTorch or TensorFlow

<a id='t05custom'></a>

### Custom model architecture
Let's create a custom model in TensorFlow for demonstration.

#### tf.keras.Model

In [None]:
import os
import tempfile

import torch
import torch.nn as nn
import torch.nn.functional as fn
from torchvision import models

import fastestimator as fe
from fastestimator.architecture.tensorflow import LeNet
from fastestimator.dataset.data import cifair10, cifair100
from fastestimator.op.tensorop.loss import CrossEntropy
from fastestimator.op.tensorop.model import ModelOp, UpdateOp
from fastestimator.trace.io import BestModelSaver
from fastestimator.trace.metric import Accuracy
from fastestimator.op.numpyop.univariate import Minmax


In [None]:
model = models.resnet50(pretrained=True)

model = model.cuda() if use_cuda else model
    


model.fc = model.fc.cuda() if use_cuda else model.fc

In [None]:
class my_model_torch():
    def __init__(self, model):
        self.model = model

    def get_classification_model(self, num_classes=10):
        base_model = self.model
        num_ftrs = base_model.fc.in_features
        base_model.fc = torch.nn.Linear(num_ftrs, 1024)
        base_model.fc = nn.Sequential(
            torch.nn.Dropout(0.5),
            torch.nn.Linear(num_ftrs, 1024),
            torch.nn.Dropout(0.2),
            torch.nn.Linear(1024, 512),
            torch.nn.Dropout(0.2),
            torch.nn.Linear(512, 256),
            torch.nn.Dropout(0.2),
            torch.nn.Linear(256, 128),
            torch.nn.Dropout(0.2),
            torch.nn.Linear(128, num_classes))
        return base_model

In [None]:
model_dir = tempfile.mkdtemp()

In [None]:
train_data, eval_data = cifair100.load_data()

pipeline = fe.Pipeline(train_data=train_data,
                       eval_data=eval_data,
                       batch_size=32,
                       ops=[Minmax(inputs="x", outputs="x")])

In [None]:
base_model = LeNet(input_shape=(32, 32, 3), classes=100)

model = fe.build(model_fn=lambda: base_model, optimizer_fn="adam")

network = fe.Network(ops=[
        ModelOp(model=model, inputs="x", outputs="y_pred"),
        CrossEntropy(inputs=("y_pred", "y"), outputs="ce"),
        UpdateOp(model=model, loss_name="ce") ])

In [None]:
traces = [Accuracy(true_key="y", pred_key="y_pred"),
          BestModelSaver(model=model, save_dir=model_dir, metric="accuracy", save_best_mode="max")]

estimator = fe.Estimator(pipeline=pipeline,
                         network=network,
                         epochs=10,
                         traces=traces)

In [None]:
estimator.fit()

In [None]:
# TensorFlow
fe.backend.save_model(model, save_dir=model_dir, model_name= "lenet_tf")

In [None]:
import numpy as np
import random

data = eval_data[random.choice(range(0, len(eval_data)))]
data = pipeline.transform(data, mode="eval")
data = network.transform(data, mode="eval")

print("Ground truth class is {}".format(data["y"][0]))
print("Predicted class is {}".format(np.argmax(data["y_pred"])))
img = fe.util.ImgData(x=data["x"])
fig = img.paint_figure()

Importing the pretrained weights to load a new model 

In [None]:
model_path = os.path.join(model_dir, 'lenet_tf.h5')
# TensorFlow
custom_model2 = LeNet(input_shape=(32, 32, 3), classes=10)

custom_model2.load_weights(model_path, by_name=True)

In [None]:
no_of_non_trainabe_layers = 3
range_of_non_trainable_layers = range(3)
for i in range_of_non_trainable_layers:
    custom_model2.layers[i] = False

In [None]:
custom_model2 = fe.build(model_fn=lambda: custom_model2, optimizer_fn="adam")

custom_network = fe.Network(ops=[
        ModelOp(model=custom_model2, inputs="x", outputs="y_pred"),
        CrossEntropy(inputs=("y_pred", "y"), outputs="ce"),
        UpdateOp(model=custom_model2, loss_name="ce") 
    ])

In [None]:
custom_train_data, custom_eval_data = cifair10.load_data()

custom_pipeline = fe.Pipeline(train_data=custom_train_data,
                       eval_data=custom_eval_data,
                       batch_size=32,
                       ops=[Minmax(inputs="x", outputs="x")])

In [None]:
custom_traces = [Accuracy(true_key="y", pred_key="y_pred"),
          BestModelSaver(model=custom_model2, save_dir=model_dir, metric="accuracy", save_best_mode="max")]

customer_estimator = fe.Estimator(pipeline=custom_pipeline,
                         network=custom_network,
                         epochs=5,
                         traces=custom_traces)

In [None]:
customer_estimator.fit()

In [None]:
for i in range(0, len(custom_model2.layers)):
    custom_model2.layers[i] = True

In [None]:
customer_estimator.fit()

In [None]:
import numpy as np

data = custom_eval_data[random.choice(range(0, len(custom_eval_data)))]
data = custom_pipeline.transform(data, mode="eval")
data = custom_network.transform(data, mode="eval")

print("Ground truth class is {}".format(data["y"][0]))
print("Predicted class is {}".format(np.argmax(data["y_pred"])))
img = fe.util.ImgData(x=data["x"])
fig = img.paint_figure()

### Importing pre-trained models/architectures from PyTorch or TensorFlow

Below we show how to define a model function using a pre-trained resnet model provided by TensorFlow and PyTorch respectively. We load the pre-trained models using a lambda function.

#### Pre-trained model from tf.keras.applications 

In [None]:
from tensorflow.keras.applications import ResNet50
resnet50_tf = fe.build(model_fn=lambda: ResNet50(weights='imagenet'), optimizer_fn="adam")

<a id='t05optimizer'></a>

If a model function returns multiple models, a list of optimizers can be provided. See the **[pggan apphub](../../apphub/image_generation/pggan/pggan.ipynb)** for an example with multiple models and optimizers.

<a id='t05weights'></a>

## Loading model weights

We often need to load the weights of a saved model. Model weights can be loaded by specifying the path of the saved weights using the `weights_path` parameter. Let's use the resnet models created earlier to showcase this.

#### Saving model weights
Here, we create a temporary directory and use FastEstimator backend to save the weights of our previously created resnet50 models:

In [None]:
import os
import tempfile

model_dir = tempfile.mkdtemp()

# TensorFlow
fe.backend.save_model(resnet50_tf, save_dir=model_dir, model_name= "resnet50_tf")

# PyTorch
fe.backend.save_model(resnet50_torch, save_dir=model_dir, model_name= "resnet50_torch")

#### Loading weights for TensorFlow and PyTorch models

In [None]:
# TensorFlow
resnet50_tf = fe.build(model_fn=lambda: tf.keras.applications.ResNet50(weights=None), 
                       optimizer_fn="adam", 
                       weights_path=os.path.join(model_dir, "resnet50_tf.h5"))

In [None]:
# PyTorch
resnet50_torch = fe.build(model_fn=lambda: models.resnet50(pretrained=False), 
                          optimizer_fn="adam", 
                          weights_path=os.path.join(model_dir, "resnet50_torch.pt"))

<a id='t05name'></a>

## Specifying a Model Name

The name of a model can be specified using the `model_name` parameter. The name of the model is helpful in distinguishing models when multiple are present.

In [None]:
model = fe.build(model_fn=LeNet, optimizer_fn="adam", model_name="LeNet")
print("Model Name: ", model.model_name)

If a model function returns multiple models, a list of model_names can be given. See the **[pggan apphub](../../apphub/image_generation/pggan/pggan.ipynb)** for an illustration with multiple models and model names.

<a id='t05apphub'></a>

## Apphub Examples
You can find some practical examples of the concepts described here in the following FastEstimator Apphubs:

* [PG-GAN](../../apphub/image_generation/pggan/pggan.ipynb)
* [Uncertainty Weighted Loss](../../apphub/multi_task_learning/uncertainty_weighted_loss/uncertainty_loss.ipynb)