<sub>&copy; 2020 Neuralmagic, Inc., Confidential // [Neural Magic Evaluation License Agreement](https://neuralmagic.com/evaluation-license-agreement/)</sub> 

# TensorFlow Transfer Learning with Adam Optimizer

This notebook provides a step-by-step walkthrough for downloading a recalibrated model from the Neural Magic Model Repo and using it for transfer learning. You will:
- Set up the environment
- Select a model*
- Set up the model and dataset
- Perform transfer learning
- Export to [ONNX](https://onnx.ai/)
 
\* Models available in the Neural Magic Model Repo are called out in this notebook. You can familiarize yourself with the models from which you can transfer learn.


Reading through this notebook will be reasonably quick to gain an intuition for what is happening. Rough time estimates for transfer learning are given. Note that training with the TensorFlow CPU implementation will be much slower than a GPU:
- 30 minutes on a GPU
- 3 hours on a laptop CPU

## Background
Neural networks can take a long time to train. Model optimization techniques such as [model pruning](https://towardsdatascience.com/pruning-deep-neural-network-56cae1ec5505) may be necessary to achieve both performance and optimizing goals. However, these model optimizations can involve many trials and errors due to a large number of hyperparameters. Fortunately, in the computer vision and natural language space, pruned (sparsified) neural networks [transfer learn well](https://towardsdatascience.com/a-comprehensive-hands-on-guide-to-transfer-learning-with-real-world-applications-in-deep-learning-212bf3b2f27a), allowing end users to get faster time to value with their deep learning deployments without having to start from scratch.

To make it easier to use pruned models, [Neural Magic](https://neuralmagic.com/) is actively:
- Creating pruned versions of popular models and datasets
- Thoroughly testing these models  with the Neural Magic Inference Engine to ensure performance
- Updating the Neural Magic Repo with these models and datasets

## Before you begin…
Be sure to read through the README found in the Neural Magic ML Tooling package.



## Step 1 - Setting Up the Environment

In this step, Neural Magic checks your environment setup to ensure the rest of the notebook will flow smoothly.
Before running, install the neuralmagicML package into the system using the following at the parent of the package directory:

`pip install neuralmagicML-python/ `


In [None]:
notebook_name = "transfer_learning_adam_tensorflow"
print("checking setup for {}...".format(notebook_name))

# filter because of tensorboard future warnings
import warnings

warnings.filterwarnings("ignore", category=FutureWarning)

try:
    # make sure neuralmagicML is installed
    import neuralmagicML
except Exception as ex:
    raise Exception(
        "please install neuralmagicML using the setup.py file before continuing"
    )

from neuralmagicML.utilsnb import check_tensorflow_notebook_setup

check_tensorflow_notebook_setup()

## Step 2 - Selecting a Model

Repositories may hold many models, so a simple UI is provided to make this selection process easier. Within the UI, ﬁlters can be applied for models trained in/on speciﬁc domains or datasets. Each network architecture listed will also include options for the dataset it was trained on and the type. The type refers to how the models were trained and/or recalibrated, speciﬁcally:
- base - baseline model, trained generally as in the original paper
- recal - a recalibrated model that is recalibrated to the point of fully recovering the baseline model’s metrics
- recal-perf - a recalibrated model that is recalibrated for performance to the point of recovering 99% of the baseline model’s metrics


In [None]:
from neuralmagicML.utilsnb import ModelSelectWidgetContainer

print("Creating ui...")
container = ModelSelectWidgetContainer(["tensorflow"], ["imagenet"])
display(container.create())

## Step 3 - Performing Transfer Learning

By default, we use the [Imagewoof](https://github.com/fastai/imagenette) dataset to transfer learn to (a dataset consisting of 10 classes of dogs). This dataset is used to show how to transfer learn on a simple dataset quickly. If you would like to try out transfer learning on your own dataset, replace the appropriate lines with your own:
- `num_classes = 10`
- `class_type = "single"`
- `train_dataset = ...`
- `val_dataset = ...`

More information for creating and working with TensorFlow datasets can be found [here](https://www.tensorflow.org/guide/data). Take care to keep the variable names the same, as the rest of the notebook is set up according to those.

The model is created in the graph based on the previous UI selections.

With that setup, you will begin transfer learning from the given model onto the dataset. The library to enable this is designed to be easily plugged into nearly any training setup for TensorFlow. In the cell block below is an example of how an integration looks. The implementation here trains all layers in the selected model. If you do not wish to do that, you can disable specific layers with standard TensorFlow code. Note that only four lines are needed to be able to integrate fully.
- Create a `ConstantKSModifier()`. This keeps the sparsity the same for any sparsiﬁed layers.
- Create a `ScheduledModifierManager()`. This is used in combination with the `ConstantKSModifier`
- Invoke `manager.create_ops()` for the desired graph. This updates the TensorFlow graph with the proper operators that modify the training process.
- Use `manager.max_epochs` to know how many epochs are needed for training.
- Invoke `sess.run(mod_ops)` on each optimizer step. This updates the modifying operators and variables in the TensorFlow graph.
- Invoke `manager.complete_graph()` once training has completed. This wilil cleanup the graph and set any final state for graph export and saving.

Once the training objects are created (optimizer, loss function, etc.), a `ScheduledModifierManager` is instantiated from the conﬁguration. Most logging and updates are done through TensorBoard for this notebook. The use of TensorBoard is entirely optional. Finally, regular training and testing code is used to go through the process.

Note, for convenience a TensorBoard instance is launched in the cell below pointed at `localhost`. If you are running this notebook on a remote server, then you will need to update TensorBoard accordingly.


In [None]:
import os
import math
from tqdm import auto
import numpy

from neuralmagicML.utils import create_unique_dir, clean_path, create_dirs
from neuralmagicML.tensorflow.models import ModelRegistry
from neuralmagicML.tensorflow.datasets import (
    ImagewoofDataset,
    ImagenetteSize,
    create_split_iterators_handle,
)
from neuralmagicML.tensorflow.utils import (
    tf_compat,
    batch_cross_entropy_loss,
    accuracy,
    write_simple_summary,
)

with tf_compat.Graph().as_default() as graph:
    batch_size = 64
    num_classes = 10
    class_type = "single"
    num_epochs = 20
    repo_model = container.selected_model
    model_name = repo_model.registry_key
    input_shape = ModelRegistry.input_shape(model_name)
    input_size = input_shape[0]

    # create the datasets
    with tf_compat.device("/cpu:0"):
        print("loading datasets")
        train_dataset = ImagewoofDataset(
            train=True, rand_trans=True, dataset_size=ImagenetteSize.s320, image_size=input_size
        )
        train_len = len(train_dataset)
        train_dataset = train_dataset.build(
            batch_size,
            shuffle_buffer_size=1000,
            prefetch_buffer_size=batch_size,
            num_parallel_calls=4,
        )
        train_steps = math.ceil(train_len / float(batch_size))

        val_dataset = ImagewoofDataset(
            train=False, rand_trans=False, dataset_size=ImagenetteSize.s320, image_size=input_size
        )
        val_len = len(val_dataset)
        val_dataset = val_dataset.build(
            batch_size,
            shuffle_buffer_size=1000,
            prefetch_buffer_size=batch_size,
            num_parallel_calls=4,
        )
        val_steps = math.ceil(val_len / float(batch_size))

    handle, iterator, (train_iter, val_iter) = create_split_iterators_handle(
        [train_dataset, val_dataset]
    )
    images, labels = iterator.get_next()
    training = tf_compat.placeholder(dtype=tf_compat.bool, shape=[])

    # create the model and graph
    print("Creating model graph for {}".format(model_name))
    logits = ModelRegistry.create(
        model_name, inputs=images, training=training, num_classes=num_classes,
    )

    print("Creating loss, accuracy, and optimizer in graph")
    loss = batch_cross_entropy_loss(logits, labels)
    acc = accuracy(logits, labels)
    global_step = tf_compat.train.get_or_create_global_step()
    train_op = tf_compat.train.AdamOptimizer(learning_rate=1e-4).minimize(
        loss, global_step=global_step
    )
    update_ops = tf_compat.get_collection(tf_compat.GraphKeys.UPDATE_OPS)
    
    #######################################################
    # First lines required for transfer learning from a sparse model in TensorFlow
    #######################################################
    print("Creating constant sparse ops in graph")
    from neuralmagicML.tensorflow.recal import (
        ScheduledModifierManager,
        ConstantKSModifier,
    )
    manager = ScheduledModifierManager([ConstantKSModifier(params="__ALL__")])
    mod_ops, mod_extras = manager.create_ops(train_steps, global_step)

    with tf_compat.Session() as sess:
        tensorboard_path = create_unique_dir(
            os.path.join(".", "tensorboard-logs", notebook_name, model_name)
        )
        print("logging tensorboard to {}".format(tensorboard_path))
        summary_writer = tf_compat.summary.FileWriter(tensorboard_path, sess.graph)
        summaries = tf_compat.summary.merge_all()
        
        # startup tensorboard
        %load_ext tensorboard
        %tensorboard --logdir ./tensorboard-logs

        print("initializing")
        sess.run(
            [
                tf_compat.global_variables_initializer(),
                tf_compat.local_variables_initializer(),
            ]
        )
        train_iter_handle, val_iter_handle = sess.run(
            [train_iter.string_handle(), val_iter.string_handle()]
        )

        print("restoring pre-trained model weights")
        ModelRegistry.load_pretrained(
            model_name, pretrained=repo_model.desc, remove_dynamic_tl_vars=True,
        )

        #######################################################
        # Initialization line required for transfer learning from a sparse model in TensorFlow
        # Note, initialization is called after load_pretrained to initialize from pretrained
        #######################################################
        manager.initialize_session()

        for epoch in auto.tqdm(range(num_epochs), desc="transfer learning"):
            print("training for epoch {}...".format(epoch))
            sess.run(train_iter.initializer)

            for step in range(train_steps):
                _, __, meas_step, meas_loss, meas_acc, meas_summ = sess.run(
                    [train_op, update_ops, global_step, loss, acc, summaries],
                    feed_dict={handle: train_iter_handle, training: True},
                )

                if step >= train_steps - 1:
                    # log the general summaries on the last training step
                    summary_writer.add_summary(meas_summ, meas_step)

                #######################################################
                # Modifier update ops line for transfer learning from a sparse model in TensorFlow
                #######################################################
                sess.run(mod_ops)

                write_simple_summary(summary_writer, "Train/Loss", meas_loss, meas_step)
                write_simple_summary(
                    summary_writer, "Train/Acc", meas_acc * 100.0, meas_step
                )

            print("validating for epoch {}...".format(epoch))
            sess.run(val_iter.initializer)
            val_losses = []
            val_acc = []

            for step in range(val_steps):
                meas_loss, meas_acc = sess.run(
                    [loss, acc], feed_dict={handle: val_iter_handle, training: False},
                )
                val_losses.append(meas_loss)
                val_acc.append(meas_acc)

                write_simple_summary(
                    summary_writer, "Val/Loss", numpy.mean(val_losses).item(), epoch
                )
                write_simple_summary(
                    summary_writer, "Val/Acc", numpy.mean(val_acc).item(), epoch
                )
            print(
                "completed epoch {} with val acc {}".format(
                    epoch, numpy.mean(val_acc).item() * 100
                )
            )

        #######################################################
        # Final line for transfer learning from a sparse model in TensorFlow, complete the graph
        #######################################################
        manager.complete_graph()

        checkpoint_path = create_unique_dir(
            os.path.join(".", notebook_name, model_name, "checkpoint")
        )
        checkpoint_path = os.path.join(checkpoint_path, "model")
        create_dirs(checkpoint_path)
        saver = ModelRegistry.saver(model_name)
        saver.save(sess, checkpoint_path)
        print("saved model checkpoint to {}".format(checkpoint_path))


## Step 4 - Exporting to ONNX

Now that the model is fully recalibrated, you need to export it to an ONNX format, which is the format used by the Neural Magic Inference Engine. For TensorFlow, exporting to ONNX is not natively supported. To add support, you will use the `tf2onnx` Python package. In the cell block below, a convenience class, `GraphExporter()`, is used to handle exporting. It wraps the somewhat complicated API for `tf2onnx` into an easy to use interface.

Note, for some configurations, the tf2onnx code does not work properly in a Jupyter Notebook. To remedy this, you should run the `exporter.export_onnx()` function call in a Python console or script.

Once the model is saved as an ONNX ﬁle, it is ready to be used for inference with Neural Magic.


In [None]:
from neuralmagicML.utils import create_unique_dir, clean_path, create_dirs
from neuralmagicML.tensorflow.utils import GraphExporter

export_path = clean_path(os.path.join(".", notebook_name, model_name, "exported"))
exporter = GraphExporter(export_path)

with tf_compat.Graph().as_default() as graph:
    print("Recreating graph...", flush=True)

    input_shape = ModelRegistry.input_shape(model_name)
    images = tf_compat.placeholder(
        tf_compat.float32, [None, input_size, input_size, 3], name="inputs"
    )
    logits = ModelRegistry.create(
        model_name,
        inputs=images,
        training=False,
        num_classes=num_classes,
        class_type=class_type,
    )

    input_names = [images.name]
    output_names = [logits.name]

    with tf_compat.Session() as sess:
        sess.run(tf_compat.global_variables_initializer())
        print("Restoring previous weights...", flush=True)
        saver = ModelRegistry.saver(model_name)
        saver.restore(sess, checkpoint_path)

        print("Exporting to pb...", flush=True)
        exporter.export_pb(outputs=[logits])
        print("Exported pb file to {}".format(exporter.pb_path), flush=True)

print("Exporting to onnx...", flush=True)
exporter.export_onnx(inputs=input_names, outputs=output_names)
print("Exported onnx file to {}".format(exporter.onnx_path))

## Next Step

Run your model (ONNX file) through the Neural Magic Inference Engine. The following is an example of code that you can run in your Python console. Be sure to enter your ONNX file path and batch size.

```
from neuralmagic import create_model
model = create_model(onnx_file_path=’some/path/to/model.onnx’, batch_size=1)
inp = [numpy.random.rand(1, 3, 224, 224).astype(numpy.float32)]
out = model.forward(inp)
print(out)
```