**HOW TO INSTALL THE PROXY PLUGIN ON MEAD**
**ATTENTION** THIS SECTION WILL REMOVED AFTER THE MEAD PROXY PLUGIN IS INSTALLED IN MEAD USER DATA. THESE INSTRUCTIONS WILL NOT BE PART OF THE NOTEBOOK FOR GA.

OPEN A TERMINAL IN JUPYTER:
File->Open->New->Terminal

```
sudo su
source /home/ec2-user/anaconda3/bin/activate JupyterSystemEnv
pip install git+https://github.com/jupyterhub/nbserverproxy@v0.3.2
jupyter serverextension enable --py nbserverproxy --sys-prefix
source deactivate
restart part-003
```

```restart part-003``` will restart the jupyter notebook and install the required plugin to run tensorboard.

# ResNet CIFAR-10 with tensorboard

This notebook details how to use TensorBoard, and how the training job writes checkpoints to a external bucket.
The model used for this notebook is a RestNet model, against the CIFAR-10 dataset.
See the following papers for more background:

[Deep Residual Learning for Image Recognition](https://arxiv.org/pdf/1512.03385.pdf) by Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, Dec 2015.

[Identity Mappings in Deep Residual Networks](https://arxiv.org/pdf/1603.05027.pdf) by Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, Jul 2016.

### Let's start by setting up the environment.

In [1]:
import os
import sagemaker
import tensorflow

sagemaker_session = sagemaker.Session()

# Replace with a role (either name or full arn) that gives SageMaker access to S3 and cloudwatch
role='SageMakerRole'

### Downloading CIFAR-10 dataset
Downloading the test and training data will take around 5 minutes.

In [2]:
import utils

utils.cifar10_download()

cifar dataset already downloaded


### Uploading the data to a S3 bucket

In [3]:
inputs = sagemaker_session.upload_data(path='/tmp/cifar10_data', key_prefix='data/cifar10')

**sagemaker_session.upload_data** will upload the CIFAR-10 dataset from your machine to a bucket named **sagemaker-{*your aws account number*}**, if you don't have this bucket yet, sagemaker_session will create it for you.

### Complete source code

In [4]:
!tree

.
├── __init__.py
├── __pycache__
│   └── utils.cpython-36.pyc
├── source_dir
│   ├── __init__.py
│   ├── resnet_cifar_10.py
│   └── resnet_model.py
├── tensorflow_resnet_cifar10_with_tensorboard.ipynb
└── utils.py

2 directories, 7 files


## Running TensorFlow training on SageMaker

In [None]:
from sagemaker.tensorflow import TensorFlow


sorce_dir = os.path.join(os.getcwd(), 'source_dir')
estimator = TensorFlow(entry_point='resnet_cifar_10.py',
                       source_dir=sorce_dir,
                       role=role,
                       hyperparameters={'training_steps': 1000, 'evaluation_steps': 100},
                       train_instance_count=2, train_instance_type='ml.p2.xlarge', 
                       base_job_name='tensorboard-example')

estimator.fit(inputs, run_tensorboard_locally=True)

The **```fit```** method will create a training job named **```tensorboard-example-{unique identifier}```** with 2 p2 instances. These instances will be writing checkpoints to the s3 bucket **```sagemaker-{your aws account number}```**, if you don't have this bucket yet, sagemaker_session will create it for you. These checkpoints can be used for restoring the training job, and to analyze training job metrics using **TensorBoard**. 

The parameter **```run_tensorboard_locally=True```** will run **TensorBoard** in the machine that this notebook is running. Everytime a new checkpoint is created by the training job in the S3 bucket, **fit** will download the checkpoint to the temp folder that **TensorBoard** is pointing to.

When the **```fit```** method starts the training, it will log the port that **TensorBoard** is using to display the metrics. The default port is **6006**, but another port can be choosen depending on its availability.

**TensorBoard** will take some minutes to start displaying metrics, depending on how long the training job container take to start their jobs.

You can access **Tensorboard** locally [http://localhost:6006](http://localhost:6006) or using your SakeMaker workspace [proxy/6006](/proxy/6006)

# Deploy the trained model to prepare for predictions

The deploy() method creates an endpoint which serves prediction requests in real-time.

In [None]:
predictor = estimator.deploy(initial_instance_count=1, instance_type='ml.c4.xlarge')

# Deleting the endpoint

In [None]:
estimator.delete_endpoint()