<a href="https://www.nvidia.com/dli"> <img src="images/DLI Header.png" alt="Header" style="width: 400px;"/> </a>

# Deploying Other's Models

So far, you have learned to solve problems with deep learning by loading pre-organized data, choosing and training a network, and deploying the results into an application. As we begin our discussion of *performance*, we're going to start at the other end of the spectrum.

Recall that deploying a trained model consists of:


1.   Creating a "Classifier" object using the network's architecture and weights
2.   Transforming the data you have into the data the model expects and the output the model generates into something useful

In this section, you'll learn to deploy other people's networks so that you can get the performance gains of their research, compute time, and data curation.

Recall that the specific deep learning workflow this course started with is *image classification.* One reason we start with this task is because it is one of the most solved challenges in Deep Learning. It has benefited from the research community refining solutions to a competition called ["ImageNet."](https://qz.com/1034972/the-data-that-changed-the-direction-of-ai-research-and-possibly-the-world/)

"ImageNet" is a large dataset with 1000 classes of common images. The competition granted awards for the research teams that had the lowest loss against this dataset. The network we have been working with, AlexNet, won Imagenet in 2012. Teams from Google and Microsoft have been winners since then. 

Here's the exciting part. Not only can we use their network architecture, we can even use their trained weights, acquired through the manipulation of the four levers above: data, hyperparameters, training time, and network architecture. Without any training or data collection, we can *deploy* award winning neural networks.

All we need to deploy one of these models are the model's architecture and weights. A quick Google search for "pretrained model alexnet imagenet caffe" returns multiple pages to download this model.

We'll download them both using a tool called wget. Wget is a great way of downloading data from the web directly to the server you're working on without pulling it to your local machine first. 

In [2]:
!wget http://dl.caffe.berkeleyvision.org/bvlc_alexnet.caffemodel

--2018-03-05 23:23:54--  http://dl.caffe.berkeleyvision.org/bvlc_alexnet.caffemodel
Resolving dl.caffe.berkeleyvision.org (dl.caffe.berkeleyvision.org)... 169.229.222.251
Connecting to dl.caffe.berkeleyvision.org (dl.caffe.berkeleyvision.org)|169.229.222.251|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 243862414 (233M) [application/octet-stream]
Saving to: 'bvlc_alexnet.caffemodel.1'


2018-03-05 23:24:04 (23.7 MB/s) - 'bvlc_alexnet.caffemodel.1' saved [243862414/243862414]



In [14]:
!wget https://raw.githubusercontent.com/BVLC/caffe/master/models/bvlc_alexnet/deploy.prototxt

--2018-03-05 23:28:18--  https://raw.githubusercontent.com/BVLC/caffe/master/models/bvlc_alexnet/deploy.prototxt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.32.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.32.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3629 (3.5K) [text/plain]
Saving to: 'deploy.prototxt.2'


2018-03-05 23:28:18 (79.1 MB/s) - 'deploy.prototxt.2' saved [3629/3629]



Those are the same two files that DIGITS generated when we trained a model from scratch. The only other file we took from DIGITS was the mean image that was used during training. We can download that below.

In [4]:
!wget https://github.com/BVLC/caffe/blob/master/python/caffe/imagenet/ilsvrc_2012_mean.npy

--2018-03-05 23:24:11--  https://github.com/BVLC/caffe/blob/master/python/caffe/imagenet/ilsvrc_2012_mean.npy
Resolving github.com (github.com)... 192.30.253.112, 192.30.253.113
Connecting to github.com (github.com)|192.30.253.112|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: 'ilsvrc_2012_mean.npy.1'

    [ <=>                                   ] 46,923      --.-K/s   in 0.004s  

2018-03-05 23:24:11 (11.5 MB/s) - 'ilsvrc_2012_mean.npy.1' saved [46923]



In [1]:
!pwd

/dli/tasks/task5/task


You now have the three files we need to repeat the workflow you learned in the previous section. We're going to write the code directly in python instead of in Jupyter, but take note that the steps are the same.

Examine the code here and compare to what was written in the previous notebook:

[deploying_imagenet_model.py](/dli/tasks/task5/task/deploying_imagenet_model.py)

In [15]:
arch = './deploy.prototxt.2'
weights = './bvlc_alexnet.caffemodel'
img = '/dli/tasks/task4/task/images/LouieReady.png'
mean = './ilsvrc_2012_mean.npy'

In [16]:
!cat ./deploy.prototxt.2

name: "AlexNet"
layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param { shape: { dim: 10 dim: 3 dim: 227 dim: 227 } }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 96
    kernel_size: 11
    stride: 4
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "norm1"
  type: "LRN"
  bottom: "conv1"
  top: "norm1"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "norm1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolutio

In [17]:
!python /dli/tasks/task5/task/deploying_imagenet_model.py $arch $weights $img $mean

libdc1394 error: Failed to initialize libdc1394
I0305 23:28:35.014855   411 net.cpp:52] Initializing net from parameters: 
name: "AlexNet"
state {
  phase: TEST
}
layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param {
    shape {
      dim: 10
      dim: 3
      dim: 227
      dim: 227
    }
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 96
    kernel_size: 11
    stride: 4
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "norm1"
  type: "LRN"
  bottom: "conv1"
  top: "norm1"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "norm1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "

I0305 23:28:35.146833   411 net.cpp:144] Setting up fc6
I0305 23:28:35.146912   411 net.cpp:151] Top shape: 10 4096 (40960)
I0305 23:28:35.146924   411 net.cpp:159] Memory required for data: 82333240
I0305 23:28:35.146948   411 layer_factory.hpp:77] Creating layer relu6
I0305 23:28:35.146970   411 net.cpp:94] Creating Layer relu6
I0305 23:28:35.146983   411 net.cpp:435] relu6 <- fc6
I0305 23:28:35.146998   411 net.cpp:396] relu6 -> fc6 (in-place)
I0305 23:28:35.147022   411 net.cpp:144] Setting up relu6
I0305 23:28:35.147035   411 net.cpp:151] Top shape: 10 4096 (40960)
I0305 23:28:35.147045   411 net.cpp:159] Memory required for data: 82497080
I0305 23:28:35.147056   411 layer_factory.hpp:77] Creating layer drop6
I0305 23:28:35.147078   411 net.cpp:94] Creating Layer drop6
I0305 23:28:35.147089   411 net.cpp:435] drop6 <- fc6
I0305 23:28:35.147106   411 net.cpp:396] drop6 -> fc6 (in-place)
I0305 23:28:35.147147   411 net.cpp:144] Setting up drop6
I0305 23:28:35.147162   411 net.cpp:15

Notice how we've created a general program for classification where the variables are the architecture, weights, and mean image. We can now classify images by finding these files for any model that anyone else has trained. Go ahead and see what else you can find to deploy.

Next, how can we use this to solve our challenge of a dog/cat classifier?