This chapter introduces how to apply PyTorch models for production usage. Some of the contents have running environment requirements and are not suitable for Google Colab. We are mainly introducing the ideas in this notebook.

Flask is framework that can create web service with Python. Now let's try building a neural network service based on Flask.

In [None]:
# check https://medium.com/@kshitijvijay271199/flask-on-google-colab-f6525986797b 
# and https://www.geeksforgeeks.org/how-to-run-flask-app-on-google-colab/ for 
# details about using Flask in Google Colab
!pip install flask-ngrok

Collecting flask-ngrok
  Downloading flask_ngrok-0.0.25-py3-none-any.whl (3.1 kB)
Installing collected packages: flask-ngrok
Successfully installed flask-ngrok-0.0.25


Defining the model:

In [None]:
from torchvision import models
import torch.nn as nn
from torchvision import transforms

CatFishClasses = ["cat", "fish"]

CatFishModel = models.resnet50()
CatFishModel.fc = nn.Sequential(
    nn.Linear(CatFishModel.fc.in_features, 500),
    nn.ReLU(),
    nn.Dropout(),
    nn.Linear(500, 2)
)

img_transforms = transforms.Compose([
    transforms.Resize((64, 64)), # resize image
    transforms.ToTensor(), # store image data in tensor
    transforms.Normalize(mean = [0.485, 0.456, 0.406], std = [0.229, 0.224, 0.225])
    # the above normalization follows distribution of ImageNet dataset
    ])


Now starting the Flask service:

In [None]:
from flask import Flask, jsonify
from torchvision import transforms
from flask_ngrok import run_with_ngrok
from flask import request
from PIL import Image
import torch.nn.functional as F
import urllib.request
import torch

app = Flask(__name__)
run_with_ngrok(app)
@app.route("/")
def status():
    return jsonify({"status" : "hahaha"})

@app.route("/predict", methods = ['GET', 'POST'])
def predict():
    img_url = request.args.get("image_url")
    urllib.request.urlretrieve(img_url, "temp_img")
    img = Image.open("temp_img")
    img = img_transforms(img)
    img = torch.unsqueeze(img, 0)
    CatFishModel.eval()
    prediction = F.softmax(CatFishModel(img), dim = 1)
    predicted_class = CatFishClasses[prediction.argmax()]
    return jsonify({"image" : img_url, "prediction" : predicted_class})

# check the ngrok web link in the output, and you should see the status returned
app.run()

 * Serving Flask app "__main__" (lazy loading)
 * Environment: production
[2m   Use a production WSGI server instead.[0m
 * Debug mode: off


 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)


 * Running on http://8856a1622fed.ngrok.io
 * Traffic stats available on http://127.0.0.1:4040


Install a postman and send a GET request with an image website, like http://farm2.static.flickr.com/1245/1259825348_6a2aa94e8d.jpg.
For example, if the flask website is http://f1e9ea4c7c0a.ngrok.io,
then in Postman, send http://f1e9ea4c7c0a.ngrok.io/predict with a 
key = "image_url" and "value" = "http://farm2.static.flickr.com/1245/1259825348_6a2aa94e8d.jpg".
You can also try to fine tune the model to get a better prediction accuracy.

In this chapter, some contents are about Docker, Cloud Service and Kubernetes, which is not well supported on Google Colab or requires a charged cloud service. Check https://stackoverflow.com/a/61275992 and https://colab.research.google.com/github/tensorflow/federated/blob/master/docs/tutorials/high_performance_simulation_with_kubernetes.ipynb for more info. We will focus on the PyTorch study here and skip the cloud deployment part. 

In the following we are going to learn TorchScript. It's a new model representation script provided by PyTorch, with the advantages of performance optimization, stablility and flexibility.

In [None]:
import torch
from torchvision import models

my_model = models.AlexNet()
traced_model = torch.jit.trace(my_model, torch.rand(1, 3, 224, 224))

  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
With rtol=1e-05 and atol=1e-05, found 999 element(s) (out of 1000) whose difference(s) exceeded the margin of error (including 0 nan comparisons). The greatest difference was 0.012500333599746227 (0.0003694472834467888 vs. 0.012869780883193016), which occurred at index (0, 406).
  _module_class,


In [None]:
from google.colab import drive
drive.mount('/content/drive')
# make a directory named libtorch_test under ./drive/MyDrive,
# which will be used later

Mounted at /content/drive


In [None]:
# adding model.eval() because the model uses Dropout, which is specially handled
# by PyTorch JIT in training mode. Turning the model into eval mode and then
# no warning will be printed
from torchvision import models

my_model = models.AlexNet()
my_model.eval()
traced_model = torch.jit.trace(my_model, torch.rand(1, 3, 224, 224))
print("the traced model")
print(traced_model)
print("the traced model's code")
print(traced_model.code)

# These parameters can be saved by JIT too
# note we are passing an absolute path here
# as in this notebook we will switch working directory later
torch.jit.save(traced_model, "/content/drive/MyDrive/libtorch_test/traced_model")

the traced model
AlexNet(
  original_name=AlexNet
  (features): Sequential(
    original_name=Sequential
    (0): Conv2d(original_name=Conv2d)
    (1): ReLU(original_name=ReLU)
    (2): MaxPool2d(original_name=MaxPool2d)
    (3): Conv2d(original_name=Conv2d)
    (4): ReLU(original_name=ReLU)
    (5): MaxPool2d(original_name=MaxPool2d)
    (6): Conv2d(original_name=Conv2d)
    (7): ReLU(original_name=ReLU)
    (8): Conv2d(original_name=Conv2d)
    (9): ReLU(original_name=ReLU)
    (10): Conv2d(original_name=Conv2d)
    (11): ReLU(original_name=ReLU)
    (12): MaxPool2d(original_name=MaxPool2d)
  )
  (avgpool): AdaptiveAvgPool2d(original_name=AdaptiveAvgPool2d)
  (classifier): Sequential(
    original_name=Sequential
    (0): Dropout(original_name=Dropout)
    (1): Linear(original_name=Linear)
    (2): ReLU(original_name=ReLU)
    (3): Dropout(original_name=Dropout)
    (4): Linear(original_name=Linear)
    (5): ReLU(original_name=ReLU)
    (6): Linear(original_name=Linear)
  )
)
the tra

The following shows examples of converting Python code to TorchScript

In [None]:
import torch

def example(x, y):
    if x.min() > y.min():
        r = x
    else:
        r = y
    return r

print("Python example")
print(example(torch.rand(3), torch.rand(3)))

tensor([0.2596, 0.1074, 0.3777])


In [None]:
import torch

@torch.jit.script
def example(x, y):
    if x.min() > y.min():
        r = x
    else:
        r = y
    return r

print("TorchScript example")
print(example(torch.rand(3), torch.rand(3)))

TorchScript example
tensor([0.9065, 0.3825, 0.6518])


The following is an example of defining a model in TorchScript.

In [None]:
# the main difference is that the class should 
# inherit from torch.jit.ScriptModule,
# instead of nn.Module
class FeaturesCNNNet(torch.jit.ScriptModule):
    def __init__(self, num_classes = 2):
        super(FeaturesCNNNet, self).__init__()
        self.features = torch.jit.trace(
            nn.Sequential(
                nn.Conv2d(3, 64, kernel_size = 11, stride = 4, padding = 2),
                nn.ReLU(),
                nn.MaxPool2d(kernel_size = 3, stride = 2),
                nn.Conv2d(64, 192, kernel_size = 5, padding = 2),
                nn.ReLU(),
                nn.MaxPool2d(kernel_size = 3, stride = 2),
                nn.Conv2d(192, 384, kernel_size= 3, padding = 1),
                nn.ReLU(),
                nn.Conv2d(384, 256, kernel_size = 3, padding = 1),
                nn.ReLU(),
                nn.Conv2d(256, 256, kernel_size = 3, padding = 1),
                nn.ReLU(),
                nn.MaxPool2d(kernel_size = 3, stride = 2)
            ),
            torch.rand(1, 3, 224, 224)
        )
    @torch.jit.script_method
    def forward(self, x):
        x = self.features(x)
        return x

my_model = FeaturesCNNNet()


In the following we are going to demonstrate some limitations of TorchScript.

In [None]:
# returning different types in one function is not supported in TorchScript
# check the error message when running

@torch.jit.script
def str_or_int(x):
    if (x > 3):
        return "haha"
    else:
        return 0

RuntimeError: ignored

In [None]:
# this is to show by default TorchScript functions
# operate on tensors
@torch.jit.script
def add_int(x, y):
    return x + y
print(add_int.code)

def add_int(x: Tensor,
    y: Tensor) -> Tensor:
  return torch.add(x, y)



In [None]:
# try this to pass the argument as type of int instead of tensor
@torch.jit.script
def add_int_v2(x : int, y : int) -> int:
    return x + y
print(add_int_v2.code)

def add_int_v2(x: int,
    y: int) -> int:
  return torch.add(x, y)



In [None]:
# also you can't declare a new member variable in class method
# other than __init__
import torch

# note the following code won't work in Google Colab, even if 
# you declare self.y in __init__
# the reason can be found in https://github.com/pytorch/pytorch/issues/28258
# basically it's a notebook limitation.
@torch.jit.script
class class_example:

    def __init__(self, x):
        self.x = x

    def set_y(y):
        self.y = y


TypeError: ignored

Now we are trying to run PyTorch in C++. Firstly set up the dependence

In [None]:
!apt install cmake g++
# https://stackoverflow.com/a/57212513 check for the difference between % and ! in usage for Google Colab
%cd /content/drive/MyDrive/libtorch_test
!wget https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-1.9.0%2Bcpu.zip
!unzip libtorch-cxx11-abi-shared-with-deps-1.9.0+cpu.zip

In ./drive/MyDrive/libtorch_test, put the following into a CMakeLists.txt.

In [None]:
cmake_minimum_required(VERSION 3.0 FATAL_ERROR)

find_package(Torch REQUIRED)

add_executable(hello_world hello_world.cpp)
target_link_libraries(hello_world "${TORCH_LIBRARIES}")

In ./drive/MyDrive/libtorch_test, put the following into hello_world.cpp.

In [None]:
#include <iostream>
#include <torch/torch.h>

int main()
{
    torch::Tensor tensor = torch::ones({2, 2});
    std::cout << tensor << std::endl;
}

Then compile and run the project.

In [17]:
%cd /content/drive/MyDrive/libtorch_test
!rm -rf build
!mkdir build
%cd build
!cmake -DCMAKE_PREFIX_PATH=libtorch ..
!make
!./hello_world

/content/drive/MyDrive/libtorch_test
/content/drive/MyDrive/libtorch_test/build
-- The C compiler identification is GNU 7.5.0
-- The CXX compiler identification is GNU 7.5.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found T

Then we are going to run a real model with libTorch.

Add the following in the above mentioned CMakeLists.txt.

In [None]:
add_executable(load_model load_model.cpp)
target_link_libraries(load_model "${TORCH_LIBRARIES}")

Create a new code file load_model.cpp and put the following inside.

In [None]:
#include <torch/script.h>
#include <iostream>
#include <memory>
int main()
{
    torch::jit::script::Module module = 
        torch::jit::load("../traced_model");
    
    std::cout << "model loaded" << std::endl;
    std::vector<torch::jit::IValue> inputs;
    inputs.push_back(torch::rand({1, 3, 224, 224}));
    torch::Tensor output = module.forward(inputs).toTensor();
    std::cout << output << std::endl;
}

Compile and run the executable.

In [20]:
%cd /content/drive/MyDrive/libtorch_test
!rm -rf build
!mkdir build
%cd build
!cmake -DCMAKE_PREFIX_PATH=libtorch ..
!make
!./load_model

/content/drive/MyDrive/libtorch_test
/content/drive/MyDrive/libtorch_test/build
-- The C compiler identification is GNU 7.5.0
-- The CXX compiler identification is GNU 7.5.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found T