<center> 
    <a href="https://discord.com/invite/RbeQMu886J" target="_blank" style="text-decoration: none;"> Join the community </a> |
    <a href="https://nebuly.gitbook.io/nebuly/welcome/questions-and-contributions" target="_blank" style="text-decoration: none;"> Contribute to the library </a>
</center>

<center> 
    <a href="https://github.com/nebuly-ai/nebullvm#how-nebullvm-works" target="_blank" style="text-decoration: none;"> How nebullvm works </a> •
    <a href="https://github.com/nebuly-ai/nebullvm#installation" target="_blank" style="text-decoration: none;"> Installation </a> •
    <a href="https://github.com/nebuly-ai/nebullvm#get-started" target="_blank" style="text-decoration: none;"> Get Started </a> •
    <a href="https://github.com/nebuly-ai/nebullvm#benchmarks" target="_blank" style="text-decoration: none;"> Benchmarks </a>
</center>

#Accelerate PyTorch ResNet50 with nebullvm
Hi and welcome 👋

In this notebook we will discover how in just a few steps you can speed up the response time of deep learning model inference using the open-source library `nebullvm`.

We will
1. Install nebullvm and the deep learning compilers used by the library.
2. Speed up a PyTorch ResNet50 without any loss of accuracy.
3. Achieve faster acceleration on the same model by applying more aggressive optimization techniques (e.g. pruning, quantization) under the constraint of sacrificing up to 5% accuracy.

Let's jump to the code.

#Installation

In [None]:
!pip install nebullvm

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting nebullvm
  Downloading nebullvm-0.4.0-py3-none-any.whl (135 kB)
[K     |████████████████████████████████| 135 kB 30.3 MB/s 
[?25hCollecting PyYAML>=6.0
  Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB)
[K     |████████████████████████████████| 596 kB 3.1 MB/s 
Collecting onnxmltools>=1.11.0
  Downloading onnxmltools-1.11.1-py3-none-any.whl (308 kB)
[K     |████████████████████████████████| 308 kB 67.4 MB/s 
[?25hCollecting psutil>=5.9.0
  Downloading psutil-5.9.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (281 kB)
[K     |████████████████████████████████| 281 kB 71.1 MB/s 
Collecting onnx>=1.10.0
  Downloading onnx-1.12.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.1 MB)
[K     |████████████████████████████████| 13.1

This is an optional step. Run it if you want to contribute to continuous improvement of `nebullvm` and share the performance achieved with it. You can find full details in the [docs](https://nebuly.gitbook.io/nebuly/nebullvm/how-nebullvm-works/fostering-continuous-improvement#sharing-feedback-to-improve-nebullvm).

In [None]:
json_feedback = {
    "allow_feedback_collection": True
}
import json
from pathlib import Path

(Path.home() / ".nebullvm").mkdir(exist_ok=True)
with open(Path.home() / ".nebullvm/collect.json", "w") as f:
  json.dump(json_feedback, f)

Let's now import nebullvm. During the import we will install the deep learning compilers used by nebullvm that are not yet installed on the hardware.

The installation of the compilers may take a few minutes.

In [None]:
import nebullvm

  "No valid onnxruntime installation found. Trying to install it..."
  "No valid OpenVino installation has been found. "
  "No TensorRT valid installation has been found. "
  "Not found any valid tvm installation. "
  "No deepsparse installation found. Trying to install it..."


# Optimization example with Pytorch

In the following example we will try to optimize a standard resnet50 loaded directly from torchvision.

Nebullvm can accelerate neural networks without loss of a user-defined precision metric, e.g. accuracy, or can achieve faster acceleration by applying more aggressive optimization techniques, such as pruning and quantization, that may have a negative impact on the selectic metric. The maximum threshold value for accuracy loss is determined by the metric_drop_ths parameter. Read more in the [docs](https://nebuly.gitbook.io/nebuly/nebullvm/get-started).

Let first test the optimization without accuracy loss (metric_drop_ths=0, default value), and then apply further accelerate it under the constrained of losing up to 5% of accuracy (metric = "accuracy", metric_drop_ths = 0.05).

## Scenario 1 - No accuracy drop

First we load the model and optimize it using the nebullvm API:

In [None]:
import torch
import torchvision.models as models
from nebullvm.api.functions import optimize_model

# Load a resnet as example
model = models.resnet50()

# Provide an input data for the model    
input_data = [((torch.randn(1, 3, 256, 256), ), 0)]

# Run nebullvm optimization
optimized_model = optimize_model(
  model, input_data=input_data, optimization_time="unconstrained"
)

# Try the optimized model
x = torch.randn(1, 3, 256, 256)
res = optimized_model(x)

ImportError: ignored

After the optimization step, we can compare the optimized model with the baseline one in order to measure the speed improvement

In [None]:
import time
import numpy as np

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

def benchmark(model, input_shape=(1, 3, 256, 256), nwarmup=50, nruns=1000):
    input_data = torch.randn(input_shape)
    input_data = input_data.to(device)

    print("Warm up ...")
    with torch.no_grad():
        for _ in range(nwarmup):
            features = model(input_data)
    torch.cuda.synchronize()
    print("Start timing ...")
    timings = []
    with torch.no_grad():
        for i in range(1, nruns+1):
            start_time = time.time()
            features = model(input_data)
            torch.cuda.synchronize()
            end_time = time.time()
            timings.append(end_time - start_time)
            if i%100==0:
                print('Iteration %d/%d, avg batch time %.2f ms'%(i, nruns, np.mean(timings)*1000))

    if isinstance(features, tuple):
      features = features[0]

    print("Input shape:", input_data.size())
    print("Output features size:", features.size())
    print('Average throughput: %.2f images/second'%(input_shape[0]/np.mean(timings)))

In [None]:
# Set the model to eval mode and move it to the available device

model.eval()
model.to(device)

Here we compute the average throughput for the baseline model:

In [None]:
benchmark(model)

Here we compute the average throughput for the optimized model:



In [None]:
benchmark(optimized_model)

## Scenario 2 - Accuracy drop

In this scenario, we set a max threshold for the accuracy drop to 5%

In [None]:
#import torch
#import torchvision.models as models
#from nebullvm.api.functions import optimize_model

## Load a resnet as example
#model = models.resnet50()

# Provide 100 random input data for the model  
input_data = [((torch.randn(1, 3, 256, 256), ), 0) for _ in range(100)]

# Run nebullvm optimization
optimized_model = optimize_model(
  model, input_data=input_data, optimization_time="unconstrained", metric_drop_ths=0.05, metric="accuracy"
)

# Try the optimized model
x = torch.randn(1, 3, 256, 256)
res = optimized_model(x)

In [None]:
# Set the model to eval mode and move it to the available device

# model.eval()
# model.to(device)

Here we compute the average throughput for the baseline model:

In [None]:
benchmark(model)

Here we compute the average throughput for the optimized model:

In [None]:
benchmark(optimized_model)

<center> 
    <a href="https://discord.com/invite/RbeQMu886J" target="_blank" style="text-decoration: none;"> Join the community </a> |
    <a href="https://nebuly.gitbook.io/nebuly/welcome/questions-and-contributions" target="_blank" style="text-decoration: none;"> Contribute to the library </a>
</center>

<center> 
    <a href="https://github.com/nebuly-ai/nebullvm#how-nebullvm-works" target="_blank" style="text-decoration: none;"> How nebullvm works </a> •
    <a href="https://github.com/nebuly-ai/nebullvm#installation" target="_blank" style="text-decoration: none;"> Installation </a> •
    <a href="https://github.com/nebuly-ai/nebullvm#get-started" target="_blank" style="text-decoration: none;"> Get Started </a> •
    <a href="https://github.com/nebuly-ai/nebullvm#benchmarks" target="_blank" style="text-decoration: none;"> Benchmarks </a>
</center>