PyTorch Profiler

Profile your PyTorch model with model-level, layer-level, and operator-level metrics.

Introduction

In deployment, identifying the bottleneck of our model is crucial. Typically, we analyze the cost from the model level down to the operator level. In this tutorial, we will show you a step-by-step guide to profile your PyTorch models.

File Structure

.
├── README.md # main documentation
├── requirements.txt # dependencies
├── assets # temp files (images, logs, etc)
├── quickstart.ipynb # custom model profiling
├── resnet.ipynb # resnet50 profiling (TBD)
└── vit.ipynb # vision transformer profiling (TBD)

Popular Efficiency Metrics

image source: Basics of Neural Networks (MIT 6.5940, Fall 2023)

Memory-Related
- #Parameters: the parameter count of the given neural network.
- Model Size: the storage for the weights of the given neural networks
- Peak #Activations: the intermediate outputs
Computation-Related
- MAC: multiply-accumulate operations
- FLOP, FLOPS: floating-point operations, floating-point operations per second
Latency: the delay from the input to the output
Throughput: the number of data processed per unit of time

Toolboxes

If you do not familiar with common profiling tools, please refer to the following tutorials:

pytorch-benchmark - model-level
Flops Profiler - layer-level
pytorch_memlab - layer-level
torch.fx - layer-level
PyTorch Profiler - operator-level

Installation

conda create -n pytorch_profiler python=3.9 -y
conda activate pytorch_profiler
pip install -r requirements.txt

Quickstart

Go through quickstart notebook to learn profiling a custom model.

# custom model
class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.conv1 = nn.Conv2d(3, 3, kernel_size=3, padding=1)
        self.conv2 = nn.Conv2d(3, 3, kernel_size=3, padding=1)
        self.conv3 = nn.Conv2d(3, 3, kernel_size=3, padding=1)
        self.conv4 = nn.Conv2d(3, 3, kernel_size=3, padding=1)

    def forward(self, x1):
        x1 = self.conv1(x1)
        x1 = self.conv2(x1)
        x1 = self.conv3(x1)
        x1 = self.conv4(x1)
        return x1

Model-Level	Layer-Level	Operator-Level

Examples

Go through resnet notebook and vit notebook to check profiling results of ResNet50 and Vision Transformer.

Best Practices

Modify quickstart notebook to profile your own model.

Other Resources

If you want to get a line-by-line analysis of non-pytorch python scripts, please refer to line_profiler and memory-profiler. Basic usage is as follows:

"""
test_profile.py
"""
import math
from line_profiler import profile
# from memory_profiler import profile

@profile
def test_profile(x: int) -> int:
    """test function for profile"""
    sum = 0
    for i in range(x):
        sum += 1
    
    return sum


def main():
    test_profile(1000)

Latency	Memory

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

PyTorch Profiler

Table of contents

Introduction

File Structure

Popular Efficiency Metrics

Toolboxes

Installation

Quickstart

Examples

Best Practices

Other Resources

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

PyTorch Profiler

Table of contents

Introduction

File Structure

Popular Efficiency Metrics

Toolboxes

Installation

Quickstart

Examples

Best Practices

Other Resources