# Speed and Memory Bencmarking 

Just comparing language models on their performance on a specific task or a benchmark turns out to be no longer sufficient. We now must take care of the computational cost of a particular model for a given environment (RAM, CPU, GPU, TPU) in terms of memory usage and the speed. The computational cost of training and deploying to production for inference are two main values to be measured. Two classes of Transformer libary, PyTorchBenchmark and TensorFlowBenchmark, make it possible to benchmark models for both TensorFlow and PyTorch.

In [None]:
!nvidia-smi

In [None]:
import torch

print(
    f"The GPU total memory is {torch.cuda.get_device_properties(0).total_memory /(1024**3)} GB"
)

In [None]:
!pip install transformers
!pip install py3nvml==0.2.5

In [None]:
from transformers import PyTorchBenchmark, PyTorchBenchmarkArguments

models = ["distilbert-base-uncased", "distilroberta-base", "albert-base-v2"]
batch_sizes = [16]
sequence_lengths = [64, 128, 256, 512]

args = PyTorchBenchmarkArguments(
    models=models, batch_sizes=batch_sizes, sequence_lengths=sequence_lengths
)
benchmark = PyTorchBenchmark(args)

In [None]:
from transformers import PyTorchBenchmark, PyTorchBenchmarkArguments

models = [
    "bert-base-uncased",
    "distilbert-base-uncased",
    "distilroberta-base",
    "distilbert-base-german-cased",
]
batch_sizes = [4]
sequence_lengths = [32, 64, 128, 256, 512]
args = PyTorchBenchmarkArguments(
    models=models,
    batch_sizes=batch_sizes,
    sequence_lengths=sequence_lengths,
    multi_process=False,
)
benchmark = PyTorchBenchmark(args)

In [None]:
# it takes time depending on your  CPU/GPU capacity and selection

In [None]:
results = benchmark.run()

In [None]:
import matplotlib.pyplot as plt

plt.figure(figsize=(8, 8))
t = sequence_lengths
models_perf = [
    list(results.time_inference_result[m]["result"][batch_sizes[0]].values())
    for m in models
]
plt.xlabel("Seq Length")
plt.ylabel("Time in Second")
plt.title("Inference Speed Result")
plt.plot(
    t,
    models_perf[0],
    "rs--",
    t,
    models_perf[1],
    "g--.",
    t,
    models_perf[2],
    "b--^",
    t,
    models_perf[3],
    "c--o",
)
plt.legend(models)
plt.show()