# PyTorch: NumPy on Steroids

PyTorch, like NumPy, is a tensor library but **with GPU support** and acceleration. On top of this, PyTorch is a deep learning library.

Let's compare the running time of a simple operation with vanilla NumPy and its counterpart PyTorch on GPU.

In [1]:
import numpy as np
import torch
import time

In [2]:
n = 10000
A = np.random.randn(n, n)

In [3]:
start = time.time()
A2 = np.matmul(A, A)
print('%.4f seconds' % (time.time() - start))

7.4536 seconds


In [4]:
A = torch.from_numpy(A).cuda() # to cast a tensor on a GPU we only have to cast its type to `cuda`
start = time.time()
A2 = torch.mm(A, A)
print('%.4f seconds' % (time.time() - start))

0.2152 seconds


On this simple example, we note a 30x speed up in the matrix multiplication with a Titan X GPU.

Let's look more closely how to map a calculation to a specific GPU (in case of a multi-GPU system).

In [5]:
print('Number of GPUs: %i' % torch.cuda.device_count())
print('ID of the GPU used: %i' % torch.cuda.current_device()) # current default GPU
torch.cuda.set_device(1) # switch to GPU 1
print('ID of the GPU used: %i' % torch.cuda.current_device())

Number of GPUs: 4
ID of the GPU used: 0
ID of the GPU used: 1


In [6]:
# Using context manager to place operations on a given device
with torch.cuda.device(0):
    A = torch.randn(n, n).cuda()
    A2 = A.mm(A)
print('A is on GPU %i' % (A.get_device()))
      
with torch.cuda.device(3):
    A = torch.randn(n, n).cuda()
    A2 = A.mm(A)
print('A is on GPU %i' % (A.get_device()))

A is on GPU 0
A is on GPU 3
