<a href='https://ai.meng.duke.edu'> = <img align="left" style="padding-top:10px;" src=https://storage.googleapis.com/aipi_datasets/Duke-AIPI-Logo.png>

# Introduction to PyTorch

In [4]:
# Install pytorch and torchvision if you have not already done so
# pip3 install torch torchvision

In [5]:
import numpy as np
import pandas as pd
import torch
from torchvision import datasets, transforms
from torch.utils.data import DataLoader, TensorDataset

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt

torch.manual_seed(0)

<torch._C.Generator at 0x107921870>

# Introduction to Tensors
The basic object used in PyTorch is the 'Tensor' which is equivalent to 'ndarray' in Numpy. Similar to Numpy, there are multiple types of Tensors, e.g. Float, Double, Int, Long, etc. Generally we will use FloatTensors, and it is the default type for most functions.

In [6]:
# Create a tensor manually
x_manual = torch.tensor([[1.0, 2.0], [3.0, 4.0]])
x_manual

tensor([[1., 2.],
        [3., 4.]])

In [7]:
x_ones = torch.ones(3,4)
print(x_ones)

x_zeros = torch.zeros(3,4)
print(x_zeros)

x_uniform = torch.rand(3,4)
print(x_uniform)

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])
tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])
tensor([[0.4963, 0.7682, 0.0885, 0.1320],
        [0.3074, 0.6341, 0.4901, 0.8964],
        [0.4556, 0.6323, 0.3489, 0.4017]])


In [8]:
# Create a tensor from a NumPy array
np_array = np.array([1., 2., 3.], dtype=np.float32)
print(np_array)
torch_tensor = torch.from_numpy(np_array)
print(torch_tensor)

[1. 2. 3.]
tensor([1., 2., 3.])


In [9]:
# Create a NumPy array from a tensor
another_tensor = torch.rand(3)
print(another_tensor)
another_np_array = another_tensor.numpy()
print(another_np_array)


tensor([0.0223, 0.1689, 0.2939])
[0.02232575 0.16885895 0.29388845]


In [10]:
# Use indexing to get slices from a tensor
A = torch.rand(3,3)
print(A)
print(A[:, 1])
print(A[:2, :])

tensor([[0.5185, 0.6977, 0.8000],
        [0.1610, 0.2823, 0.6816],
        [0.9152, 0.3971, 0.8742]])
tensor([0.6977, 0.2823, 0.3971])
tensor([[0.5185, 0.6977, 0.8000],
        [0.1610, 0.2823, 0.6816]])


In [11]:
A = torch.rand(3,3)
B = torch.rand(3,3)

# Add tensors together
print("A+B")
print(A+B)

# Element-wise multiply tensors
print()
print("Elementwise multiplication (Hadamard product)")
print(A*B)

# Matrix-Matrix multiplication of tensors
print()
print("Matrix multiplication (matrix product)")
print(torch.mm(A,B))

A+B
tensor([[0.6892, 0.7036, 0.9845],
        [0.2443, 1.1150, 1.0965],
        [1.0474, 1.4583, 0.4196]])

Elementwise multiplication (Hadamard product)
tensor([[0.1132, 0.0833, 0.0302],
        [0.0075, 0.1722, 0.2700],
        [0.2265, 0.4905, 0.0429]])

Matrix multiplication (matrix product)
tensor([[0.9355, 1.0787, 0.6453],
        [0.3255, 0.3742, 0.2261],
        [0.4069, 1.0051, 0.7265]])


In [12]:
# Check if GPU is available, otherwise use CPU
if torch.cuda.is_available():
    cuda = True
else:
    cuda = False
cuda

False

In [13]:
# Attach a variable to the GPU
mat_gpu = torch.rand(5000, 5000)
if cuda:
    mat_gpu = mat_gpu.cuda()
mat_gpu

tensor([[0.5846, 0.0332, 0.1387,  ..., 0.9534, 0.2357, 0.3334],
        [0.8576, 0.6120, 0.8924,  ..., 0.3778, 0.3465, 0.4203],
        [0.1008, 0.9075, 0.2329,  ..., 0.8757, 0.6707, 0.0709],
        ...,
        [0.9011, 0.0352, 0.5583,  ..., 0.3135, 0.2705, 0.3187],
        [0.0967, 0.0548, 0.4999,  ..., 0.4541, 0.5116, 0.8959],
        [0.6136, 0.4996, 0.0217,  ..., 0.3558, 0.1079, 0.0682]])

## Autograd
The key thing that PyTorch provides us is its Autograd capability which provides automatic differentiation. A Tensor keeps its value and the gradient with respect to this Tensor value. Almost all of built-in operations in PyTorch supports automatic differentiation. To use it we can call `.backward()` on a computation graph, e.g. neural network, after we finish our computation on the graph, and we can automatically get the accumulated gradient for each Tensor (which has specified `requires_grad=True`) in the computational graph

In [14]:
x = torch.tensor(2.0, requires_grad=False)
w = torch.tensor(0.5, requires_grad=True)
b = torch.tensor(0.1, requires_grad=True)
print('x =',x)
print('w =',w)
print('b =',b)

# Define a computational graph
y = w*x + b #y = 0.5x + 0.1 and y(2) = 1.1
print('y =',y)

x = tensor(2.)
w = tensor(0.5000, requires_grad=True)
b = tensor(0.1000, requires_grad=True)
y = tensor(1.1000, grad_fn=<AddBackward0>)


Now let's calculate the derivative of the above function y=wx+b with respect to our weight w and bias term b.  We can calculate them manually:

For w:
$$
\frac{\partial y}{\partial w} = \frac{\partial}{\partial w}\left(wx + b\right) = x\\
\text{and}\\
\displaystyle \frac{\partial y}{\partial w}\Bigr|_{x=2} = 2 
$$
For b:
$$
\frac{\partial y}{\partial b} = \frac{\partial}{\partial b}\left(wx + b\right) = 1\\
\text{and}\\
\displaystyle \frac{\partial y}{\partial b}\Bigr|_{x=2} = 1 
$$

In [15]:
# Compute derivatives of y with respect to each variable x,w,b
y.backward()

print('Gradient with respect to w:',w.grad)
print('Gradient with respect to b:',b.grad)

Gradient with respect to w: tensor(2.)
Gradient with respect to b: tensor(1.)


In [16]:
# Convert y from tensor to a NumPy array
# We get an error when we try this
y = y.numpy()
type(y)

RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.

In [17]:
# Must first detach y from the computational graph
y = y.detach().numpy()
type(y)

numpy.ndarray