In [None]:
%matplotlib inline

<div class="alert alert-info"><h4>Further reading:</h4><p>This notebook is adapted from the <a href="https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html">PyTorch: A 60 Minute Blitz</a> tutorial on the PyTorch website. For documentation and more tutorials, visit <a href="https://pytorch.org">pytorch.org</a></p></div>


# PyTorch

PyTorch is a free, open-source machine learning framework with a Python interface that's widely used both in research and industry. For the purposes of this class, it lets us easily set up and train neural network models. These lab notebooks are just a brief introduction to PyTorch, but will hopefully get you comfortable with the terms and tools that will be important, starting with perhaps the most fundamental: tensors



# Tensors

Tensors are a data structure—a way of storing numbers. They are very similar to arrays, which you may have encountered before (such as the ndarrays in NumPy). A tensor can be as simple as a one-dimensional list of numbers:

$$
\begin{bmatrix} 1 & 2 & 3 \end{bmatrix}
$$

But a tensor can also be a 2-dimensional array of numbers:

$$
\begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix}
$$

Or it can have 3, 4, or any other number of dimensions. You might talk about a 1x3 tensor (like the first example above) or a 3x3 tensor (like the second example) or an Nx1x32x32 tensor (which we'll see in a later notebook). So, at least as far as the field of machine learning is concerned, a tensor is just a big block of numbers—a multidimensional array.

PyTorch uses tensors to store the data that make up the input to a neural network model (e.g. images for classification), and also uses them to store the parameters that define the model (its weights and biases).

--------------

In [None]:
import torch
import numpy as np

## Initializing Tensors

There are several ways to initialize a tensor—let's look at some examples:

**Directly from data**

Tensors can be created directly from data. The data type is automatically inferred.

In [None]:
data = [[1, 2], [3, 4]]
data_tensor = torch.tensor(data)
print(data_tensor, data_tensor.dtype)

**From a NumPy array**

Tensors can be created from NumPy arrays. Note that the new tensor and the NumPy array will share the same memory.

In [None]:
np_array = np.array(data)
data_tensor = torch.from_numpy(np_array)
print(data_tensor)

**With random or constant values**

``shape`` is a tuple of tensor dimensions.



In [None]:
shape = (2, 3)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)

print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor}")

**Using another tensor as a template**

The new tensor retains the properties (shape, datatype) of the argument tensor, unless explicitly overridden.



In [None]:
data_tensor = torch.tensor([[1, 2], [3, 4]])

x_ones = torch.ones_like(data_tensor) # retains the properties of data_tensor
print(f"Ones Tensor: \n {x_ones} \n")

x_rand = torch.rand_like(data_tensor, dtype=torch.float) # overrides the datatype of data_tensor
print(f"Random Tensor: \n {x_rand} \n")

--------------




## Tensor Attributes

Tensor attributes describe their shape, datatype, and the device on which they are stored.



In [None]:
tensor = torch.rand(3, 4)

print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")

--------------




## Tensor Operations

A full list of tensor operations can be found [here](https://pytorch.org/docs/stable/torch.html).

**Arithmetic (element-wise)**

You can use the usual +, -, \*, / symbols

In [None]:
x = torch.randint(10, (3, 2))
y = torch.randint(10, (3, 2))
print(x, '\n')
print(y, '\n')
z = x + y
print(z)

**Matrix multiplication**

In [None]:
x = torch.tensor([[0, 1], [-1, 0]])
y = torch.tensor([[4, 0], [0, 4]])
print(x, '\n')
print(y, '\n')
z = x @ y
print(z)

**Standard numpy-like indexing and slicing**

In [None]:
tensor = torch.zeros(3, 3)
print(tensor, '\n')
tensor[0, 0] = 3
tensor[:, 2] = 5
print(tensor)

**In-place operations**

Operations that have a ``_`` suffix are in-place. For example: ``x.copy_(y)``, ``x.t_()``, will change ``x``.

In [None]:
print(tensor, "\n")
tensor.add_(10)
print(tensor)

<div class="alert alert-info"><h4>Note</h4><p>In-place operations save some memory, but can potentially overwrite values required to compute gradients, so their use is discouraged.</p></div>

**Joining tensors**

You can use ``torch.cat`` to concatenate a sequence of tensors along a given dimension. See also [torch.stack](https://pytorch.org/docs/stable/generated/torch.stack.html), another operation that is subtly different from ``torch.cat``.

In [None]:
t1 = torch.cat([tensor, tensor], dim=1)
print(t1)

**Changing device**

Typically, tensor operations will be faster on a GPU (if supported) than a CPU.

In [None]:
import time

In [None]:
x = torch.rand(18_000, 18_000)
y = torch.rand(18_000, 18_000)
start_time = time.time()

# this should take several seconds to run:
z = x @ y

end_time = time.time()
elapsed_time = end_time - start_time
print(np.round(elapsed_time, 3), " seconds")

In [None]:
# move our tensor to the GPU if available

# CUDA: Nvidia GPUs
if torch.cuda.is_available():
    x = x.to('cuda')
    y = y.to('cuda')
    
# MPS: Mac GPUs
if torch.backends.mps.is_available():
    x = x.to('mps')
    y = y.to('mps')
    
print(f"Tensor is stored on: {x.device}")

In [None]:
# if the tensors were moved to a GPU, this should be faster
z = 0
start_time = time.time()
z = x @ y

end_time = time.time()
elapsed_time = end_time - start_time
print(np.round(elapsed_time, 3), " seconds")