<a href="https://colab.research.google.com/github/pkro/pytorch_for_deep_learning/blob/main/00_pytorch_fundamentals_video.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 00. PyTorch Fundamentals

Resource notebook: https://www.learnpytorch.io/00_pytorch_fundamentals/

In [1]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)


2.0.1+cu118


In [2]:
!nvidia-smi # works only if connected to a runtime with GPU

Mon Jul  3 07:30:02 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   52C    P8    10W /  70W |      0MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

## Introduction to Tensors

### Creating tensors

Pytorch tensors are created using [torch.Tensor](https://pytorch.org/docs/stable/tensors.html)

In [3]:
# scalar

scalar = torch.tensor(7)
scalar

tensor(7)

In [4]:
print(scalar.ndim) # 0 -> scalar is a 0-dimensional tensor with a dimension count of 0
print(scalar.item()) # 7, get tensor back as python int
print(scalar.shape)

0
7
torch.Size([])


In [5]:
vector = torch.tensor([3,4]) # vector: magnitude and direction, indicated by x/y coordinates that represent a vector from [0,0] to [x,y]
vector

tensor([3, 4])

In [6]:
print(vector.ndim) # 1-dimensional tensor
print(vector.shape) # torch.Size([2])
# print(vector.item()) # error as a vector can't be converted to a scalar

1
torch.Size([2])


In [7]:
# matrices and tensors are usually written in all uppercase

MATRIX = torch.tensor([
                      [7,8],
                      [9,10]
                       ])

MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [8]:
print(MATRIX.ndim) # 2-dimensional tensor
print(MATRIX.shape) # torch.Size([2,2]) 2 by 2

2
torch.Size([2, 2])


In [9]:
TENSOR = torch.tensor([[[1,2,3],
                        [4,5,6],
                        [7,8,9]]])
TENSOR

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [10]:
TENSOR[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [11]:
TENSOR [0,1] # same as TENSOR[0][1]

tensor([4, 5, 6])

In [12]:
print(TENSOR.ndim) # 3-dimensional tensor
print(TENSOR.shape) # torch.Size([1, 3, 3]) 1 * 3 * 3

3
torch.Size([1, 3, 3])


In [13]:
TENSOR2 = torch.tensor([[[1,2,3],
                        [4,5,6],
                        [7,8,9]],
                        [[11,12,13],
                        [14,15,16],
                        [17,18,19]]])
TENSOR2

tensor([[[ 1,  2,  3],
         [ 4,  5,  6],
         [ 7,  8,  9]],

        [[11, 12, 13],
         [14, 15, 16],
         [17, 18, 19]]])

In [14]:
print(TENSOR2.ndim) # 3
print(TENSOR2.shape) # [2,3,3]


3
torch.Size([2, 3, 3])


### Random tensors

Why random tensors?

Random tensors are important because many neural networks start learning with tensors of random numbers and adjust those numbers during the learning / training process to better represent the data.

Flow:

- start with random numbers
- look at data
- updata random numbers
- look at data
- update random numbers
- etc

In [15]:
# create a random tensor of size (3,4)

random_tensor = torch.rand(3,4) # number of items per dimension, torch.rand(1,3,4) would create a 3-dimensional tensor
random_tensor

tensor([[0.0848, 0.1336, 0.1499, 0.6614],
        [0.2968, 0.3000, 0.4738, 0.7710],
        [0.8006, 0.7803, 0.5569, 0.1489]])

In [16]:
random_tensor.ndim, random_tensor.shape

(2, torch.Size([3, 4]))

In [17]:
# create a random tensor with a similar shape to an image tensor
random_image_size_tensor = torch.rand(size=(224,224,3)) # height, width, color channels (R,G,B); sometimes color channels are at the beginning (e.g. 3,224,224)

In [18]:
random_image_size_tensor.ndim, random_image_size_tensor.shape

(3, torch.Size([224, 224, 3]))

### zeroes and ones

Note: ones and zeros are created as float32, not integers

In [19]:
zeros = torch.zeros(size=(3,4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [20]:
ones = torch.ones(size=(3,4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [21]:
ones.dtype # all tensors are float32 unless otherwise defined

torch.float32

### create range of tensors and tensors-like

In [22]:
torch.range(0,10) # deprecated, produces range 0-10
torch.arange(0,10) # works like python range, creates range 0-9 (10 is exclusive)

  torch.range(0,10) # deprecated, produces range 0-10


tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [23]:
one_to_ten = torch.arange(1,11)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [24]:
zero_to_1000_step_50 = torch.arange(0,1001,50)
zero_to_1000_step_50

tensor([   0,   50,  100,  150,  200,  250,  300,  350,  400,  450,  500,  550,
         600,  650,  700,  750,  800,  850,  900,  950, 1000])

In [25]:
# same as
torch.arange(start=0, end=1001, step=50)

tensor([   0,   50,  100,  150,  200,  250,  300,  350,  400,  450,  500,  550,
         600,  650,  700,  750,  800,  850,  900,  950, 1000])

In [26]:
# tensors like
# zeros all values of the input tensor / returns a tensor in the same shape as the input tensor with all values zerod
ten_zeros = torch.zeros_like(one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

## Tensor datatypes

Tensor datatypes is one of the 3 big error sources when working with pytorch & deep learning:

1. Tensor is not the right datatype
2. Tensor is not the right shape
3. tensor is not the right device

In [27]:
float_32_tensor = torch.tensor([3.0,6.0,9.0],
                               dtype=None, # specify datatype (defaults to float32 if floats are given)
                               device="cpu", # what device is the tensor on; "cuda" for gpu
                               requires_grad=False) # whethter to track gradients in tensor operations or not
float_32_tensor.dtype

torch.float32

In [28]:
int64_tensor = torch.tensor([3,6,9]) # note we use ints, so the tensor will default to int64
int64_tensor.dtype

torch.int64

In [29]:
# convert tensor
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [30]:
# multiplication between tensors of different datatype possible,
# output is the lower precission datatype
float_16_tensor * float_32_tensor

tensor([ 9., 36., 81.])

In [31]:
(int64_tensor * float_32_tensor).dtype # float32 as multiplying int * float doesn't have precission loss

torch.float32

In [32]:
(torch.tensor([1,2,3], dtype=torch.int32) * torch.tensor([4,5,6], dtype=torch.float32)).dtype # float32, same as above

torch.float32

### getting information from tensors

1. datatype (tensor.dtype)
2. shape (tensor.shape, tensor.size())
3. device (tensor.device)


In [33]:
some_tensor = torch.rand(3,4)

def tensor_info(tensor):
  print(f"shape: {tensor.shape}, dimensions: {tensor.ndim}") # no str(tensor.shape) needed in formatted strings / string literals
  print(f"dtype: {tensor.dtype}")
  print(f"device: {tensor.device}")

tensor_info(some_tensor)


shape: torch.Size([3, 4]), dimensions: 2
dtype: torch.float32
device: cpu


### manipulating tensors (tensor operations)

- addition
- subtraction
- division
- multiplication (element-wise)
- matrix multiplication

In [34]:
tensor = torch.tensor([1,2,3])
tensor + 10 # adds 10 to each element

tensor([11, 12, 13])

In [35]:
tensor * 10 # multiplies each element by 10

tensor([10, 20, 30])

In [36]:
print(tensor - 10); # doesn't print if just use "tensor - 10" - why?

tensor([-9, -8, -7])


In [37]:
# using pytorch in-built functions
torch.mul(tensor, 10), torch.add(tensor, 5), torch.sub(tensor, 20)

(tensor([10, 20, 30]), tensor([6, 7, 8]), tensor([-19, -18, -17]))

In general, the course recommends to use the standard python operators instead of the torch methods for better readability.

In general, the torch methods are faster for more complex operations such as matrix multiplication (matmul)


### Matrix multiplication

[how to multiply matrices](https://www.mathsisfun.com/algebra/matrix-multiplying.html)

[visualization](http://matrixmultiplication.xyz/)

Two main ways of performing multiplication in neural networks and deep learning:

1. Element-wise multiplication / scalare multiplication
  - multiply by single number: just multiply each matrix element with that number
2. Matrix multiplication / dot product (most common operation in neural networks)
  - symbol: **&middot;** (just a fat dot), e.g. "a **&middot;** b"
  - pytorch function: `torch.matmul(t1, t2)` or `torch.mm(...)`, python operator: `@`, e.g. `t1 @ t2`
  - **inner dimensions** must match:
    - (3, **2**) @ (**3**, 2) will NOT work
    - (2, **3**) @ (**3**, 2) WILL work
    - (3, **2**) @ (**2**, 3) WILL work
  - the resulting matrix has the shape of the **outer dimensions**  
    - (**2**, 3) @ (3, **2**) -> resulting shape: (2, 2)
    - (**3**, 2) @ (2, **3**) -> resulting shape: (3, 3)
    - torch.rand(7,10) @ torch.rand(10,2) -> shape: (7,2)
  - order counts!
  - dot product of rows and columns
  - rows of first matrix are multiplied by columns of second matrix



In [38]:
(torch.rand(7,10) @ torch.rand(10,2)).shape

torch.Size([7, 2])

In [39]:
# element-wise multiplication
torch.tensor([[1,2,3], [4,5,6]]) * torch.tensor([[1,2,3], [4,5,6]])

tensor([[ 1,  4,  9],
        [16, 25, 36]])

In [40]:
# matrix multiplication with vectors
# tensor from before is a [1,2,3]
torch.matmul(tensor, tensor) # 1*1 + 2*2 + 3*3

tensor(14)

In [41]:
# dot product / matrix multiplication
torch.matmul(torch.tensor(
    [ [1,2,3],
      [4,5,6]
    ]),torch.tensor(
        [
          [7,8],
          [9,10],
          [11,12]
        ]))

tensor([[ 58,  64],
        [139, 154]])

### One of the most common errors in deep learning: shape errors



In [42]:
# shapes for matrix multiplication
tensor_A = torch.tensor([[1,2],
                         [3,4],
                         [5,6]])

tensor_B = torch.tensor([[7,10],
                         [8,11],
                         [9,12]])

tensor_info(tensor_A)
tensor_info(tensor_B)

try:
  torch.matmul(tensor_A, tensor_B) # error
except RuntimeError:
  print("can't multiply tensors: incompatible shapes")



shape: torch.Size([3, 2]), dimensions: 2
dtype: torch.int64
device: cpu
shape: torch.Size([3, 2]), dimensions: 2
dtype: torch.int64
device: cpu
can't multiply tensors: incompatible shapes


To fix the error, we can transpose one of our tensors so that the outer dimensions match.

Transpose switches the axes or dimensions of a given tensor.

In [43]:
tensor_info(tensor_B)
print(tensor_B)
tensor_info(tensor_B.T) # transposed
print(tensor_B.T)


shape: torch.Size([3, 2]), dimensions: 2
dtype: torch.int64
device: cpu
tensor([[ 7, 10],
        [ 8, 11],
        [ 9, 12]])
shape: torch.Size([2, 3]), dimensions: 2
dtype: torch.int64
device: cpu
tensor([[ 7,  8,  9],
        [10, 11, 12]])


In [44]:
# the matrix multiplicatoin works when tensor_B is transposed so the inner dimensions match with tensor_A
torch.matmul(tensor_A, tensor_B.T) # the resulting tensors shape is the outer dimensions of the source tensors (3/3)

tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

## Tensor aggregation: Finding the min, max, mean, sum etc.

In [45]:
x = torch.arange(0,100,10)
x.min(), x.max() # or torch.min(x)

(tensor(0), tensor(90))

In [46]:
# mean
try:
  print(x.mean())
except RuntimeError:
  print("mean doesn't work with integer / long types as the result is usually a floating point")

x_float = x.type(torch.float32)

x_float.mean() #  or torch.mean(x.type(torch.float32))


mean doesn't work with integer / long types as the result is usually a floating point


tensor(45.)

In [47]:
x.sum(), x_float.sum()

(tensor(450), tensor(450.))

In [48]:
# argmin/argmax return the index of the min / max value
torch.argmin(x), torch.argmax(x)

(tensor(0), tensor(9))

In [49]:
# what does it do with multidimensional tensors (matrices)?
# for multidimensional tensors, it seems argmin/argmax flattens the tensor and gives the index of the value in the resulting one-dimensional array
print(tensor_A)
def tensor_aggregation_info(tensor):
  print(f"min: {tensor.min()}")
  print(f"max: {tensor.max()}")
  print(f"sum: {tensor.sum()}")
  try:
    print(f"mean: {tensor.mean()}")
  except RuntimeError:
    print("mean: wrong datatype (must be floating point or complex)")
  print(f"argmin: {tensor.argmin()}")
  print(f"argmax: {tensor.argmax()}")

def tensor_info_all(tensor):
  tensor_info(tensor)
  tensor_aggregation_info(tensor)

tensor_info_all(tensor_A)

tensor([[1, 2],
        [3, 4],
        [5, 6]])
shape: torch.Size([3, 2]), dimensions: 2
dtype: torch.int64
device: cpu
min: 1
max: 6
sum: 21
mean: wrong datatype (must be floating point or complex)
argmin: 0
argmax: 5


## Reshaping, stacking, squeezing and unsqueezing tensors

- Reshaping: reshapes an input to a defined shape
- View: return a view of an input tensor of certain shape but keep the same memory as the original tensor (saving memory)
- Stacking: combine multiple tensors on top of each other (torch.vstack) or side by side (torch.hstack) or on whatever dimension (torch.stack)
- Squeeze: remove all `1` dimensions from a tensor
- Unsqueeze: add a `1` dimension to a tensor
- Permute: return a view of the input with dimensions permuted (swapped) in a certain way



In [50]:
x = torch.arange(1., 10.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [51]:
# reshape
x_reshaped = torch.reshape(x, [3,3])
print(x_reshaped)

try:
  torch.reshape(x, [3,4]) # would require x to have 12 values
except RuntimeError as e:
  print(str(e))
  print("tensor can only be reshaped in a tensor with the same amount of elements")

tensor([[1., 2., 3.],
        [4., 5., 6.],
        [7., 8., 9.]])
shape '[3, 4]' is invalid for input of size 9
tensor can only be reshaped in a tensor with the same amount of elements


In [52]:
x.reshape(9,1)

tensor([[1.],
        [2.],
        [3.],
        [4.],
        [5.],
        [6.],
        [7.],
        [8.],
        [9.]])

In [53]:
tensor_8_elements = torch.arange(1,9)
tensor_8_elements.reshape(2,2,2)

tensor([[[1, 2],
         [3, 4]],

        [[5, 6],
         [7, 8]]])

In [54]:
# create a view
# can be used to reshape a tensor but using the original's memory

z = x.view(3,3)
z

tensor([[1., 2., 3.],
        [4., 5., 6.],
        [7., 8., 9.]])

In [55]:
# changing an element in the view changes it in the original tensor
z[0,0] = 99.0
print(z)
print(x)

tensor([[99.,  2.,  3.],
        [ 4.,  5.,  6.],
        [ 7.,  8.,  9.]])
tensor([99.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.])


In [56]:
# and vice versa
x[len(x)-1] = 77.0
print(f"x: {x}")
print(f"z: {z}")


x: tensor([99.,  2.,  3.,  4.,  5.,  6.,  7.,  8., 77.])
z: tensor([[99.,  2.,  3.],
        [ 4.,  5.,  6.],
        [ 7.,  8., 77.]])


In [57]:
# reset to original values / shape for further code
x = torch.arange(1., 10.)
x

tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.])

In [58]:
# stacking tensors on top of each other
# torch.vstack(x, x) # error, argument must be a tuple or list of tensors
torch.vstack((x,x)) # argument as tuple

tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.],
        [1., 2., 3., 4., 5., 6., 7., 8., 9.]])

In [59]:
torch.hstack([x,x]) # argument as list

tensor([1., 2., 3., 4., 5., 6., 7., 8., 9., 1., 2., 3., 4., 5., 6., 7., 8., 9.])

In [60]:
torch.stack([x,x,x,x], dim=1)

tensor([[1., 1., 1., 1.],
        [2., 2., 2., 2.],
        [3., 3., 3., 3.],
        [4., 4., 4., 4.],
        [5., 5., 5., 5.],
        [6., 6., 6., 6.],
        [7., 7., 7., 7.],
        [8., 8., 8., 8.],
        [9., 9., 9., 9.]])

In [61]:
# stacking matrices
print(tensor_A)
print(tensor_B)

print(torch.stack((tensor_A, tensor_B), dim=0))
print(torch.vstack((tensor_A, tensor_B)))
print(torch.hstack((tensor_A, tensor_B)))

tensor([[1, 2],
        [3, 4],
        [5, 6]])
tensor([[ 7, 10],
        [ 8, 11],
        [ 9, 12]])
tensor([[[ 1,  2],
         [ 3,  4],
         [ 5,  6]],

        [[ 7, 10],
         [ 8, 11],
         [ 9, 12]]])
tensor([[ 1,  2],
        [ 3,  4],
        [ 5,  6],
        [ 7, 10],
        [ 8, 11],
        [ 9, 12]])
tensor([[ 1,  2,  7, 10],
        [ 3,  4,  8, 11],
        [ 5,  6,  9, 12]])


The video is wrong: hstack and vstack aren't the same as stack, just with a different "dim" parameter.

Good gpt explanation:

- torch.stack(): This method concatenates sequence of tensors along a new dimension. All tensors need to be of the same size. You can specify the new axis along which the tensors will be stacked with the dim argument. For example, if you have two tensors of shape (3,) and you use torch.stack() with dim=0, you'll get a tensor of shape (2, 3). If you set dim=1, you'll get a tensor of shape (3, 2).

- torch.hstack(): This method is equivalent to numpy.hstack(). It is a shorthand method for stacking tensors horizontally (i.e., column wise). It's similar to concatenation along the second axis, except for 1-D tensors where it concatenates along the first axis. This means that torch.hstack() can handle inputs with different sizes along the vertical (first) axis, unlike torch.stack(). In the special case of one-dimensional tensors, it behaves like torch.cat() along the first axis, hence different from torch.stack(dim=0) which would introduce a new axis.

- torch.vstack(): This method is equivalent to numpy.vstack(). It is a shorthand method for stacking tensors vertically (i.e., row wise). This method concatenates along the first axis for 2-D (or higher) tensors and for 1-D tensors, it concatenates along a new axis, similar to torch.stack(dim=0).

So, while it's tempting to think of vstack and hstack as just convenience methods for stack with different dim parameters, they actually have slightly different functionality. They can handle arrays with differing sizes (in their non-stacking dimensions), whereas stack requires all tensors to be of the same size.

In [62]:
# squeeze and unsqueeze
# squeeze removes all dimensions of size 1, so the following tensor becomes an 1d-tensor
tensor_2d = torch.tensor([[1,2,3]])
print(tensor_2d)
tensor_info_all(tensor_2d)

tensor_1d = torch.squeeze(tensor_2d)
tensor_1d

tensor([[1, 2, 3]])
shape: torch.Size([1, 3]), dimensions: 2
dtype: torch.int64
device: cpu
min: 1
max: 3
sum: 6
mean: wrong datatype (must be floating point or complex)
argmin: 0
argmax: 2


tensor([1, 2, 3])

In [63]:
# unsqueeze adds a dimension of size 1 to a tensor at the specified position
print(tensor_1d.unsqueeze(0))
print(tensor_1d.unsqueeze(dim=1))
# print(tensor_1d.unsqueeze(2)) # error

tensor([[1, 2, 3]])
tensor([[1],
        [2],
        [3]])


In [64]:
# torch.permute returns a view of a tensor with dimensions rearranged in a specified order

print(tensor_2d)
tensor_info_all(tensor_2d)

permuted = torch.permute(tensor_2d, [1,0])
print(permuted)
tensor_info_all(permuted)

tensor([[1, 2, 3]])
shape: torch.Size([1, 3]), dimensions: 2
dtype: torch.int64
device: cpu
min: 1
max: 3
sum: 6
mean: wrong datatype (must be floating point or complex)
argmin: 0
argmax: 2
tensor([[1],
        [2],
        [3]])
shape: torch.Size([3, 1]), dimensions: 2
dtype: torch.int64
device: cpu
min: 1
max: 3
sum: 6
mean: wrong datatype (must be floating point or complex)
argmin: 0
argmax: 2


In [65]:
# more permute examples from course
# Example: we have image data with x/y/color_channels that we want to convert to color_channels/x/y

x_original = torch.rand(size=(224,224,3))

print(x_original.shape)

x_permuted = torch.permute(x_original, [2, 0, 1])
print(x_permuted.shape)

torch.Size([224, 224, 3])
torch.Size([3, 224, 224])


## Indexing (selecting data from tensors

- similar to indexing with numpy


In [66]:
x = torch.arange(1,10).reshape(1, 3,3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [67]:
# index on tensor


In [68]:
x[0, 0] # same as x[0][0]

tensor([1, 2, 3])

In [69]:
x[0,2,0]

tensor(7)

In [70]:
# use ":" to select all of a target dimension
print(x[0,:,0]) # first column
print(x[0,:,2]) # last column
print(x[0,:,-1]) # also last column (but we don't have to know the actual length of the rows)
print(x[0,:,-2]) # last minus 1 column (in our case the second column)
print(x[0,1,:]) # get second row

tensor([1, 4, 7])
tensor([3, 6, 9])
tensor([3, 6, 9])
tensor([2, 5, 8])
tensor([4, 5, 6])


## Pytorch tensors & numpy

NumPy is a populy scientific computing library. PyTorch can interact with it.

- data in NumPy, want PyTorch tensor: `torch.from_numpy(ndarray)`
- PyTorch tensor to NumPy: `torch.Tensor.numpy()`

PyTorch uses numpys default float64 datatype, see below, numpy uses the datatype of the given tensor (no implicit conversion)

Both create a **copy** (not a view) of the original array or tensor

In [71]:
array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array)
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [72]:
print(array.dtype) # numpy default is float64
print(tensor.dtype) # torch default is float32 (but is float64 here as it uses the dtype of the given numpy array)

float64
torch.float64


In [73]:
# convert to float32
tensor = torch.from_numpy(array).type(torch.float32)
tensor.dtype

torch.float32

In [74]:
# tensor to numpy
tensor = torch.ones(7)
print(tensor)
numpy_tensor = tensor.numpy();
tensor.dtype, numpy_tensor

tensor([1., 1., 1., 1., 1., 1., 1.])


(torch.float32, array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## Reproducibility

A neural network learns by:

- start with random numbers
- tensor operations
- update random numbers to make them better representations of the data
- repeat from step 2

To test / reproduce result (reduce randomness), set a random seed (basically a starting value for the pseudo-random number generator).

```python
# pytorch
import torch
torch.manual_seed(0)
```

Note: depending on what is used, the python and numpy random number generator may have to be seeded as well:

```python
# python
import random
random.seed(0)

# numpy
import numpy as np
np.random.seed(0)
```

[more](https://pytorch.org/docs/stable/notes/randomness.html)



In [75]:
# random but reproducible

RANDOM_SEED = 42
torch.manual_seed(RANDOM_SEED)
random_tensor_A = torch.rand(3,4)
random_tensor_B = torch.rand(3,4) # still different

random_tensor_A, random_tensor_B, random_tensor_A == random_tensor_B

(tensor([[0.8823, 0.9150, 0.3829, 0.9593],
         [0.3904, 0.6009, 0.2566, 0.7936],
         [0.9408, 0.1332, 0.9346, 0.5936]]),
 tensor([[0.8694, 0.5677, 0.7411, 0.4294],
         [0.8854, 0.5739, 0.2666, 0.6274],
         [0.2696, 0.4414, 0.2969, 0.8317]]),
 tensor([[False, False, False, False],
         [False, False, False, False],
         [False, False, False, False]]))

In [76]:
torch.manual_seed(RANDOM_SEED)
random_tensor_A = torch.rand(3,4)
torch.manual_seed(RANDOM_SEED) # reseed with same seed
random_tensor_B = torch.rand(3,4) # same as A

random_tensor_A, random_tensor_B, random_tensor_A == random_tensor_B

(tensor([[0.8823, 0.9150, 0.3829, 0.9593],
         [0.3904, 0.6009, 0.2566, 0.7936],
         [0.9408, 0.1332, 0.9346, 0.5936]]),
 tensor([[0.8823, 0.9150, 0.3829, 0.9593],
         [0.3904, 0.6009, 0.2566, 0.7936],
         [0.9408, 0.1332, 0.9346, 0.5936]]),
 tensor([[True, True, True, True],
         [True, True, True, True],
         [True, True, True, True]]))

## Running tensors and PyTorch objects on GPUs for faster computations

### Getting a GPU

1. Easiest - use google colab for free GPU (colab pro for faster GPUs)
2. Use your own (requires CUDA / nvidia GPU)
3. Cloud computing

In [77]:
# Check for GPU access
!nvidia-smi

Mon Jul  3 07:37:17 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   40C    P8     9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [79]:
# Check for GPU access with PyTorch
torch.cuda.is_available()

True

# Setup device agnostic code

[PyTorch best practices](https://pytorch.org/docs/stable/notes/cuda.html#best-practices)

In [80]:
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [81]:
# count number of devices

torch.cuda.device_count()

1

## Putting tensors (and models) on the GPU for faster computations / tensor operations




In [82]:
# Create a tensor (default is CPU)

tensor = torch.tensor([1,2,3])

print(tensor, tensor.device)

tensor([1, 2, 3]) cpu


In [83]:
# Move tensor to GPU (if available)
tensor_on_gpu = tensor.to(device) # device as defined above; code will work if we only have CPU, too
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

### Moving tensors back to the CPU

Why we might want to do this:

- tensors on GPU can't be transformed to numpy

In [87]:
try:
  tensor_on_gpu.numpy()
except:
  print("can't transform a GPU tensor to numpy")

tensor_on_gpu.cpu().numpy() # copy tensor to cpu memory first

can't transform a GPU tensor to numpy


array([1, 2, 3])

In [88]:
tensor_on_cpu = tensor_on_gpu.to('cpu') # same as tensor_on_gpu.cpu()
tensor_on_cpu.numpy()

array([1, 2, 3])