# Introduction to Pytorch
https://pytorch.org/tutorials/beginner/nlp/pytorch_tutorial.html

Notes:
- Deep learning uses __tensors__ to perform computations.

In [1]:
import torch
import numpy as np
import pandas as pd
import torch.autograd as autograd
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

torch.manual_seed(1);

## Creating Tensors
$\vec{v}$ to $\mathcal{T}$

https://en.wikiversity.org/wiki/Tensors/Definitions

https://pytorch.org/docs/stable/tensors.html

__Tensors__ can be constructed using:
1. List of lists
1. `pandas` series
1. `numpy` 1D-arrays
1. List of numbers in `Python`.

In [2]:
# My first lil' tensor :')
torch.tensor(1)

tensor(1)

Create a Tensor from a Python list

In [3]:
data = list(range(0, 10))

In [4]:
vector = torch.tensor(data)

Create a Tensor Matrix from:
1. List of lists
1. List of Series
1. `numpy` matrix
1. `pandas` matrix

In [5]:
# List of lists
lists = [[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]

torch.tensor(lists)

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [6]:
# List of Series
a = pd.Series(np.arange(0, 10))

torch.tensor((a, a, a))

tensor([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]], dtype=torch.int32)

In [7]:
# numpy matrix
matrix_A = np.random.uniform(1, 10, size=(3,3))

torch.tensor(matrix_A)

tensor([[9.3745, 3.2861, 1.7952],
        [1.9264, 9.5940, 6.1870],
        [3.3367, 2.1279, 4.0479]], dtype=torch.float64)

In [8]:
# pandas matrix
matrix_B = pd.DataFrame(
    np.random.randint(low=1,high=100,size=(10,10))
)

# torch.as_tensor(matrix_B) >>> TypeError: not a sequence
# torch.tensor(martix_B) >>> TypeError: not a sequence
torch.tensor(matrix_B.values)

tensor([[ 4, 38, 93, 65, 64, 61, 15, 60, 81,  2],
        [80, 39, 16, 30, 63, 20,  2, 73, 37, 17],
        [73, 50, 81, 53, 23, 77, 86, 36, 24,  7],
        [96, 15, 97, 69,  5, 42, 29, 46, 71, 83],
        [59, 65, 89, 48, 49, 73,  1, 49, 43, 72],
        [ 3, 20,  1, 34, 97, 36, 28, 69, 86, 73],
        [80, 98, 17, 22, 73, 99, 19, 64, 41, 61],
        [86, 27, 94, 52,  7, 65, 22, 86, 19, 92],
        [42, 61, 37,  3, 20, 52, 17, 14, 72,  8],
        [96, 40, 13, 20, 54, 85, 20, 44, 66, 22]], dtype=torch.int32)

Create a multi-dimensional Tensor from:

1. List of lists
2. List of Series
3. `numpy` matrix
4. `pandas` matrix

In [9]:
# List of lists
multi_lists = [[[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]],
               [[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]],
               [[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]]]

Tensor = torch.tensor(multi_lists)
Tensor.shape

torch.Size([3, 3, 3])

In [10]:
# List of Series

In [11]:
# numpy matrix/ndarray

In [12]:
# pandas matrix

# Tensor Indexing

What is a multi-dimesional (n-D) tensor?

1. A multi-dimesional tensor has matrix indicies
2. A matrix has vector indicies
3. A vector has scalar indicies


## Indexing into a Vector
Let's look at this visually, starting with the vector we created, $\vec{v}$ `vector`.

In [13]:
print(vector.shape)
print(vector)

torch.Size([10])
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])


### Vectors, Elements, and Scalars
$\vec{v}$ = \[0, 1, 2, 3, 4, 5, 6, 7, 8, 9\]

$\vec{v}$ has a shape of _m_ x _n_ = 10 rows, 1 column

__Plain English__:

- We have an array with 10 numbers
- The shape of the array is 10 rows x 1 column

__Explanation__:

- We're working with a vector with 10 elements.
- The elements are _scalars_ with values 0 - 9.

> __Elements__ in _linear algebra_ are also called entries, coefficients, or components.
>
> * Elements of a vector
> * Entries of a vector
> * Coefficients of a vector
> * Components of a vector
>
> They all mean the same thing!

> To explain this difference between _elements_ and _scalars_, let's imagine you're sitting down to a nice breakfast with your family. 

>There's pancakes, sausage, toast, yogurt, fruit, bacon, tea, coffee, orange juice, and water. Like a vulture, you circle each platter for the perfect piece and add it to your plate. Once you're finished you take a seat.

> 1. Your _plate is the vector_. It holds your delicious breakfast.
2. Each piece of fresh food you have on your plate is the _number of elements in your vector_.
3. The __name__ of each piece(s) (e.g. 3 strips of bacon, 2 pieces of fruit, 1 dallop of yogurt, 4 pancakes) are your _scalars_!

>Food for thought: You have a preference, the pieces on your plate are the direct result of weighted decisions _you make_.

> ### Let's setup our vector
> How many _elements_ do we have?
>-  3 (bacon strips)
>- \+ 2 (pieces of fruit)
>- \+ 1 (dallop of yogurt)
>- \+ 4 (pancakes) 
>- = 10 (pieces of food)
 
>Let's weight each food item:
>- Bacon - 6
>- Fruit - 9
>- Yogurt - 10
>- Pancakes - 11

>Let's make our plate:
>
>$\vec{plate}$ = \[6(bacon strip),
6(bacon strip),
6(bacon strip), 9(fruit), 9(fruit), 10(yogurt), 11(pancake), 11(pancake), 11(pancake), 11(pancake)\]

>Awesome. Let's not forget about our computer's breakfast, I mean _vector_:
>
>$\vec{v}$ = \[6, 6, 6, 9, 9, 10, 11, 11, 11, 11\]

> Help your family with the dishes :)

In [14]:
# Indexing into a vector results in a single value, our scalar.
# n is the number of elements in a vector, in our case 10
# ith element in vector is equal to v[i] here i=[1-1],....,i=[n-1]
# Here we get a scalar.
print(vector[0])
print(type(vector[0]))

tensor(0)
<class 'torch.Tensor'>


In [15]:
# To get a value, we need to pull the scalar out of the vector.
print(vector[0].item())
type(vector[0].item())

0


int

### Matrices and Vectors

Let's see what our matrix looks like

In [16]:
print(type(matrix_A))
print(matrix_A.ndim, 'dimensions')
print(matrix_A.shape)
print(matrix_A)

<class 'numpy.ndarray'>
2 dimensions
(3, 3)
[[9.37453347 3.2861392  1.79524279]
 [1.92636318 9.59396917 6.18704059]
 [3.33666875 2.1279054  4.04792622]]


## Indexing into the Matrix
- Whether you index into a matrix by row or column, you will get a vector.
> The size of the vector that is returned depends on the shape of the matrix. If a matrix has a shape of (4, 3)
> - Indexing by __row__ yields a vector of size 4.
> - Indexing by __column__ returns a vector of size 3.

In [17]:
# Indexing into our matrix using, well, index/bracket notation :D
# The bracket notation returns only column values in the first row
# print(matrix_A[0]) 
print(matrix_A[0,:])  # used explicit index notation
print(matrix_A[0].shape)

[9.37453347 3.2861392  1.79524279]
(3,)


In [18]:
# The bracket notation returns all rows, of the second column.
print(matrix_A[:,1].shape)
print(matrix_A[:,1])

(3,)
[3.2861392  9.59396917 2.1279054 ]


## Indexing into a Tensor

In [19]:
# Print out our Tensor
Tensor

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]],

        [[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]],

        [[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [20]:
print(Tensor.ndim)
print(Tensor.shape)

3
torch.Size([3, 3, 3])


In [21]:
# Indexing into a Tensor return the first Matrix at index 0!
Tensor[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

### Notes

* Tensors can be created from other numerical datatypes.
    * Similar to numpy and pandas, you can specify the datatype when you create a ndarray or dataframe/series.
* Most common datatypes used in Tensors are `float` and `long`.


# Creating Random Tensors

In [22]:
tensor_A = torch.randn(3, 3)

In [23]:
matrix_C = np.random.randn(3, 3)

In [24]:
tensor_A * matrix_C

tensor([[-0.1479,  0.1754, -0.0624],
        [ 1.0879, -0.3806, -0.1507],
        [ 0.6763,  0.1481, -0.4151]], dtype=torch.float64)

# Tensor Operations
https://pytorch.org/docs/stable/torch.html

In [25]:
x = torch.tensor([1, 2, 3])
y = torch.tensor([5., 6., 7.])
z = x + y
print(z)

tensor([ 6.,  8., 10.])


In [26]:
torch.get_default_dtype()

torch.float32

In [27]:
z.numel()

3

In [28]:
torch.from_numpy(np.linspace(1, 10, 20))

tensor([ 1.0000,  1.4737,  1.9474,  2.4211,  2.8947,  3.3684,  3.8421,  4.3158,
         4.7895,  5.2632,  5.7368,  6.2105,  6.6842,  7.1579,  7.6316,  8.1053,
         8.5789,  9.0526,  9.5263, 10.0000], dtype=torch.float64)

In [29]:
torch.linspace(1, 10, 20)

tensor([ 1.0000,  1.4737,  1.9474,  2.4211,  2.8947,  3.3684,  3.8421,  4.3158,
         4.7895,  5.2632,  5.7368,  6.2105,  6.6842,  7.1579,  7.6316,  8.1053,
         8.5789,  9.0526,  9.5263, 10.0000])

In [30]:
torch.arange(1, 10)

tensor([1, 2, 3, 4, 5, 6, 7, 8, 9])

# Tensor Concatenation

Cannot concat `torch` dtypes with `numpy` dtypes.
``` python
torch.cat([tensor_A, matrix_C], 1)
#TypeError: expected Tensor as element 1 in argument 0, but got numpy.ndarray
```

In [31]:
# Using torch.as_tensor() to tansform numpy array to a tensor.
tensor_C = torch.as_tensor(matrix_C)

In [32]:
# Like pd.concat(), you can specify the axis you want to join the
# tensors on.
print("Column Tensor Join")
torch.cat([tensor_A, tensor_C], axis=1)

Column Tensor Join


tensor([[ 0.6614,  0.2669,  0.0617, -0.2236,  0.6571, -1.0125],
        [ 0.6213, -0.4519, -0.1661,  1.7509,  0.8423,  0.9069],
        [-1.5228,  0.3817, -1.0276, -0.4442,  0.3881,  0.4040]],
       dtype=torch.float64)

In [33]:
print("Row Tensor Join")
torch.cat([tensor_A, tensor_C], axis=0)

Row Tensor Join


tensor([[ 0.6614,  0.2669,  0.0617],
        [ 0.6213, -0.4519, -0.1661],
        [-1.5228,  0.3817, -1.0276],
        [-0.2236,  0.6571, -1.0125],
        [ 1.7509,  0.8423,  0.9069],
        [-0.4442,  0.3881,  0.4040]], dtype=torch.float64)

# Reshaping Tensors

In [34]:
# There are 24 elements in this tensor
tensor_x = torch.randn(2, 3, 4)

# .view() is equivalent of pd.reshape().
# Using .view() on a tensor to remove a dimension
tensor_x_view = tensor_x.view(2, 12)

print('Original Tensor N-Dimensions:', tensor_x.ndim)
print('Reshaped Tensor N-Dimensions:', tensor_x_view.ndim)

Original Tensor N-Dimensions: 3
Reshaped Tensor N-Dimensions: 2


In [35]:
# -1 in view will cause torch to infer the dimension.
tensor_x_view2 = tensor_x.view(2, -1)
print("Another way to reshape a tensor:", tensor_x_view2.shape)

Another way to reshape a tensor: torch.Size([2, 12])


# Computation Graphs and Automatic Differentiation

Computation graphs are essential to efficient deep learning programming.
- The programmer doesn't need to write the backpropagation
Computation graphs shows how your tensors are/were combined to give you the output.
- Specifies what parameters were involved in creating a tensor.
- Computation graphs can be used to calculate derivatives.
> `requires_grad` must be set to `True` in order to calculate derivatives and perform backpropagation.
>
>"...All this output tensor knows is its data and shape. It has no idea that it was the sum of two other tensors (it could have been read in from a file, it could be the result of some other operation, etc.)
>
>If `requires_grad`=`True`, the Tensor object keeps track of how it was created. Lets see it in action."

In [36]:
x = torch.tensor([6., 9., 4., 3., 8.], requires_grad=True)
y = torch.tensor([7., 3., 25., 84., 12.], requires_grad=True)
z = x + y
print(z)
print(z.grad_fn)

tensor([13., 12., 29., 87., 20.], grad_fn=<AddBackward0>)
<AddBackward0 object at 0x0000026CC1805700>


In [37]:
s = z.sum()
print(s)
print(s.grad_fn)

tensor(161., grad_fn=<SumBackward0>)
<SumBackward0 object at 0x0000026CC19A2DF0>


In [38]:
# "...note if you run this block multiple times, the gradient will
# increment. That is because Pytorch accumulates the gradient into
# the .grad property, since for many models this is very convenient."
s.backward()
print(x.grad)

tensor([1., 1., 1., 1., 1.])


# The Cornerstone Fundamentals
##### \*\*\*Crucial for being a successful programmer in deep learning***
Tensors "know" how they were created when `requires_grad` = `True`

### Tensors without Backpropagation: `requires_grad=False`

In [39]:
a = torch.randn(4, 4)
b = torch.randn(4, 4)
c = a + b

In [40]:
print(f"Does Tensor `a` have grad? {a.requires_grad}")
print(f"Does Tensor `b` have grad? {b.requires_grad}")
print(f"Does Tensor `c` have grad? {c.requires_grad}")

Does Tensor `a` have grad? False
Does Tensor `b` have grad? False
Does Tensor `c` have grad? False


If a tensor is created without setting `requires_grad` = True, tensors created from "grad-less" tensors will be unable to perform derivatives. No derivatives, no backpropagation.

### Tensors _with_ Backpropagation: `requires_grad=True`

In [41]:
# Method `requires_grad_() sets requires_grad = True, inplace.
a.requires_grad_()
b.requires_grad_()

tensor([[ 0.4100,  0.4085,  0.2579,  1.0950],
        [-0.5065,  0.0998, -0.6540,  0.7317],
        [-1.4567,  1.6089,  0.0938, -1.2597],
        [ 0.2546, -0.5020, -1.0412,  0.7323]], requires_grad=True)

In [42]:
# Seperate cells to show `requires_grad_()` changes the Tensors
# inplace.
a

tensor([[ 0.4533,  1.1422,  0.2486, -1.7754],
        [-0.0255, -1.0233, -0.5962, -1.0055],
        [ 0.4285,  1.4761, -1.7869,  1.6103],
        [-0.7040, -0.1853, -0.9962, -0.8313]], requires_grad=True)

In [43]:
b

tensor([[ 0.4100,  0.4085,  0.2579,  1.0950],
        [-0.5065,  0.0998, -0.6540,  0.7317],
        [-1.4567,  1.6089,  0.0938, -1.2597],
        [ 0.2546, -0.5020, -1.0412,  0.7323]], requires_grad=True)

In [44]:
c = a + b
print(c.grad_fn)  # Our Tensors know how they were created!
print(c.requires_grad)

<AddBackward0 object at 0x0000026CBB4B1970>
True


### Removing autograd from Tensors

In [45]:
# Using tensor.detach(), we can remove the values/scalars from
# a tensor without its 'memory' (ability to tell how it was created)
new_c = c.detach()

In [46]:
# new_c is not as cool as old 'c', it cannot perform backpropagation.

# "In essence, we have broken the Tensor away from its past history"


print(new_c.requires_grad)
print(new_c.grad_fn)

False
None


In [47]:
# wrap the code block tensor
with torch.no_grad():
    print((x * 42).requires_grad)

False


A tensor that is not operated on, changed, transformed inside the body of the `with` statement will be __unchanged__ if `requires_grad=True`.
``` python
with torch.no_grad():
    print(x.requires_grad)
```
\>\>\> __True__

# Tests

In [48]:
# torch.tensor?

In [49]:
a = pd.Series(np.arange(0, 10))

# torch.tensor((a, b, c).as_matrix())
# torch.cat((a, b, c))

torch.tensor(a)

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=torch.int32)

``` python
np.ndarray(vector) # tensors do not support indexing error.
np.array(vector)
pd.Series(vector)
pd.DataFrame(vector)
```

In [50]:
T = torch.arange(0, 10_000_000).reshape(1000, 100, 100)
M = np.arange(0, 10_000_000).reshape(1000, 100, 100)

In [51]:
%%timeit -r 10 -n 1_000
1 + T

15 ms ± 39.5 µs per loop (mean ± std. dev. of 10 runs, 1000 loops each)


In [52]:
%%timeit -r 10 -n 1_000
1 + M

10.8 ms ± 407 µs per loop (mean ± std. dev. of 10 runs, 1000 loops each)
