<a href="https://colab.research.google.com/github/cssc9cssc9/python_test/blob/main/PyTorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

> The reason we need to learn PyTorch:
- **Automatic differentiation** is a powerful tool
- PyTorch implements common functions used in deep learning
- Data Processing with PyTorch DataSet
- **Mixed Presision** Training in PyTorch (Decrease the used memory)

In [1]:
#!pip install torch
import torch
import torch.nn as nn
import torch.functional as fun

from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt

import numpy as np

torch.manual_seed(446)
np.random.seed(446)

## tensor and relation to ndarray
PyTorch's basic building block, the torch's `tensor` is similar to numpy's `ndarray`

In [2]:
# We create tensors in a similar way to numpy ndarrays
x_numpy = np.array([.1, .2, .3])
x_torch = torch.tensor([.1, .2 ,.3])
print(f'x_numpy = {x_numpy}, \nx_torch = {x_torch}\n')

# We also can to basic operations like +-*/
y_numpy = np.array([3,4,5.])
y_torch = torch.tensor([3,4,5.])
print(f'x+y=\n{x_numpy+y_numpy}\n{x_torch+y_torch}\n')

# Many functions that are in numpy are also in PyTorch
print(f'norm\nnp.linalg.norm(x_numpy) = {np.linalg.norm(x_numpy)}\ntorch.norm(x_torch) = {torch.norm(x_torch)}\n')

# to apply an operation along a dimension
# We use dim keyword argument instead of axis
x_numpy = np.array([[1,2],[3,4.]])
x_torch = torch.tensor([[1,2],[3,4.]])
print(f'mean along the 1st(0th) dimension\nnp.mean(numpy, axis=0) = {np.mean(x_numpy, axis=0)},\ntorch.mean(x_torch, dim=0) = {torch.mean(x_torch, dim=0)}')

x_numpy = [0.1 0.2 0.3], 
x_torch = tensor([0.1000, 0.2000, 0.3000])

x+y=
[3.1 4.2 5.3]
tensor([3.1000, 4.2000, 5.3000])

norm
np.linalg.norm(x_numpy) = 0.37416573867739417
torch.norm(x_torch) = 0.37416574358940125

mean along the 1st(0th) dimension
np.mean(numpy, axis=0) = [2. 3.],
torch.mean(x_torch, dim=0) = tensor([2., 3.])


## `tensor.view`
We can use `tensor.view()` function to reshape tensors similarly to `numpy.reshape()`

It can also automatically calculate the correct dimension if a `-1` is passed in. This is useful if we are working with batches but the batch size is unknown.


In [3]:
# "MNIST"
N, C, W, H = 10000, 3, 28, 28
X = torch.randn((N, C, W, H))

print(f"X.shape = {X.shape}\n")
print(f"X.view(N, C, 784).shape = {X.view(N, C, 784).shape}\n")
print(f'X.view(-1, C, 784).shape = {X.view(-1, C, 784).shape}\n')

X.shape = torch.Size([10000, 3, 28, 28])

X.view(N, C, 784).shape = torch.Size([10000, 3, 784])

X.view(-1, C, 784).shape = torch.Size([10000, 3, 784])



## BROADCASTING SEMANTICS
Two tensors are **broadcastable** if the following rules hold:

1. Each tensor has at least one dimension.
2. The dimension is read **from right to left**, and only the highest dimension can be empty.
3. When iterating over hte dimension sizes, starting at the trailling dimension, the dimension sizes must either be **equal**, **one of them is 1**, or **one of them does not exist**.

##### Try 1
We create a $R^{3\times2}$ matrix $x$, and a $R^{2}$ array $y$, and we consider $x+y$ and see how the broadcasting works.

In [9]:
# View how the broadcasting work
x = torch.tensor([list(range(3*2))]).view([3,2])
y = torch.ones(2)
print(f'a =\n{x}\nb = \n{y}\na+b=\n{x+y}')

a =
tensor([[0, 1],
        [2, 3],
        [4, 5]])
b = 
tensor([1., 1.])
a+b=
tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])


According the above example, we can find tensor copy itself to form a $3\times2$ matrix.

#### Try 2
We create a $R^{5\times1\times4\times1}$ tensor $x$, and a $R^{3\times1\times1}$ tensor $y$. (i.e., $y$ is a array in mode $3$.)

In [10]:
# PyTorch operations support NumPy Boradcasting Semantics
x=torch.empty(5,1,4,1)
y=torch.empty( 3,1,1)
print(f"(x+y).size()={(x+y).size()}")

(x+y).size()=torch.Size([5, 3, 4, 1])


In [6]:
a = torch.tensor(list(range(4*3*2)))
a = a.view([4,3,2,1])
b = torch.tensor(list(range(4*2)))
b = b.view([4,1,2,1])
print(f"(a+b).size()={(a+b).size()}")

(a+b).size()=torch.Size([4, 3, 2, 1])


## Computation graphs
What;s special about PyTorch's `tensor` object is that it implicitly creates a computation graph in the background. A computation graph is a way of writing a mathematical expression as a graph. There is a algorithm to **compute the gradients of all variables of a computation graph in time** on the same order it is to compute the function itself.

</br>

Consider the expression $e=(a+b)*(b+1)$ with values $a=2,\ b=1$. We can draw the evaluated computation graph as

In PyTorch, we can write this as

![tree-img](https://colah.github.io/posts/2015-08-Backprop/img/tree-eval.png)


source:

[PyTorch Tutorial (Hung-yi Lee)](https://www.youtube.com/watch?v=kQeezFrNoOg&ab_channel=Hung-yiLee)

[PyTorch Official Tutorials document](https://www.youtube.com/redirect?event=video_description&redir_token=QUFFLUhqbVZ4aXdrXzNHWjVNRG04WUhWeUI2U0RvVjE1UXxBQ3Jtc0tuSlJ0VHkxZlJ5dWVJRnd5WHlXZXRjWnhOLTVIbVRBb0FUbkMwcjk1RGE0eXFCaTB1Qkxla0FmRF9XMHVPZ25GdHVfVlRjTzFIblEzTnItVE1UcmRheDJ2c1liTF9EUHA3NXFqb2JNNUNsX2h1UnFLZw&q=https%3A%2F%2Fpytorch.org%2Ftutorials%2F)