In [5]:
import torch
import torch.nn as nn
import numpy as np 
import matplotlib.pyplot as plt
import time

[Jupyter notebook notes ](https://medium.com/game-of-data/12-things-to-know-about-jupyter-notebook-markdown-3f6cef811707/)

# Numpy Vs. PyTorch

numpy **array** and pytorch **tensor** can be creted in hte same way: 

In [9]:
n = np.linspace(0,1,5)
t = torch.linspace(0,1,5)

They can be resised in similar ways

In [8]:
n = np.arange(48).reshape(3,4,4)
t = torch.arange(48).reshape(3,4,4)

Most imporantly they have the same broadcasting rules. In order to use **pytorch** (and **numpy**) most efficiently, one needs to have a very strong grasp of the **broadcasting rules**

# General  Broadcasting Rules

When operating on two arrays, Numpy compares their shapes element-wise. It starts with the trailing(i.e. rightmost) dimensions and works its way left. Two dimensions are compatible when 
1. they are equal
2. one of them is 1

**Example**: The following are compatible

Shape 1:(1,6,4,1,7,2)

Shape 2:(5,6,1,3,1,2)



In [26]:
a = np.ones((6,5))
b = np.arange(5).reshape((1,5))

In [27]:
a+b

array([[1., 2., 3., 4., 5.],
       [1., 2., 3., 4., 5.],
       [1., 2., 3., 4., 5.],
       [1., 2., 3., 4., 5.],
       [1., 2., 3., 4., 5.],
       [1., 2., 3., 4., 5.]])

In [29]:
a = torch.ones((6,5))
b = torch.arange(5).reshape(1,5)

In [None]:
a+b

The arrays/tensors don't even have to have the same numbe of dimentions. If one of them has less dimentions than the other

**Example**: Sacling each other the color channels of an image by a different amount:

    Image  (3d array): 256 x 256 x 3
    Scale  (1d array):             3
    Result (3d array): 256 x 256 x 3 

In [30]:
Scale = torch.tensor([0.5,1.5,1])
Image = torch.randn((256,256,3))

In [31]:
Result = Image*Scale

**Example**:One has an array of 2 images and wants to scale the color channels of each image by a slightly different amount:

    Images  (4d array): 2 x 256 x 256 x 3
    Scales  (4d array): 2 x   1 x   1 x 3
    Results (4d array): 2 x 256 x 256 x 3

In [41]:
Images = torch.randn(2,256,256,3)
Scale = torch.tensor([0.5,1.1,0.7,0.6,0.3,0.9]).reshape(2,1,1,3)

In [45]:
Results = Images*Scale

# Operation across Dimentions

fundomental thing for pytorch. Simple operations can be done one 1 dimentional tensors:



In [48]:
t = torch.tensor([0.5,13,4])
torch.mean(t), torch.std(t), torch.max(t), torch.min(t)

(tensor(5.8333), tensor(6.4485), tensor(13.), tensor(0.5000))

Suposing we have  a 2d sensor and we want to compute the mean value of each of the collumns.



In [57]:
t = torch.arange(20, dtype=float).reshape(5,4)
torch.mean(t, axis=0)

tensor([ 8.,  9., 10., 11.], dtype=torch.float64)

In [58]:
t = torch.rand(5,256,256,3)

Take the mean value across the batch(size 5)

In [61]:
torch.mean(t,axis=0).shape

torch.Size([256, 256, 3])

In [None]:
take mean across the color channel

torch.mean(t,axis=-1).shape

Take only the maximum color channel value(and get corresponding indices):

- This is done all the time in segmentation models(i.e take an image, decide which pixels coorespod to, say a car)

In [62]:
values, indices = torch.max(t,axis=-1)

In [65]:
values.shape

torch.Size([5, 256, 256])

# Pytorch vs Numpy Differences

**Pytorch** starts to differ from numpy in terms of automatically computing gradients of operations






In [69]:
x = torch.tensor([[5.,8.],[4.,6.]], requires_grad=True)
y = x.pow(3).sum()
y

tensor(917., grad_fn=<SumBackward0>)

In [70]:
y.backward() #compute the gradient
x.grad #print the gradient(everything that has happened to x)

tensor([[ 75., 192.],
        [ 48., 108.]])

The automatic computation of gradients is the backbone of training deep learning models. Most gradient computations don't have an analytical formula, so the automatic computation of gradients is essential. In general if one has y = F(x)

Then pytorch can compute dy/dxi For each element of the vector x. In the context of machine learning, x contains all the weights(parameters) of the neural network and y is the ***Loss Function*** of the neural network.

# Additional benefits 

***In addition, Any sort of large matric multiplication problem is faster with toch tensor than with numpy arrays, especially when runnig in gpu***



In [None]:
A = torch.rand((1000,1000))