## Data Manipulation
- We need to understnad how to manage n-dimentional tensors before starting our dive into deep learning
- Why use pytorch / tensorflow?
  - It leverages GPUs to accelerate numerical computation, whereas NumPy only runs on CPUs

In [7]:
import torch
x = torch.arange(12, dtype=torch.float32)
x, x.numel(), x.shape

(tensor([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11.]),
 12,
 torch.Size([12]))

- Reshape function restructure the tensor in the requested shape
- The same thing could be achieved using `torch.view()` as well
- We can find the dimensions using `size()` function or the value of `shape` of the tensor

In [40]:
id_before_reshape = id(x)
x_reshape = x.reshape(3, 4)
x_view = x.view(3, -1)
x_reshape, x_reshape.shape, x_view, x_view.size()

(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]]),
 torch.Size([3, 4]),
 tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]]),
 torch.Size([3, 4]))

- We can initialize `torch.zeros`, `torch.ones` for a tensor for desired shape with all zeros or ones
- Also, we can use `torch.zeros_like` or `torch.ones_like` to initialize tensors of shape which is similar to some other tensor

In [44]:
A = torch.zeros((2,3))
B = torch.ones((1,3))
C = torch.ones_like(A)
A, B, C

(tensor([[0., 0., 0.],
         [0., 0., 0.]]),
 tensor([[1., 1., 1.]]),
 tensor([[1., 1., 1.],
         [1., 1., 1.]]))

- We can also initialize tensors using `randn` function, this picks up values from a normal distribution of `mean = 0` and `sd = 1`

In [47]:
X = torch.randn((2,3,4))
X

tensor([[[-1.0703, -0.6006,  0.6795,  0.7969],
         [-0.6514, -0.6167,  0.3797, -0.7482],
         [ 0.6418,  0.2202,  0.2801,  0.2707]],

        [[-1.1957, -1.7886, -0.0371,  0.7337],
         [-0.8295,  1.0147, -0.8393,  0.2184],
         [ 0.4706,  0.1834,  0.1571, -1.5279]]])

In [50]:
# We can perform parallel operations on these tensors using the following mathematical operators
X_exp = torch.exp(X)
x = torch.tensor([1.0, 2, 4, 8])
y = torch.tensor([2, 2, 2, 2])
# These will create element-wise operations, we can also produce algebraic operations
X_exp, x + y, x - y, x * y, x / y, x ** y

(tensor([[[0.3429, 0.5485, 1.9729, 2.2187],
          [0.5213, 0.5397, 1.4618, 0.4732],
          [1.8999, 1.2463, 1.3233, 1.3109]],
 
         [[0.3025, 0.1672, 0.9636, 2.0828],
          [0.4363, 2.7586, 0.4320, 1.2441],
          [1.6010, 1.2013, 1.1702, 0.2170]]]),
 tensor([ 3.,  4.,  6., 10.]),
 tensor([-1.,  0.,  2.,  6.]),
 tensor([ 2.,  4.,  8., 16.]),
 tensor([0.5000, 1.0000, 2.0000, 4.0000]),
 tensor([ 1.,  4., 16., 64.]))

In [52]:
X = torch.arange(12, dtype=torch.float32).reshape((3,4))
Y = torch.tensor([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])
X == Y

tensor([[False,  True, False,  True],
        [False, False, False, False],
        [False, False, False, False]])

### Broadcasting
- Two tensors are “broadcastable” if the following rules hold:
  - Each tensor has at least one dimension.
  - When iterating over the dimension sizes, starting at the trailing dimension, the dimension sizes must either be equal, one of them is 1, or one of them does not exist
- Broadcasting works according to the following two-step procedure:
  - expand one or both arrays by copying elements along axes with length 1 so that after this transformation, the two tensors have the same shape
  - perform an elementwise operation on the resulting arrays

In [53]:
a = torch.arange(3).reshape((3, 1)) # 3 x 1
b = torch.arange(2).reshape((1, 2)) # 1 x 2
a, b

(tensor([[0],
         [1],
         [2]]),
 tensor([[0, 1]]))

In [78]:
a + b

tensor([[0, 1],
        [1, 2],
        [2, 3]])

In [59]:
a_broadcasted_internally = torch.tensor([[0,0], [1,1], [2,2]])
b_broadcasted_internally = torch.tensor([[0,1], [0,1], [0,1]])
a_broadcasted_internally + b_broadcasted_internally

tensor([[0, 1],
        [1, 2],
        [2, 3]])

### We can save memory using `+=` instead of `= A + B`

In [70]:
X, Y = torch.randn((2,3))
before_id = id(Y)
Y = X + Y
before_id, id(Y), Y

(5276189424, 5275877008, tensor([-0.8869,  1.1944,  1.5369]))

In [71]:
before_id = id(Y)
Y += X
before_id, id(Y), Y

(5275877008, 5275877008, tensor([-0.9505,  2.0809,  2.3685]))

### Exercise
- Run the code in this section. Change the conditional statement X == Y to X < Y or X > Y, and then see what kind of tensor you can get.
- Replace the two tensors that operate by element in the broadcasting mechanism with other shapes, e.g., 3-dimensional tensors. Is the result the same as expected?

In [74]:
X = torch.arange(12, dtype=torch.float32).reshape((3,4))
Y = torch.tensor([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])
X, Y, X > Y # we can get element wise understanding which ele is greater/less than compared to the other tensor

(tensor([[ 0.,  1.,  2.,  3.],
         [ 4.,  5.,  6.,  7.],
         [ 8.,  9., 10., 11.]]),
 tensor([[2., 1., 4., 3.],
         [1., 2., 3., 4.],
         [4., 3., 2., 1.]]),
 tensor([[False, False, False, False],
         [ True,  True,  True,  True],
         [ True,  True,  True,  True]]))

3 x 2 x 2<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1 x 2<br>
From the broadcasting rules, we see that the last dimension is match, the second last dimension is 1 and the first dim does not exists, hence we can broadcast it

In [81]:
a = torch.arange(12).reshape((3, 2, 2)) # 3 x 2 x 2
b = torch.arange(2).reshape((1, 2)) # 1 x 2
a, b

(tensor([[[ 0,  1],
          [ 2,  3]],
 
         [[ 4,  5],
          [ 6,  7]],
 
         [[ 8,  9],
          [10, 11]]]),
 tensor([[0, 1]]))

## Data Preprocessing