<a href="https://colab.research.google.com/github/kenhuangsy/General-Deep-Learning/blob/main/Data_Manipulation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Basics

In [1]:
import torch

Let's get started by understanding what a tensor is. A tensor is simply a single or multi-dimensional array of numerical values. If we only have one axis, it is called a vector. If we have 2 axes, it is called a matrix

One of the most basic functions for creating new tensors is arange().

In [2]:
x = torch.arange(12, dtype = torch.float32)
x

tensor([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11.])

In [3]:
x.numel()

12

In [4]:
x.shape

torch.Size([12])

In [5]:
X = x.reshape(3, 4)
X

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])

In [6]:
X = x.reshape(-1, 4)
X

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])

In [7]:
X = x.reshape(3, -1)
X

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])

Practitioners often need to work with tensors initialized with all zeros or ones.

In [8]:
torch.zeros((2, 3, 4))

tensor([[[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]])

In [9]:
torch.ones((2, 3, 4))

tensor([[[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]],

        [[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]])

Random is useful when we want to initialize the weights of a neural network with random numbers. This is because randomizing the weights can allow us to break symmetry in the network which makes the process and chances of finding the global minima easier. Without random initialization, all neurons in the network would perform the same computation and produce the same output, resulting in poor performance. Additionally, starting with random weights can also help prevent the network from getting stuck in a poor local minimum during training.

In [10]:
torch.randn(3, 4)

tensor([[ 0.4378, -2.5085,  1.8293, -1.6031],
        [-0.2617,  1.1642, -1.7660,  0.8106],
        [-0.9764, -0.6006,  0.2010,  1.6216]])

In [11]:
torch.tensor([[1,2,3,4],
              [5,6,7,8],
              [9,10,11,12]])

tensor([[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12]])

## Indexing and Slicing

In [13]:
X

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])

We can access a range of indices by slicing through our tensor.

X[start:non-inclusive stop]

In [17]:
X[-1]

tensor([ 8.,  9., 10., 11.])

In [18]:
X[1: 3]

tensor([[ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])

We can also assign specific values to a tensor. Note that tensors are also 0 index based

In [19]:
X

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])

In [22]:
X[1, 2] = 17
X

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5., 17.,  7.],
        [ 8.,  9., 10., 11.]])

In [24]:
X[:2, :] = 12
X

tensor([[12., 12., 12., 12.],
        [12., 12., 12., 12.],
        [ 8.,  9., 10., 11.]])

## Operations

Some of the most useful operations are the elementwise operations. These apply a STANDARD scalar operation to each element of a tensor

In [26]:
x

tensor([12., 12., 12., 12., 12., 12., 12., 12.,  8.,  9., 10., 11.])

In [28]:
#apply e^x to all elements in the tensor
torch.exp(x)

tensor([162754.7969, 162754.7969, 162754.7969, 162754.7969, 162754.7969,
        162754.7969, 162754.7969, 162754.7969,   2980.9580,   8103.0840,
         22026.4648,  59874.1406])

We can also apply BINARY scalar operators. What does this mean? If we are given two vectors u and v of the same shape and a binary operator f, we can produce a new vector c where we plug vectors u and v into the binary operator f.

In [30]:
u = torch.tensor([1, 2, 3, 4])
v = torch.tensor([5, 6, 7, 8])
c = u + v
c

tensor([ 6,  8, 10, 12])

In [31]:
c = u * v
c

tensor([ 5, 12, 21, 32])

In [32]:
c = u-v
c

tensor([-4, -4, -4, -4])

In [34]:
c = u/v
c

tensor([0.2000, 0.3333, 0.4286, 0.5000])

We can also concatenate multiple tensors together stacking them end to end to form a large tensor. We just need a list of tensors and tell the computer along which axis to concatenate. 

Why is concatenation useful?
In a neural network, for example, concatenating multiple feature maps along the channel dimension can allow for increased model capacity and the ability to learn more complex representations of the input data. Additionally, concatenating multiple input sequences or feature vectors can be useful for tasks such as machine translation or image captioning.

In [42]:
X = torch.arange(12, dtype = torch.float32).reshape((3,4))
Y = torch.tensor([[2.0, 1, 4, 3], 
                  [1, 2, 3, 4],
                  [4, 3, 2, 1]])
display(X)
display(Y)

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])

tensor([[2., 1., 4., 3.],
        [1., 2., 3., 4.],
        [4., 3., 2., 1.]])

In [43]:
concat_horizontal_stack = torch.cat((X, Y), axis = 1)
concat_vertical_stack = torch.cat((X, Y), axis = 0)
display(concat_horizontal_stack)
display(concat_vertical_stack)

tensor([[ 0.,  1.,  2.,  3.,  2.,  1.,  4.,  3.],
        [ 4.,  5.,  6.,  7.,  1.,  2.,  3.,  4.],
        [ 8.,  9., 10., 11.,  4.,  3.,  2.,  1.]])

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.],
        [ 2.,  1.,  4.,  3.],
        [ 1.,  2.,  3.,  4.],
        [ 4.,  3.,  2.,  1.]])

We can compare binary tensor through logical statements. 

In [44]:
X == Y

tensor([[False,  True, False,  True],
        [False, False, False, False],
        [False, False, False, False]])

In [46]:
X > Y

tensor([[False, False, False, False],
        [ True,  True,  True,  True],
        [ True,  True,  True,  True]])

We can use .sum() to sum up all the elements of a tensor which yields a tensor with only one element

In [45]:
X.sum()

tensor(66.)

## Broadcasting

Under certain circumstances where the shapes of the tensors differ but we still want to perform elementwise binary operations such as adding them together or multiplying them together, we can use broadcasting to solve this issue. So in short broadcasting is a mechanism that allows for the element-wise operations between tensors of different shapes.

In [50]:
u = torch.randn(2, 3)
v = torch.randn(3)
display(u)
display(v)


tensor([[-1.0341, -0.8639, -0.8430],
        [-1.8832,  0.3893, -0.7829]])

tensor([-0.0814,  0.0213, -0.0141])

In [51]:
c = u + v
c

tensor([[-1.1156, -0.8426, -0.8571],
        [-1.9647,  0.4106, -0.7969]])

In the example, we have a tensor u with shape (2, 3) and a tensor v with shape (3). Since the last dimension of tensor u is 3 and the last dimension of tensor v is also 3, PyTorch can broadcast tensor v along the first dimension to match the shape of tensor u. This allows us to perform an operation such as addition between the two tensors. The resulting tensor C has the same shape as u, which is (2, 3).

## Saving Memory
This part is really important if you want to create your own machine learning model

In [53]:
display(u)
display(v)

tensor([[-1.0341, -0.8639, -0.8430],
        [-1.8832,  0.3893, -0.7829]])

tensor([-0.0814,  0.0213, -0.0141])

In [54]:
before_id = id(u)
u = u + v
after_id = id(u)
before_id == after_id

False

Notice how the memory location changes when we perform the element wise binary operations. This means that it will allocate new memory to store the result of u + v. This is bad because number 1, we are allocating memory unnecessarily all the time. Two this can lead to memory leak. So how do we fix this? Fortunately it is a very simple fix

In [55]:
before_id = id(u)
u += v
after_id = id(u)
before_id == after_id

True

## Conversion to other python objects

In [56]:
A = u.numpy()
B = torch.from_numpy(A)
display(type(A))
display(type(B))

numpy.ndarray

torch.Tensor

## Exercises

1. Create a tensor of shape (4, 3) filled with random values, 
and then extract the second column of the tensor.
2. Create a tensor filled with zeros
3. Create a tensor filled with ones
4. Create two tensors A and B filled with random integers and then store their elementwise sum into a new tensor C
5. Create a tensor with 3 dimensions
6. Create two tensors A and B with different shapes and use broadcasting to obtain their element wise sum
7. Compare two tensors using logical statements
8. Create two tensors of shape (3, 4) and concatenate them horizontally and vertically

In [59]:
# 1
A = torch.randn(4, 3)
display(A)
display(A[:, 1])

tensor([[-1.0187,  1.0221,  0.5933],
        [-1.7533,  0.4169,  0.3206],
        [-1.3601,  1.4218, -1.9415],
        [-0.8130,  0.5767, -1.0583]])

tensor([1.0221, 0.4169, 1.4218, 0.5767])

In [61]:
# 2
A = torch.zeros(4, 3)
A

tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])

In [62]:
# 3
A = torch.ones(4, 3)
A

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])

In [63]:
# 4
A = torch.randn(4, 3)
B = torch.randn(4, 3)
C = A + B
C

tensor([[ 2.3987,  0.9844, -2.1502],
        [-1.7218,  2.2971,  1.8848],
        [-1.4289,  1.4740,  2.2923],
        [ 0.4790,  0.7110, -0.9562]])

In [64]:
# 5
A = torch.arange(27)
display(A)
A = A.reshape(3, 3, 3)
display(A)

tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
        18, 19, 20, 21, 22, 23, 24, 25, 26])

tensor([[[ 0,  1,  2],
         [ 3,  4,  5],
         [ 6,  7,  8]],

        [[ 9, 10, 11],
         [12, 13, 14],
         [15, 16, 17]],

        [[18, 19, 20],
         [21, 22, 23],
         [24, 25, 26]]])

In [72]:
# 6
A = torch.randn((2, 3), dtype = torch.float32)
B = torch.tensor(2.0)
display(A)
display(B)
display(A + B)

tensor([[ 0.4759,  0.3167, -1.5098],
        [-0.0407, -1.8093, -0.2850]])

tensor(2.)

tensor([[2.4759, 2.3167, 0.4902],
        [1.9593, 0.1907, 1.7150]])

In [73]:
# 7
A = torch.tensor([1, 2, 3])
B = torch.tensor([4, 5, 6])
A > B

tensor([False, False, False])

In [74]:
# 8
A = torch.randn(3, 4)
B = torch.randn(3, 4)
display(torch.cat((A, B), axis = 1))
display(torch.cat((A, B), axis = 0))

tensor([[-1.6037, -1.2166,  0.7917,  0.4260, -0.3731, -1.0789, -0.7706, -0.8480],
        [ 0.8385, -0.3906, -0.2613,  0.5359, -0.4539, -1.3881,  0.7464, -0.4369],
        [ 2.6878, -1.6598,  0.1895,  1.2110,  1.5444,  0.3877, -0.0809,  1.0686]])

tensor([[-1.6037, -1.2166,  0.7917,  0.4260],
        [ 0.8385, -0.3906, -0.2613,  0.5359],
        [ 2.6878, -1.6598,  0.1895,  1.2110],
        [-0.3731, -1.0789, -0.7706, -0.8480],
        [-0.4539, -1.3881,  0.7464, -0.4369],
        [ 1.5444,  0.3877, -0.0809,  1.0686]])