# **Data Manipulation**


## **1. *n-dimensional* arrays**

*Tensor* in PyTorch and TensorFlow resembles NumPy's ndarray, but with additional features
- *Tensor* class supports automatic differentiation
- *Tensor* class leverages GPUs to accelerate numerical computation, whereas NumPy only runs on CPUs

Note:
A *tensor* represents a (possibly multi-dimensinal) array of numerical values.
- A *tensor* with one axis is called a Vector
- A *tensor* with two axis is called a Matrix
- *Tensor* with $ k > 2 $ axis are referred as $ k^{th} order\ tensor $ 

In [1]:
import torch

In [6]:
# arange(n) - create a vector of evenly spaced values, 0 inclusive, n exclusive
x = torch.arange(12, dtype=torch.float32)
x

tensor([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11.])

In [7]:
# numel() - total number of elements in a tensor
x.numel()

12

In [10]:
# tensor's shape is the length along each axis
x.shape

torch.Size([12])

In [11]:
X = x.reshape(3, 4)
X

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])

In [12]:
X = x.reshape(-1, 3)
X

tensor([[ 0.,  1.,  2.],
        [ 3.,  4.,  5.],
        [ 6.,  7.,  8.],
        [ 9., 10., 11.]])

In [15]:
# Zero tensor
y = torch.zeros((2, 3, 4))
y

tensor([[[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]])

In [16]:
y.numel()

24

In [17]:
y.shape

torch.Size([2, 3, 4])

In [20]:
# tensor of all One's
torch.ones((3, 2, 5))

tensor([[[1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.]],

        [[1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.]],

        [[1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.]]])

In [21]:
# tensors of random values
# torch.randn() - create a tensor whos elements are from a standard Gaussian distribution with mean 0 and standard devialtion 1.
torch.randn(4,5)

tensor([[-0.2568, -0.7483, -0.1985, -0.8295, -0.2876],
        [ 0.4790, -0.2002, -2.0866, -1.1152,  2.3277],
        [-0.2953,  2.6795,  1.2310, -1.4715,  1.2286],
        [-0.5123,  0.5501, -0.9516, -1.4438, -1.1373]])

In [23]:
z = torch.tensor([[2,5,7],[1,3,6],[8,9,0]])
print(z)
print(z.shape)

tensor([[2, 5, 7],
        [1, 3, 6],
        [8, 9, 0]])
torch.Size([3, 3])


## **2. *Indexing* and *Slicing***

*tensor* elements are inedexed strating with 0
- $'-'ve $ indexing used to access elements based on its position relative to the end of the list
- $ X[start:stop] $ - returns values includes index start(inclusive) to end (exclusive)
- with only one index(or slice) is specified for $k^{th}$ order tensor, it is applied along axis 0

In [33]:
print(X)
print(X.shape)
print(X[-1]) # selects the last row
print(X[1:3]) # selcets the second and third rows

tensor([[ 0.,  1.,  2.],
        [ 3.,  4.,  5.],
        [ 6.,  7.,  8.],
        [ 9., 10., 11.]])
torch.Size([4, 3])
tensor([ 9., 10., 11.])
tensor([[3., 4., 5.],
        [6., 7., 8.]])


In [36]:
print(X[1,2])
X[1,2] = 17
print(X)

tensor(5.)
tensor([[ 0.,  1.,  2.],
        [ 3.,  4., 17.],
        [ 6.,  7.,  8.],
        [ 9., 10., 11.]])


In [67]:
# assign multiple elements the same value
print(X[:3, :]) # selects 1st, 2nd and 3rd row and all cloumns
X[:3, :] = -7
print(X)

tensor([[-7., -7., -7.],
        [-7., -7., -7.],
        [-7., -7., -7.]])
tensor([[-7., -7., -7.],
        [-7., -7., -7.],
        [-7., -7., -7.],
        [ 9., 10., 11.]])


## **3. *Opeartions***

In [41]:
# elementwise operation
print(x)
print(torch.exp(x))

tensor([-7., -7., -7., -7., -7., -7., -7., -7., -7.,  9., 10., 11.])
tensor([9.1188e-04, 9.1188e-04, 9.1188e-04, 9.1188e-04, 9.1188e-04, 9.1188e-04,
        9.1188e-04, 9.1188e-04, 9.1188e-04, 8.1031e+03, 2.2026e+04, 5.9874e+04])


In [45]:
x = torch.tensor([1.0, 4.5, 2.9, -3.4])
y = torch.tensor([2, 7, -9, 5])
x + y, x - y, x * y , x / y 

(tensor([ 3.0000, 11.5000, -6.1000,  1.6000]),
 tensor([-1.0000, -2.5000, 11.9000, -8.4000]),
 tensor([  2.0000,  31.5000, -26.1000, -17.0000]),
 tensor([ 0.5000,  0.6429, -0.3222, -0.6800]))

In [61]:
# Concatenating multiple tensors
A = torch.randn(12).reshape((3,4))
print(A)
print(A.shape)
B = torch.tensor([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])
print(B)
print(B.shape)

tensor([[-0.0799, -0.1928,  0.1779,  0.3879],
        [-0.3728,  1.0827,  0.4407, -0.2011],
        [-0.4438,  1.5737, -0.6165, -0.4128]])
torch.Size([3, 4])
tensor([[2., 1., 4., 3.],
        [1., 2., 3., 4.],
        [4., 3., 2., 1.]])
torch.Size([3, 4])


In [62]:
# concatenation along rows (Axis = 0)
C = torch.cat((A,B), dim = 0)
print(C)
print(C.shape)

tensor([[-0.0799, -0.1928,  0.1779,  0.3879],
        [-0.3728,  1.0827,  0.4407, -0.2011],
        [-0.4438,  1.5737, -0.6165, -0.4128],
        [ 2.0000,  1.0000,  4.0000,  3.0000],
        [ 1.0000,  2.0000,  3.0000,  4.0000],
        [ 4.0000,  3.0000,  2.0000,  1.0000]])
torch.Size([6, 4])


In [63]:
# concatenation along columns (Axis =1)
D = torch.cat((A,B), dim = 1)
print(D)
print(D.shape)

tensor([[-0.0799, -0.1928,  0.1779,  0.3879,  2.0000,  1.0000,  4.0000,  3.0000],
        [-0.3728,  1.0827,  0.4407, -0.2011,  1.0000,  2.0000,  3.0000,  4.0000],
        [-0.4438,  1.5737, -0.6165, -0.4128,  4.0000,  3.0000,  2.0000,  1.0000]])
torch.Size([3, 8])


In [64]:
print(A == B)

tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [66]:
# sum of all the elements of the tensor
C.sum() 

tensor(31.3435)

## **4. *Broadcasting***

- When shapes of two tensors differ, we can perform elementwise binary operations using $broadcasting \ mechanism$
- *broadcasting* is a two-step process:
    - expand one or both arrays bu copying elements along axis with length 1 so after transformation, the tw otensors have same shape
    - perfrom an elementwise operation on the resulting arrays

In [70]:
a = torch.arange(3).reshape((3,1))
b = torch.arange(2).reshape((1,2))
a, b

(tensor([[0],
         [1],
         [2]]),
 tensor([[0, 1]]))

In [72]:
#Since a and b are 3 × 1 and 1 × 2 matrices, respectively, their shapes do not match up.
#Broadcasting produces a larger 3 × 2 matrix by replicating matrix a along the columns and
#matrix b along the rows before adding them elementwise
a + b


tensor([[0, 1],
        [1, 2],
        [2, 3]])