## 2.1. Data Manipulation

*Studying and coding along with the printed book __„Dive into Deep Learning“__ by Aston Zhang, Zachary C. Lipton, Mu Li & Alexander J. Smola. The accompanying website for the chapter Preliminaries > Data Manipulation can be found at [d2l.ai](https://d2l.ai/chapter_preliminaries/ndarray.html).*

### 2.1.1. Getting started with Data Manipulation

In [2]:
# importing the PyTorch library
import torch

#### __What is a tensor?__

A tensor represents a (possibly multidimensional) array of numerical values. 

- In the one-dimensional case (only one axis is needed for the data) a tensor is called a **vector**.
- With two axes, a tensor is called a **matrix**.
- With *k > 2* axes, we drop the specialized names and just refer to the object as a ***k<sup>th</sup>*-order tensor**.

In [3]:
# creating new tensors prepopulated with values with arange(n)
# creates a vector of evenly spaced values
# it starts at 0 (included) and ends at n (not included). 
# by default the interval size is 1
# by default new tensors are stored in main memory and designated for CPU-based computation
x = torch.arange(12, dtype=torch.float32)
x

tensor([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11.])

In [4]:
# element of the tensor
x[0]

tensor(0.)

In [5]:
# total number of elements in a tensor
x.numel()

12

In [6]:
# accessing a tensor’s shape (the length along each axis) by inspecting its shape attribute
x.shape

torch.Size([12])

In [7]:
# changing the shape of a tensor without altering its size or values by invoking reshape
# transform vector x to a matrix X with shape (3, 4)
# the elements of the vector are laid out one row at a time: x[3] == X[0, 3]
X = x.reshape(3,4)
X

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])

In [8]:
X[0, 3]

tensor(3.)

In [9]:
X[1, 3]

tensor(7.)

In [10]:
X.numel()

12

If we know the size of a tensor size we can work out one component of the shape with the information we have:
- Given a tensor of size *n* and target shape *(h, w)* we know that *w = n/h*.
- Above example: given a tensor of 12 and target shape (3, w) we know that w = 12/3.

In [11]:
# to automatically infer one component of the shape, we can place a -1 for the shape component that should be inferred automatically 
Y = x.reshape(-1, 4)
Y

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])

In [12]:
Z = x.reshape(3, -1)
Z

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])

In [13]:
# example: a tensor initialized to contain all 0s or 1s
# constructing a tensor with all elements set to 0 and a shape of (2, 3, 4)
torch.zeros((2, 3, 4))

tensor([[[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]])

In [14]:
torch.ones((2, 3, 4))

tensor([[[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]],

        [[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]])

#### __Sampling elements randomly from a given probability distribution__

- It can be of advantage to sample each element randomly (and independently) from a given probability distribution. 
- For example, the parameters of neural networks are often initialized randomly. 

In [15]:
# creating a tensor with elements drawn from a standard Gaussian (normal) distribution with mean 0 and standard deviation 1
torch.randn(3,4)

tensor([[-0.2981, -0.6713, -0.5142,  1.5903],
        [ 0.4454, -0.6757, -1.0581, -0.4133],
        [ 0.2685,  0.9962,  0.8409,  0.4096]])

### 2.1.2. Indexing and Slicing

In [16]:
# tensor elements cab be accessed by indexing, starting with 0
# a whole ranges of indices can be accessed via slicing (e.g., X[start:stop]), 
## where the returned value includes the first index (start) but not the last (stop)
X[1:3]

tensor([[ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])

When only one index (or slice) is specified for a *k<sup>th</sup>*-order tensor, it is applied along axis 0.<br/>
Thus, in the previous code, [-1] selects the last row and [1:3] selects the second and third rows.

In [17]:
# to access an element based on its position relative to the end of the list, we can use negative indexing
X[-1]

tensor([ 8.,  9., 10., 11.])

In [18]:
# writing elements of a matrix by specifying indices
print(X)
X[1, 2] = 17
X[2, 3] = 33
X

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])


tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5., 17.,  7.],
        [ 8.,  9., 10., 33.]])

#### __Assigning multiple elements the same value__

In [19]:
#, we apply the indexing on the left-hand side of the assignment operation
# for instance, [:2, :] accesses the first and second rows
## where : takes all the elements along axis 1 (column)
X[:2, :] = 12
X

tensor([[12., 12., 12., 12.],
        [12., 12., 12., 12.],
        [ 8.,  9., 10., 33.]])

### 2.1.3. Operations

Manipulating tensors with various mathematical operations.

__Elementwise operations__ apply a standard scalar operation to each element of a tensor. 

__Functions that take two tensors as inputs:__ Here elementwise operations apply some standard binary operator on each pair of corresponding elements. We can create an elementwise function from any function that maps from a scalar to a scalar.

Mathematical notation (signature) for an elementwise function that maps from a scalar to a scalar (unary scalar operatorst): 

<math xmlns="http://www.w3.org/1998/Math/MathML">
  <mi>f</mi>
  <mo>:</mo>
  <mrow data-mjx-texclass="ORD">
    <mi mathvariant="double-struck">R</mi>
  </mrow>
  <mo stretchy="false">&#x2192;</mo>
  <mrow data-mjx-texclass="ORD">
    <mi mathvariant="double-struck">R</mi>
  </mrow>
</math>


This just means that the function maps from any real number onto some other real number.

In [20]:
torch.exp(x) # standard operators can be applied elementwise

tensor([1.6275e+05, 1.6275e+05, 1.6275e+05, 1.6275e+05, 1.6275e+05, 1.6275e+05,
        1.6275e+05, 1.6275e+05, 2.9810e+03, 8.1031e+03, 2.2026e+04, 2.1464e+14])

In [21]:
x = torch.tensor([1.0, 2, 4, 8])
y = torch.tensor([2, 2, 2, 2])
print(x)
print(y)

tensor([1., 2., 4., 8.])
tensor([2, 2, 2, 2])


In [23]:
# common standard arithmetic operators for addition (+), subtraction (-), multiplication (*), division (/)
# and exponentiation (**) have all been lifted to elementwise operations for identically-shaped tensors of arbitrary shape
x + y, x - y, x * y, x / y, x ** y

(tensor([ 3.,  4.,  6., 10.]),
 tensor([-1.,  0.,  2.,  6.]),
 tensor([ 2.,  4.,  8., 16.]),
 tensor([0.5000, 1.0000, 2.0000, 4.0000]),
 tensor([ 1.,  4., 16., 64.]))

#### __Concatenating Multiple Tensors__

In [29]:
# concatenating multiple tensors by stacking them end-to-end to form a larger one
X = torch.arange(12, dtype=torch.float32).reshape((3,4))
print("Tensor X:")
print(X)
Y = torch.tensor([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])
print("Tensor Y:")
print(Y)

Tensor X:
tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])
Tensor Y:
tensor([[2., 1., 4., 3.],
        [1., 2., 3., 4.],
        [4., 3., 2., 1.]])


In [30]:
# providing a list of tensors to the cat() function and tell the system along which axis to concatenate
print("X and Y concatenated with dim=0:")
torch.cat((X, Y), dim=0)

X and Y concatenated with dim=0:


tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.],
        [ 2.,  1.,  4.,  3.],
        [ 1.,  2.,  3.,  4.],
        [ 4.,  3.,  2.,  1.]])

In [31]:
print("X and Y concatenated with dim=1:")
torch.cat((X, Y), dim=1)

X and Y concatenated with dim=1:


tensor([[ 0.,  1.,  2.,  3.,  2.,  1.,  4.,  3.],
        [ 4.,  5.,  6.,  7.,  1.,  2.,  3.,  4.],
        [ 8.,  9., 10., 11.,  4.,  3.,  2.,  1.]])

#### __Constructing a binary tensor via logical statements__

In [33]:
# example: X == Y
# for each position i, j, 
# if X[i, j] and Y[i, j] are equal -> the corresponding entry in the result takes value 1
# if X[i, j] and Y[i, j] are not equal -> the corresponding entry in the result takes value 0
X == Y

tensor([[False,  True, False,  True],
        [False, False, False, False],
        [False, False, False, False]])

#### __Summing all the elements in a tensor__

In [34]:
# summing all the elements in the tensor yields a tensor with only one element
X.sum()

tensor(66.)