# Data Manipulation

In order to get anything done, we need some way to store and manipulate data. Generally, there are two important things we need to do with data:

1. Acquire them.
2. Process them once they are inside the computer.

There is no point in acquiring data without some way to store it.

## Tensors
Tensors are n-dimensional arrays. A tensor represents a (possibly multidimensional) array of numerical values. 

k = 1 (one-dimensional case):
- The tensor is a vector. 

k = 2 (two-dimensional case):
- The tensor is a matrix

k > 2:
- kth order tensor 

### Framework structures for tensors:  
For all modern deep learning frameworks, the tensor class (ndarray in MXNet, Tensor in PyTorch and TensorFlow) resembles NumPy’s ndarray, with a few killer features added. First, the tensor class supports automatic differentiation. Second, it leverages GPUs to accelerate numerical computation

- Numpy - ndarray
- MXNet - ndarray
- Pytorch - Tensor
- TensorFlow - Tensor



# PyTorch Tensor

The tensor class is the main interface for storing and manipulating data in deep learning libraries. Tensors provide a variety of functionalities including construction routines; indexing and slicing; basic mathematics operations; broadcasting; memory-efficient assignment; and conversion to and from other Python objects.

## Basics

### arange

In [1]:
import torch

x = torch.arange(12, dtype=torch.float32)
x

tensor([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11.])

In [2]:
y = torch.arange(5, dtype=torch.int8)
y

tensor([0, 1, 2, 3, 4], dtype=torch.int8)

- **arange(n)**:  create a number of evenly spaced values starting at  (included) and ending at n (not included). By default the interval is 1. 

Unless otherwise specified, new tensors are stored in main memory and designated for CPU-based computation.

### numel

In [3]:
x.numel()

12

In [4]:
y.numel()

5

- **numel(tensor)**: Return the number of elements in a tensor

### shape attribute

In [5]:
x.shape

torch.Size([12])

In [6]:
y.shape

torch.Size([5])

- **shape** *attribute*: length alogn each axis

### reshape

In [7]:
X = x.reshape(3,4)
X

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])

In [8]:
x[3] == X[0,3]

tensor(True)

In [9]:
x[4] == X[1,0]

tensor(True)

**reshape(n_rows, n_cols)**: change the shape of a tensor without altering its size or values. This new tensor retains all elements but reconfigures them into a matrix. Notice that the elements of our vector are laid out one row at a time.

In [10]:
X = x.reshape(-1,4)
X

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])

In [11]:
X = x.reshape(3,-1)
X

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])

Because we already know our tensor’s size, we can work out one component of the shape given the rest. To automatically infer one component of the shape, we can place a -1 for the shape component that should be inferred automatically. In our case, instead of calling **x.reshape(3, 4)**, we could have equivalently called **x.reshape(-1, 4)** or **x.reshape(3, -1)**.

### zeros, one and randn



In [12]:
torch.zeros(24)

tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [13]:
torch.ones(24)

tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
        1., 1., 1., 1., 1., 1.])

In [14]:
torch.randn(24)

tensor([ 0.6471, -0.3765, -1.1447, -1.1126,  1.1664, -0.0358,  2.6865,  0.5641,
         1.0015, -0.7671, -1.4421,  1.9788,  0.5308, -0.1627, -0.0152, -1.3389,
         1.4683,  0.9464,  0.3038,  0.0337, -0.8737,  0.5834,  0.3447,  0.6763])

In [15]:
torch.zeros(6,4)

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [16]:
torch.ones(6,4)

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [17]:
torch.randn(6,4)

tensor([[ 1.1617, -0.2120,  0.0806, -1.6618],
        [ 0.5942, -0.1601, -0.0933,  0.5663],
        [ 0.3005,  0.4619,  1.3231, -0.6663],
        [-1.4457, -1.1787,  0.7740, -1.5015],
        [ 0.1963, -0.0198,  1.0102, -0.5460],
        [-0.7538, -1.0498,  1.5255,  0.3199]])

In [18]:
torch.zeros((2,3,4))

tensor([[[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]])

In [19]:
torch.ones(2,3,4)

tensor([[[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]],

        [[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]])

In [20]:
torch.randn(2,3,4)

tensor([[[-1.7369,  0.1288,  0.4205,  0.6310],
         [-0.4131,  0.5017, -0.6827, -0.6478],
         [-0.2334, -0.2708, -0.4676, -0.7220]],

        [[-0.5790,  0.4647, -0.1110,  0.0577],
         [ 1.2521, -1.4204, -0.7753, -0.7342],
         [-0.5215,  0.2952, -0.4017, -1.0301]]])

- **zeros([shape])**: fill the tensor with zeros
- **ones([shape])**: fill the tensor with ones
- **randn([shape])**: fill the tensor with tensor with randon numbers

## Indexing and Slicing

In [21]:
X = torch.arange(12).reshape(3,4)
X

tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])

### Indexing

In [22]:
X[0], X[-1], X[2,3]

(tensor([0, 1, 2, 3]), tensor([ 8,  9, 10, 11]), tensor(11))

In [23]:
X[1,2] = 17
X

tensor([[ 0,  1,  2,  3],
        [ 4,  5, 17,  7],
        [ 8,  9, 10, 11]])

- **Indexing**: means select elements of the tensor via index. The number of values necessary to index an element is equal to the k-dimension of the tensor. You can change the values of a tensor  using indexes.

### Slicing

In [24]:
X[1:3] 

tensor([[ 4,  5, 17,  7],
        [ 8,  9, 10, 11]])

In [25]:
X[:2], X[:1]

(tensor([[ 0,  1,  2,  3],
         [ 4,  5, 17,  7]]),
 tensor([[0, 1, 2, 3]]))

In [26]:
X[:-1], X[:-2]

(tensor([[ 0,  1,  2,  3],
         [ 4,  5, 17,  7]]),
 tensor([[0, 1, 2, 3]]))

- **slicing**: You can access whole ranges of indices using slicing and also assign multiple specific elements.

### Assign multiple elements

In [27]:
X[:2,:-1] = 12
X

tensor([[12, 12, 12,  3],
        [12, 12, 12,  7],
        [ 8,  9, 10, 11]])

In [28]:
X[:2,3:4] = 12
X

tensor([[12, 12, 12, 12],
        [12, 12, 12, 12],
        [ 8,  9, 10, 11]])

## Operations



### Elementwise operations
These apply a standard scalar operation to each element of a tensor. For functions that take two tensors as inputs, elementwise operations apply some standard binary operator on each pair of corresponding elements. We can create an elementwise function from any function that maps from a scalar to a scalar.

**Unary Scalar Operators**

$$ f : \Bbb{R} \rarr \Bbb{R}$$

This means, that every function that maps a real number to another real number are elementwise operations, like $ exp(x)$

In [29]:
x = torch.arange(12)
torch.exp(x)

tensor([1.0000e+00, 2.7183e+00, 7.3891e+00, 2.0086e+01, 5.4598e+01, 1.4841e+02,
        4.0343e+02, 1.0966e+03, 2.9810e+03, 8.1031e+03, 2.2026e+04, 5.9874e+04])

**Binary Scalasr Operators**

Map pairs of real numbers to a (single) real number via the signature:

$$ f: \Bbb{R}, \Bbb{R} \rarr \Bbb{R} $$ 

Given vectos $u$ and $v$ of the same *shape* and binary operator f we can produce a vector:

setting $c_i \larr f(u_i,v_i)$ for all $i$  $\Rarr$ 
$$c = F(u,v)$$

Where the vector-valued produced:
$$ F:\Bbb{R}^d, \Bbb{R}^d \rarr \Bbb{R}^d $$

The standard arithmetic operations (addition, subtraction, multiplication, division, exponentiation) can be lifted to elementwise operations for identically-shaped tensors

In [30]:
x = torch.tensor([1.0,2,4,8])
y = torch.tensor([2,2,2,2])

x+y, x-y, x*y, x/y, x**y

(tensor([ 3.,  4.,  6., 10.]),
 tensor([-1.,  0.,  2.,  6.]),
 tensor([ 2.,  4.,  8., 16.]),
 tensor([0.5000, 1.0000, 2.0000, 4.0000]),
 tensor([ 1.,  4., 16., 64.]))

**Concatenation**  
Concatenate multiple tensors, stacking them end-to-end to form a larger one. We just need to provide a list of tensors and tell the system along which axis to concatenate.


In [31]:
X = torch.arange(12, dtype=torch.float32).reshape((3,4))
Y = torch.tensor([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])

#concatenate matrices along rows
torch.cat((X, Y), dim=0)

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.],
        [ 2.,  1.,  4.,  3.],
        [ 1.,  2.,  3.,  4.],
        [ 4.,  3.,  2.,  1.]])

In [32]:
#concatenate matrices along columns
torch.cat((X, Y), dim=1)

tensor([[ 0.,  1.,  2.,  3.,  2.,  1.,  4.,  3.],
        [ 4.,  5.,  6.,  7.,  1.,  2.,  3.,  4.],
        [ 8.,  9., 10., 11.,  4.,  3.,  2.,  1.]])

**Logical Staments**

Sometimes, we want to construct a binary tensor via logical statements. Take X == Y as an example. For each position i, j, if X[i, j] and Y[i, j] are equal, then the corresponding entry in the result takes value 1, otherwise it takes value 0


In [33]:
X == Y

tensor([[False,  True, False,  True],
        [False, False, False, False],
        [False, False, False, False]])

**Stats**

Stats function are applied all over the tensor and return a tensor with one element.

In [34]:
X.sum()

tensor(66.)

In [35]:
X.median()

tensor(5.)

In [36]:
X.mean()

tensor(5.5000)

In [37]:
X.std()

tensor(3.6056)

## Broadcasting

 Under certain conditions, even when shapes differ, we can still perform elementwise binary operations by invoking the broadcasting mechanism. Broadcasting works according to the following two-step
 procedure:

1. Expand one ot both arrays by copin elements along axes whit lenght 1. 

2. Perform elementwise operation on the resulting arrays

In [38]:
a = torch.arange(3).reshape((3, 1))
b = torch.arange(2).reshape((1, 2))
a, b

(tensor([[0],
         [1],
         [2]]),
 tensor([[0, 1]]))

In [39]:
a + b

tensor([[0, 1],
        [1, 2],
        [2, 3]])

In [40]:
a-b

tensor([[ 0, -1],
        [ 1,  0],
        [ 2,  1]])

In [41]:
a*b

tensor([[0, 0],
        [0, 1],
        [0, 2]])

In [42]:
a**b

tensor([[1, 0],
        [1, 1],
        [1, 2]])

Broadcasting produces a larger  3x2
 matrix by replicating matrix a along the columns and matrix b along the rows before adding them elementwise.

## Saving Memory

Running operations can cause new memory to be allocated to host results. For example, if we write Y = X + Y, we dereference the tensor that Y used to point to and instead point Y at the newly allocated memory.

In [43]:
before = id(Y)
before

2298404168224

In [44]:
Y = Y+X
id(Y)

2298404309088

In [45]:
id(Y)==before

False

This might be undesirable for two reasons.

1. we do not want to run around allocating memory unnecessarily all the time. In machine learning, we often have hundreds of megabytes of parameters and update all of them multiple times per second. Whenever possible, we want to perform these updates in place
2. we might point at the same parameters from multiple variables. If we do not update in place, we must be careful to update all of these references, lest we spring a memory leak or inadvertently refer to stale parameters.

### In-place operations

In [46]:
Z = torch.zeros_like(Y)
Z

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

- **zeros_like(A)**: produce a matrix of zeros but with the shape of the matrix A

In [47]:
print('id(Z):', id(Z))
Z[:] = X + Y
print('id(Z):', id(Z))
Z

id(Z): 2298404307168
id(Z): 2298404307168


tensor([[ 2.,  3.,  8.,  9.],
        [ 9., 12., 15., 18.],
        [20., 21., 22., 23.]])

If the value of X is not reused in subsequent computations, we can also use X[:] = X + Y or X += Y to reduce the memory overhead of the operation.

In [48]:
before = id(X)
X += Y
id(X) == before

True

## Conversion to another Python objects

### numpy ndarray to torch tensor

In [49]:
import numpy 

A = X.numpy()
B = torch.from_numpy(A)
type(A), type(B)

(numpy.ndarray, torch.Tensor)

### Tensor element to python scalar

In [50]:
a = torch.tensor([3.5])
a, a.item(), float(a), int(a)

(tensor([3.5000]), 3.5, 3.5, 3)

## Exercises

In [53]:
#1

X = torch.arange(12, dtype=torch.float32).reshape((3,4))
Y = torch.tensor([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])

X == Y, X < Y, X > Y

(tensor([[False,  True, False,  True],
         [False, False, False, False],
         [False, False, False, False]]),
 tensor([[ True, False,  True, False],
         [False, False, False, False],
         [False, False, False, False]]),
 tensor([[False, False, False, False],
         [ True,  True,  True,  True],
         [ True,  True,  True,  True]]))

In [116]:
#2

a = torch.arange(4).reshape((2, 2, 1))
a

b = torch.arange(16).reshape(1,2,8)
b

a+b



tensor([[[ 0,  1,  2,  3,  4,  5,  6,  7],
         [ 9, 10, 11, 12, 13, 14, 15, 16]],

        [[ 2,  3,  4,  5,  6,  7,  8,  9],
         [11, 12, 13, 14, 15, 16, 17, 18]]])

# Notas
- Revisar concepto de Broadcasting
- Tener cuidado con la memoria, las variables no se sobrescriben en la misma posicion de memoria, utilizar [:]
