<!--NAVIGATION-->

<a href="https://colab.research.google.com/github/bpesquet/machine-learning-handbook/blob/master/python-data-science/numpy_tensor_management.ipynb"><img align="left" src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab" title="Open in Google Colaboratory"></a>


# Tensor Management with NumPy

**NumPy** is a Python library providing support for large, multi-dimensional arrays and matrices, along with a large collection of mathematical functions to operate on these arrays. 

It is the fundamental package for scientific computing with Python.

In [2]:
# Import the NumPy package under the alias "np"
import numpy as np

## Tensors

In the context of data science, a **tensor** is a set of primitive values (almost always numbers) shaped into an array of any number of dimensions.

Tensors are the core data structures for machine learning.

### Tensor properties

- A tensor's **rank** is its number of dimensions. 
- A dimension is often called an **axis**. 
- The tensor's **shape** describes the number of entries along each axis.

### Scalars (0D tensors)

In [96]:
x = np.array(12)
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

12
Dimension: 0
Shape: ()


### Vectors (1D tensors)

In [97]:
x = np.array([12, 3, 6, 14])
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

[12  3  6 14]
Dimension: 1
Shape: (4,)


### Matrices (2D tensors)

In [98]:
x = np.array([[5, 78, 2, 34, 0],
              [6, 79, 3, 35, 1],
              [7, 80, 4, 36, 2]])
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

[[ 5 78  2 34  0]
 [ 6 79  3 35  1]
 [ 7 80  4 36  2]]
Dimension: 2
Shape: (3, 5)


### 3D tensors

In [99]:
x = np.array([[[5, 78, 2, 34, 0],
               [6, 79, 3, 35, 1]],
              [[5, 78, 2, 34, 0],
               [6, 79, 3, 35, 1]],
              [[5, 78, 2, 34, 0],
               [6, 79, 3, 35, 1]]])
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

[[[ 5 78  2 34  0]
  [ 6 79  3 35  1]]

 [[ 5 78  2 34  0]
  [ 6 79  3 35  1]]

 [[ 5 78  2 34  0]
  [ 6 79  3 35  1]]]
Dimension: 3
Shape: (3, 2, 5)


## Tensor shape management

The number of entries along a specific axis is also called **dimension**, which can be somewhat confusing. 

A 3 dimensions *vector* is not the same as a 3 dimensions *tensor*.

In [100]:
x = np.array([12, 3, 6]) # x is a 3 dimensions vector (1D tensor)
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

[12  3  6]
Dimension: 1
Shape: (3,)


### Tensors with single-dimensional entries


In [101]:
x = np.array([[12, 3, 6, 14]]) # x is a one row matrix (2D tensor)
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

[[12  3  6 14]]
Dimension: 2
Shape: (1, 4)


In [102]:
x = np.array([[12], [3], [6], [14]]) # x is a one column matrix (2D tensor)
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

[[12]
 [ 3]
 [ 6]
 [14]]
Dimension: 2
Shape: (4, 1)


### Removing single-dimensional entries from a tensor

In [103]:
x = np.array([[12, 3, 6, 14]])
x = np.squeeze(x) # x is now a vector (1D tensor)
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

[12  3  6 14]
Dimension: 1
Shape: (4,)


In [104]:
x = np.array([[12], [3], [6], [14]])
x = np.squeeze(x) # x is now a vector (1D tensor)
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

[12  3  6 14]
Dimension: 1
Shape: (4,)


### Reshaping a tensor

In [105]:
x = np.array([[1, 2],
              [3, 4],
              [5, 6]])
x = x.reshape(2, 3)
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

[[1 2 3]
 [4 5 6]]
Dimension: 2
Shape: (2, 3)


In [106]:
# Reshape a matrix into a vector
x = np.array([[1, 2],
              [3, 4],
              [5, 6]])
x = x.reshape(6, )
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

[1 2 3 4 5 6]
Dimension: 1
Shape: (6,)


In [107]:
# Reshape a 3D tensor into a matrix
x = np.array([[[5, 6],
               [7, 8]],
              [[9, 10],
               [11, 12]],
              [[13, 14],
               [15, 16]]])
print ('Original dimension: ' + str(x.ndim))
print ('Original shape: ' + str(x.shape))
x = x.reshape(3, 2*2)
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

Original dimension: 3
Original shape: (3, 2, 2)
[[ 5  6  7  8]
 [ 9 10 11 12]
 [13 14 15 16]]
Dimension: 2
Shape: (3, 4)


### Transposing a tensor

In [108]:
# Transpose a vector (no effect)
x = np.array([12, 3, 6, 14])
x = x.T # alternative syntax: x = np.transpose(x)
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

[12  3  6 14]
Dimension: 1
Shape: (4,)


In [109]:
# Transpose a matrix
x = np.array([[5, 78, 2, 34],
              [6, 79, 3, 35],
              [7, 80, 4, 36]])
x = x.T
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

[[ 5  6  7]
 [78 79 80]
 [ 2  3  4]
 [34 35 36]]
Dimension: 2
Shape: (4, 3)


## Tensor slicing

In [110]:
# Slice a vector
x = np.array([1, 2, 3, 4, 5, 6, 7])
print(x[:3])
print(x[3:])

[1 2 3]
[4 5 6 7]


In [111]:
# Slice a matrix
x = np.array([[5, 78, 2, 34],
              [6, 79, 3, 35],
              [7, 80, 4, 36]])
print(x[:2, :])
print(x[2:, :])
print(x[:, :2])
print(x[:, 2:])

[[ 5 78  2 34]
 [ 6 79  3 35]]
[[ 7 80  4 36]]
[[ 5 78]
 [ 6 79]
 [ 7 80]]
[[ 2 34]
 [ 3 35]
 [ 4 36]]


## Creating tensors

NumPy provides several useful functions for initializing tensors with particular values.

### Filling a tensor with zeros

In [112]:
x = np.zeros(3)
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

[ 0.  0.  0.]
Dimension: 1
Shape: (3,)


In [113]:
x = np.zeros((3,4))
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

[[ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]]
Dimension: 2
Shape: (3, 4)


### Filling a tensor with random numbers

Values are sampled from a "normal" (Gaussian) distribution

In [114]:
x = np.random.randn(5,2)
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

[[ 0.040915 -1.3023  ]
 [-1.487269 -0.483633]
 [ 0.400504 -1.054659]
 [ 2.510167  1.305077]
 [-1.214831  0.847603]]
Dimension: 2
Shape: (5, 2)


## Operations between tensors

**Element-wise** operations are applied independently to each entry in the tensors being considered. 

Other operations, like dot product, combine entries in the input tensors to produce a differently shaped result.


### Element-wise addition

In [115]:
# Element-wise addition between two vectors
x = np.array([2, 5, 4])
y = np.array([1, -1, 4])
z = x + y
print(z)
print ('Dimension: ' + str(z.ndim))
print ('Shape: ' + str(z.shape))

[3 4 8]
Dimension: 1
Shape: (3,)


### Element-wise product

In [116]:
# Element-wise product between two matrices (shapes must be identical)
x = np.array([[1, 2, 3], 
              [3, 2, 1]])
y = np.array([[3, 0, 2], 
              [1, 4, -2]])
z = x * y
print(z)
print ('Dimension: ' + str(z.ndim))
print ('Shape: ' + str(z.shape))

[[ 3  0  6]
 [ 3  8 -2]]
Dimension: 2
Shape: (2, 3)


### Dot product

![Dot product](images/02fig05.jpg)

In [117]:
# Dot product between two matrices (shapes must be compatible)
x = np.array([[1, 2, 3], 
              [3, 2, 1]]) # x has shape (2, 3)
y = np.array([[3, 0], 
              [2, 1], 
              [4, -2]]) # y has shape (3, 2)
z = np.dot(x, y) # alternative syntax: z = x.dot(y)
print(z)
print ('Dimension: ' + str(z.ndim))
print ('Shape: ' + str(z.shape))

[[19 -4]
 [17  0]]
Dimension: 2
Shape: (2, 2)


## Broadcasting

Broadcasting is a powerful NumPy functionality.

If there is no ambiguity, the smaller tensor can be "broadcasted" implicitly to match the larger tensor's shape before an operation is applied to them.

### Broadcasting between a vector and a scalar

In [118]:
x = np.array([12, 3, 6, 14])
x = x + 3
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

[15  6  9 17]
Dimension: 1
Shape: (4,)


### Broadcasting between a matrix and a scalar

In [119]:
x = np.array([[0, 1, 2], 
              [-2, 5, 3]])
x = x - 1
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

[[-1  0  1]
 [-3  4  2]]
Dimension: 2
Shape: (2, 3)


### Broadcasting between a matrix and a vector

In [120]:
x = np.array([[0, 1, 2], 
              [-2, 5, 3]])
y = np.array([1, 2, 3])
z = x + y
print(z)
print ('Dimension: ' + str(z.ndim))
print ('Shape: ' + str(z.shape))

[[ 1  3  5]
 [-1  7  6]]
Dimension: 2
Shape: (2, 3)


## Summing tensors

### Summing on all axes

In [121]:
x = np.array([[0, 1, 2], 
              [-2, 5, 3]])
x = np.sum(x) # x is now a scalar (0D tensor)
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

9
Dimension: 0
Shape: ()


### Summing on a specific axis

In [122]:
# Sums a matrix on its first axis (rows)
x = np.array([[0, 1, 2], 
              [-2, 5, 3]])
x = np.sum(x, axis=0) # x is now a vector (1D tensor)
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

[-2  6  5]
Dimension: 1
Shape: (3,)


In [123]:
# Sums a matrix on its second axis (columns)
x = np.array([[0, 1, 2], 
              [-2, 5, 3]])
x = np.sum(x, axis=1) # x is now a vector (1D tensor)
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

[3 6]
Dimension: 1
Shape: (2,)


### Keeping tensor dimensions while summing

In [124]:
# Sums a matrix on its second axis (columns), keeping the same dimensions
x = np.array([[0, 1, 2], 
              [-2, 5, 3]])
x = np.sum(x, axis=0, keepdims=True) # x is still a matrix (2D tensor)
print(x)
print ('Dimension: ' + str(x.ndim))
print ('Shape: ' + str(x.shape))

[[-2  6  5]]
Dimension: 2
Shape: (1, 3)


## Normalizing tensor entries

In [125]:
x = np.random.randn(3,4)
print(x)
print("Mean: " + str(x.mean(axis=0)))
print("Standard deviation: " + str(x.std(axis=0)))
x -= x.mean(axis=0)
x /= x.std(axis=0)
print(x)
print("Final mean: " + str(x.mean(axis=0)))
print("Final standard deviation: " + str(x.std(axis=0)))

[[-0.898209  0.07939  -1.969532  1.198997]
 [-1.37814   1.085242 -0.331007  0.827235]
 [-0.176263  0.673271  1.298829  1.527151]]
Mean: [-0.817537  0.612634 -0.333904  1.184461]
Standard deviation: [ 0.493969  0.41287   1.334304  0.285924]
[[-0.163314 -1.291555 -1.225829  0.050839]
 [-1.134894  1.14469   0.002171 -1.249373]
 [ 1.298208  0.146865  1.223658  1.198534]]
Final mean: [  7.401487e-17  -1.387779e-16   7.401487e-17  -5.181041e-16]
Final standard deviation: [ 1.  1.  1.  1.]
