# <font color='pink'>Numpy Refresher</font>

### <font style="color:rgb(8,133,37)">Why do we need a special library for math and DL?</font>
Python provides data types such as lists / tuples out of the box. Then, why are we using special libraries for deep learning tasks, such as Pytorch or TensorFlow, and not using standard types?

The major reason is efficiency - In pure python, there are no primitive types for numbers, as in e.g. C language. All the data types in Python are objects with lots of properties and methods. You can see it using the `dir` function:

In [1]:
a = 3
dir(a)[-10:]

['as_integer_ratio',
 'bit_count',
 'bit_length',
 'conjugate',
 'denominator',
 'from_bytes',
 'imag',
 'numerator',
 'real',
 'to_bytes']

In [13]:
import numpy as np
l = [[1,2,3],[2,3,4],[2,4,5]]

A = np.array([[[[1,2,3],[2,3,4],[2,4,5]],[[1,2,3],[2,3,4],[2,4,5]],[[1,2,3],[2,3,4],[2,4,5]],[[1,2,3],[2,3,4],[2,4,5]]],
              [[[1,2,3],[2,3,4],[2,4,5]],[[1,2,3],[2,3,4],[2,4,5]],[[1,2,3],[2,3,4],[2,4,5]],[[1,2,3],[2,3,4],[2,4,5]]]])

B = np.array(l)

print(l)
print(A)
print(B)

print(A.shape)
print(A.dtype)

[[1, 2, 3], [2, 3, 4], [2, 4, 5]]
[[[[1 2 3]
   [2 3 4]
   [2 4 5]]

  [[1 2 3]
   [2 3 4]
   [2 4 5]]

  [[1 2 3]
   [2 3 4]
   [2 4 5]]

  [[1 2 3]
   [2 3 4]
   [2 4 5]]]


 [[[1 2 3]
   [2 3 4]
   [2 4 5]]

  [[1 2 3]
   [2 3 4]
   [2 4 5]]

  [[1 2 3]
   [2 3 4]
   [2 4 5]]

  [[1 2 3]
   [2 3 4]
   [2 4 5]]]]
[[1 2 3]
 [2 3 4]
 [2 4 5]]
(2, 4, 3, 3)
int64


### <font style="color:rgb(8,133,37)">Python Issues</font>

- slow in tasks that require tons of simple math operations on numbers
- huge memory overhead due to storing plain numbers as objects
- runtime overhead during memory dereferencing - cache issues


NumPy is an abbreviation for "numerical python" and as it stands from the naming it provides a rich collection of operations on the numerical data types with a python interface. The core data structure of NumPy is `ndarray` - a multidimensional array. Let's take a look at its interface in comparison with plain python lists.

# <font color='green'>Performance comparison of Numpy array and Python lists </font>

Let's imagine a simple task - we have several 2-dimensional points and we want to represent them as a list of points for further processing. For the sake of simplicity of processing we will not create a `Point` object and will use a list of 2 elements to represent coordinates of each point (`x` and `y`):

In [2]:
# create points list using explicit specification of coordinates of each point
points = [[0, 1], [10, 5], [7, 3]]
points

[[0, 1], [10, 5], [7, 3]]

In [43]:
# create random points
from random import randint

num_dims = 2
num_points = 10
x_range = (0, 10)
y_range = (1, 50)
points = [[randint(*x_range), randint(*y_range)] for _ in range(num_points)]
points

[[7, 13],
 [0, 17],
 [5, 6],
 [0, 7],
 [1, 41],
 [6, 3],
 [7, 32],
 [10, 14],
 [9, 38],
 [6, 42]]

**How can we do the same using Numpy? Easy!**

In [44]:
import numpy as np
points = np.array(points)  # we are able to create numpy arrays from python lists
points

array([[ 7, 13],
       [ 0, 17],
       [ 5,  6],
       [ 0,  7],
       [ 1, 41],
       [ 6,  3],
       [ 7, 32],
       [10, 14],
       [ 9, 38],
       [ 6, 42]])

In [45]:
# create random points using numpy library
num_dims = 2
num_points = 10
x_range = (0, 11)
y_range = (1, 51)
points = np.random.randint(
    low=(x_range[0], y_range[0]), high=(x_range[1], y_range[1]), size=(num_points, num_dims)
)
points

array([[ 3, 33],
       [ 4, 46],
       [ 1, 16],
       [10,  4],
       [ 2, 19],
       [ 6,  8],
       [ 0, 30],
       [ 6, 49],
       [ 0,  5],
       [ 1, 20]])

**It may look as over-complication to use NumPy for the creation of such a list and we still cannot see the good sides of this approach. But let's take a look at the performance side.**

In [46]:
num_dims = 2
num_points = 100000
x_range = (0, 10)
y_range = (1, 50)

### <font style="color:rgb(8,133,37)">Python performance</font>

In [47]:
%timeit \
points = [[randint(*x_range), randint(*y_range)] for _ in range(num_points)]

86.4 ms ± 2.43 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


### <font style="color:rgb(8,133,37)">NumPy performance</font>

In [48]:
%timeit \
points = np.random.randint(low=(x_range[0], y_range[0]), high=(x_range[1], y_range[1]), size=(num_points, num_dims))

3.4 ms ± 26.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


Wow, NumPy is **around 50 times faster** than pure Python on this task! One may say that the size of the array we're generating is relatively large, but it's very reasonable if we take into account the dimensions of inputs (and weights) in neural networks (or math problems such as hydrodynamics).

# <font style="color:pink">Basics of Numpy </font>
We will go over some of the useful operations of Numpy arrays, which are most commonly used in ML tasks.

## <font color='pink'>1. Basic Operations </font>


### <font style="color:rgb(8,133,37)">1.1. Python list to numpy array</font>

In [49]:
py_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

np_array = np.array(py_list)
np_array

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [50]:
py_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]

np_array= np.array(py_list)
np_array

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

### <font style="color:rgb(8,133,37)">1.2. Slicing and Indexing</font>

In [51]:
print('First row:\t\t\t{}'.format(np_array[0]))
print('First column:\t\t\t{}'.format(np_array[:, 0]))
print('3rd row 2nd column element:\t{}'.format(np_array[2][1]))
print('2nd onwards row and 2nd onwards column:\n{}'.format(np_array[1:, 1:]))
print('Last 2 rows and last 2 columns:\n{}'.format(np_array[-2:, -2:]))
print('Array with 3rd, 1st and 4th row:\n{}'.format(np_array[[2, 0, 3]]))

First row:			[1 2 3]
First column:			[ 1  4  7 10]
3rd row 2nd column element:	8
2nd onwards row and 2nd onwards column:
[[ 5  6]
 [ 8  9]
 [11 12]]
Last 2 rows and last 2 columns:
[[ 8  9]
 [11 12]]
Array with 3rd, 1st and 4th row:
[[ 7  8  9]
 [ 1  2  3]
 [10 11 12]]


### <font style="color:rgb(8,133,37)">1.3. Basic attributes of NumPy array</font>

Get a full list of attributes of an ndarray object [here](https://numpy.org/devdocs/user/quickstart.html).

In [52]:
print('Data type:\t{}'.format(np_array.dtype))
print('Array shape:\t{}'.format(np_array.shape))

Data type:	int64
Array shape:	(4, 3)


Let's create a function (with name `array_info`) to print the NumPy array, its shape, and its data type. We use this function to print arrays further in this section. 


In [53]:
def array_info(array):
    print('Array:\n{}'.format(array))
    print('Data type:\t{}'.format(array.dtype))
    print('Array shape:\t{}\n'.format(array.shape))
    
array_info(np_array)

Array:
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
Data type:	int64
Array shape:	(4, 3)



### <font style="color:rgb(8,133,37)">1.4. Creating NumPy array using built-in functions and datatypes</font>

The full list of supported data types can be found [here](https://numpy.org/devdocs/user/basics.types.html).


**Sequence Array**

`np.arange([start, ]stop, [step, ]dtype=None)`

Return evenly spaced values in `[start, stop)`.

More delatis of the function can be found [here](https://docs.scipy.org/doc/numpy/reference/generated/numpy.arange.html).

In [54]:
# sequence array
array = np.arange(10, dtype=np.int64)
array_info(array)

Array:
[0 1 2 3 4 5 6 7 8 9]
Data type:	int64
Array shape:	(10,)



In [55]:
# sequence array
array = np.arange(5, 10, dtype=np.float32)
array_info(array)

Array:
[5. 6. 7. 8. 9.]
Data type:	float32
Array shape:	(5,)



**Zeroes Array**

In [56]:
# Zero array/matrix
zeros = np.zeros((2, 3), dtype=np.float32)
array_info(zeros)

Array:
[[0. 0. 0.]
 [0. 0. 0.]]
Data type:	float32
Array shape:	(2, 3)



**Ones Array**

In [57]:
# ones array/matrix
ones = np.ones((3, 2), dtype=np.int8)
array_info(ones)

Array:
[[1 1]
 [1 1]
 [1 1]]
Data type:	int8
Array shape:	(3, 2)



**Constant Array**

In [58]:
# constant array/matrix
array = np.full((3, 3), 3.14)
array_info(array)

Array:
[[3.14 3.14 3.14]
 [3.14 3.14 3.14]
 [3.14 3.14 3.14]]
Data type:	float64
Array shape:	(3, 3)



**Identity Array**

In [59]:
# identity array/matrix
identity = np.eye(5, dtype=np.float32)      # identity matrix of shape 5x5
array_info(identity)

Array:
[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]
Data type:	float32
Array shape:	(5, 5)



**Random Integers Array**

`np.random.randint(low, high=None, size=None, dtype='l')`

Return random integer from the `discrete uniform` distribution in `[low, high)`. If high is `None`, then return elements are in `[0, low)`

More details can be found [here](https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.randint.html).

In [60]:
# random integers array/matrix
rand_int = np.random.randint(5, 10, (2,3)) # random integer array of shape 2x3, values lies in [5, 10)
array_info(rand_int)

Array:
[[6 5 5]
 [6 9 8]]
Data type:	int64
Array shape:	(2, 3)



**Random Array**

`np.random.random(size=None)`

Results are from the `continuous uniform` distribution in `[0.0, 1.0)`.

These types of functions are useful is initializing the weight in Deep Learning. More details and similar functions can found [here](https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.random.random.html).

In [61]:
# random array/matrix
random_array = np.random.random((5, 5))   # random array of shape 5x5
array_info(random_array)

Array:
[[0.22968976 0.63979286 0.92004906 0.64210426 0.85976043]
 [0.85332005 0.17993543 0.11128144 0.53575031 0.0116344 ]
 [0.5336474  0.39730277 0.0175296  0.35712673 0.46156763]
 [0.30105639 0.51682369 0.87648164 0.32592055 0.71648524]
 [0.37338526 0.30771026 0.03550093 0.55577503 0.79802747]]
Data type:	float64
Array shape:	(5, 5)



**Boolean Array**

If we compare above `random_array` with some `constant` or `array` of the same shape, we will get a boolean array.

In [62]:
# Boolean array/matrix
bool_array = random_array > 0.5
array_info(bool_array)

Array:
[[False  True  True  True  True]
 [ True False False  True False]
 [ True False False False False]
 [False  True  True False  True]
 [False False False  True  True]]
Data type:	bool
Array shape:	(5, 5)



The boolean array can be used to get value from the array. If we use a boolean array of the same shape as indices, we will get those values for which the boolean array is True, and other values will be masked.

Let's use the above `boolen_array` to get values from `random_array`.

In [63]:
# Use boolean array/matrix to get values from array/matrix
values = random_array[bool_array]
array_info(values)

Array:
[0.63979286 0.92004906 0.64210426 0.85976043 0.85332005 0.53575031
 0.5336474  0.51682369 0.87648164 0.71648524 0.55577503 0.79802747]
Data type:	float64
Array shape:	(12,)



Basically, from the above method, we are filtering values that are greater than `0.5`. 

**Linespace**

`np.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0)`

Returns num evenly spaced samples, calculated over the interval `[start, stop]`.

More detais about the function find [here](https://docs.scipy.org/doc/numpy/reference/generated/numpy.linspace.html)

In [64]:
# Linspace
linespace = np.linspace(0, 5, 7, dtype=np.float32)   # 7 elements between 0 and 5
array_info(linespace)

Array:
[0.        0.8333333 1.6666666 2.5       3.3333333 4.1666665 5.       ]
Data type:	float32
Array shape:	(7,)



### <font style="color:rgb(8,133,37)">1.5. Data type conversion</font>

Sometimes it is essential to convert one data type to another data type.

In [65]:
age_in_years = np.random.randint(0, 100, 10)
array_info(age_in_years)

Array:
[12 71  0 21 15 58 68 95 59 21]
Data type:	int64
Array shape:	(10,)



Do we really need an `int64` data type to store age?

So let's convert it to `uint8`.

In [66]:
age_in_years = age_in_years.astype(np.uint8)
array_info(age_in_years)

Array:
[12 71  0 21 15 58 68 95 59 21]
Data type:	uint8
Array shape:	(10,)



Let's convert it to `float128`. 😜

In [67]:
age_in_years = age_in_years.astype(np.float128)
array_info(age_in_years)

Array:
[12. 71.  0. 21. 15. 58. 68. 95. 59. 21.]
Data type:	float128
Array shape:	(10,)



## <font color='pink'>2. Mathematical functions </font>

Numpy supports a lot of Mathematical operations with array/matrix. Here we will see a few of them which are useful in Deep Learning. All supported functions can be found [here](https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.math.html).

### <font style="color:rgb(8,133,37)">2.1. Exponential Function </font>
Exponential functions ( also called `exp` ) are used in neural networks as activations functions. They are used in softmax functions which is widely used in Classification tasks.

Return element-wise `exponential` of `array`.

More details of `np.exp` can be found **[here](https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.exp.html#numpy.exp)**

In [68]:
array = np.array([np.full(3, -1), np.zeros(3), np.ones(3)])
array_info(array)

# exponential of a array/matrix
print('Exponential of an array:')
exp_array = np.exp(array)
array_info(exp_array)

Array:
[[-1. -1. -1.]
 [ 0.  0.  0.]
 [ 1.  1.  1.]]
Data type:	float64
Array shape:	(3, 3)

Exponential of an array:
Array:
[[0.36787944 0.36787944 0.36787944]
 [1.         1.         1.        ]
 [2.71828183 2.71828183 2.71828183]]
Data type:	float64
Array shape:	(3, 3)



### <font style="color:rgb(8,133,37)">2.2. Square Root </font>

`np.sqrt` return the element-wise `square-root` (`non-negative`) of an array.

More details of the function can be found [here](https://docs.scipy.org/doc/numpy/reference/generated/numpy.sqrt.html)

`Root Mean Square Error` (`RMSE`) and `Mean Absolute Error` (`MAE`) commonly used to measure the `accuracy` of continuous variables.

In [69]:
array = np.arange(10)
array_info(array)

print('Square root:')
root_array = np.sqrt(array)
array_info(root_array)

Array:
[0 1 2 3 4 5 6 7 8 9]
Data type:	int64
Array shape:	(10,)

Square root:
Array:
[0.         1.         1.41421356 1.73205081 2.         2.23606798
 2.44948974 2.64575131 2.82842712 3.        ]
Data type:	float64
Array shape:	(10,)



### <font style="color:rgb(8,133,37)">2.3. Logrithm </font>

`np.log` return element-wise natural logrithm of an array.

More details of the function can be found [here](https://docs.scipy.org/doc/numpy/reference/generated/numpy.log.html)

`Cross-Entropy` / `log loss` is the most commonly used loss in Machine Learning classification problem. 

In [70]:
array = np.array([0, np.e, np.e**2, 1, 10])
array_info(array)

print('Logrithm:')
log_array = np.log(array)
array_info(log_array)

Array:
[ 0.          2.71828183  7.3890561   1.         10.        ]
Data type:	float64
Array shape:	(5,)

Logrithm:
Array:
[      -inf 1.         2.         0.         2.30258509]
Data type:	float64
Array shape:	(5,)



  log_array = np.log(array)


<font color='red'>**Note:** Getting warning because we are trying to calculate `log(0)`.</font>

### <font style="color:rgb(8,133,37)">2.4. Power </font>

`numpy.power(x1, x2)`

Returns first array elements raised to powers from second array, element-wise.

Second array must be broadcastable to first array.

What is **broadcasting**? We will see later.

More detalis about the function can be found [here](https://docs.scipy.org/doc/numpy/reference/generated/numpy.power.html)

In [71]:
array = np.arange(0, 6, dtype=np.int64)
array_info(array)

print('Power 3:')
pow_array = np.power(array, 3)
array_info(pow_array)

Array:
[0 1 2 3 4 5]
Data type:	int64
Array shape:	(6,)

Power 3:
Array:
[  0   1   8  27  64 125]
Data type:	int64
Array shape:	(6,)



### <font style="color:rgb(8,133,37)">2.5. Clip Values </font>

`np.clip(a, a_min, a_max)`

Return element-wise cliped values between `a_min` and `a_max`.

More details of the finction can be found [here](https://docs.scipy.org/doc/numpy/reference/generated/numpy.clip.html)

`Rectified Linear Unit` (`ReLU`) is the most commonly used activation function in Deep Learning.

What ReLU do?

If the value is less than zero, it makes it zero otherwise leave as it is. In NumPy assignment will be implementing this activation function using NumPy.

In [72]:
array = np.random.random((3, 3))
array_info(array)

# clipped between 0.2 and 0.5
print('Clipped between 0.2 and 0.5')
cliped_array = np.clip(array, 0.2, 0.5)
array_info(cliped_array)

# clipped to 0.2
print('Clipped to 0.2')
cliped_array = np.clip(array, 0.2, np.inf)
array_info(cliped_array)

Array:
[[0.29200765 0.16539496 0.22904352]
 [0.3849782  0.96187213 0.46891598]
 [0.94090639 0.58064724 0.46449503]]
Data type:	float64
Array shape:	(3, 3)

Clipped between 0.2 and 0.5
Array:
[[0.29200765 0.2        0.22904352]
 [0.3849782  0.5        0.46891598]
 [0.5        0.5        0.46449503]]
Data type:	float64
Array shape:	(3, 3)

Clipped to 0.2
Array:
[[0.29200765 0.2        0.22904352]
 [0.3849782  0.96187213 0.46891598]
 [0.94090639 0.58064724 0.46449503]]
Data type:	float64
Array shape:	(3, 3)



## <font color='pink'>3. Reshape ndarray </font>

Reshaping the array / matrix is very often required in Machine Learning and Computer vision. 

### <font style="color:rgb(8,133,37)">3.1. Reshape </font>

`np.reshape` gives an array in new shape, without changing its data.

More details of the function can be found [here](https://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html)

In [73]:
a = np.arange(1, 10, dtype=np.int8)
array_info(a)

print('Reshape to 3x3:')
a_3x3 = a.reshape(3, 3)
array_info(a_3x3)

print('Reshape 3x3 to 3x3x1:')
a_3x3x1 = a_3x3.reshape(3, 3, 1)
array_info(a_3x3x1)

Array:
[1 2 3 4 5 6 7 8 9]
Data type:	int8
Array shape:	(9,)

Reshape to 3x3:
Array:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Data type:	int8
Array shape:	(3, 3)

Reshape 3x3 to 3x3x1:
Array:
[[[1]
  [2]
  [3]]

 [[4]
  [5]
  [6]]

 [[7]
  [8]
  [9]]]
Data type:	int8
Array shape:	(3, 3, 1)



### <font style="color:rgb(8,133,37)">3.2. Expand Dim </font>

`np.expand_dims`

In the last reshape, we have added a new axis. We can use `np.expand_dims` or `np.newaxis` to do the same thing.

Mode details for `np.expand_dim` can be found [here](https://docs.scipy.org/doc/numpy/reference/generated/numpy.expand_dims.html)

In [74]:
print('Using np.expand_dims:')
a_expand = np.expand_dims(a_3x3, axis=2)
array_info(a_expand)

print('Using np.newaxis:')
a_newaxis = a_3x3[..., np.newaxis]
# or 
# a_newaxis = a_3x3[:, :, np.newaxis]
array_info(a_newaxis)

Using np.expand_dims:
Array:
[[[1]
  [2]
  [3]]

 [[4]
  [5]
  [6]]

 [[7]
  [8]
  [9]]]
Data type:	int8
Array shape:	(3, 3, 1)

Using np.newaxis:
Array:
[[[1]
  [2]
  [3]]

 [[4]
  [5]
  [6]]

 [[7]
  [8]
  [9]]]
Data type:	int8
Array shape:	(3, 3, 1)



### <font style="color:rgb(8,133,37)">3.3. Squeeze </font>

Sometimes we need to remove the redundant axis (single-dimensional entries). We can use `np.squeeze` to do this.

More details of `np.squeeze` can be found [here](https://docs.scipy.org/doc/numpy/reference/generated/numpy.squeeze.html)

Deep Learning very often uses this functionality.

In [75]:
print('Squeeze along axis=2:')
a_squeezed = np.squeeze(a_newaxis, axis=2)
array_info(a_squeezed)

# should get value error
print('Squeeze along axis=1, should get ValueError')
a_squeezed_error = np.squeeze(a_newaxis, axis=1)  # Getting error because of the size of 
                                                  # axis-1 is not equal to one.

Squeeze along axis=2:
Array:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Data type:	int8
Array shape:	(3, 3)

Squeeze along axis=1, should get ValueError


ValueError: cannot select an axis to squeeze out which has size not equal to one

<font color='red'>**Note:** Getting error because of the size of axis-1 is not equal to one.</font>

### <font style="color:rgb(8,133,37)">3.4. Reshape revisit </font>

We have a 1-d array of length n. We want to reshape in a 2-d array such that the number of columns becomes two, and we do not care about the number of rows. 

In [76]:
a = np.arange(10)
array_info(a)

print('Reshape such that number of column is 2:')
a_col_2 = a.reshape(-1, 2)
array_info(a_col_2)

Array:
[0 1 2 3 4 5 6 7 8 9]
Data type:	int64
Array shape:	(10,)

Reshape such that number of column is 2:
Array:
[[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]
Data type:	int64
Array shape:	(5, 2)



## <font color='pink'>4. Combine Arrays / Matrix </font>

Combining two or more arrays is a frequent operation in machine learning. Let's have a look at a few methods. 


### <font style="color:rgb(8,133,37)">4.1. Concatenate </font>

`np.concatenate`, Join a sequence of arrays along an existing axis.

More details of the function find [here](https://docs.scipy.org/doc/numpy/reference/generated/numpy.concatenate.html)

In [77]:
a1 = np.array([[1, 2, 3], [4, 5, 6]])
a2 = np.array([[7, 8, 9]])

print('Concatenate along axis zero:')
array = np.concatenate((a1, a2), axis=0)
array_info(array)

Concatenate along axis zero:
Array:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Data type:	int64
Array shape:	(3, 3)



### <font style="color:rgb(8,133,37)">4.2. hstack </font>

`np.hstack`, stack arrays in sequence horizontally (column-wise).

More details of the function find [here](https://docs.scipy.org/doc/numpy/reference/generated/numpy.hstack.html#numpy.hstack)

In [78]:
a1 = np.array((1, 2, 3))
a2 = np.array((4, 5, 6))
a_hstacked = np.hstack((a1,a2))

print('Horizontal stack:')
array_info(a_hstacked)

Horizontal stack:
Array:
[1 2 3 4 5 6]
Data type:	int64
Array shape:	(6,)



In [79]:
a1 = np.array([[1],[2],[3]])
a2 = np.array([[4],[5],[6]])
a_hstacked = np.hstack((a1,a2))

print('Horizontal stack:')
array_info(a_hstacked)

Horizontal stack:
Array:
[[1 4]
 [2 5]
 [3 6]]
Data type:	int64
Array shape:	(3, 2)



### <font style="color:rgb(8,133,37)">4.3. vstack </font>

`np.vstack`, tack arrays in sequence vertically (row-wise).

More details of the function find [here](https://docs.scipy.org/doc/numpy/reference/generated/numpy.vstack.html#numpy.vstack)

In [80]:
a1 = np.array([1, 2, 3])
a2 = np.array([4, 5, 6])
a_vstacked = np.vstack((a1, a2))

print('Vertical stack:')
array_info(a_vstacked)

Vertical stack:
Array:
[[1 2 3]
 [4 5 6]]
Data type:	int64
Array shape:	(2, 3)



In [81]:
a1 = np.array([[1, 11], [2, 22], [3, 33]])
a2 = np.array([[4, 44], [5, 55], [6, 66]])
a_vstacked = np.vstack((a1, a2))

print('Vertical stack:')
array_info(a_vstacked)

Vertical stack:
Array:
[[ 1 11]
 [ 2 22]
 [ 3 33]
 [ 4 44]
 [ 5 55]
 [ 6 66]]
Data type:	int64
Array shape:	(6, 2)



## <font color='pink'>5. Element wise Operations </font>


Let's generate a random number to show element-wise operations. 

In [82]:
a = np.random.random((4,4))
b = np.random.random((4,4))
array_info(a)
array_info(b)

Array:
[[0.28055804 0.89750495 0.32290294 0.52530489]
 [0.94692486 0.62762753 0.9070513  0.7303294 ]
 [0.04341395 0.33595116 0.69466514 0.99016402]
 [0.70813161 0.8224878  0.53109625 0.81882403]]
Data type:	float64
Array shape:	(4, 4)

Array:
[[0.14750968 0.75134626 0.3228003  0.19924315]
 [0.04339902 0.98453058 0.69117909 0.0122185 ]
 [0.42056818 0.98801316 0.71156809 0.1177584 ]
 [0.6271482  0.67193596 0.23233437 0.0907897 ]]
Data type:	float64
Array shape:	(4, 4)



### <font style="color:rgb(8,133,37)">5.1. Element wise Scalar Operation </font>

**Scalar Addition**

In [83]:
a + 5 # element wise scalar addition

array([[5.28055804, 5.89750495, 5.32290294, 5.52530489],
       [5.94692486, 5.62762753, 5.9070513 , 5.7303294 ],
       [5.04341395, 5.33595116, 5.69466514, 5.99016402],
       [5.70813161, 5.8224878 , 5.53109625, 5.81882403]])

**Scalar Subtraction**

In [None]:
a - 5 # element wise scalar subtraction

array([[-4.79381179, -4.01385302, -4.10804192, -4.91274206],
       [-4.84598429, -4.79524191, -4.90177379, -4.64899849],
       [-4.62045477, -4.35334043, -4.20724509, -4.62784018],
       [-4.57977082, -4.08801497, -4.09955634, -4.24919991]])

**Scalar Multiplication**

In [None]:
a * 10 # element wise scalar multiplication

array([[2.06188205, 9.86146978, 8.91958082, 0.87257943],
       [1.54015706, 2.04758087, 0.98226213, 3.51001514],
       [3.79545229, 6.46659571, 7.92754911, 3.72159823],
       [4.20229177, 9.11985025, 9.00443658, 7.50800086]])

**Scalar Division**

In [None]:
a/10 # element wise scalar division

array([[0.02061882, 0.0986147 , 0.08919581, 0.00872579],
       [0.01540157, 0.02047581, 0.00982262, 0.03510015],
       [0.03795452, 0.06466596, 0.07927549, 0.03721598],
       [0.04202292, 0.0911985 , 0.09004437, 0.07508001]])

### <font style="color:rgb(8,133,37)">5.2. Element wise Array Operations </font>

**Arrays Addition**

In [None]:
a + b # element wise array/vector addition

array([[0.7816227 , 1.25863032, 0.92743483, 0.11178089],
       [0.9928362 , 0.85439993, 0.92546319, 0.91142549],
       [0.7612639 , 0.75585012, 1.63082277, 0.55096238],
       [1.11731998, 1.10243276, 1.66527624, 1.58494226]])

**Arrays Subtraction**

In [None]:
a - b # element wise array/vector subtraction

array([[-0.36924629,  0.71366364,  0.85648134,  0.06273499],
       [-0.68480479, -0.44488376, -0.72901076, -0.20942246],
       [-0.00217345,  0.53746902, -0.04531295,  0.19335727],
       [-0.27686162,  0.72153729,  0.13561107, -0.08334209]])

**Arrays Multiplication**

In [None]:
a * b # element wise array/vector multiplication

array([[0.11864781, 0.26870862, 0.03164377, 0.00213982],
       [0.12919153, 0.13301942, 0.08125636, 0.19670966],
       [0.1448795 , 0.07060912, 0.66438241, 0.06654313],
       [0.29293789, 0.17368548, 0.68868865, 0.62627402]])

**Arrays Division**

In [None]:
a / b # element wise array/vector division

array([[ 0.35831742,  3.61910924, 25.14205038,  3.55821584],
       [ 0.18360985,  0.31518611,  0.11874011,  0.62631424],
       [ 0.99430616,  5.92230332,  0.94593165,  2.08140104],
       [ 0.60283277,  4.7886368 ,  1.17730818,  0.90008647]])

We can notice that the dimension of both arrays is equal in above arrays element-wise operations. **What if dimensions are not equal.** Let's check!!

In [84]:
print('Array "a":')
array_info(a)
print('Array "c":')
c = np.random.rand(2, 2)
array_info(c)
# Should throw ValueError
a + c

Array "a":
Array:
[[0.28055804 0.89750495 0.32290294 0.52530489]
 [0.94692486 0.62762753 0.9070513  0.7303294 ]
 [0.04341395 0.33595116 0.69466514 0.99016402]
 [0.70813161 0.8224878  0.53109625 0.81882403]]
Data type:	float64
Array shape:	(4, 4)

Array "c":
Array:
[[0.87914623 0.19482509]
 [0.72945478 0.73562304]]
Data type:	float64
Array shape:	(2, 2)



ValueError: operands could not be broadcast together with shapes (4,4) (2,2) 

<font color='red'>**Oh got the ValueError!!**</font>

What is this error?

<font color='red'>ValueError</font>: operands could not be broadcast together with shapes `(4,4)` `(2,2)` 

**Let's see it next.**


### <font style="color:rgb(8,133,37)">5.3. Broadcasting </font>

There is a concept of broadcasting in NumPy, which tries to copy rows or columns in the lower-dimensional array to make an equal dimensional array of higher-dimensional array. 

Let's try to understand with a simple example.

In [None]:
a = np.array([[1, 2, 3], [4, 5, 6],[7, 8, 9]])
b = np.array([0, 1, 0])

print('Array "a":')
array_info(a)
print('Array "b":')
array_info(b)

print('Array "a+b":')
array_info(a+b)  # b is reshaped such that it can be added to a.


# b = [0,1,0] is broadcasted to     [[0, 1, 0],
#                                    [0, 1, 0],
#                                    [0, 1, 0]]  and added to a.

Array "a":
Array:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Data type:	int64
Array shape:	(3, 3)

Array "b":
Array:
[0 1 0]
Data type:	int64
Array shape:	(3,)

Array "a+b":
Array:
[[1 3 3]
 [4 6 6]
 [7 9 9]]
Data type:	int64
Array shape:	(3, 3)



## <font color='pink'>6. Linear Algebra</font>

Here we see commonly use linear algebra operations in Machine Learning. 

### <font style="color:rgb(8,133,37)">6.1. Transpose </font>

In [None]:
a = np.random.random((2,3))
print('Array "a":')
array_info(a)

print('Transose of "a":')
a_transpose = a.transpose()
array_info(a_transpose)

Array "a":
Array:
[[0.32678018 0.52626284 0.12422799]
 [0.86600758 0.94664089 0.30984851]]
Data type:	float64
Array shape:	(2, 3)

Transose of "a":
Array:
[[0.32678018 0.86600758]
 [0.52626284 0.94664089]
 [0.12422799 0.30984851]]
Data type:	float64
Array shape:	(3, 2)



### <font style="color:rgb(8,133,37)">6.2. Matrix Multiplication</font>
We will discuss 2 ways of performing Matrix Multiplication.

- `matmul`
- Python `@` operator

**Using matmul function in numpy**
This is the most used approach for multiplying two matrices using Numpy. [See docs](https://docs.scipy.org/doc/numpy/reference/generated/numpy.matmul.html)

In [None]:
a = np.random.random((3, 4))
b = np.random.random((4, 2))

print('Array "a":')
array_info(a)
print('Array "b"')
array_info(b)

c = np.matmul(a,b) # matrix multiplication of a and b

print('matrix multiplication of a and b:')
array_info(c)

print('{} x {} --> {}'.format(a.shape, b.shape, c.shape)) # dim1 of a and dim0 of b has to be 
                                                        # same for matrix multiplication

Array "a":
Array:
[[0.19837622 0.54867477 0.29697277 0.9020018 ]
 [0.25028083 0.47372491 0.57636356 0.99023833]
 [0.9210784  0.78337408 0.73200099 0.09821953]]
Data type:	float64
Array shape:	(3, 4)

Array "b"
Array:
[[0.33795095 0.8520493 ]
 [0.92371395 0.61533368]
 [0.49767764 0.92149655]
 [0.56722858 0.53479792]]
Data type:	float64
Array shape:	(4, 2)

matrix multiplication of a and b:
Array:
[[1.23329788 1.26269245]
 [1.37070369 1.56544493]
 [1.45490633 1.99390465]]
Data type:	float64
Array shape:	(3, 2)

(3, 4) x (4, 2) --> (3, 2)


**Using `@` operator**
This method of multiplication was introduced in Python 3.5. [See docs](https://www.python.org/dev/peps/pep-0465/)

In [None]:
a = np.random.random((3, 4))
b = np.random.random((4, 2))

print('Array "a":')
array_info(a)
print('Array "b"')
array_info(b)

c = a@b # matrix multiplication of a and b
array_info(c)

Array "a":
Array:
[[0.25692542 0.98916152 0.14395255 0.32123955]
 [0.8701551  0.32276888 0.67997767 0.98174477]
 [0.73770333 0.47442233 0.40709742 0.1758751 ]]
Data type:	float64
Array shape:	(3, 4)

Array "b"
Array:
[[0.86007706 0.63512662]
 [0.39461333 0.86837326]
 [0.58309036 0.23802624]
 [0.40909642 0.49526679]]
Data type:	float64
Array shape:	(4, 2)

Array:
[[0.82666727 1.21550535]
 [1.67388604 1.48102064]
 [1.13101955 1.06451566]]
Data type:	float64
Array shape:	(3, 2)



### <font style="color:rgb(8,133,37)">6.3. Inverse</font>

In [None]:
A = np.random.random((3,3))
print('Array "A":')
array_info(A)
A_inverse = np.linalg.inv(A)
print('Inverse of "A" ("A_inverse"):')
array_info(A_inverse)

print('"A x A_inverse = Identity" should be true:')
A_X_A_inverse = np.matmul(A, A_inverse)  # A x A_inverse = I = Identity matrix
array_info(A_X_A_inverse)

Array "A":
Array:
[[0.69602469 0.30991938 0.6412185 ]
 [0.46201512 0.97870859 0.60559073]
 [0.10836314 0.64972568 0.11656206]]
Data type:	float64
Array shape:	(3, 3)

Inverse of "A" ("A_inverse"):
Array:
[[ 4.21177468 -5.73591594  6.63122873]
 [-0.1774372  -0.17555812  1.88819913]
 [-2.92647193  6.31102815 -8.11063382]]
Data type:	float64
Array shape:	(3, 3)

"A x A_inverse = Identity" should be true:
Array:
[[ 1.00000000e+00 -3.33958446e-16 -2.74140832e-17]
 [ 5.90561966e-16  1.00000000e+00  3.74849809e-17]
 [ 1.00480120e-16 -1.00453515e-16  1.00000000e+00]]
Data type:	float64
Array shape:	(3, 3)



### <font style="color:rgb(8,133,37)">6.4. Dot Product</font>

In [None]:
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])

dot_pro = np.dot(a, b)  # It will be a scalar, so its shape will be empty
array_info(dot_pro)

Array:
70
Data type:	int64
Array shape:	()



## <font color='pink'>7. Array statistics</font>

### <font style="color:rgb(8,133,37)">7.1. Sum</font>

In [None]:
a = np.array([1, 2, 3, 4, 5])

print(a.sum())

15


### <font style="color:rgb(8,133,37)">7.2. Sum along Axis</font>

In [None]:
a = np.array([[1, 2, 3], [4, 5, 6]])
print(a)
print('')

print('sum along 0th axis = ',a.sum(axis = 0)) # sum along 0th axis ie: 1+4, 2+5, 3+6
print("")
print('sum along 1st axis = ',a.sum(axis = 1)) # sum along 1st axis ie: 1+2+3, 4+5+6

[[1 2 3]
 [4 5 6]]

sum along 0th axis =  [5 7 9]

sum along 1st axis =  [ 6 15]


### <font style="color:rgb(8,133,37)">7.3. Minimum and Maximum</font>

In [None]:
a = np.array([-1.1, 2, 5, 100])

print('Minimum = ', a.min())
print('Maximum = ', a.max())

Minimum =  -1.1
Maximum =  100.0


### <font style="color:rgb(8,133,37)">7.4. Min and Max along Axis</font>

In [None]:
a = np.array([[-2, 0, 2], [1, 2, 3]])

print('a =\n',a,'\n')
print('Minimum = ', a.min())
print('Maximum = ', a.max())
print()
print('Minimum along axis 0 = ', a.min(0))
print('Maximum along axis 0 = ', a.max(0))
print()
print('Minimum along axis 1 = ', a.min(1))
print('Maximum along axis 1 = ', a.max(1))

a =
 [[-2  0  2]
 [ 1  2  3]] 

Minimum =  -2
Maximum =  3

Minimum along axis 0 =  [-2  0  2]
Maximum along axis 0 =  [1 2 3]

Minimum along axis 1 =  [-2  1]
Maximum along axis 1 =  [2 3]


### <font style="color:rgb(8,133,37)">7.5. Mean and Standard Deviation</font>

In [None]:
a = np.array([-1, 0, -0.4, 1.2, 1.43, -1.9, 0.66])

print('mean of the array = ',a.mean())
print('standard deviation of the array = ',a.std())

mean of the array =  -0.001428571428571414
standard deviation of the array =  1.1142252730860458


### <font style="color:rgb(8,133,37)">7.6. Standardizing the Array</font>

Make distribution of array elements such that`mean=0` and `std=1`.

In [None]:
a = np.array([-1, 0, -0.4, 1.2, 1.43, -1.9, 0.66])

print('mean of the array = ',a.mean())
print('standard deviation of the array = ',a.std())
print()

standardized_a = (a - a.mean())/a.std()
print('Standardized Array = ', standardized_a)
print()

print('mean of the standardized array = ',standardized_a.mean()) # close to 0
print('standard deviation of the standardized  array = ',standardized_a.std()) # equals to 1

mean of the array =  -0.001428571428571414
standard deviation of the array =  1.1142252730860458

Standardized Array =  [-8.96202458e-01  1.28212083e-03 -3.57711711e-01  1.07826362e+00
  1.28468507e+00 -1.70393858e+00  5.93621943e-01]

mean of the standardized array =  -3.172065784643304e-17
standard deviation of the standardized  array =  1.0


# <font color='pink'>References </font>

https://numpy.org/devdocs/user/quickstart.html

https://numpy.org/devdocs/user/basics.types.html

https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.astype.html

https://coolsymbol.com/emojis/emoji-for-copy-and-paste.html

https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.math.html

https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.exp.html#numpy.exp

https://docs.scipy.org/doc/numpy/reference/generated/numpy.clip.html

https://docs.scipy.org/doc/numpy/reference/generated/numpy.sqrt.html

https://docs.scipy.org/doc/numpy/reference/generated/numpy.log.html

https://docs.scipy.org/doc/numpy/reference/generated/numpy.power.html

https://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html

https://docs.scipy.org/doc/numpy/reference/generated/numpy.expand_dims.html

https://docs.scipy.org/doc/numpy/reference/generated/numpy.squeeze.html

https://docs.scipy.org/doc/numpy/reference/generated/numpy.concatenate.html

https://docs.scipy.org/doc/numpy/reference/generated/numpy.hstack.html#numpy.hstack

https://docs.scipy.org/doc/numpy/reference/generated/numpy.vstack.html#numpy.vstack