# Machine learning zoomcamp
* https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp

## Numpy
* Tutorial: https://mlbookcamp.com/article/numpy
* Video: https://www.youtube.com/watch?v=Qa0-jYtRdbY
* Notebook from Video: https://github.com/alexeygrigorev/mlbookcamp-code/blob/master/course-zoomcamp/01-intro/notebooks/07-numpy.ipynb
* Notebook for exercise: https://github.com/alexeygrigorev/mlbookcamp-code/blob/master/appendix-c-numpy.ipynb

## My Exercise and Notes

NumPy is a short name for “Numerical Python” – it’s a Python library for numerical manipulations. NumPy plays a central role in the python machine learning ecosystem: nearly all the libraries in Python depend on it. For example, Pandas, Scikit-Learn, and TensorFlow all rely on NumPy for numerical operations.

NumPy arrays are similar to Python lists, but they are better optimized for number crunching tasks – like machine learning.

In [62]:
import numpy as np

## Creating numpy arrays

### One-dimensional arrays

One-dimensional array is a vector

#### np.zeros
Create numpy array with specified number of zeros

In [63]:
zeros = np.zeros(10)
zeros

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

#### np.ones
Create numpy array with specified number of ones

In [64]:
ones = np.ones(7)
ones

array([1., 1., 1., 1., 1., 1., 1.])

#### np.full

Create numpy array with specified number of elements with defined value

np.full(N, V) - N = number of elements, V = value of the element

In [65]:
np.full(10,2)

array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

#### np.repeat

You can also use np.repeat to create array with defined value

np.repeat(V, N) - V = value, N = number of times to repeat

In [66]:
np.repeat(2,10)

array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

With np.repeat you can create an array with multiple values instead of a single number

In [67]:
np.repeat([1,3],5)

array([1, 1, 1, 1, 1, 3, 3, 3, 3, 3])

You can also define different times to repeat each of the number

In [68]:
np.repeat([0,1],[2,6])

array([0, 0, 1, 1, 1, 1, 1, 1])

#### np.linspace

Create numpy array with equally spaced '*x*' elements (which is nothing but length of the array) between a '*start*' and '*end*' numbers

np.linspace(start,end,x)

In [69]:
np.linspace(0,10,5)

array([ 0. ,  2.5,  5. ,  7.5, 10. ])

If you do not want the value you specify as stop, then you can use endpoint=False (by default endpoint=True)

In [70]:
np.linspace(0,10,5,endpoint=False)

array([0., 2., 4., 6., 8.])

You can define the type of the element (the type of values that the array can have) using dtype - e.g. unsigned or signed integers (uint8, uint16, uint32, uint64 of size 8, 16, 32 and 64 bits respectively ; int8, int16, int32 and int64), float (float16, float32, and float64).
dtype can be used in any of the above commands as well.

In [71]:
print(np.linspace(0,10,5,dtype=np.uint8))

[ 0  2  5  7 10]


If you dont specify the number of elements, it is taken as 50

In [72]:
np.linspace(0,1)

array([0.        , 0.02040816, 0.04081633, 0.06122449, 0.08163265,
       0.10204082, 0.12244898, 0.14285714, 0.16326531, 0.18367347,
       0.20408163, 0.2244898 , 0.24489796, 0.26530612, 0.28571429,
       0.30612245, 0.32653061, 0.34693878, 0.36734694, 0.3877551 ,
       0.40816327, 0.42857143, 0.44897959, 0.46938776, 0.48979592,
       0.51020408, 0.53061224, 0.55102041, 0.57142857, 0.59183673,
       0.6122449 , 0.63265306, 0.65306122, 0.67346939, 0.69387755,
       0.71428571, 0.73469388, 0.75510204, 0.7755102 , 0.79591837,
       0.81632653, 0.83673469, 0.85714286, 0.87755102, 0.89795918,
       0.91836735, 0.93877551, 0.95918367, 0.97959184, 1.        ])

#### np.arange

Create numpy array with a range of numbers between a '*start*' and '*end*' numbers with a step of '*step*'. 

np.arange(start,end,step)

*Note: with np.linspace you can specify number of elements (which are equally spaced), while with np.arange you can specify the step (the difference between values of the subsequent elements)*

In [73]:
np.arange(0,10,5)

array([0, 5])

In [74]:
np.arange(1,10)

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [75]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Create a numpy array with specified values

In [76]:
a = np.array([1,21,5,7,11,15])
a

array([ 1, 21,  5,  7, 11, 15])

#### Accessing array elements

Access single element of the array

In [77]:
a[2]

5

Access multiple elements of the array

In [78]:
a[[0,3]]

array([1, 7])

Assign value to an element of the array

In [79]:
a[2]=99
a

array([ 1, 21, 99,  7, 11, 15])

### Two-dimensional arrays

Two-dimensional array is a matrix

Using tuples e.g. (5,2) you can define the dimension of the array - here 5 rows x 2 columns

In [80]:
zeros = np.zeros((5, 2), dtype=np.float32)
zeros

array([[0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.]], dtype=float32)

#### shape (dimensions) of an array

In [81]:
zeros.shape

(5, 2)

#### np.reshape

Change the dimensions of the array (e.g. convert from 1 x 16 to 4 x 4 or 2 x 8)

In [82]:
numbers = np.ones(16)
print(numbers.shape)
numbers

(16,)


array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

In [83]:
numbers = numbers.reshape(4,4)
print(numbers.shape)
numbers

(4, 4)


array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [84]:
numbers.reshape(2,8)

array([[1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1.]])

#### accessing elements of multi-dimensional array

In [85]:
numbers = [
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
]

numbers = np.array(numbers)
numbers

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

Access elements of the array.

array[row,column]

In [86]:
numbers[1,0]

4

To get an entire row, provide the row number

In [87]:
numbers[1]

array([4, 5, 6])

To get an entire column (meaning a specific column with all the rows), use : to specify all elements of the row and specify the column number 

In [88]:
numbers[:,1]

array([2, 5, 8])

You can also assign values to the elements in the array. You can assign value to a specific element or an entire row or a column

In [89]:
numbers

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

Assign value to specific element

In [90]:
numbers[1,1] = 10
numbers

array([[ 1,  2,  3],
       [ 4, 10,  6],
       [ 7,  8,  9]])

Assign values to elements in the entire row

In [91]:
numbers[0] = [2,1,0]
numbers

array([[ 2,  1,  0],
       [ 4, 10,  6],
       [ 7,  8,  9]])

Assign values to elements in the entire column

In [92]:
numbers[:,1] = [2,3,4]
numbers

array([[2, 2, 0],
       [4, 3, 6],
       [7, 4, 9]])

### Numpy Operations

NumPy comes with a wide range of operations that work with the NumPy arrays. 

Element-wise operations
NumPy arrays support all the arithmetic operations: addition ("+"), subtraction ("-"), multiplication ("*"), division ("/") and others.

In [93]:
numbers = np.ones(9)
numbers = numbers.reshape(3,3)
numbers

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [94]:
numbers * 2

array([[2., 2., 2.],
       [2., 2., 2.],
       [2., 2., 2.]])

In [95]:
numbers + 3

array([[4., 4., 4.],
       [4., 4., 4.],
       [4., 4., 4.]])

In [96]:
(numbers + 3) / 2

array([[2., 2., 2.],
       [2., 2., 2.],
       [2., 2., 2.]])

#### Summarizing operations

min, max, mean, std, sum

In [97]:
numbers = np.array([1, 2, 3, 4, 5, 6])
numbers = numbers.reshape(3,2)
print(numbers)
print("min of all elements : ",numbers.min())
print("max of all elements : ",numbers.max())
print("max of all elements column-wise: ",numbers.max(axis=0))
print("max of all elements row-wise: ",numbers.max(axis=1))
print("mean of elements in a particular row (row 0) : ",numbers[0].mean())
print("mean of all elements : ",numbers.mean())
print("standard deviation of all elements : ",numbers.std())
print("sum of all elements : ",numbers.sum())

[[1 2]
 [3 4]
 [5 6]]
min of all elements :  1
max of all elements :  6
max of all elements column-wise:  [5 6]
max of all elements row-wise:  [2 4 6]
mean of elements in a particular row (row 0) :  1.5
mean of all elements :  3.5
standard deviation of all elements :  1.707825127659933
sum of all elements :  21


#### Other useful element-wise operations 

Other operations that we might need for machine learning applications are exponent, logarithm, square root, comparison, sorting.

In [98]:
data = np.array([[0.55, 0.44],
                [0.29, 0.76],
                [0.62, 0.62]])
data

array([[0.55, 0.44],
       [0.29, 0.76],
       [0.62, 0.62]])

In [99]:
data_exp = np.exp(data) # exponent
data_log = np.log(data) # logarithm
data_sqrt = np.sqrt(data) # square root
print("exponent : \n",data_exp)
print("log : \n", data_log)
print("square root : \n", data_sqrt)

exponent : 
 [[1.73325302 1.55270722]
 [1.33642749 2.13827622]
 [1.85892804 1.85892804]]
log : 
 [[-0.597837   -0.82098055]
 [-1.23787436 -0.27443685]
 [-0.4780358  -0.4780358 ]]
square root : 
 [[0.74161985 0.66332496]
 [0.53851648 0.87177979]
 [0.78740079 0.78740079]]


#### Comparisons

In [100]:
data > 0.5

array([[ True, False],
       [False,  True],
       [ True,  True]])

#### np.sort

Creates a sorted copy of the array (note - it is a copy and hence the original array remains as it is)

In [101]:
numbers = np.array([5, 3, 6, 0, 1, 2])
print(np.sort(numbers))
print(numbers)

[0 1 2 3 5 6]
[5 3 6 0 1 2]


If we want to sort the elements of the array in-place without creating another array, we invoke the method sort on the array itself

In [102]:
numbers.sort()
print(numbers)

[0 1 2 3 5 6]


In [103]:
numbers = np.array([5, 6, 7, 0, 1, 2])
idx = numbers.argsort()
print(idx)

[3 4 5 0 1 2]


#### To add
* randomly generated arrays
* np.concatenate
* np.hstack
* np.column_stack
* np.vstack
* transpose
* slicing and filtering