# ML-Fundamentals - Python Numpy Basics

## Table of Contents
* [Introduction](#Introduction)
* [Requirements](#Requirements) 
  * [Knowledge](#Knowledge) 
  * [Modules](#Python-Modules)
* [Exercises](#Exercises)
* [Summary and Outlook](#Summary-and-Outlook)
* [Literature](#Literature) 
* [Licenses](#Licenses)

## Introduction

In this exercise you will learn numpy operations / features, which you will need throughout nearly all data science tasks, when working with python.

## Requirements
### Knowledge

You should have a basic knowledge of:
- numpy

Suitable sources for acquiring this knowledge are:
- [numpy quickstart](https://docs.scipy.org/doc/numpy-1.15.1/user/quickstart.html)
- [numpy slicing](https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html)

### Python Modules

By [deep.TEACHING](https://www.deep-teaching.org/) convention, all python modules needed to run the notebook are loaded centrally at the beginning. 


In [1]:
# External Modules
import numpy as np

## Exercises

**Task:**

Generate a numpy-1D array of length 10, all elements being 0, except the 5th element which is 1

In [2]:
### Your Solution
a = np.zeros(10)
a[4] = 1
print(a)


[0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]


**Task:**

Reverse the order of vector `z`: The first element becomes the last, the second becomes the second last etc.

In [3]:
### Your Solution
z = np.arange(50)
z[: : -1]

#oder

np.flip(z)

array([49, 48, 47, 46, 45, 44, 43, 42, 41, 40, 39, 38, 37, 36, 35, 34, 33,
       32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16,
       15, 14, 13, 12, 11, 10,  9,  8,  7,  6,  5,  4,  3,  2,  1,  0])

**Task:**

Find the indices of all elements that are nonzero.

In [4]:
### Your Solution
z = np.array([1,2,0,0,4,0])
np.where(z != 0)

(array([0, 1, 4]),)

**Task:**

Generate a 10x10 array with random values and find the smallest and largest value.

In [5]:
### Your Solution
a = np.random.randint(1000, size=(10, 10))
print(a, '\n Minimum: ', np.min(a),'Maximum: ', np.max(a))


[[597 476 741 534 382 387 457 853 311 705]
 [ 46 892  70 804 405 738  31 939 143 231]
 [534   3 193 190  40 650 476 297 561 417]
 [167 981 825 220 114 179 748 735 672 214]
 [221 490 950 951 481 111 319 363 786 600]
 [607 379 894  59 748 659 546 103 836 600]
 [531  44 648 778 128  88 676 189 498 654]
 [518 923  73 649 231 580  16 298 693 708]
 [329 748 319 174 222 264 976 419 743 961]
 [238 138 175 476  12  88 207 447 960 406]] 
 Minimum:  3 Maximum:  981


**Task:**

Generate a vector of length 50 with random values and calculate the mean.

In [6]:
### Your Solution
v = np.random.randint(1000, size=(50))
mean = np.mean(v)
print(v, '\n Mean: ', mean)


[582 784 206 592 949 870 530 333 591 808 775 110 325 446 267 170 175 738
 546 739 648 208 315 835 779 822  49 388 485 503 454 533 159 240 846 997
 856 951 414 583 202 858 149 727  11 624 830 353 661 708] 
 Mean:  534.48


**Task:**

Explain the following results:

In [7]:
print(0 * np.nan) # calculations using NaN always return NaN
print(np.nan == np.nan) # NaN can not be compared to NaN
print(np.inf > np.nan) # You should not compare against NaN because it's not part of indinity
print(np.nan - np.nan) # again, calculation involving NaN return NaN
print(0.3 == 3 * 0.1) # Has something to do with the floating point on a bit level

nan
False
False
nan
False


**Task:**

Generate an 8x8 matrix and fill it with a chess pattern like:

`
array([[1., 0., 1., 0., 1., 0., 1., 0.],
       [0., 1., 0., 1., 0., 1., 0., 1.],
       [1., 0., 1., 0., 1., 0., 1., 0.],
       [0., 1., 0., 1., 0., 1., 0., 1.],
       [1., 0., 1., 0., 1., 0., 1., 0.],
       [0., 1., 0., 1., 0., 1., 0., 1.],
       [1., 0., 1., 0., 1., 0., 1., 0.],
       [0., 1., 0., 1., 0., 1., 0., 1.]])
`

In [8]:
### Your Solution - took long enough
chess = np.ones(64).reshape((8,8))
for (i,j),val in np.ndenumerate(chess):
    if(i % 2 != 0):
        if(j % 2 == 0):
            chess[i][j] = 0
    else:
        if(j % 2 != 0):
            chess[i][j] = 0
print(chess)


#oder
chessArray = np.zeros([8,8],dtype = int)
chessArray[::2,::2] = 1 # starting at first row selecting every second element starting with the first
chessArray[1::2,1::2] = 1 # start at second row and select every second element starting with the second
print(chessArray)

[[1. 0. 1. 0. 1. 0. 1. 0.]
 [0. 1. 0. 1. 0. 1. 0. 1.]
 [1. 0. 1. 0. 1. 0. 1. 0.]
 [0. 1. 0. 1. 0. 1. 0. 1.]
 [1. 0. 1. 0. 1. 0. 1. 0.]
 [0. 1. 0. 1. 0. 1. 0. 1.]
 [1. 0. 1. 0. 1. 0. 1. 0.]
 [0. 1. 0. 1. 0. 1. 0. 1.]]
[[1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1]]


**Task:**

Generate a random 5x5 matrix and normalize (scale) it. That means, the smallest value should become 0.0, the largest 1.0

In [9]:
### Your Solution
a = np.random.randint(100, size=(5, 5))
normalized = (a - np.min(a)) / (np.max(a) - np.min(a))  # (a - a_min)/(a_max - a_min)
print(normalized)

[[0.13684211 0.98947368 0.38947368 0.97894737 0.53684211]
 [0.91578947 0.70526316 0.53684211 0.71578947 0.09473684]
 [0.51578947 0.50526316 1.         0.46315789 0.93684211]
 [0.13684211 0.35789474 0.28421053 0.         0.64210526]
 [0.30526316 0.13684211 0.12631579 0.35789474 0.64210526]]


**Task:**

From each row, subtract the maximum value of that row.

In [10]:
### Your Solution
z = np.arange(12).reshape((3,4))
print(z)
for (i,j),val in np.ndenumerate(z):
    z[i][j] = val - z[i].max()
print(z)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[[-3 -2 -1  0]
 [-3 -2 -1  0]
 [-3 -2 -1  0]]


**Task:**
    
Divide each column by the sum of that column. Then verify that each column adds up to 1.

In [11]:
### Your Solution
z = np.arange(12).reshape((3,4)).transpose()
print(z)
x = np.copy(z).astype(float)
for (i,j),val in np.ndenumerate(z):
       x[i,j] = val / z[i].sum()
print("Transposed matrix:")
print(x)
z = np.copy(x.transpose())
print('Final result: ')
print(z)
print("Sum of each column: ")
print(np.sum(z, axis=0)) # 0 = columns | 1 = rows




[[ 0  4  8]
 [ 1  5  9]
 [ 2  6 10]
 [ 3  7 11]]
Transposed matrix:
[[0.         0.33333333 0.66666667]
 [0.06666667 0.33333333 0.6       ]
 [0.11111111 0.33333333 0.55555556]
 [0.14285714 0.33333333 0.52380952]]
Final result: 
[[0.         0.06666667 0.11111111 0.14285714]
 [0.33333333 0.33333333 0.33333333 0.33333333]
 [0.66666667 0.6        0.55555556 0.52380952]]
Sum of each column: 
[1. 1. 1. 1.]


**Task:**

Negate all elements between 3 and 8 in place.

In [12]:
### Your Solution
Z = np.arange(11)
# np.negative() did not do in place operation
Z[3:8:1] = Z[3:8:1]*-1
print(Z)

[ 0  1  2 -3 -4 -5 -6 -7  8  9 10]


**Task:**

Explain the result (output) of the following code:

In [13]:
### Your Solution
print(sum(range(5),-1)) # 0+1+2+3+4+(-1)
from numpy import *
# If axis is negative it counts from the last to the first axis
print(sum(range(5), axis=-1)) #0+1+2+3+4 axis -1
del sum

9
10


**Task:**

Generate a random vector of length 100 and sort it.

In [14]:
### Your Solution
vector100 = np.random.randint(10,size=100)
np.sort(vector100)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
       4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6,
       6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8,
       8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9])

**Task:**

Check if two arrays are equal:

    1. All elements should be exactly the same
    2. Equality within a tolerance

In [15]:
### Your Solution
A = np.random.random((3,4))
B = A.copy()
B[1,2] = A[1,2] * 1.00000000000001
print (A)

np.array_equal(A,B) # no tolerance - false
np.allclose(A,B,0.01) # tolerance is 0.01 - true

[[0.34702813 0.25909624 0.17445103 0.421659  ]
 [0.48484589 0.34318098 0.73683324 0.73404156]
 [0.45006762 0.74833719 0.83428002 0.92361665]]


True

**Task:**

Generate (as little code as possible) the following matrix with `np.array` and save it as the variable `arr`.

\begin{bmatrix}
1 & 1 & 1 &1  &1 \\ 
1 & 2 & 1 & 1 & 1\\ 
1 & 1 & 3 & 1 & 1\\ 
1 &1  & 1 & 4 & 1
\end{bmatrix}

And calculate:
- the sum of each line
- the sum of each row

In [16]:
### Your Solution
arr = np.ones((4,5))
dia = np.arange(4)
arr[dia,dia] = arr[dia,dia] + dia[dia]
print(arr)
#TODO calculate sums
print("Sum of each line:")
print(np.sum(arr, axis=0))
print("Sum of each row:")
print(np.sum(arr, axis=1))



[[1. 1. 1. 1. 1.]
 [1. 2. 1. 1. 1.]
 [1. 1. 3. 1. 1.]
 [1. 1. 1. 4. 1.]]
Sum of each line:
[4. 5. 6. 7. 4.]
Sum of each row:
[5. 6. 7. 8.]


**Task:**

Generate a 2x2 matrix from `arr`: It shall consist of the 4 values when taking the values of the 2nd and 4th column of arr and the even rows.

Use different methods:
(see http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html)
- integer array indexes
- slices
- boolean arrays


In [17]:
### Your Solution - TODO
rows = np.array([0, 3], dtype=np.intp)
columns = np.array([0, 4], dtype=np.intp)
arr[np.ix_(rows, columns)]
print(arr)
m1 = arr[1::2,1::2] # select every second row and column, starting with the second one in each case
print(m1)
m2 = arr[[1, 1, 3, 3], [1, 3, 1, 3]].reshape(2,2) # using advanced indexing
print(m2)
# m3 = arr[((np.sum(arr, axis=1) % 2) == 0) & ((np.sum(arr, axis=0) % 2) != 0)]
# print(m3)



[[1. 1. 1. 1. 1.]
 [1. 2. 1. 1. 1.]
 [1. 1. 3. 1. 1.]
 [1. 1. 1. 4. 1.]]
[[2. 1.]
 [1. 4.]]
[[2. 1.]
 [1. 4.]]


**Task:**

(see http://docs.scipy.org/doc/numpy-1.10.1/user/basics.broadcasting.html)

Explain the following operations on `arr`

In [22]:
### Your Solution - TODO
print(arr)
print('--------1-------')
# 1. scalar product with scalar 5
print(arr * 5.)
print('--------2-------')
# 2. matrix product of arr with the vector [0, 1, 2, 3, 4]
print(np.arange(arr.shape[1]))
print(arr * np.arange(arr.shape[1]))
print('--------3------')
# 3. matrix product of transposed arr with the vector [0, 1, 2, 3]
print(arr.T * np.arange(arr.shape[0]))
print('--------4-------')
# 4. Does not work because amount of columns does not match (col_arr=5, col_v=4)
# original: print(arr * np.arange(arr.shape[0]))
# instead this would work:
print(arr * np.arange(arr.shape[1]))

[[1. 1. 1. 1. 1.]
 [1. 2. 1. 1. 1.]
 [1. 1. 3. 1. 1.]
 [1. 1. 1. 4. 1.]]
--------1-------
[[ 5.  5.  5.  5.  5.]
 [ 5. 10.  5.  5.  5.]
 [ 5.  5. 15.  5.  5.]
 [ 5.  5.  5. 20.  5.]]
--------2-------
[0 1 2 3 4]
[[ 0.  1.  2.  3.  4.]
 [ 0.  2.  2.  3.  4.]
 [ 0.  1.  6.  3.  4.]
 [ 0.  1.  2. 12.  4.]]
--------3------
[[ 0.  1.  2.  3.]
 [ 0.  2.  2.  3.]
 [ 0.  1.  6.  3.]
 [ 0.  1.  2. 12.]
 [ 0.  1.  2.  3.]]
--------4-------
[[ 0.  1.  2.  3.  4.]
 [ 0.  2.  2.  3.  4.]
 [ 0.  1.  6.  3.  4.]
 [ 0.  1.  2. 12.  4.]]


**Task:**

Calculate the matrix-vector product (dot product) of `arr` and $v$:

with:

$
v = (1,2,3,4,5)^T
$

In [None]:
### Your Solution
print(arr)
np.dot(arr ,np.array([1,2,3,4,5]).transpose())

## Summary and Outlook

In this notebook you've picked up many of the essential numpy operations for maths and data science tasks.

The more you use the library, the more of its functionality you'll discover and its usage will grow more intuitive and familiar.

## Licenses

### Notebook License (CC-BY-SA 4.0)

*The following license applies to the complete notebook, including code cells. It does however not apply to any referenced external media (e.g., images).*

Exercise: Python Numpy Basics <br/>
by Klaus Strohmenger <br/>
is licensed under a [Creative Commons Attribution-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-sa/4.0/).<br/>
Based on a work at https://gitlab.com/deep.TEACHING.


### Code License (MIT)

*The following license only applies to code cells of the notebook.*

Copyright 2019 Klaus Strohmenger

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.