# ML-Fundamentals - Python Numpy Basics

## Table of Contents
* [Introduction](#Introduction)
* [Requirements](#Requirements) 
  * [Knowledge](#Knowledge) 
  * [Modules](#Python-Modules)
* [Exercises](#Exercises)
* [Summary and Outlook](#Summary-and-Outlook)
* [Literature](#Literature) 
* [Licenses](#Licenses)

## Introduction

In this exercise you will learn numpy operations / features, which you will need throughout nearly all data science tasks, when working with python.

## Requirements
### Knowledge

You should have a basic knowledge of:
- numpy

Suitable sources for acquiring this knowledge are:
- [numpy quickstart](https://docs.scipy.org/doc/numpy-1.15.1/user/quickstart.html)
- [numpy slicing](https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html)

### Python Modules

By [deep.TEACHING](https://www.deep-teaching.org/) convention, all python modules needed to run the notebook are loaded centrally at the beginning. 


In [18]:
# External Modules
import numpy as np

## Exercises

**Task:**

Generate a numpy-1D array of length 10, all elements being 0, except the 5th element which is 1

In [19]:
### Your Solution
a =  np.array( [0,0,0,0,1,0,0,0,0,0] )
print(a)

[0 0 0 0 1 0 0 0 0 0]


**Task:**

Reverse the order of vector `z`: The first element becomes the last, the second becomes the second last etc.

In [20]:
### Your Solution
z = np.arange(50)
z = np.flip(z)
print(z)

[49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26
 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10  9  8  7  6  5  4  3  2
  1  0]


**Task:**

Find the indices of all elements that are nonzero.

In [21]:
### Your Solution
z = np.array([1,2,0,0,4,0])
index = np.where(z!=0)
print(index)

(array([0, 1, 4], dtype=int64),)


**Task:**

Generate a 10x10 array with random values and find the smallest and largest value.

In [22]:
### Your Solution
random = np.random.rand(10,10)
minValue = np.amin(random)
maxValue = np.amax(random)
print(random)
print(minValue)
print(maxValue)

[[0.75804249 0.59857783 0.4393791  0.05616395 0.87864833 0.57515599
  0.91703802 0.11375905 0.71192068 0.23358937]
 [0.78052476 0.75477095 0.13069233 0.05353046 0.17878921 0.91565577
  0.40405964 0.56673008 0.58489715 0.95228043]
 [0.27675718 0.13439001 0.01257745 0.17306302 0.52499982 0.77326289
  0.17802649 0.05219761 0.25119305 0.0542259 ]
 [0.05726599 0.89197815 0.88624905 0.28345523 0.0922124  0.35442576
  0.00729181 0.98943543 0.53497151 0.89941367]
 [0.69340746 0.50403294 0.03327165 0.76181356 0.63904287 0.78872926
  0.4911802  0.59812475 0.62800798 0.31865878]
 [0.98615948 0.59625133 0.23636446 0.46104211 0.23952292 0.24468949
  0.33181406 0.34056705 0.47430393 0.63639779]
 [0.68024536 0.19480246 0.46993574 0.37915991 0.77804387 0.97110008
  0.66472901 0.64851617 0.7798892  0.62532298]
 [0.77101031 0.07405639 0.21555262 0.46689483 0.99626344 0.05528242
  0.33198462 0.49172746 0.60946793 0.10804002]
 [0.34675944 0.62929475 0.32628785 0.89355125 0.33688239 0.86978454
  0.29140575

**Task:**

Generate a vector of length 50 with random values and calculate the mean.

In [23]:
### Your Solution
vector = np.random.rand(50)
mean = np.mean(vector)
print(vector)
print(mean)

[0.03104714 0.9525371  0.08318769 0.65135068 0.80139371 0.30062688
 0.24728455 0.79601275 0.98127645 0.59179886 0.30042847 0.24258581
 0.41363076 0.63053878 0.34224012 0.18325035 0.70011372 0.01661546
 0.40564879 0.45943167 0.48193566 0.54313287 0.51829327 0.00498174
 0.02496405 0.3212136  0.62994728 0.4588785  0.36025419 0.36951233
 0.31213261 0.21615415 0.81347953 0.67143815 0.19748034 0.84398485
 0.07863972 0.9463197  0.18670271 0.99731305 0.59598863 0.27917913
 0.30010034 0.53072818 0.65369731 0.66579348 0.81041459 0.88539253
 0.19941396 0.07340293]
0.4620373820841897


**Task:**

Explain the following results:

##### print(0 * np.nan)
print(np.nan == np.nan) //always wrong because they are two different objects
print(np.inf > np.nan)  //cannot compare a number with an object
print(np.nan - np.nan)  //not a number - not a number = also not a number
print(0.3 == 3 * 0.1)   //https://www.sololearn.com/Discuss/1524665/why-is-0-1-3-not-equal-to-0-3-in-python

**Task:**

Generate an 8x8 matrix and fill it with a chess pattern like:

`
array([[1., 0., 1., 0., 1., 0., 1., 0.],
       [0., 1., 0., 1., 0., 1., 0., 1.],
       [1., 0., 1., 0., 1., 0., 1., 0.],
       [0., 1., 0., 1., 0., 1., 0., 1.],
       [1., 0., 1., 0., 1., 0., 1., 0.],
       [0., 1., 0., 1., 0., 1., 0., 1.],
       [1., 0., 1., 0., 1., 0., 1., 0.],
       [0., 1., 0., 1., 0., 1., 0., 1.]])
`

In [24]:
### Your Solution
x = np.zeros((10,10),dtype=int)
x[1::2,::2] = 1
x[::2,1::2] = 1
print(x)


[[0 1 0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0 1 0]]


**Task:**

Generate a random 5x5 matrix and normalize (scale) it. That means, the smallest value should become 0.0, the largest 1.0

In [25]:
### Your Solution
r = np.random.randint(10, size=10)
print(r)
scaled = np.interp(r, (r.min(), r.max()), (0, +1))
print(scaled)
#(z-zmin)/(zmax-zmin)

[5 4 3 5 8 4 3 2 7 8]
[0.5        0.33333333 0.16666667 0.5        1.         0.33333333
 0.16666667 0.         0.83333333 1.        ]


**Task:**

From each row, subtract the maximum value of that row.

In [26]:
### Your Solution
z = np.arange(12).reshape((3,4))
print(z)
res = z.max(axis=1)
print(res)
result = z-res[:, None]
print(result) #https://howtothink.readthedocs.io/en/latest/PvL_06.html



[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[ 3  7 11]
[[-3 -2 -1  0]
 [-3 -2 -1  0]
 [-3 -2 -1  0]]


**Task:**
    
Divide each column by the sum of that column. Then verify that each column adds up to 1.

In [27]:
### Your Solution
z = np.arange(12).reshape((3,4))
print(z)
colmnSum = z.sum(axis=0)
print(colmnSum)
result = z/colmnSum
print(result) #https://howtothink.readthedocs.io/en/latest/PvL_06.html

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[12 15 18 21]
[[0.         0.06666667 0.11111111 0.14285714]
 [0.33333333 0.33333333 0.33333333 0.33333333]
 [0.66666667 0.6        0.55555556 0.52380952]]


**Task:**

Negate all elements between 3 and 8 in place.

In [28]:
### Your Solution
Z = np.arange(11)
m = (Z > 2) & (Z < 9) #mask
Z[m] *= -1
print(Z)


[ 0  1  2 -3 -4 -5 -6 -7 -8  9 10]


**Task:**

Explain the result (output) of the following code:

In [29]:
### Your Solution
print(sum(range(5),-1))
from numpy import *
print(sum(range(5), axis=-1)) # axis -1
del sum #das objekt entfernen

9
10


**Task:**

Generate a random vector of length 100 and sort it.

In [30]:
### Your Solution
r = np.random.randint(1000, size=100)
print(r)
sortedArray = np.sort(r)
print(sortedArray)

[420 398 149 114 385 615 129 385 837 969 624 514 980 364 854 452  91 357
 435 697  13 455 662 934 124 478 366 491 669 247  64 481 372 110 292  68
 385  61 336 193 175 543 319 648 731 497 800 818 490 111 907  54 831 778
  44 294 681 866 628 171 641 463 986 493 376 909  60 353 659 858 599 595
 674 627 654 934 694 477 555 457 284 453 291 831 840 538 586 676  20 583
 707 924 930 523 616 483  16 173  53 697]
[ 13  16  20  44  53  54  60  61  64  68  91 110 111 114 124 129 149 171
 173 175 193 247 284 291 292 294 319 336 353 357 364 366 372 376 385 385
 385 398 420 435 452 453 455 457 463 477 478 481 483 490 491 493 497 514
 523 538 543 555 583 586 595 599 615 616 624 627 628 641 648 654 659 662
 669 674 676 681 694 697 697 707 731 778 800 818 831 831 837 840 854 858
 866 907 909 924 930 934 934 969 980 986]


**Task:**

Check if two arrays are equal:

    1. All elements should be exactly the same
    2. Equality within a tolerance

In [31]:
### Your Solution
A = np.random.random((3,4))
B = A.copy()
B[1,2] = A[1,2] * 1.00000000000001
print (A)
print(B)
np.allclose(A, B)

[[0.84084476 0.35082536 0.00671228 0.52767661]
 [0.30550531 0.69287849 0.47801212 0.40278275]
 [0.62794648 0.26649191 0.14366374 0.30060645]]
[[0.84084476 0.35082536 0.00671228 0.52767661]
 [0.30550531 0.69287849 0.47801212 0.40278275]
 [0.62794648 0.26649191 0.14366374 0.30060645]]


True

**Task:**

Generate (as little code as possible) the following matrix with `np.array` and save it as the variable `arr`.

\begin{bmatrix}
1 & 1 & 1 &1  &1 \\ 
1 & 2 & 1 & 1 & 1\\ 
1 & 1 & 3 & 1 & 1\\ 
1 &1  & 1 & 4 & 1
\end{bmatrix}

And calculate:
- the sum of each line
- the sum of each row

In [32]:
### Your Solution
arr = np.ones((4,5), dtype=int)
arr[1,1]=2
arr[2,2]=3
arr[3,3]=4
print(arr)
# diagonal function

[[1 1 1 1 1]
 [1 2 1 1 1]
 [1 1 3 1 1]
 [1 1 1 4 1]]


**Task:**

Generate a 2x2 matrix from `arr`: It shall consist of the 4 values when taking the values of the 2nd and 4th column of arr and the even rows.

Use different methods:
(see http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html)
- integer array indexes
- slices
- boolean arrays


In [39]:
### Your Solution

##integer array
indexed =arr[[1, 1, 3,3], [1, 3,1, 3]]
indexed = x.reshape(2,2)
print(indexed)


##slices
sliced = arr[[1,3],:][:,[1,3]]
print(sliced)

##boolean arrays

rows = (arr.sum(-1) % 2) == 0
columns = [1,3]
boolean = arr[np.ix_(rows, columns)]
print(boolean)





[[2 1]
 [1 4]]
[[2 1]
 [1 4]]
[[2 1]
 [1 4]]


**Task:**

(see http://docs.scipy.org/doc/numpy-1.10.1/user/basics.broadcasting.html)

Explain the following operations on `arr`

In [17]:
### Your Solution
print(arr)
print('--------1-------')
print(arr * 5.)
print('--------2-------')
print(np.arange(arr.shape[1]))
print(arr * np.arange(arr.shape[1]))
print('--------3------')
print(np.arange(arr.shape[0]))
print(arr.T * np.arange(arr.shape[0]))
print('--------4-------')
print(arr * np.arange(arr.shape[0]))

[[1 1 1 1 1]
 [1 2 1 1 1]
 [1 1 3 1 1]
 [1 1 1 4 1]]
--------1-------
[[ 5.  5.  5.  5.  5.]
 [ 5. 10.  5.  5.  5.]
 [ 5.  5. 15.  5.  5.]
 [ 5.  5.  5. 20.  5.]]
--------2-------
[0 1 2 3 4]
[[ 0  1  2  3  4]
 [ 0  2  2  3  4]
 [ 0  1  6  3  4]
 [ 0  1  2 12  4]]
--------3------
[0 1 2 3]
[[ 0  1  2  3]
 [ 0  2  2  3]
 [ 0  1  6  3]
 [ 0  1  2 12]
 [ 0  1  2  3]]
--------4-------


ValueError: operands could not be broadcast together with shapes (4,5) (4,) 

**Task:**

Calculate the matrix-vector product (dot product) of `arr` and $v$:

with:

$
v = (1,2,3,4,5)^T
$

In [20]:
### Your Solution
u = np.arange(5)
u = u + 1
dotProduct = np.dot(arr,u.T)
print(dotProduct)

[15 17 21 27]


## Summary and Outlook

In this notebook you've picked up many of the essential numpy operations for maths and data science tasks.

The more you use the library, the more of its functionality you'll discover and its usage will grow more intuitive and familiar.

## Licenses

### Notebook License (CC-BY-SA 4.0)

*The following license applies to the complete notebook, including code cells. It does however not apply to any referenced external media (e.g., images).*

Exercise: Python Numpy Basics <br/>
by Klaus Strohmenger <br/>
is licensed under a [Creative Commons Attribution-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-sa/4.0/).<br/>
Based on a work at https://gitlab.com/deep.TEACHING.


### Code License (MIT)

*The following license only applies to code cells of the notebook.*

Copyright 2019 Klaus Strohmenger

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.