# Python Programming Crash Course -
# Numbers and the Matrix

<br>
<div>
<img src="data/Python-logo-notext.svg" width="200"/>
</div>


## Introduction

As we have seen, we can do math with basic Python just fine. However, it's not very efficient when you have large amounts of data.\
Or, if you want to do linear algebra, you will have a hard time doing that with basic Python.

For such specialized tasks, you usually employ libraries or <font color="green">**packages**</font> that offer more functionality than the <font color="green">**build-in**</font> functions.

The most basic package for working with numerical data in all parts of science and engioneering is <font color="green">**NumPy**</font>. 
With the numpy package, you get a data structures that is optimized to deal with that: <font color="green">**arrays**</font>.

The benefits:

 - efficient mathematical operations on arrays/matrices
 - lots of mathematical functions
 - linear algebra operations
 - memory efficiency
 - random number generators
 - interoperability, since it is the basis for many other libraries (e.g. pandas)
 - integration with C++


What is an <font color="green">**array**</font>?

An array is a central data structure of the <font color="green">**NumPy**</font> library. It's like a <font color="green">**vector**</font> or <font color="green">**matrix**</font> from linear algebra.
You can think of a <font color="green">**vector**</font> as a list of numbers, but more efficient. An multi-dimensional <font color="green">**array**</font> is a grid of values and it contains information about the raw data, how to locate an element, and how to interpret an element. Each element in an <font color="green">**array**</font> is of the same type!
That means an <font color="green">**array**</font> has a data type which is referred to as <font color="green">**dtype**</font>.

The most likely form in which you encounter <font color="green">**arrays**</font> is when it comes to columns in a <font color="green">**table**</font>.
When working with tabular data, we will use another package called <font color="green">**pandas**</font>. This uses <font color="green">**NumPy**</font> under the hood and you will see
that numpy will also play a role when creating figures and plots because it has some useful functionality for this.

If you use <font color="green">**packages**</font>, you first have to import them using the keyword <font color="#3da831">**import**</font>:

In [1]:
# import
import numpy as np

With the keyword <font color="#3da831">**as**</font> you create an <font color="green">**alias**</font> for the name numpy ... that's because we don't want to type so much! You can thereafter refer to it as np.

## 1. Arrays

<font color="green">**Arrays**</font> are data structures that can be used to represent mathematical <font color="green">**vectors**</font> or <font color="green">**matrices**</font>. 

One-dimensional <font color="green">**arrays**</font> are similar to <font color="green">**lists**</font>, i.e. they can be used in a similar way. In fact, you can initialize an array with a <font color="green">**list**</font>.

But <font color="green">**arrays**</font> can be multi-dimensional, hence we refer to them also as <font color="green">**ndarray**</font>, meaning n-dimensional array.

A quick reminder:
 - vector = 1D array
 - matrix = 2D array

An array object can represent all of those.

A 1D array is a <font color="green">**vector**</font>:
<br>
<div>
<img src="data/vector.svg" width="200"/>
</div>


A 2D array is a <font color="green">**matrix**</font>:
<br>
<div>
<img src="data/matrix.svg" width="200"/>
</div>

You can also have higher-dimensional <font color="green">**arrays**</font>:

<br>
<div>
<img src="data/tensor.svg" width="600"/>
</div>



Creating an <font color="green">**array**</font> works with the function **array()**. Since this comes from the <font color="green">**NumPy**</font> package which we imported, we would call it using the dot notation:

```Python
numpy.array()
```

And since we have used an <font color="green">**alias**</font>, we can just type:

```Python
np.array()
```

This will work for all the <font color="green">**NumPy**</font> functions.

In [2]:
# 1d array
my_array = np.array([1, 2, 3, 4, 5, 6, 7])
my_array

array([1, 2, 3, 4, 5, 6, 7])

In [3]:
# 2d arrays
my_2d_array = np.array(
    [
        [1, 2, 3, 4], 
        [5, 6, 7, 8], 
        [9, 10, 11, 12]
    ]
)
my_2d_array

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

We can access elements by <font color="green">**indexing**</font> and <font color="green">**slicing**</font>, same as with <font color="green">**lists**</font>:

<br>
<div>
<img src="data/slicing_numpy.svg" width="900"/>
</div>


In [4]:
# index
print(my_array)
my_array[0]

[1 2 3 4 5 6 7]


1

In [5]:
# slicing
my_array[1:3]

array([2, 3])

In [6]:
# indexing in 2d
my_2d_array[0] # select a row

array([1, 2, 3, 4])

In [7]:
# select an element
my_2d_array[1][1]

6

What attributes has an <font color="green">**array**</font>?

<font color="green">**Arrays**</font> are fixed-sized containers, all elements have the same type, an <font color="green">**array**</font> has a <font color="green">**shape**</font>, i.e. the number of dimensions and elements.\
In a NumPy <font color="green">**array**</font>, the dimensions are also reffered to by the term <font color="green">**axis**</font>. 

Generally, axis = 0 refers to rows, axis = 1 refers to columns!


<br>
<div>
<img src="data/axis_and_shape.png" width="600"/>
</div>


In [8]:
# shape 
print(my_array)
print(my_array.shape)

[1 2 3 4 5 6 7]
(7,)


In [9]:
# this array is a 2d array and therefor has 2 axes
print(my_2d_array)
my_2d_array.shape

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


(3, 4)

The attribute `size` tells you the number of elements:

In [10]:
# size
print(my_array)
my_array.size

[1 2 3 4 5 6 7]


7

In [11]:
my_2d_array.size

12

If you need the number of dimensions, you need `ndim`:

In [12]:
# ndim
my_2d_array.ndim

2

Finally, the type of the elements and the whole <font color="green">**array**</font> can be accessed by `dtype`:

In [13]:
# what is the dtype?
print(my_array)
my_array.dtype

[1 2 3 4 5 6 7]


dtype('int64')

## 2. Creating arrays

We have already created a 1D and 2D <font color="green">**array**</font>using lists using **`np.array()`**:

In [15]:
a = np.array([1, 2, 3])
print(a)

[1 2 3]


But there are other ways:

In [16]:
# an array on zeros
np.zeros(5)

array([0., 0., 0., 0., 0.])

In [17]:
# an array of ones
np.ones(5)

array([1., 1., 1., 1., 1.])

In [18]:
# an "empty" array - actually it's filled with random numbers
np.empty(4)

array([4.68990233e-310, 0.00000000e+000, 4.68881048e-310, 4.68881054e-310])

In [22]:
# create an array from a range of values
np.arange(11)

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [23]:
# create an array that is linearly spaced in an interval
np.linspace(1, 10, 5)

array([ 1.  ,  3.25,  5.5 ,  7.75, 10.  ])

- **`np.zeros()`** creates an array of zeros
- **`np.ones()`** creates an array of ones
- **`np.empty()`** creates an empty array
- **`np.arange()`** creates an array from a range of numbers. you specify that just like the **`range()`** function
- **`np.linspace()`** creates an array of linearly spaced numbers in a given interval

You need these functions often when you create figures, e.g. to initialize the x-axis values.

You can even specify the dtype directly:

In [24]:
# default is float
a = np.ones(5)
print(a)
print(a.dtype)

[1. 1. 1. 1. 1.]
float64


In [29]:
# int
a = np.ones(5, dtype=int)
print(a)
print(a.dtype)

[1 1 1 1 1]
int64


In [28]:
# bool
np.zeros(5, dtype=bool)

array([False, False, False, False, False])

## 3. Sorting elements

<font color="green">**Arrays**</font> can be sorted using **`np.sort()`**. The <font color="green">**function**</font> returns a copy of the <font color="green">**array**</font>, sorted in ascendng order.

In [30]:
sortme = np.array([4, 3, 6, 1, 2, 5])
print(sortme)
np.sort(sortme)

[4 3 6 1 2 5]


array([1, 2, 3, 4, 5, 6])

In [31]:
# is sortme now a sorted array?
sortme

array([4, 3, 6, 1, 2, 5])

What about sorting 2D arrays?

In [32]:
# create a 2d array
sortme2d = np.array([[4, 1, 3, 6], [4, 3, 5, 6],  [2, 7, 3, 5]])
print(sortme2d)

[[4 1 3 6]
 [4 3 5 6]
 [2 7 3 5]]


In [33]:
# sort it!
np.sort(sortme2d)

array([[1, 3, 4, 6],
       [3, 4, 5, 6],
       [2, 3, 5, 7]])

What happens if we call **`np.sort()`**?
<details>
    <summary><font color="orange"><b>Click me!</b></font></summary>
    The matrix is sorted along the last axis by default, that means "by row" in this case!
    you can tell sort, which axis to sort, if you want for example sort it "by column".
</details>


In [34]:
# sort it top  to bottom
np.sort(sortme2d, axis=0)

array([[2, 1, 3, 5],
       [4, 3, 3, 6],
       [4, 7, 5, 6]])

Sometimes it is useful to not sort the <font color="green">**array**</font> but to find out, at which index the element would be in a sorted <font color="green">**array**</font>.

This is done using the function **`np.argsort()`**.

In [35]:
print(sortme)
print(np.sort(sortme))
np.argsort(sortme)

[4 3 6 1 2 5]
[1 2 3 4 5 6]


array([3, 4, 1, 0, 5, 2])

## 4. Adding and removing elements

Since <font color="green">**arrays**</font> are of fixed size, you cannot simply add and remove elements, like we did with lists.\
But we can <font color="green">**concatenate**</font> arrays and thereby create a new combined <font color="green">**array**</font>!

This is done with **`np.concatenate()`**.

In [36]:
# concatenate
print(my_array)
another_array = np.arange(22, 26)
print(another_array)
np.concatenate([my_array, another_array])

[1 2 3 4 5 6 7]
[22 23 24 25]


array([ 1,  2,  3,  4,  5,  6,  7, 22, 23, 24, 25])

In [37]:
# concatenate 2d
print(my_2d_array)
another_array2 = np.zeros((2, 4))
print(another_array2)
np.concatenate((my_2d_array, another_array2))

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]]


array([[ 1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.],
       [ 9., 10., 11., 12.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]])

What happened?

<details>
    <summary><font color="orange"><b>Click me!</b></font></summary>
    Concatenating integer arrays with float arrays works, but it changes the dtype to float for the new array.
</details>


In [38]:
# concatenate 2d
print(my_2d_array)
another_array2 = np.zeros((3, 4))
print(another_array2)
np.concatenate((my_2d_array, another_array2), axis=1)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]


array([[ 1.,  2.,  3.,  4.,  0.,  0.,  0.,  0.],
       [ 5.,  6.,  7.,  8.,  0.,  0.,  0.,  0.],
       [ 9., 10., 11., 12.,  0.,  0.,  0.,  0.]])

You can also stack them together vertically or horizontally using **`np.vstack()`** or **`np.hstack()`**.

In [39]:
vector1 = np.array([1, 2, 3, 4])
vector2 = np.array([5, 6, 7, 8])
matrix1 = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
matrix2 = np.array([[9, 10, 11, 12], [13, 14, 15, 16]])

print("vector1 =", vector1)
print("\nvector2 =", vector2)
print("\nMatrix1 =\n", matrix1)
print("\nMatrix2 =\n", matrix2)

vector1 = [1 2 3 4]

vector2 = [5 6 7 8]

Matrix1 =
 [[1 2 3 4]
 [5 6 7 8]]

Matrix2 =
 [[ 9 10 11 12]
 [13 14 15 16]]


In [40]:
# stack two "rows" onto each other:
np.vstack((vector1, vector2))

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [41]:
# stack two "rows" next to each other
np.hstack((vector1, vector2))

array([1, 2, 3, 4, 5, 6, 7, 8])

In [42]:
# stack two matrices onto each other
np.vstack((matrix1, matrix2))

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 16]])

In [43]:
#stack 2 matrices nect to each other
np.hstack((matrix1, matrix2))

array([[ 1,  2,  3,  4,  9, 10, 11, 12],
       [ 5,  6,  7,  8, 13, 14, 15, 16]])

## 5. Reshaping

what if we have values in an <font color="green">**array**</font>, but it's not in the format we need? Can we change the shape of an array? Sure!\
This is done with the **`np.reshape()`** function. The values will remain the same but the structure will change.

This is useful, if you want to create a 2D <font color="green">**array**</font> from a 1D <font color="green">**array**</font>:

In [44]:
a_vector = np.arange(1, 9)
print("a_vector =", a_vector)
new_a = a_vector.reshape(4, 2)
print("\nThe new vector new_a =\n", new_a)

a_vector = [1 2 3 4 5 6 7 8]

The new vector new_a =
 [[1 2]
 [3 4]
 [5 6]
 [7 8]]


What happened?
<details>
    <summary><font color="orange"><b>Click me!</b></font></summary>
    The resulting reshaped array is a 2D array, a 4 by 2 matrix. You can also create a differently shaped array,

    but you must make sure, that the number of elements does fit into the new shape!
</details>


In [45]:
# a 2 by 4 matrix
a_vector.reshape(2, 4)

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [46]:
# a 2 by 2 by 2 (cubic) shape
a_vector.reshape(2, 2, 2)

array([[[1, 2],
        [3, 4]],

       [[5, 6],
        [7, 8]]])

In [47]:
# reshape new_a as a row vector
new_a.reshape(1, 8)

array([[1, 2, 3, 4, 5, 6, 7, 8]])

In [48]:
# reshape new_a as a column vector
new_a.reshape(8, 1)

array([[1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8]])

Sometimes, we need to convert a 1D <font color="green">**array**</font> into a 2D <font color="green">**array**</font>. That means we need to add a new dimension.


We can also add a new dimension, if we need to turn a 1D array into a 2D array. This can be done using **`expand_dims()`**. the parameter `axis` tells it where the <font color="green">**axis**</font> should be. Or you can use `new_axis`.

In [49]:
# start with a single (row) array
print("The original a_vector = ", a_vector)
print("The shape of a_vector is", a_vector.shape)
print("If we concatenate a_vector with itself, we get:\n", np.concatenate((a_vector, a_vector)))


The original a_vector =  [1 2 3 4 5 6 7 8]
The shape of a_vector is (8,)
If we concatenate a_vector with itself, we get:
 [1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8]


In [51]:
# create a "row vector"
row_vector_a = np.expand_dims(a_vector, axis=0)
print("A 1 by 8 matrix (2D array):\n", row_vector_a)
print("The shape is", row_vector_a.shape)

# or use np.newaxis
row_vector_a = a_vector[np.newaxis,:]
print("\nA 1 by 8 matrix (2D array):\n", row_vector_a)
print("The shape is", row_vector_a.shape)

# or use None
row_vector_a = a_vector[None,:]
print("\nA 1 by 8 matrix (2D array):\n", row_vector_a)
print("The shape is", row_vector_a.shape)

row_vector_b = np.array([[9, 10, 11, 12, 13, 14, 15, 16]])

# concatenate them
print(
    "\nIf we concatenate this matrix along axis 0 (rows) with row_vector_b, we get:\n", 
    np.concatenate((row_vector_a, row_vector_b))
)
print("\nOr if we concatenate this matrix along axis 1 (columns) and we get:\n", 
      np.concatenate((row_vector_a, row_vector_b), axis=0))

A 1 by 8 matrix (2D array):
 [[1 2 3 4 5 6 7 8]]
The shape is (1, 8)

A 1 by 8 matrix (2D array):
 [[1 2 3 4 5 6 7 8]]
The shape is (1, 8)

A 1 by 8 matrix (2D array):
 [[1 2 3 4 5 6 7 8]]
The shape is (1, 8)

If we concatenate this matrix along axis 0 (rows) with row_vector_b, we get:
 [[ 1  2  3  4  5  6  7  8]
 [ 9 10 11 12 13 14 15 16]]

Or if we concatenate this matrix along axis 1 (columns) and we get:
 [[ 1  2  3  4  5  6  7  8]
 [ 9 10 11 12 13 14 15 16]]


What happened?
<details>
    <summary><font color="orange"><b>Click me!</b></font></summary>
    Using `np.newaxis` or `None`with the slicing annotation just adds a new `axis`at this position. The result is the same.
    All three methods add another dimension to the array!
</details>

We can also create a column vector by specifying the <font color="green">**axis**</font> to add:

In [53]:
# create a column vector
column_vector_a = np.expand_dims(a_vector, axis=1)
print("A 8 by 1 matrix (2D array):\n", column_vector_a)
print("The shape is", column_vector_a.shape)

column_vector_a = a_vector[:,None]
print("\nA 8 by 1 matrix (2D array):\n", column_vector_a)
print("The shape is", column_vector_a.shape)

# concatenate them
column_vector_b = np.array([
    [9], 
    [10], 
    [11], 
    [12], 
    [13], 
    [14], 
    [15], 
    [16]
    ]
)

print(
    "\nIf we concatenate these matrices along the row axis (0), we get:\n", 
    np.concatenate((column_vector_a, column_vector_b), axis=0)
)
print("\nOr if we concatenate them along the column axis (1) and we get:\n", 
      np.concatenate((column_vector_a, column_vector_b), axis=1)
     )

A 8 by 1 matrix (2D array):
 [[1]
 [2]
 [3]
 [4]
 [5]
 [6]
 [7]
 [8]]
The shape is (8, 1)

A 8 by 1 matrix (2D array):
 [[1]
 [2]
 [3]
 [4]
 [5]
 [6]
 [7]
 [8]]
The shape is (8, 1)

If we concatenate these matrices along the row axis (0), we get:
 [[ 1]
 [ 2]
 [ 3]
 [ 4]
 [ 5]
 [ 6]
 [ 7]
 [ 8]
 [ 9]
 [10]
 [11]
 [12]
 [13]
 [14]
 [15]
 [16]]

Or if we concatenate them along the column axis (1) and we get:
 [[ 1  9]
 [ 2 10]
 [ 3 11]
 [ 4 12]
 [ 5 13]
 [ 6 14]
 [ 7 15]
 [ 8 16]]


In [54]:
# you can always add another axis
print(row_vector_a)
print(row_vector_a.shape)
a_in_3d = row_vector_a[None,:]
print("This is a cube!")
print(a_in_3d)
print(a_in_3d.shape)

[[1 2 3 4 5 6 7 8]]
(1, 8)
This is a cube!
[[[1 2 3 4 5 6 7 8]]]
(1, 1, 8)


## 6. Selecting, <font color="green">**boolean expressions**</font> and <font color="green">**masking**</font>

<font color="green">**NumPy arrays**</font> can also be useful for <font color="green">**filtering**</font> tables. We will see that later, when we get to the pandas package.
You can use <font color="green">**boolean expressions**</font> to select which elements to acess and retrieve, which is called <font color="green">**masking**</font>.

In <font color="green">**NumPy**</font>, <font color="green">**masking**</font> refers to the process of using <font color="green">**boolean arrays (masks)**</font> to filter or manipulate values within an <font color="green">**array**</font> based on certain conditions. <font color="green">**Masks**</font> are <font color="green">**arrays**</font> of <font color="green">**boolean values**</font>, where each element indicates whether the corresponding element in the original <font color="green">**array**</font> satisfies a condition.

You can create a <font color="green">**mask**</font> by applying a condition to an array. For example, you might want to create a <font color="green">**mask**</font> that identifies all elements greater than a certain threshold.

In [55]:
my_array

array([1, 2, 3, 4, 5, 6, 7])

In [56]:
# create a mask
mask_even = my_array%2==0  # select only even numbers
print(mask_even)

[False  True False  True False  True False]


In [57]:
# create another mask
mask_larger_3 = my_array > 3  # select only number larger than 3
print(mask_larger_3)

[False False False  True  True  True  True]


If you create a <font color="green">**mask**</font> using <font color="green">**boolean expressions**</font>, you can use it to <font color="green">**filter**</font> your <font color="green">**array**</font>.

In [58]:
# select only even numbers
filtered_array = my_array[mask_even]
print(filtered_array)

[2 4 6]


In [59]:
# selct only numbers larger than 3
filtered_array = my_array[mask_larger_3]
print(filtered_array)

[4 5 6 7]


Of course, you can also plug in the <font color="green">**boolean expression**</font> directly:

In [60]:
filtered_array = my_array[my_array%2==0]
print(filtered_array)

[2 4 6]


In [65]:
print(mask_even)
print(mask_larger_3)
mask_even & mask_larger_3

[False  True False  True False  True False]
[False False False  True  True  True  True]


array([False, False, False,  True, False,  True, False])

If you have different filter criteria, you can join them logically or negate them. 
However, instead of using the <font color="green">**boolean keywords**</font> <font color="#3da831">**and**</font>, <font color="#3da831">**or**</font> and <font color="#3da831">**not**</font>, you must use <font color="green">**bitwise operators**</font> here.

|operator| meaning |
|-|-|
|<font color="#a71ed9">**&**</font> | bitwise and|
|<font color="#a71ed9">**\|**</font>| bitwise or|
|<font color="#a71ed9">**~**</font> | bitwise not|

The reason for this is that we are evaluating our expressions <font color="green">**element-wise**</font> on each element of an <font color="green">**array**</font>!

The <font color="green">**boolean operators**</font> <font color="#3da831">**and**</font>, <font color="#3da831">**or**</font> and <font color="#3da831">**not**</font> work on boolean expressions that evaluate to <font color="#3da831">**True**</font> or <font color="#3da831">**False**</font>. This is equivalent of comparing just a single element in an array.

However, in our case, our conditions evaluate to an <font color="#3da831">**array**</font> of <font color="#3da831">**True**</font> and <font color="#3da831">**False**</font>, i.e. they can be either <font color="#3da831">**True**</font> or <font color="#3da831">**False**</font> depending on the element we are looking at!

In [66]:
# combine filter criterions
filtered_array = my_array[mask_even & mask_larger_3]
print(filtered_array)

[4 6]


In [67]:
# combine filter criterions
filtered_array = my_array[mask_even | mask_larger_3]
print(filtered_array)

[2 4 5 6 7]


In [68]:
# select all uneven ones
filtered_array = my_array[ ~mask_even ]
print(filtered_array)

[1 3 5 7]


In [69]:
# select un-even ones greater than 4
filtered_array = my_array[~(my_array%2==0) & (my_array>4)]
print(filtered_array)

[5 7]


## 7. Math with NumPy

Let's do some calulations. The initial idea of NumPy is to be efficient and support linear algebra.\
Accordingly, the well-known Python operators are defined on arrays with vector calculations in mind.

You can add/subtract/divide/multiply complete arrays element-wise, e.g. vector addition, ... .\
Or you can add/subtract/divide/multiply (by) a single value (<font color="green">**scalar**</font>) <font color="green">**element-wise**</font>.
Both is done using our standard <font color="green">**operators**</font> <font color="#a71ed9">**+**</font>, <font color="#a71ed9">**-**</font>, <font color="#a71ed9">**\***</font>, <font color="#a71ed9">**/**</font>, <font color="#a71ed9">**%**</font>, <font color="#a71ed9">**//**</font> and <font color="#a71ed9">**\*\***</font>.


In [70]:
a_array = np.array([1, 2, 3, 4, 5])
b_array = np.arange(6, 11)
print(a_array)
print(b_array)

[1 2 3 4 5]
[ 6  7  8  9 10]


In [71]:
# add a scalar
a_array + 4

array([5, 6, 7, 8, 9])

In [72]:
# vector addition
a_array + b_array

array([ 7,  9, 11, 13, 15])

In [75]:
# subtraction
a_array - b_array

array([-5, -5, -5, -5, -5])

In [74]:
# scalar division
a_array / 2

array([0.5, 1. , 1.5, 2. , 2.5])

In [73]:
###### element-wise vector division
a_array / b_array

array([0.16666667, 0.28571429, 0.375     , 0.44444444, 0.5       ])

You get the idea, <font color="green">**operators**</font> work <font color="green">**element-wise**</font>, if you use them with <font color="green">**arrays**</font> with appropriate dimensions and if you use a single number instead, it's a <font color="green">**scalar**</font> operation!

However, you must make sure that the <font color="green">**array**</font> dimensions fit or they can be <font color="green">**broadcasted**</font>!

In [76]:
a_array + np.array([1, 2, 3])

ValueError: operands could not be broadcast together with shapes (5,) (3,) 

## 8. <font color="green">**Broadcasting**</font>

Okay, what the hell means <font color="green">**broadcasting**</font>?

<font color="green">**Broadcasting**</font> in <font color="green">**NumPy**</font> is a powerful mechanism that allows arrays of different shapes to be combined in arithmetic operations. It enables efficient computation without the need for explicit copying of data or manual alignment of <font color="green">**array**</font> shapes. Broadcasting follows a set of rules to determine how the shapes of the input <font color="green">**arrays**</font> should be aligned during operations.

The key concept behind <font color="green">**broadcasting**</font> is extending smaller <font color="green">**arrays**</font> to match the shape of larger <font color="green">**arrays**</font> so that they can be operated on <font color="green">**element-wise**</font>. This is achieved by automatically replicating the smaller array's elements along the dimensions where they differ from the larger array's shape.

The basic rules are:

1. If the <font color="green">**arrays**</font> have different numbers of dimensions, the <font color="green">**shape**</font> of the one with fewer dimensions is _padded_ with ones on its left side.
2. If the <font color="green">**shape**</font> of any dimension in the input <font color="green">**arrays**</font> does not match, the array with shape equal to 1 in that dimension is stretched to match the other shape.
3. If in any dimension the sizes disagree and neither is equal to 1, an <font color="green">**error**</font> is <font color="green">**raised**</font>.

In [78]:
x = np.array([[1, 2, 3],
              [4, 5, 6]])

y = np.array([10, 20, 30])

# Performing element-wise addition
result = x + y

print(result)


[[11 22 33]
 [14 25 36]]


What happens?

<details>
    <summary><font color="orange"><b>Click me!</b></font></summary>
    In this example, x has shape (2, 3) and y has shape (3,). According to the broadcasting rules:
    The shape of y is padded with ones on the left side to match the number of dimensions of x. So, y effectively becomes (1, 3).
    Since the dimensions along the second axis (axis 1) don't match, y is stretched to match the shape of x along that axis.
    Finally, element-wise addition is performed.
</details>

So <font color="green">**broadcasting**</font>, in essence is that <font color="green">**NumPy**</font> can sometimes infer your meaning, even if you try to add arrays with different <font color="green">**shapes**</font>.

## 9. More math and functions

<font color="green">**NumPy**</font> offers a ton of mathematical functions beyond basic arithmetics. Whatever mathematical calculation you need to do, you can do it with <font color="green">**NumPy**</font>.

- Basic Mathematical Operations
- Trigonometric Functions
- Hyperbolic Functions
- Rounding and Absolute Functions
- Statistical Functions
- Linear Algebra Functions
- Random Number Generation

In [79]:
print(a_array)
print(b_array)

[1 2 3 4 5]
[ 6  7  8  9 10]


### Basic math formulas

In [80]:
# natural logarithm
np.log(a_array)

array([0.        , 0.69314718, 1.09861229, 1.38629436, 1.60943791])

In [81]:
# log 10
np.log10(a_array)

array([0.        , 0.30103   , 0.47712125, 0.60205999, 0.69897   ])

In [82]:
# log 2
np.log2(a_array)

array([0.        , 1.        , 1.5849625 , 2.        , 2.32192809])

In [83]:
# exponential
np.exp(a_array)

array([  2.71828183,   7.3890561 ,  20.08553692,  54.59815003,
       148.4131591 ])

In [86]:
# power
np.power(a_array, 3)

array([  1,   8,  27,  64, 125])

In [87]:
# square root
np.sqrt(a_array)

array([1.        , 1.41421356, 1.73205081, 2.        , 2.23606798])

In [88]:
# sum
np.sum(a_array)

15

### Statistics

In [89]:
# mean
np.mean(a_array)

3.0

In [90]:
# median
np.median(b_array)

8.0

In [91]:
# standard deviation
np.std(a_array)

1.4142135623730951

In [92]:
# min man
print(np.min(a_array))
print(np.max(a_array))

1
5


In [93]:
# variance
np.var(a_array)

2.0

In [94]:
np.log10(34)

1.5314789170422551

### Rounding and absolute

In [98]:
# round, floor and ceil
print(np.round(4.6))
print(np.floor(4.6))
print(np.ceil(4.2))

5.0
4.0
5.0


In [99]:
# absolute
print(
    np.abs(
        np.array([-1, 2, -3, 4, -5])
    )
)

[1 2 3 4 5]


### Random numbers

Sometimes you need a random number, e.g. to create a jitter in a jitter plot. A so-called <font color="green">**random number generator**</font>!
<font color="green">**NumPy**</font> offers some functions to generate random numbers:

In [117]:
# generate a random number from a uniform distribution

# a single number between 0 and 1
np.random.rand()

0.10960654294276995

In [122]:
# a random array
np.random.rand(2, 3)

array([[0.35860582, 0.08958341, 0.09200314],
       [0.05220264, 0.74676567, 0.94978286]])

In [139]:
# random numbers from a standard normal distribution
np.random.randn()

0.8702439939082629

In [143]:
# a normal distributed random array
np.random.randn(1, 3)

array([[-0.69020833,  0.86029539,  0.45922823]])

In [157]:
# random integers
print(np.random.randint(10))  # from 0 to 10
print(np.random.randint(-20, 10))  # from -20 to 10

8
-7


In [158]:
# an array thereof
print(np.random.randint(10, size=(2, 3)))  # from 0 to 10
print(np.random.randint(-20, 10, (3, 4)))  # from -20 to 10


[[8 2 5]
 [3 4 3]]
[[  1  -2   0 -11]
 [ -5   3   6   1]
 [ -8 -17  -3   4]]


# Summary

now you should now:

- what are packages
- what is NumPy
- What are arrays? How do you create them?
- How can you calculate with arrays?
- What is masking? 
- How do you filter arrays?
- How can you create random numbers?


## Exercise 1

Create a 1D array of length 10 that contains only the literal True as elements, using a single line of code.
Do the same for the Literal False.


## Exercise 2

Create a 1D array of length 10 that contains unordered random numbers, then sort them in descending order.

## Exercise 3

Create a 3x3 matrix with random integers between 1 and 10. Replace all odd numbers with -1.

## Exercise 4

create an array with 20 random integers. Write a function that finds all elements that are divisible by both 3 and 5.

## Exercise 5

Given a 1D array arr1 of shape (5,) and a 2D array arr2 of shape (5, 3), add arr1 to each row of arr2 without using loops.

## Exercise 6

Create this 2D vector and access the indicated elements with slicing.

<br>
<div>
<img src="data/exercise_slice.svg" width="600"/>
</div>


## Exercise 7

Create a 1D array with random integers. Find the sum of all elements that are greater than the mean of the array.

## Exercise 8

Create two arrays a and b of the same shape with random integers. Compute the element-wise maximum, minimum and sum of the two arrays.

## Exercise 9

Create a 2D array of shape (4, 6) with consecutive integers starting from 1. Reshape it into a 3D array of shape (2, 3, 4).

## Exercise 10

Create a 2D array of shape (5, 4) with random integers. Compute the mean and standard deviation of each column.

< [6 - What could possibly go wrong?](Python%20Crash%206%20-%20What%20could%20possibly%20go%20wrong.ipynb) | [Contents](Python%20Crash%20ToC.ipynb) | [8 - Everyone loves pandas](Python%20Crash%208%20-%20Everyone%20loves%20pandas.ipynb) >