[![Fixel Algorithms](https://i.imgur.com/AqKHVZ0.png)](https://fixelalgorithms.gitlab.io/)

# AI Program

## Exercise 0002 - Scientific Python

> Notebook by:
> - Royi Avital RoyiAvital@fixelalgorithms.com

## Revision History

| Version | Date       | User        |Content / Changes                                                   |
|---------|------------|-------------|--------------------------------------------------------------------|
| 0.1.000 | 23/02/2024 | Royi Avital | First version                                                      |

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/FixelAlgorithmsTeam/FixelCourses/blob/master/AIProgram/2024_02/Exercise0002.ipynb)

In [None]:
# Import Packages

# General Tools
import numpy as np
import scipy as sp
import pandas as pd

from numba import float32, float64, jit, njit, vectorize

# Image Processing

# Machine Learning


# Miscellaneous
import os
from platform import python_version
import random
import timeit

# Typing
from typing import Callable, Dict, List, Optional, Set, Tuple, Union

# Visualization
import matplotlib as mpl
import matplotlib.pyplot as plt
import plotly.graph_objects as go
import seaborn as sns

# Jupyter
from IPython import get_ipython
from IPython.display import Image, display
from ipywidgets import Dropdown, FloatSlider, interact, IntSlider, Layout

## Notations

* <font color='red'>(**?**)</font> Question to answer interactively.
* <font color='blue'>(**!**)</font> Simple task to add code for the notebook.
* <font color='green'>(**@**)</font> Optional / Extra self practice.
* <font color='brown'>(**#**)</font> Note / Useful resource / Food for thought.

Code Notations:

```python
someVar    = 2; #<! Notation for a variable
vVector    = np.random.rand(4) #<! Notation for 1D array
mMatrix    = np.random.rand(4, 3) #<! Notation for 2D array
tTensor    = np.random.rand(4, 3, 2, 3) #<! Notation for nD array (Tensor)
tuTuple    = (1, 2, 3) #<! Notation for a tuple
lList      = [1, 2, 3] #<! Notation for a list
dDict      = {1: 3, 2: 2, 3: 1} #<! Notation for a dictionary
oObj       = MyClass() #<! Notation for an object
dfData     = pd.DataFrame() #<! Notation for a data frame
dsData     = pd.Series() #<! Notation for a series
hObj       = plt.Axes() #<! Notation for an object / handler / function handler
```

### Code Exercise

 - Single line fill

 ```python
 vallToFill = ???
 ```

 - Multi Line to Fill (At least one)

 ```python
 # You need to start writing
 ????
 ```

 - Section to Fill

```python
#===========================Fill This===========================#
# 1. Explanation about what to do.
# !! Remarks to follow / take under consideration.
mX = ???

???
#===============================================================#
```

In [None]:
# Configuration
# %matplotlib inline

seedNum = 512
np.random.seed(seedNum)
random.seed(seedNum)

# sns.set_theme() #>! Apply SeaBorn theme

runInGoogleColab = 'google.colab' in str(get_ipython())

In [None]:
# Constants



In [None]:
# Course Packages


In [None]:
# General Auxiliary Functions



## Question 001 - Implement the $\operatorname{Diag}$ and $\operatorname{diag}$ Operators

This section is about implementing the $\operatorname{Diag}$ and $\operatorname{diag}$ Operators without using them explicitly.  
Namely implementing them using other linear operators without using `np.diag()`, `np.diagonal()`, `np.diagflat()`, etc...

1. Derive $\operatorname{Diag}$ analytically.
2. Derive $\operatorname{diag}$ analytically.
3. Implement the function `OperatorDiagMat()`.
4. Implement the function `OperatorDiagVec()`.

Let $\boldsymbol{X} \in \mathbb{R}^{d \times d}$:
 * The function $\operatorname{diag} \left( \cdot \right) : \mathbb{R}^{d \times d} \to \mathbb{R}^{d}$ returns the diagonal of a matrix, that is, $\boldsymbol{b} = \operatorname{diag} \left( \boldsymbol{X} \right) \implies \boldsymbol{b} \left[ i \right] = \boldsymbol{X} \left[ i, i\right]$.
 * The function $\operatorname{Diag} \left( \cdot \right) : \mathbb{R}^{d} \to \mathbb{R}^{d \times d}$ returns a diagonal matrix from a vector, that is, $B = \operatorname{diag} \left( \boldsymbol{x} \right) \implies \boldsymbol{B} \left[ i, j \right] = \begin{cases} {x}_{i} & \text{ if } i = j \\ 0 & \text{ if } i \neq j \end{cases}$.


## Solution 001

1. $\operatorname{Diag}  \left( \boldsymbol{a} \right) = \left( \boldsymbol{a} \boldsymbol{1}^{T} \right) \circ \boldsymbol{I}$.
2. $\operatorname{diag} \left( \boldsymbol{A} \right) = \left( \boldsymbol{A} \circ \boldsymbol{I} \right) \boldsymbol{1}$.

---

In [None]:
# Implement `Diag`

#===========================Fill This===========================#
# 1. Implement the `OperatorDiagMat()` function.
# 2. The input is a vector, the output is a diagonal matrix with the input as its diagonal.
# !! Try to implement without loops.
# !! You may find `np.ones()` and `np.eye()` useful.

def OperatorDiagMat(vA: np.ndarray) -> np.ndarray:

    numElements = vA.size
    
    return np.outer(vA, np.ones(numElements)) * np.eye(numElements)

#===============================================================#

In [None]:
# Implement `diag`

#===========================Fill This===========================#
# 1. Implement the `OperatorDiagVec()` function.
# 2. The input is a square matrix, the output is a vector.  
#    The input matrix's main diagonal as the output vector.
# !! Try to implement without loops.
# !! You may find `np.ones()` and `np.eye()` useful.

def OperatorDiagVec(mA: np.ndarray) -> np.ndarray:

    numRows = mA.shape[0]
    
    return (mA * np.eye(numRows, numRows)) @ np.ones(numRows)

#===============================================================#

In [None]:
# Verify Implementation 
numRows, numCols = 5, 4

vA = np.random.rand(numRows)
mA = np.random.rand(numRows, numRows)
mB = np.random.rand(numRows, numCols)

print(f'The implementation of `OperatorDiagMat()` is verified: {np.all(np.diag(vA) == OperatorDiagMat(vA))}')
print(f'The implementation of `OperatorDiagVec()` is verified: {np.all(np.diag(mA) == OperatorDiagVec(mA))}')


* <font color='green'>(**@**)</font> Add support for non square matrices in `OperatorDiagVec()`.

## Question 002 - Estimating Probability

The Birthday Problem is about the probability that given $k$ people in a room, none of them celebrate birthday at the same day.  
Though it seems, at first glance, that the odds are low, this section shows the in practice.

1. Derive analytically the probability no one shares its birthday day with others.  
2. Simulate the case for `k` people with `r` realizations. 

In the analysis assume a year is 365 days.

* <font color='brown'>(**#**)</font> In this context, _realization_ means experiment of sampling `k` people. Then the statistics is analyzed over those realizations.
* <font color='brown'>(**#**)</font> Original problem is about the probability at least 2 shares the same day. This formulation is easier to analyze.
* <font color='brown'>(**#**)</font> Verify your answer with simple edge cases test. For instance, what is $n$ where the probability is zero?.


## Solution 002

1. For $k$ people the number of combinations of $k$ birthdays is given by ${n}^{k}$ where $n = 365$.
2. To choose $k$ days out of $365$ so no date is shared is given by ${P}^{n}_{k} = \binom{n}{k} k! = \frac{n!}{ \left( n - k \right)! } = n \left( n - 1 \right) \left( n - 2 \right) \cdots \left( n - k + 1 \right)$.  
   The reasoning is the order matters since for equally probable events choosing $\left\{ 1, 2 \right\}$ is twice more probable than $\left\{ 1, 1 \right\}$. Hence it has to be counted twice. Which means the permutations of the $k$ chosen dates (People) should be counted.
3. Hence the chance of having no shared birthday is $\frac{n!}{ \left( n - k \right)! \cdot {n}^{k} }$.


* <font color='brown'>(**#**)</font> In probability, when using the concept of $\frac{\left | {\Omega}_{E} \right |}{\left | \Omega \right |}$ the concept is counting the number of events is counting equally probable samples.
* <font color='brown'>(**#**)</font> Other analysis in [Scientific American - Probability and the Birthday Paradox](https://www.scientificamerican.com/article/bring-science-home-probability-birthday-paradox/), [Wolfram MathWorld - Birthday Problem](https://mathworld.wolfram.com/BirthdayProblem.html).
* <font color='brown'>(**#**)</font> One interpretation to ${P}^{n}_{k}$ is the number of ways of distributing $k$ distinct objects to $n$ distinct boxes if only one object may be placed in each box since it matters which object is placed in which box.
* <font color='brown'>(**#**)</font> Variants & Solutions: [Birthday Problem - Expected Number of Collisions](https://math.stackexchange.com/questions/35791), [Birthday Problem - Expected Value](https://math.stackexchange.com/questions/211295), [Birthday Problem - Probability of Multiple Collisions](https://math.stackexchange.com/questions/535868), [Birthday Problem - Probability of 3 People Having the Same Birthday](https://math.stackexchange.com/questions/25876), [Birthday Problem - Using Combinations Instead of Permutations](https://math.stackexchange.com/questions/2771627).

---

In [None]:
# Generating an Array of Realizations
# This section implements a function which generates an array of `r` realizations.
# Each realization simulate `k` people.
# The dates will be mapped into the range {0, 1, 2, ..., 364}.

#===========================Fill This===========================#
# 1. Implement the `BirthdayRealization()` function.
# 2. Given `k` people, draw `k` birthday days.
# 3. Concatenate `r` realizations into array `k x r`.
# !! Pay attention to the mapping of the days.
# !! You may find `np.random.randint()` and `np.random.choice()` useful.

def BirthdayRealizations( r: int, k: int ) -> np.ndarray:
    
    return np.random.randint(365, size = (k, r))

#===============================================================#

In [None]:
# Probability of No Birthday
# Given an array of experiments of the same `k`, calculate the probability of no shared birth day.

#===========================Fill This===========================#
# 1. Implement the `NoSharedBirthdayProb()` function.
# 2. Extract the number of realizations.
# 3. Check the number of cases of unique days.
# !! Pay attention to the mapping of the days.
# !! You may find `np.unique()` and `np.bincount()` useful.
# !! You may rewrite everything your style.

@njit
def NoSharedBirthdayProb( mR: np.ndarray ) -> float:

    numRealizations = mR.shape[1]
    notSharedCnt = 0
    for ii in range(numRealizations):
        notSharedCnt += np.all(np.bincount(mR[:, ii]) <= 1)
    
    return notSharedCnt / numRealizations

#===============================================================#

In [None]:
# Analytic Calculation of the Probability 
def NoSharedBirthdayProbAnalytic( k: int ) -> float:

    numComb = sp.special.perm(365, k)
    
    return numComb / (365 ** k) #<! Uses Python perfect accuracy for integers

In [None]:
# Verifying the Realizations
# This section verifies the empirical results.

maxK            = 100
numRealizations = 10_000

vK      = range(1, maxK + 1) #<! Important to keep number integers
vP      = [100 * NoSharedBirthdayProb(BirthdayRealizations(numRealizations, kk)) for kk in vK]
vPRef   = [100 * NoSharedBirthdayProbAnalytic(kk) for kk in vK]

In [None]:
# Plot Results

hF, hA = plt.subplots(figsize = (10, 6))
hA.plot(vK, vP, lw = 2, label = 'Empirical Results')
hA.plot(vK, vPRef, lw = 2, label = 'Analytic Solution')
hA.set_title(f'The Birthday Problem: Empirical ({numRealizations} Realizations) vs. Analytic')
hA.set_xlabel('Number of People')
hA.set_ylabel('Probability [%]')
hA.legend();

* <font color='red'>(**?**)</font> You're entering a party with 40 people. Someone offers you a bet.  
If there are at least 2 people with the same birth day, you get 50$ else he gets 150$. Should you take the bet?
* <font color='blue'>(**!**)</font> Replace `vK = range(1, maxK + 1)` with `vK = np.arange(1, maxK + 1)`. Run the analysis and explain results.

## Question 003 - Minimizing a Function

This section shows how to find the minimum of a function.  
The function is given by ([Rastrigin Function](https://en.wikipedia.org/wiki/Rastrigin_function)):

$$ f \left( x, y \right) = 20 + {y}^{2} + {x}^{2} - 10 \cos \left( 2 \pi x \right) - 10 \cos \left( 2 \pi y \right) $$

* <font color='brown'>(**#**)</font> See more functions at [Optimization Test Functions](http://www.sfu.ca/~ssurjano/optimization.html) and [Wikipedia - Test Functions for Optimization](https://en.wikipedia.org/wiki/Test_functions_for_optimization).

In this section the function `sp.optimize.minimize(method = 'BFGS')` will be used for minimization.

* <font color='brown'>(**#**)</font> There are several different optimizers in SciPy as seen in [SciPy Optimize](https://docs.scipy.org/doc/scipy/reference/optimize.html).

In [None]:
# Generate the Data Grid

tuGrid = (-5, 5, 1001)
vX = np.linspace(*tuGrid)
vY = np.linspace(*tuGrid)

In [None]:
# The Rastrigin Function
# Implement the 2D Rastrigin function.
# This section implements the function in 2 ways:
# 1. Vectorized: Classic NumPy implementation.
# 2. Scalar: Evaluation per `x` and `y` where vectorization is done by Numba.


#===========================Fill This===========================#
# 1. Implement the `Rastrigin2DVec()` function (Vectorized style).
# 2. Calculate the function along x.
# 3. Calculate the function along y.
# 4. Merge the 2 by broadcasting.
# !! Try to avoid loops.
# !! You may choose a different implementation path.

# @njit
def Rastrigin2DVec( vX: np.ndarray, vY: np.ndarray ) -> np.ndarray:

    vFx = 10 + np.square(vX) - 10 * np.cos(2 * np.pi * vX)
    vFy = 10 + np.square(vY) - 10 * np.cos(2 * np.pi * vY)
    mF  = vFy[:, None] + vFx[None, :]
    
    return mF

#===============================================================#

In [None]:
#===========================Fill This===========================#
# 1. Implement the `Rastrigin2D()` function.
# 2. Calculate the function given **scalars** `valX` and `valY`.

@vectorize([float32(float32, float32), float64(float64, float64)])
def Rastrigin2D( valX: float, valY: float ) -> float:
    
    return 20 + np.square(valX) - 10 * np.cos(2 * np.pi * valX) + np.square(valY) - 10 * np.cos(2 * np.pi * valY)

#===============================================================#

In [None]:
# Verify Implementations
ε = 1e-6

mFVec = Rastrigin2DVec(vX, vY)
mF    = Rastrigin2D(vX[None, :], vY[:, None])

maxAbsDev = np.max(np.abs(mFVec - mF))

print(f'The maximum absolute deviance between implementations: {maxAbsDev}')
print(f'The implementations are verified: {maxAbsDev < ε}')

In [None]:
# Visualize the Function

hF, hA = plt.subplots(figsize = (10, 10))

rangeXTicks = range(0, tuGrid[2], 50)

oImgPlt = hA.imshow(mF)
hA.set_xticks(rangeXTicks)
hA.set_xticklabels(vX[rangeXTicks])
hA.set_yticks(rangeXTicks)
hA.set_yticklabels(vY[rangeXTicks])
hA.set_title('The 2D Rastrigin Function')
hA.set_xlabel(r'$x$')
hA.set_ylabel(r'$y$')
hF.colorbar(oImgPlt);

* <font color='red'>(**?**)</font> Is the function _Convex_? What does it imply on its minimum?

In [None]:
# Optimization Path

maxNumIter = 1000
numStartPts = 10

hF = lambda vInput: Rastrigin2D(vInput[0], vInput[1])

# Adding some margins to the start points
mX0 = np.row_stack(([0.1, 0.15], (tuGrid[0] + 1) + (tuGrid[1] - tuGrid[0] - 2) * np.random.rand(numStartPts - 1, 2)))
mPath = np.full(shape = (maxNumIter, 2, numStartPts), fill_value = np.nan)

jj = 0

def MinCallback(mPath: np.ndarray, intermediate_result: sp.optimize.OptimizeResult, startPtIdx: int) -> None:
    
    global jj

    mPath[jj, :, startPtIdx] = intermediate_result['x']

    jj += 1


for ii in range(numStartPts):
    hC = lambda intermediate_result: MinCallback(mPath, intermediate_result, ii)
    dOptRes = sp.optimize.minimize(hF, mX0[ii, :], method = 'BFGS', options = {'maxiter': maxNumIter}, callback = hC)
    jj = 0 #<! New count each start index


In [None]:
# Concatenating the Start Point to the Path Points
# NumPy is like `C`, the last dimension is contiguous.
mTMP = np.concatenate((np.transpose(mX0[:, :, None], (2, 1, 0)), mPath))

In [None]:
# Draw the Optimization Path

# Starting Point marker to be bigger
vMarkerSize = 6 * np.ones(maxNumIter)
vMarkerSize[0] = 12

# Draw the Path (Using Plotly)
hFig = go.Figure()
hFig.add_trace(go.Heatmap(x = vX, y = vY, z = mF))
for ii in range(numStartPts):
    hFig.add_trace(go.Scatter(x = mTMP[:, 0, ii], y = mTMP[:, 1, ii], mode = 'markers', marker = {'size': vMarkerSize}, name = f'{ii:02d}'))
hFig.update_layout(autosize = False, width = 800, height = 800, title = 'Optimization Path', 
                   legend = {'orientation': 'h', 'yanchor': 'bottom', 'y': 1.02, 'xanchor': 'left', 'x': 0.01})