[![Fixel Algorithms](https://fixelalgorithms.co/images/CCExt.png)](https://fixelalgorithms.gitlab.io/)

# Image Processing with Python

## NumPy Basics

> Notebook by:
> - Royi Avital RoyiAvital@fixelalgorithms.com

## Revision History

| Version | Date       | User        |Content / Changes                                                   |
|---------|------------|-------------|--------------------------------------------------------------------|
| 0.1.000 | 03/10/2023 | Royi Avital | First version                                                      |

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/FixelAlgorithmsTeam/FixelCourses/blob/master/ImageProcessingPython/0001NumPy.ipynb)

In [None]:
# Import Packages

# General Tools
import numpy as np
import scipy as sp
import pandas as pd

from numba import jit, njit

# Image Processing

# Machine Learning


# Miscellaneous
import os
from platform import python_version
import random
import timeit

# Typing
from typing import Callable, List, Tuple

# Visualization
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
# from bokeh.plotting import figure, show

# Jupyter
from IPython import get_ipython
from IPython.display import Image, display
from ipywidgets import Dropdown, FloatSlider, interact, IntSlider, Layout

## Notations

* <font color='red'>(**?**)</font> Question to answer interactively.
* <font color='blue'>(**!**)</font> Simple task to add code for the notebook.
* <font color='green'>(**@**)</font> Optional / Extra self practice.
* <font color='brown'>(**#**)</font> Note / Useful resource / Food for thought.

### Code Exercise

 - Single line fill

 ```python
 vallToFill = ???
 ```

 - Multi Line to Fill (At least one)

 ```python
 # You need to start writing
 ????
 ```

 - Section to Fill

```python
#===========================Fill This===========================#
# 1. Explanation about what to do.
# !! Remarks to follow / take under consideration.
mX = ???

???
#===============================================================#
```

In [None]:
# Configuration
# %matplotlib inline

seedNum = 512
np.random.seed(seedNum)
random.seed(seedNum)

# sns.set_theme() #>! Apply SeaBorn theme

runInGoogleColab = 'google.colab' in str(get_ipython())

In [None]:
# Constants



In [None]:
# Fixel Algorithms Packages


In [None]:
# General Auxiliary Functions

def MatBlockView(mI: np.ndarray, tuBlockShape: Tuple[int, int] = (4, 4)) -> np.ndarray:
    """
    Generates a view of block of shape `blockShape` of the input 2D NumPy array.
    Input:
      - mI           : Numpy 2D array.
      - tuBlockShape : A tuple of the block shape.
    Output:
      - tBlockView   : Tensor of blocks on its 3rd axis.
    Remarks:
      - It assumed the shape of the input array `mI` is an integer multiplication
        of the block size.
      - No verification of compatibility of shapes is done.
    """
    # Pay attention to integer division
    # Tuple addition means concatenation of the Tuples
    tuShape   = (mI.shape[0] // tuBlockShape[0], mI.shape[1] // tuBlockShape[1]) + tuBlockShape
    tuStrides = (tuBlockShape[0] * mI.strides[0], tuBlockShape[1] * mI.strides[1]) + mI.strides
    
    return np.lib.stride_tricks.as_strided(mI, shape = tuShape, strides = tuStrides)

## NumPy Basics

This _notebook_ exercises some NumPy concepts.  
It is focused on some vectorization tricks and accelerating some operations.

* <font color='brown'>(**#**)</font> For performance measurement the package [`timeit`](https://docs.python.org/3/library/timeit.html) or the `%timeit` magic will be used.
* <font color='brown'>(**#**)</font> For visualization the package [Matplotlib](https://github.com/matplotlib/matplotlib) will be used.
* <font color='brown'>(**#**)</font> For acceleration the package [Numba](https://github.com/numba/numba) will be used.

### Array Generation

This section exercises several ways to generate / initialize NumPy arrays.

* <font color='brown'>(**#**)</font> Relevant NumPy functions are: [`zeros()`](https://numpy.org/doc/stable/reference/generated/numpy.zeros.html), [`ones()`](https://numpy.org/doc/stable/reference/generated/numpy.ones.html), [`full()`](https://numpy.org/doc/stable/reference/generated/numpy.full.html), [`empty()`](https://numpy.org/doc/stable/reference/generated/numpy.empty.html).
* <font color='brown'>(**#**)</font> Pay attention to the element type (`dtype`).

In [None]:
# Parameters

numRows, numCols = 300, 500
numIter = 100

In [None]:
#===========================Fill This===========================#
# 1. Compare the runtime of allocating an array using `ones()` vs. `empty()`.
# 2. Use the `timeit` package to compare run time.
# !! Read documentation about `globals` in `timeit()`.

timeOnes  = ???
timeEmpty = ???

if timeOnes < timeEmpty:
    print(f'Generating array of ones is {timeEmpty / timeOnes} times faster!')
else:
    print(f'Generating empty array is {timeOnes / timeEmpty} times faster!')

#===============================================================#


Another way to time functions / code snippets is using the `%timeit` magic of `Jupyter`.  
This section compares generating array of zeros with `full()` and `zeros()`.

* <font color='brown'>(**#**)</font> Using `full` is effective way to initialize an array with `NaN`.

In [None]:
%timeit np.zeros(shape = (numRows, numCols))
%timeit np.full(shape = (numRows, numCols), fill_value = 0.0)

NumPy has advanced pseudo random number generators.  
This section compares performance using the newer generator interface to the classic generator.

In [None]:
#===========================Fill This===========================#
# 1. Generate 100 integer numbers from {0, 1, ..., 999} without replacement:
#    - Using `np.random.choice()`.
#    - Using the generator API (Done).
# !! Use the `%timeit` magic.

?????

#===============================================================#

### Broadcasting

Broadcasting is a powerful concept which allows using vectorization in a broader scenarios.
This sections shows broadcasting in several scenarios.

To grasp the concept one may refer to [Lev Maximov - Broadcasting in NumPy](https://towardsdatascience.com/58856f926d73)

![](https://i.imgur.com/zxoQhX3.png)

In order for Broadcasting to _kick in_ the dimensions of the arrays must match.  
In order to achieve this, 2 simple rules are applied:

1. If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its **leading** (left) side.
2. If the shape of the two arrays does not match in any dimension, if one of the arrays has shape equal to 1 in that dimension is stretched to match the other shape.

Arrays must match in their dimensions after this 2 rules are applied.  
In case they don't, an error is raised.

The following task deals with broadcasting a vector over rows / columns of a matrix.

In [None]:
#===========================Fill This===========================#
# 1. Generate a matrix of size (2, 3).
# 2. Generate a vector of size (2, ).
# 3. Add the vector to each row of the matrix.
# 4. Add the vector to each column of the matrix.
# !! For simplicity, use elements of an integer values.
# !! Using `vA[:, None]` generates new axis which is useful for broadcasting

mA = ??? #<! The array
vR = ??? #<! To be added to rows
vC = ??? #<! To be added to columns

# Rows Broadcasting
mR = ???
print(f'Result of broadcasting rows: \n{mR}')

# Columns Broadcasting
mC = ???
print(f'Result of broadcasting rows: \n{mC}')

#===============================================================#

The following task deals with broadcasting a matrix multiplications.

In [None]:
# This sections:
# 1. Generates tensor (Matrices of same size).
# 2. Generate a matrix which is compatible in size for matrix multiplications.
# 3. Multiplies each matrix of the tensor by the matrix.
tA = np.random.randn(3, 5, 4)
mB = np.random.randn(4, 3)

tC = np.empty(shape = (tA.shape[0], tA.shape[1], mB.shape[1]))

for ii in range(tA.shape[0]):
    tC[ii] = tA[ii] @ mB

In [None]:
#===========================Fill This===========================#
# 1. Replicate the above using broadcasting.
# !! No loops should be used.
# !! You may read on `np.matmul()`.

tD = ??? #<! Broadcasting

#===============================================================#

# Verify
print(f'Result is valid: {np.allclose(tC, tD)}')

* <font color='red'>(**?**)</font> Could we do the broadcasting if `tA = np.random.randn(5, 4, 3)`? Namely the matrices were on the 3rd axis?

<!-- First, the broadcasting adds (1) add the beginning of the array. Second `matmul` seeks elements to be matrices, so `tA[0]` should be a matrix. -->

### Loops

Python, currently without `JIT`, is very slow in general and specifically applying operations based on loops.  
When dealing with loops for `ndarray` one should stick to the following:

1. Use vectorized operations.  
   Vectorized operations trades memory efficiency for speed.
2. NumPy is _row major_   
   Data is row contiguous.  
   Applying operations on rows will be faster than working on columns.
3. Use NumPy's built in iterators.  
   There are few tools such as: `numpy.nditer`, `numpy.lib.stride_tricks.as_strided`, `numpy.lib.stride_tricks.sliding_window_view`.
4. Use _Numba_  
   Numba add JIT acceleration to work on NumPy arrays.


This section exercise the ideas above in a simple scenario:

1. A symmetric matrix `mA` is given with shape `(1000, 1000)`.
2. For each 10x10 sub block (Non sliding) the mean will be evaluated.
3. The output should be a `100x100` array.

* <font color='brown'>(**#**)</font> We'll use the `%%timeit` magic to time the whole cell.

In [None]:
# Generating Data
tuMatShape   = (1000, 1000)
tuBlockShape = (10, 10)
tuOutShape   = (100, 100)
mA = np.random.rand(*tuMatShape)

# Since `%%timeit` doesn't expose its own generated variables
mORef      = np.zeros(shape = tuOutShape)
mOIterCol  = np.zeros(shape = tuOutShape)
mOMatBlock = np.zeros(shape = tuOutShape)
mONumba    = np.zeros(shape = tuOutShape)


In [None]:
%%timeit
# Reference Implementation
# Working along columns

for nn, jj in enumerate(range(0, tuMatShape[1], tuBlockShape[1])):
    for mm, ii in enumerate(range(0, tuMatShape[0], tuBlockShape[0])):
        mORef[mm, nn] = np.mean(mA[ii:(ii + tuBlockShape[0]), jj:(jj + tuBlockShape[1])])

In [None]:
%%timeit
#===========================Fill This===========================#
# 1. Apply the function using plain loop.
# 2. Iterate on rows instead of columns.
# !! Use `mOIterCol` for the result.

????

#===============================================================#

* <font color='red'>(**?**)</font> Do the timings of iterating over rows vs. columns match your expectations? Why?

In [None]:
%%timeit
#===========================Fill This===========================#
# 1. Apply the function using the above implemented `MatBlockView()`.
# 2. Read and understand the function `MatBlockView()`.
# !! You may use `np.mean(... axis = ())`.
# !! Use `mOMatBlock` for the result.
# !! You may need to use `mOMatBlock[:]`. Why?

????

#===============================================================#

The following implementation used Numba for acceleration:

1. Read the [5 Minutes Guide with Numba](https://numba.pydata.org/numba-doc/dev/user/5minguide.html).
2. A function must be used for Numba.

* <font color='brown'>(**#**)</font> Pay attention that `jit` and `njit` are already imported.

In [None]:
#===========================Fill This===========================#
# 1. Apply the function using loops inside a function.
# 2. The function is accelerated by Numba's jit.
# !! Make sure not to use any global variables.

@njit
def CalcBlockMean(mA: np.ndarray, tuBlockShape: Tuple[int, int], mO: np.ndarray):
    """
    Calculates the mean of each block sized `tuBlockShape` in `mA`.  
    The block are not overlapping.
    Input:
      - mA           : Numpy 2D array.
      - tuBlockShape : A tuple of the block shape.
      - mO           : Numpy 2D array to be updated in place.
    Remarks:
      - The following must hold (Per element) mA.shape / tuBlockShape = mO.shape.
    """
    
    ????

#===============================================================#

CalcBlockMean(mA, tuBlockShape, mONumba) #<! For the first run of the JIT compilation

In [None]:
%%timeit
# Using Numba for acceleration
CalcBlockMean(mA, tuBlockShape, mONumba)

In [None]:
# Verify results of each method
print(f'The iteration over columns is valid: {np.allclose(mORef, mOIterCol)}')
print(f'The matrix block view is valid: {np.allclose(mORef, mOMatBlock)}')
print(f'The Numba is valid: {np.allclose(mORef, mONumba)}')