[![Fixel Algorithms](https://i.imgur.com/AqKHVZ0.png)](https://fixelalgorithms.gitlab.io/)

# AI Program

## Convex Smooth Optimization - Local Quadratic Model 

> Notebook by:
> - Royi Avital RoyiAvital@fixelalgorithms.com

## Revision History

| Version | Date       | User        |Content / Changes                                                   |
|---------|------------|-------------|--------------------------------------------------------------------|
| 1.0.000 | 09/02/2024 | Royi Avital | First version                                                      |

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/FixelAlgorithmsTeam/FixelCourses/blob/master/AIProgram/2024_02/0010LocalQuadraticModel.ipynb)

In [None]:
# Import Packages

# General Tools
import numpy as np
import scipy as sp
import pandas as pd

# Machine Learning

# Miscellaneous
import os
import math
from platform import python_version
import random

# Typing
from typing import Callable, List, Tuple, Union

# Visualization
from matplotlib.colors import LogNorm, Normalize, PowerNorm
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objects as go
import seaborn as sns

# Jupyter
from IPython import get_ipython
from IPython.display import Markdown, display
from ipywidgets import Dropdown, FloatSlider, interact, IntSlider, Layout

## Notations

* <font color='red'>(**?**)</font> Question to answer interactively.
* <font color='blue'>(**!**)</font> Simple task to add code for the notebook.
* <font color='green'>(**@**)</font> Optional / Extra self practice.
* <font color='brown'>(**#**)</font> Note / Useful resource / Food for thought.

Code Notations:

```python
someVar    = 2; #<! Notation for a variable
vVector    = np.random.rand(4) #<! Notation for 1D array
mMatrix    = np.random.rand(4, 3) #<! Notation for 2D array
tTensor    = np.random.rand(4, 3, 2, 3) #<! Notation for nD array (Tensor)
tuTuple    = (1, 2, 3) #<! Notation for a tuple
lList      = [1, 2, 3] #<! Notation for a list
dDict      = {1: 3, 2: 2, 3: 1} #<! Notation for a dictionary
oObj       = MyClass() #<! Notation for an object
dfData     = pd.DataFrame() #<! Notation for a data frame
dsData     = pd.Series() #<! Notation for a series
hObj       = plt.Axes() #<! Notation for an object / handler / function handler
```

### Code Exercise

 - Single line fill

 ```python
 vallToFill = ???
 ```

 - Multi Line to Fill (At least one)

 ```python
 # You need to start writing
 ????
 ```

 - Section to Fill

```python
#===========================Fill This===========================#
# 1. Explanation about what to do.
# !! Remarks to follow / take under consideration.
mX = ???

???
#===============================================================#
```

## MathJaX Macros

Adding _quality of life_ macros.

$$
\newcommand{\MyParen}[1]{\left( #1 \right)}
\newcommand{\MyBrack}[1]{\left\lbrack #1 \right\rbrack}
\newcommand{\MyBrace}[1]{\left\lbrace #1 \right\rbrace}
\newcommand{\MyMat}[1]{\begin{bmatrix} #1 \end{bmatrix}}
\newcommand{\MyNorm}[2]{{\left\| #1 \right\|}_{#2}}
\newcommand{\MyAbs}[1]{\left| #1 \right|}
\newcommand{\MyNormTwo}[1]{\MyNorm{#1}{2}}
\newcommand{\MyCeil}[1]{\lceil #1 \rceil}
\newcommand{\MyInProd}[2]{\langle #1, #2 \rangle}
\newcommand{\MyUndBrace}[2]{\underset{#2}{\underbrace{#1}}}
\newcommand{\RR}[1]{\mathbb{R}^{#1}}
\newcommand{\InR}[1]{\in \mathbb{R}^{#1}}
\newcommand{\InC}[1]{\in \mathbb{C}^{#1}}
\newcommand{\BS}[1]{\boldsymbol{#1}}
\newcommand{\MyClr}[2]{{\color{#1}{#2}}}
\newcommand{\MyQuad}[2]{ {#1}^{T} #2 #1 }
$$

In [None]:
# Configuration
%matplotlib inline

# warnings.filterwarnings("ignore")

seedNum = 512
np.random.seed(seedNum)
random.seed(seedNum)

# sns.set_theme() #>! Apply SeaBorn theme
# sns.set_palette("tab10")

runInGoogleColab = 'google.colab' in str(get_ipython())

In [None]:
# Constants

FIG_SIZE_DEF    = (8, 8)
ELM_SIZE_DEF    = 50
CLASS_COLOR     = ('b', 'r')
EDGE_COLOR      = 'k'
MARKER_SIZE_DEF = 10
LINE_WIDTH_DEF  = 2


In [None]:
# Course Packages


In [None]:
# Auxiliary Functions

In [None]:
# Parameters

gradRadius = 4
numGridPts = (2 * gradRadius) + 1

μ = -0.2
σ = 1.5

* <font color='red'>(**?**)</font> For multivariate (2D) Gaussian, $\mu$ should be a vector and $\sigma$ should be a matrix. Yet above they are scalars. Explain.

## Local Quadratic Model

Local quadratic models are useful in many cases to model set of samples.  
One motivation could be that near local extrema many functions look like a quadratic model.

A _Quadratic Model_ in $\mathbb{R}^{n}$ is given by:

$$ f \left( \boldsymbol{x} \right) = \boldsymbol{x}^{T} \boldsymbol{A} \boldsymbol{x} + \boldsymbol{b}^{T} \boldsymbol{x} + c $$

Where the model parameters are the elements of the **matrix** $\boldsymbol{A}$, the **vector** $\boldsymbol{b}$ and the **scalar** $c$.

This notebook demonstrates how to estimate the parameters of a quadratic model given a set of samples.  
Specially, a 2D model will be used.

The objective is to estimate the peak location and value of the sampled data.  
The steps are:

1. Estimate the parameters of the polynomial model.
2. Extract the maximum values and the corresponding argument of the 2nd order model.
3. Compare the model results to the actual model.


* <font color='red'>(**?**)</font> How many parameters do a 2D _Quadratic Model_ have? Think about the properties of $\BS{A}$.

## Generate Data

The data will be sampled from a 2D Gaussian function.

In [None]:
# Generate / Load the Data
vX = np.linspace(-gradRadius, gradRadius, numGridPts) #<! Grid of the Gaussian Function
vY = np.exp(-0.5 * np.square((vX - μ) / σ))
mY = np.outer(vY, vY) #<! 2D Gaussian


In [None]:
# Display the Data

hF = go.Figure()
hF.add_trace(go.Surface(x = vX, y = vX, z = mY, name = 'Gaussian'))
hF.add_trace(go.Scatter3d(x = np.repeat(vX, numGridPts), y = np.tile(vX, numGridPts), z = mY.flat, mode = 'markers', name = 'Samples'))
hF.update_layout(title = 'Data Samples', scene = dict(xaxis_title = r'x_1', yaxis_title = r'x_2', zaxis_title = ''),
                 autosize = False, width = 600, height = 500, margin = dict(l = 45, r = 45, b = 45, t = 45)) #<! No LaTeX support in 3D plots


## Build the Linear Model

Given a set of $\left\{ \left( \boldsymbol{x}_{i}, {y}_{i} \right) \right\}_{i = 1}^{N}$ the model is given by:

$$ {y}_{i} = \boldsymbol{x}_{i}^{T} \boldsymbol{A} \boldsymbol{x}_{i} + \boldsymbol{b}^{T} \boldsymbol{x}_{i} + c $$

Which could be solved using _Linear Least Squares_ as:

$$ \arg \min_{\BS{A}, \BS{b}, c} \sum_{i = 1}^{N} \MyParen{ {y}_{i} - \boldsymbol{x}_{i}^{T} \boldsymbol{A} \boldsymbol{x}_{i} - \boldsymbol{b}^{T} \boldsymbol{x}_{i} - c }^{2} $$

Yet, there is a more efficient way to build this.  
A 2nd order model has basically the following form: ${y}_{i} = p {x}_{1}^{2} + q {x}_{2}^{2} + 2 r {x}_{1} {x}_{2} + s {x}_{1} + {t} {x}_{2} + u$.  
Which can be solved in a classic linear form:

$$ \BS{y} = \BS{H} \BS{w} $$

Where $\BS{w} = \MyBrack{ u, t, s, r, q, p }^{T}$ (Or any other permutation).

* <font color='brown'>(**#**)</font> If a model is **linear**, it can be always be written in the form $\BS{y} = \BS{H} \BS{w}$ for some $\BS{H}$ and $\BS{w}$.



### Question 001

1. Find the connection between the parameters $u, t, s, r, q, p$ to the elements of $\BS{A}, \BS{b}, c$.
2. Derive the matrix $\BS{H}$.  
   The matrix is a combination of the set of $\MyBrace{ \BS{x}_{i} }_{i = 1}^{N}$.
3. Implement in code a function to build $\BS{H}$.

### Solution 001

#### The Mapping

The matrix $\BS{A}$ must be symmetric, hence:

$$ \BS{A} = \begin{bmatrix} {a}_{11} & {a}_{12} \\ {a}_{12} & {a}_{22} \end{bmatrix} $$

For $\BS{x} = \MyBrack{ {x}_{1}, {x}_{2} }^{T}$ the result:

$$ \begin{bmatrix} {x}_{1} & {x}_{2} \end{bmatrix} \begin{bmatrix} {a}_{11} & {a}_{12} \\ {a}_{12} & {a}_{22} \end{bmatrix} \begin{bmatrix} {x}_{1} \\ {x}_{2} \end{bmatrix} = {a}_{11} {x}_{1}^{2} + 2 {a}_{12} {x}_{1} {x}_{2} + {a}_{22} {x}_{2}^{2} $$

For the vector $\BS{b} = \MyBrack{ {b}_{1}, {b}_{2} }^{T}$ the result:

$$ \BS{b}^{T} \BS{x} = {b}_{1} {x}_{1} + {b}_{2} {x}_{2} $$

Hence the mapping is:

$$ u = c, t = {b}_{2}, s = {b}_{1}, r = {a}_{12}, q = {a}_{22}, p = {a}_{11} $$

#### The Matrix $\BS{H}$

If $\BS{w} = \MyBrack{ u, t, s, r, q, p }^{T}$ then the columns of $\BS{H}$ should match the elements of $\BS{x}_{i}$:

$$ \BS{H} = \begin{bmatrix} 1 & \BS{x}_{1} \MyBrack{2} & \BS{x}_{1} \MyBrack{1} & 2 \BS{x}_{1} \MyBrack{1} \BS{x}_{1} \MyBrack{2} & \BS{x}_{1} \MyBrack{2}^{2} & \BS{x}_{1} \MyBrack{1}^{2} \\ 1 & \BS{x}_{2} \MyBrack{2} & \BS{x}_{2} \MyBrack{1} & 2 \BS{x}_{2} \MyBrack{1} \BS{x}_{2} \MyBrack{2} & \BS{x}_{2} \MyBrack{2}^{2} & \BS{x}_{2} \MyBrack{1}^{2} \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ 1 & \BS{x}_{N} \MyBrack{2} & \BS{x}_{N} \MyBrack{1} & 2 \BS{x}_{N} \MyBrack{1} \BS{x}_{N} \MyBrack{2} & \BS{x}_{N} \MyBrack{2}^{2} & \BS{x}_{N} \MyBrack{1}^{2} \end{bmatrix} $$

---


In [None]:
# The Linear Model Matrix H

#===========================Fill This===========================#
# 1. Implement the function to build H.
# !! You may find `np.column_stack()` useful.

def BuildMatH( vX1: np.ndarray, vX2: np.ndarray ) -> np.ndarray:
    """
    Build the linear model matrix for 2nd degree polynomial in 2D.
    Input:
      vX1         - The set of the 1st coordinates, Vector (numPts, 1).
      vX2         - The set of the 2nd coordinates, Vector (numPts, 1).
    Output:
      mH          - The model matrix.
    """

    numPts  = np.size(vX1) #<! Number of points
    mH      = np.column_stack((np.ones(numPts), vX2, vX1, 2 * vX1 * vX2, np.square(vX2), np.square(vX1)))

    return mH

#===============================================================#

## Solve the Linear Model

The linear model is given by $\BS{y} = \BS{H} \BS{w}$, to estimate $\BS{w}$ the model should be solved.  
In reality, equality is not always achievable (Noise, Model accuracy, etc...) hence the problem is solved in the _Leas Squares_ meaning:

$$ \arg \min_{\BS{w}} \frac{1}{2} \MyNormTwo{ \BS{H} \BS{w} - \BS{y} }^{2} $$

To solve such cases the function [`np.linalg.lstsq()`](https://numpy.org/doc/stable/reference/generated/numpy.linalg.lstsq.html) is used.

In [None]:
# Estimate the Model Parameters
# Using a LS solver to estimate vW

# Since `mY.flat` is row major:
# 1. x1 changes for ech point, repeats each numGridPts.
# 2. x2 changes each numGridPts, constant in between.
vX1 = np.tile(vX, numGridPts) #<! Replicate the vector
vX2 = np.repeat(vX, numGridPts) #<! Replicates the items
mH  = BuildMatH(vX1, vX2)

# vW, _, _, _ = np.linalg.lstsq(mH, mY.flat, rcond = None) #<! _ is "don't care"
vW, *_ = np.linalg.lstsq(mH, mY.flat, rcond = None) #<! See https://stackoverflow.com/questions/431866

In [None]:
# Build the Quadratic Model

#===========================Fill This===========================#
# 1. Calculate `mA`, `vB` and `valC` from `vW`.

mA      = np.array([[vW[5], vW[3]], [vW[3], vW[4]]])
vB      = np.array([vW[2], vW[1]])
valC    = vW[0]

#===============================================================#

In [None]:
# Estimate the Model Values
mX = np.column_stack((vX1, vX2))

# vYEst = mH * vW #<! The linear model
vYEst = np.array([mX[ii, :] @ mA @ mX[ii, :].T + np.dot(mX[ii, :], vB) + valC for ii in range(numGridPts * numGridPts)])
# vYEst = np.diag(mX @ mA @ mX.T) + mX @ vB + valC

* <font color='red'>(**?**)</font> How can `vY` be evaluated using `mA`, `vB` and `valC` without any loops? Is it efficient?
* <font color='blue'>(**!**)</font> Implement the vectorized method. 

In [None]:
# Estimate `arg max` and Maximum Value

vXMax = -np.linalg.solve(mA, vB)
valYMax = vXMax.T @ mA @ vXMax + np.dot(vXMax, vB) + valC

print(f'The quadratic model peak location is: {vXMax}')

In [None]:
# Display Data

hF = go.Figure()
hF.add_trace(go.Surface(x = vX, y = vX, z = mY, opacity = 0.35, showscale = False, name = 'Gaussian'))
# hF.add_trace(go.Surface(x = vX, y = vX, z = np.reshape(vYEst, (numGridPts, numGridPts)), name = 'Gaussian'))
hF.add_trace(go.Scatter3d(x = np.repeat(vX, numGridPts), y = np.tile(vX, numGridPts), z = mY.flat, mode = 'markers', name = 'Samples'))
hF.add_trace(go.Scatter3d(x = np.repeat(vX, numGridPts), y = np.tile(vX, numGridPts), z = vYEst, mode = 'markers', name = 'Quadratic Model'))
hF.add_trace(go.Scatter3d(x = [vXMax[0]], y = [vXMax[1]], z = [valYMax], mode = 'markers', name = 'Quadratic Model Peak'))
hF.update_layout(title = 'Data Samples', scene = dict(xaxis_title = r'x_1', yaxis_title = r'x_2', zaxis_title = ''),
                 autosize = False, width = 900, height = 600, margin = dict(l = 10, r = 10, b = 10, t = 40),
                 coloraxis_showscale = False) #<! No LaTeX support in 3D plots

* <font color='red'>(**?**)</font> Did the model estimate the peak well? Explain.
* <font color='red'>(**?**)</font> How can it be improved? Think about the number of points and the choice.
* <font color='green'>(**@**)</font> Implement the idea and verify improves results.
