# A) Gradient descent on mean-squared error (MSE)

- **<font color='green'>[RUN & OBSERVE]</font>** $\rightarrow$ the cell should be run directly without modification
- **<font color='orange'>[RUN & PLAY]</font>** $\rightarrow$ the cell can be run directly, but some parameters should be changed interactively
- **<font color='red'>[FILL & RUN]</font>**    $\rightarrow$ the cell should be filled before being run
- **<font color='magenta'>[FILL & PLAY]</font>** $\rightarrow$ the cell should be filled, and then some parameters should be changed interactively.


______
## 1) Continuous function
### a) Reference function

We consider the following analytical function supported on a unit square :
$$ u_0 : \left \{ \begin{array}{rcl}
\Omega = [0,1]^2 & \rightarrow & \mathbb R \\
(x_1 , x_2) &\mapsto & (x_1-0.5)^2 + (x_2-0.5)^2
\end{array} \right.$$

|**<font color='green'>[RUN & OBSERVE]</font>**|
|---|

In [None]:
from utils.myGeometries import square
Omega = square(maxh=0.2)          # generates the domain and its discretization

from ngsolve import x, y
x1 = x; x2 = y;
u0 = (x1-0.5)**2 + (x2-0.5)**2    # define the analytic function

from ngsolve.webgui import Draw
Draw(u0, Omega, settings = { "Objects" : { "Wireframe" : False }, "deformation" :  1})

_____
### b) Discretization
The domain is discretized in conforming triangular elements (no overlap, no nodes within edges).

|**<font color='green'>[RUN & OBSERVE]</font>**|
|---|

In [None]:
Draw(Omega)    # draw the discretized domain Omega

____
### c) Function spaces

#### i) $L^2(\Omega)$ (element DoFs, discontinuous)

$$ L^2(\Omega) = \left \{ u : \Omega \rightarrow \mathbb R, \int_\Omega |u|^2 \leq \infty \right \} $$

Discretized version $L^2(\Omega)$ corresponds to the space of discontinuous functions, defined element-wise.

|**<font color='orange'>[RUN & PLAY]</font>**|
|---|

In [None]:
from ngsolve import L2, GridFunction

order = 0  # can be changed
fesL2 = L2(Omega, order = order)                  # define the discretized function space
uL2 = GridFunction(fesL2)                         # define a discretized function
uL2.vec.data[-1] = 1                              # set a DoF to 1
Draw(uL2, settings = {"deformation" :  0.5})      # vizualize the discretized function

#### ii) $H^1(\Omega)$ (nodal DoF, continuous)

$$ H^1(\Omega) = \left \{ u \in L^2(\Omega), \nabla u \in L^2(\Omega) \right \} $$

Discretization of $H^1(\Omega)$ corresponds to the space of continuous functions, defined node-wise.

|**<font color='magenta'>[FILL & PLAY]</font>**|
|---|

In [None]:
from ngsolve import H1

order = 1  # can be changed
fesH1 = H1(Omega, order = order)                   # define the discretized function space
uH1 = GridFunction(fesH1)                          # define a discretized function
uH1.vec.data[-1] = 1                               # set a DoF to 1
Draw(uH1, order=order, settings = {"deformation" :  0.5})      # vizualize the function

_____
### d) Least square formulation
We look for a function $u\in H$ (with $H=L^2(\Omega)$ or $H^1(\Omega)$ setting the regularity level we want), minimizing the integral of the square error with the reference :
$$ u^* = \arg \min_{u\in H} J(u) = \frac{1}{2}\int_{x\in\Omega} ( u(x) -u_0 )^2 \; \mathrm{d} x$$

In what follows, we drop the $x$ dependency for conciseness :
- $u(x)\rightarrow u$
- no "$ x\in $"  and "$\mathrm{d}x$" in the integrals (always evaluated on the geometric space).
  
|**<font color='red'>[FILL & RUN]</font>**|
|---|

In [None]:
from ngsolve import Integrate

def J(u):
    """ integral of the half squared error """
    return Integrate( (u - u0)**2  / 2 , u.space.mesh)

#### i) Directional derivative
Given any function $v\in H$ , the directionnal derivative evaluates the first order variation of $J$ for a small step in the direction of $v$
$$J'(u; v) =  \lim_{t\rightarrow 0} \frac{J(u+tv) - J(u)}{t} $$

| **Exercise** (pen & paper)  : find the expression of $J'(u;v)$ |
|---|

#### ii) Gradient descent

For any scalar $\alpha >0$ we have always 

$$ J'(u; -\alpha (u-u_0) ) = -\alpha \int_\Omega (u-u_0)^2 \leq 0 $$

So $-(u-u_0)$ is a descent direction for $\alpha$ small enough.

|**<font color='red'>[FILL & RUN]</font>**|
|---|

In [None]:
def descent_direction(u):
    return -( u - u0 )

Having a descent direction means we can decrease $J$ by adding $\alpha (u_0-u)$ iteratively to the current trial function $u$ (or to its dofs, after discretization).

|**<font color='red'>[FILL & RUN]</font>**|
|---|

In [None]:
from ngsolve import GridFunction
def myGradientDescent(u0,     # initial guess
                      J : callable = J,   # objective function 
                      d : callable = descent_direction,  # descent direction
                      alpha : float = 0.1, # initial step size
                      niter : int = 100,   # number of iterations
                      ) -> tuple[GridFunction, list[float]]:
    
    """ Simple gradient descent algorithm with basic step size adaptation """

    uOld = u0
    uNew, Jlist = GridFunction(u0.space), [J(u0)]
    for _ in range(niter): # optimization loop
        uNew.Set( uOld + alpha * d(uOld) )   # assign to UNew the value of uOld + alpha * descent_direction
        Jlist.append(J(uNew))                # store J(UNew) to inspect convergence
        
        # step size adaptation
        if Jlist[-1] < Jlist[-2]: # if J decreases, the step is accepted
            uOld.Set(uNew)        # set UOld value to UNew    
            alpha *= 1.2          # increase the step size
        else:                     # else, if J increases, the step is rejected
            Jlist.pop()           # forget the last J value
            alpha /=2             # decrease the step size

    return uOld, Jlist

import matplotlib.pyplot as plt
def plot(listToPlot, xlabel = "Iterations", ylabel = "J", title = "Convergence"):
    """ Shorthand to plot the convergence """
    plt.semilogy(listToPlot);  plt.xlabel(xlabel) ;plt.ylabel(ylabel)
    plt.title(title); plt.grid(); plt.show()

We can now play around and see what happens with interpolations from different function spaces.

| **Exercise**  : Explore & analyze the influence of the function space on the result and the final error. |
|---|


|**<font color='orange'>[RUN & PLAY]</font>**|
|---|

In [None]:
# Discretization
maxh = 0.2                       # max element size (should be bigger than 0.05)
mesh = square(maxh = maxh)       # generate the mesh
#-------------------------------------------------------------------
# Interpolation
order = 0                        # max polynomial order
fes = L2(mesh, order = order)    # generate L2 finite element space
#fes = H1(mesh, order = order)    # generate H1 finite element space
#-------------------------------------------------------------------
# Solve & plot
uInit = GridFunction(fes)
uSol, JList = myGradientDescent(uInit, alpha = 0.1, niter = 100) # what should be the optimal step size ?
print(f"Final mean squared error = {JList[-1]:.3e}")
Draw(uSol, settings = {"deformation" :  0.5})
plot(JList, title = f"Convergence for {fes.type[0:2].swapcase()} space " + 
     f"({fes.mesh.ne} elements, order = {fes.globalorder})")

To summarize : 
- If $u_0$ is regular, then the interpolation should also be regular.

____________
## 2) Discontinuous function

### a) Reference function

Now, we consider the following function supported on a unit square :
$$ u_0 : \left \{ \begin{array}{rcl}
\Omega = [0,1]^2 & \rightarrow & \mathbb R \\
(x,y) &\mapsto & \mathbb{1}_{[0.4,0.6]^2}
\end{array} \right.$$


|**<font color='green'>[RUN & OBSERVE]</font>**|
|---|

In [None]:
maxh = 0.1  # maximum element size
#-----------------------------------------------------------
# define the geometry & the mesh
from ngsolve import Mesh
from netgen.geom2d import CSG2d, Rectangle
def square_in_square(maxh):
    geo = CSG2d()
    box = Rectangle( pmin=(0,0), pmax=(1,1), mat="out")
    rect = Rectangle( pmin=(0.3,0.3), pmax=(0.7,0.7), mat="in")
    geo.Add(box-rect); geo.Add(rect)
    return Mesh(geo.GenerateMesh(maxh=maxh))
#-----------------------------------------------------------
Omega2 = square_in_square(maxh)
u0 = Omega2.MaterialCF({"in" : 1})
Draw(u0, Omega2,  settings = { "Objects" : { "Wireframe" : False }, "deformation" :  0.5})

### b) Interpolation

|**<font color='orange'>[RUN & PLAY]</font>**|
|---|

In [None]:
# Discretization
maxh = 0.2                       # max element size (should be bigger than 0.05)
mesh = square_in_square(maxh)    # generate the mesh
#-------------------------------------------------------------------
# Interpolation
order = 0                        # max polynomial order
fes = L2(mesh, order = order)    # generate L2 finite element space
#fes = H1(mesh, order = order)    # generate H1 finite element space
#-------------------------------------------------------------------
# Solve & plot
uInit = GridFunction(fes)
uSol, JList = myGradientDescent(uInit, alpha = 0.1, niter = 100) # what should be the optimal step size ?
print(f"Final mean squared error = {JList[-1]:.3e}")
Draw(uSol, settings = {"deformation" :  0.5})
plot(JList, title = f"Convergence for {fes.type[0:2].swapcase()} space " + 
     f"({fes.mesh.ne} elements, order = {fes.globalorder})")

The result is not satisfying with $H^1(\Omega)$ : the interpolation has too much regularity! The discontinuity cannot be captured. It's a lot better with $L^2(\Omega)$.

### Take home message
**The interpolation should be carefully chosen!**
 - **Not enough regularity** leads to ***slow convergence***, needing a lot of elements to decrease the approximation error enough
 - **Too much regularity** leads to ***wrong results***.