# Testing softened (well-differentiable) alternatives to `np.min`

Can we use `npa.min` directly during optimisation, or do we need softened versions? The only way to know for sure is to test the autograd gradients of the optimisation FoM using both `npa.min` and the softened version. The softened version will guarantee consistency between the autograd gradient and finite differences, but the `npa.min` version may also be consistent, in which case the softened version is not needed. 

This problem is different from the issue with building the unit cell grid permittivities (`build_grating_gradable()` in `twobox.py`). There, each grid permittivity is set depending on an `if` condition for where the grid lies (inside box1, inside box2, between boxes, etc.). The `if` conditions themselves are not differentiable (they create non-continuous boundaries between `if` regions), hence `softmax` is necessary for FoM differentiability.

Softmin and LogSumExp (LSE) can both be made numerically stable, and are both soft approximations to the min function. Which one to use depends on the speed and accuracy of each method, though they are similar enough that it probably doesn't matter.

## Functions

In [17]:
import numpy as np
import autograd.numpy as npa
from autograd import grad

def softmin(sigma,p):
    e_x = npa.exp(sigma*(npa.min(p) - p))
    return e_x/npa.sum(e_x)

def softmin_unstable(sigma,p):
    # Unstable for very small negative values and/or sigma
    e_x = npa.exp(-sigma*p)
    return e_x/npa.sum(e_x)


def f(x):
    return npa.min(x)

def gradable_min_softmin(x,sigma=1):
    """Approximate min using expected value of x with probability distribution given by softmin"""
    return npa.sum(x*softmin(sigma,x))

def gradable_min_LSE(x,sigma=1):
    """Approximate min using LogSumExp"""
    # From https://mathoverflow.net/questions/35191/a-differentiable-approximation-to-the-minimum-function    
    ex = npa.exp(sigma * (npa.min(x) - x))
    sumexp = npa.sum(ex)
    logsumexp = npa.log(sumexp) - sigma*npa.min(x)
    return -1/sigma * logsumexp


x = npa.array([1.,-1.1,3.,-1.])
print(f(x))
print(gradable_min_softmin(x,1))
print(gradable_min_LSE(x,1))

-1.1
-0.8966647010107589
-1.8148433677095008


## Visualisation

In [18]:
%matplotlib qt

import matplotlib.pyplot as plt
from matplotlib import cm
from matplotlib.ticker import LinearLocator

sigma = 1.

X = np.arange(-5, 5, 0.25)
Y = np.arange(-5, 5, 0.25)
X, Y = np.meshgrid(X, Y)
XY = np.dstack((X,Y))

standard_mins = np.minimum(X,Y)

softmin_mins = np.zeros(X.shape)
LSE_mins = np.zeros(X.shape)
for i in range(XY.shape[0]):
    for j in range(XY.shape[1]):
        softmin_mins[i,j] = gradable_min_softmin(XY[i,j,:],sigma)
        LSE_mins[i,j] = gradable_min_LSE(XY[i,j,:],sigma)

fig, axs = plt.subplots(3,1, subplot_kw={"projection": "3d"}, figsize=(7,15))
surf = axs[0].plot_surface(X, Y, standard_mins, cmap=cm.coolwarm, linewidth=0, antialiased=False)
surf1 = axs[1].plot_surface(X, Y, softmin_mins, cmap=cm.coolwarm, linewidth=0, antialiased=False)
surf2 = axs[2].plot_surface(X, Y, softmin_mins, cmap=cm.coolwarm, linewidth=0, antialiased=False)

# Customize the z axis.
for ax in axs:
    ax.set_zlim(-5, 5)
    ax.zaxis.set_major_locator(LinearLocator(10))

# Add a color bar which maps values to colors.
fig.colorbar(surf, shrink=0.5, aspect=5)
fig.colorbar(surf1, shrink=0.5, aspect=5)
fig.colorbar(surf2, shrink=0.5, aspect=5)


<matplotlib.colorbar.Colorbar at 0x1561b0bc0>

## Gradients

In [19]:
f_grad = grad(f)
min_softmin_grad = grad(gradable_min_softmin)
min_LSE_grad = grad(gradable_min_LSE)

print(f_grad(x))
print(min_softmin_grad(x))
print(min_LSE_grad(x))

[0. 1. 0. 0.]
[-0.05372286  0.58875435 -0.02348758  0.4884561 ]
[0.0599141  0.48926874 0.00810849 0.44270866]


: 