In [1]:
import numpy as np
import matplotlib.pyplot as plt

<br>

# Functional and functional derivatives
---

<br>

### Functionals

A functional is a function that takes a function as input and outputs a real number, some examples below:

&emsp; $\displaystyle E[g] = \int_{-\infty}^{\infty} g(x) e^{-x^2} dx$
&emsp; expected value under a gaussian distribution

&emsp; $\displaystyle H[p] = - \sum_x p(x) \log p(x)$
&emsp; entropy of a discrete probability distribution

&emsp; $\displaystyle D_{KL}(p||q) = - \int p(x) \log \frac{q(x)}{p(x)}$
&emsp; KL divergence between two probability density functions

Most functionals will take the form of an integral (or a sum for discrete functions).

<br>

### Functional derivatives

The definition of functional derivatives is tighly linked, as the definition of derivates for functions, with the development in Taylor series:

&emsp; $\displaystyle F[y(x) + \epsilon \eta(x)] = F[y(x)] + \epsilon \int \frac{\delta F}{\delta y}(x) \eta(x) dx + O(\epsilon^2)$

Which defines by **how much the value of the functional $F$ will change when going in the direction** $\eta$ in the space of functions. It resembles the Taylor series for multivariate functions, which describes below a change in the direction $u$:

&emsp; $\displaystyle f(x + \epsilon u) = f(x) + \epsilon \; u^T \nabla f + O(\epsilon^2)$
&emsp; or equivalently
&emsp; $\displaystyle f(x + \epsilon u) = f(x) + \epsilon \; \langle u, \nabla f \rangle + O(\epsilon^2)$

* the **functional derivative** $\displaystyle \frac{\delta F}{\delta y}$ is the equivalent of the gradient (and is therefore **a function**)
* the dot product becomes $\displaystyle \Big \langle \frac{\delta F}{\delta y}, \eta \Big \rangle = \int \frac{\delta F}{\delta y}(x) \eta(x) dx$

Note that if the differential is defined in terms of a sum and not an integral, the same thing applies, given that we have a valid dot product (which reflects correctly the definition of the functional).

<br>

### Functional differentials

Remember that differentials for function are defined as functions of $x$ and $dx$. They define the plan tangent to the function that best approximate the value of the function in the neighborhood of the point $x$:

&emsp; $df = df(x,dx) = dx^T \nabla f = \langle \nabla f, dx \rangle$
&emsp; or for single valued functions
&emsp; $\displaystyle df = df(x,dx) = \frac{df}{dx} dx = \Big \langle \frac{df}{dx}, dx \Big \rangle$

Similarly, we define the functional as:

&emsp; $\displaystyle \delta F = \delta F[y, \eta] = \Big \langle \frac{\delta F}{\delta y}, \eta \Big \rangle$
&emsp; where the dot product might be
&emsp; $\displaystyle \Big \langle \frac{\delta F}{\delta y}, \eta \Big \rangle = \int \frac{\delta F}{\delta y}(x) \eta(x) dx$

Once again, the kind of dot product depends on the definition of the functional.

<br>

### Example of functional derivative

Say we compute the expected value of a function $g(x)$ with respect to the distribution $p(x)$:

&emsp; $\displaystyle E[g] = \int g(x) p(x) dx$
&emsp; $\implies$
&emsp; $\displaystyle \delta E = E[g+\eta] - E[g] = \int p(x) \eta(x) dx$
&emsp; $\implies$
&emsp; $\displaystyle \frac{\delta E}{\delta g} = p$
&emsp; (the function)

If our functional $F[y]$ is the value of the integral of $y$ on the segment $[a,b]$, then:

&emsp; $\displaystyle F[g] = \int_a^b y(x) dx$
&emsp; $\implies$
&emsp; $\displaystyle \delta F = F[y+\eta] - F[y] = \int_a^b \eta(x) dx$
&emsp; $\implies$
&emsp; $\displaystyle \frac{\delta F}{\delta y} = 1$
&emsp; (the function that returns 1)

Things are often much more complex obviously, and if we defined $F[q] = D_{KL}(p||q)$, the functional derivative is hard to get:

&emsp; $\displaystyle F[q] = -\int p(x) \log \frac{q(x)}{p(x)} dx$
&emsp; $\implies$
&emsp; $\displaystyle \delta F = -\int p(x) \log \big (1 + \frac{\eta(x)}{q(x)} \big) dx$

To get $\eta$ out of the log, we use the Taylor series of the natural logarithm:

&emsp; $\log (1+x) \simeq x + O(x^2)$
&emsp; $\implies$
&emsp; $\displaystyle \delta F \simeq -\int p(x) \frac{\eta(x)}{q(x)} dx$
&emsp; $\implies$
&emsp; $\displaystyle \frac{\delta F}{\delta y} \simeq \frac{p}{q}$