<table>
 <tr align=left><td><img align=left src="https://i.creativecommons.org/l/by/4.0/88x31.png">
 <td>Text provided under a Creative Commons Attribution license, CC-BY. All code is made available under the FSF-approved MIT license. (c) Kyle T. Mandli</td>
</table>

In [None]:
from __future__ import print_function

%matplotlib inline
import numpy
import matplotlib.pyplot as plt

import scipy.integrate
import scipy.sparse as sparse
import scipy.sparse.linalg as linalg

# Stochastic Spectral Methods

**Purpose:** Construct surrogate (or Reduce Order Models) to help reduce the cost of UQ approaches.

Examples:
 - Bayesian model calibration, sensitivity analysis, design and control.
 - Posterior density sampling

**Key Idea:** Exploit smoothness of high-dimensional parameter spaces.

## Spectral Representation of Random Processes

Some definitions:

- Define a sequence of random variables
$$
    \left \{ Q_k(\omega) \right \}^\infty_{k=1}
$$
on the sample space $\Omega$ and probability space $(\Omega, \mathcal F, P)$.

- Let $\mathbb P_k$ denote space or polynomials with argument $Q_i$ with degree $\leq k$.
- Let $\hat{\mathcal P}_k \in \mathbb P_k$ s.t. $P \in \hat{\mathcal P}_k$ are orthogonal to $\mathbb P_{k-1}$.

### Polynomial Expansions

For a second order, finite-variance random variable $u$ can be represented as
$$
    u(\omega) = u_0 \hat{\!P}_0 + \sum^\infty_{i_1 = 1} u_{i_1} \hat{\!P}_1(Q_{i_1}) + \sum^\infty_{i_1 = 1} \sum^\infty_{i_2 = 1} u_{i_1, i_2} \hat{\!P}_2(Q_{i_1}, Q_{i_2}) + \sum^\infty_{i_1 = 1} \sum^\infty_{i_2 = 1} \sum^\infty_{i_4 = 1} u_{i_1, i_2, i_3} \hat{\!P}_3(Q_{i_1}, Q_{i_2}, Q_{i_3}) + \cdots
$$
Here $ \hat{\!P}_k(\cdot)$ represent the interaction between variables and $u_{i_1}, u_{i_1, i_2}, \ldots \in \mathbb R$.

More succinctly this can be written as
$$
    u(Q) = \sum^\infty_{k=0} u_k \Psi_k(Q_1, Q_2, Q_3, \ldots)
$$

If we have a finite set of $\left \{Q_i\right\}^p_{k=1}$ then we have a finite sum that can represent $u(\omega)$ up to $K$-order interactions as
$$
    u^K(Q) = \sum^K_{k=0} u_k \Psi_k(Q_1, Q_2, Q_3, \ldots)
$$
where $K+1 = \frac{(n+p)!}{n!p!}$.

For example if the random variable $u$ has only second-order interactions and less we would have
$$
u(\omega) = u_0 \hat{\!P}_0 + \sum^\infty_{i_1 = 1} u_{i_1} \hat{\!P}_1(Q_{i_1}) + \sum^\infty_{i_1 = 1} \sum^\infty_{i_2 = 1} u_{i_1, i_2} \hat{\!P}_2(Q_{i_1}, Q_{i_2})
$$

Now consider a random process $u(t, x, \omega)$ where now we have also added possible time and spatial dependence.  Our polynomial representation then can be written as
$$
    u^K(t, x, Q) = \sum^K_{k=0} u_k(t, x) \Psi_k(Q)
$$
where we note that we have separated the time and spatial dependence from the random variables.  The $\Psi_k$ are then usually looked as a orthogonal polynomial basis for the random part of the process $u$.

**Example:**

Consider a single random variable $Q \in C_0$ and take $\Psi_k(Q)$ as some set of one-dimensional polynomials that are orthogonal to each other with respect to the density $\rho_Q(q)$ and normalized so that $\Psi_0 = 1$.  Then
$$
    \mathbb E[\psi_0(Q)] = 1
$$
and
$$\begin{aligned}
    \mathbb E[\psi_i(Q) \psi_j(Q)] &= \int \psi_i(Q) \psi_j(Q) \rho_Q(q) dq \\
    &= \langle \psi_i(Q) \psi_j(Q) \rangle_\rho \\
    &= \delta_{ij} \gamma_i
\end{aligned}$$

Note that this is analogous to projecting onto an orthogonal polynomial basis.  Compute the mean and variance of the process $u(t, x, \omega)$ given the above rules.

**Mean:**
$$\begin{aligned}
    \mathbb E [u^K(t, x, Q)] &= \mathbb E \left [ \sum^K_{k=0} u_k(t, x) \psi_k(Q) \right ] \\
    &= u_0(t, x) \mathbb E[\psi_0(Q)] + \sum^K_{k=1} u_k(t, x) \mathbb E[\psi_k(Q)] \\
    &= u_0(t, x)
\end{aligned}$$

**Variance:**
$$\begin{aligned}
    \text{var} [u^K(t, x, Q)] &= \mathbb E \left [\left ( u^K(t, x, Q) - \mathbb E[u^K(t, x, Q)] \right )^2 \right] \\
    &= \mathbb E \left [\left(\sum^K_{k=0} u_k(t, x) \psi_k(Q) - u_0(t,x) \right )^2 \right ] \\
    &= \mathbb E \left [\left(u_0(t,x) + \sum^K_{k=1} u_k(t, x) \psi_k(Q) - u_0(t,x) \right )^2 \right ] \\
    &= \mathbb E \left [\left(\sum^K_{k=1} u_k(t, x) \psi_k(Q) \right )^2 \right ] \\
    &= \sum^K_{k=1} u^2_k(t, x)\gamma_k
\end{aligned}$$

### Basis for Distributions

As mentioned we want to have a polynomial basis that is orthogonal w.r.t. a density.

**Normal Distribution** $Q \sim N(0, 1)$
$$
    \rho_Q(q) = \frac{1}{\sqrt{2 \pi}} e^{-q^2 / 2}
$$
defined on $\mathbb R$, use the Hermite polynomials:
$$\begin{aligned}
    &H_0(Q) = 1, & & H_1(Q) = Q, & & H_2(Q) = Q^2 - 1, \\
    &H_3(Q) = Q^3 - 3 Q, & & H_4(Q) = Q^4 - 6Q^2 + 3, & & H_5(Q) = Q^5 - 10 Q^3 + 15 Q, \\
\end{aligned}$$

Normalization constants:
$$
    \gamma_i = \int_{\mathbb R} \psi^2(q) \rho_Q(q)dq = i!
$$

**Uniform Distribution** $Q \sim \mathcal{U}(-1, 1)$
$$
    \rho_Q(q) = \frac{1}{2}
$$
defined on $[-1, 1]$.
For a uniform distribution we use the Legendre polynomials:
$$\begin{aligned}
    &P_0(Q) = 1, & & P_1(Q) = Q, & & P_2(Q) = \frac{3}{2} Q^2 - \frac{1}{2}, \\
    &P_3(Q) = \frac{5}{2} Q^3 - \frac{3}{2} Q, & & P_4(Q) = \frac{35}{8} Q^4 - \frac{15}{4} Q^2 + \frac{3}{8}, & & P_5(Q) = \frac{63}{8} Q^5 - \frac{70}{8} Q^3 + \frac{15}{8} Q, \\
\end{aligned}$$

**Example:** Take $u \sim N(\mu, \sigma^2)$ that can be represented as
$$
    u = \mu + \sigma Q
$$
where $Q \sim N(0, 1)$.  Compute the coefficients of the polynomial representation of $u$.

$$
    \mu = \mathbb E \left [ \sum^K_{k=0} u_k(t, x) \psi_k(Q) \right ] = u_0(t, x) \Rightarrow u_0 = \mu
$$

$$\begin{aligned}
    \sigma^2 &= \mathbb E \left [\left ( u^K(t, x, Q) - \mathbb E[u^K(t, x, Q)] \right )^2 \right] \\
    &= \sum^K_{k=1} u^2_k(t, x) \gamma_k \\
    &= u^2_1(t, x) \gamma_1 \Rightarrow u_1 = \sigma
\end{aligned}$$

**Example:** Take $u \sim \mathcal U(a, b)$ that has mean and variance
$$
    \mu = \frac{a+b}{2} \quad \quad \sigma^2 = \frac{(b - a)^2}{12}
$$
and can be expressed as
$$
    u = \mu + \sqrt{3} \sigma Q
$$
where $Q \sim \mathcal{U}(-1, 1)$.  Compute the coefficients of the polynomial representation of $u$.

Similarly to the case of a normal distribution we have
$$
    u_0 = \mu = \frac{a + b}{2}
$$
and
$$\begin{aligned}
    u_1 = \sqrt{3} \sigma
\end{aligned}$$

### Multiple Random Variables

The single random variable case naturally extends to multiple random variables if we assume that the variables are assumed to be independent of each other (not necessarily the case).  This implies that the expectation of their product is the expectation of the individual variables multiplied together and motivates the following.

A *p-dimensional Multi-Index* is a $p$-tuple where
$$
    \boldsymbol{k'} = (k_1, \ldots, k_p) \in \mathbb N^p_0
$$
of non-negative integers with magnitude
$$
    |\boldsymbol{k'}| = \sum^p_{i=1} k_i
$$
and are ordered such that
$$
    \boldsymbol{j'} \leq \boldsymbol{k'} \iff j_i \leq k_i \text{  for  } i=1, \ldots, p.
$$
This is a bit hard to deal with but table 10.1 provides some values for the first few multi-indices as

| $k$ | $|\boldsymbol{k'}|$ | Multi-Index | Polynomial Multiplication              |
|----------------------------------------------------------------------------------|
|0    | 0                   | (0, 0, 0)   | $\psi_0(Q_1) \psi_0(Q_2) \psi_0(Q_3)$  |
|1    | 1                   | (1, 0, 0)   | $\psi_1(Q_1) \psi_0(Q_2) \psi_0(Q_3)$  |
|2    |                     | (0, 1, 0)   | $\psi_0(Q_1) \psi_1(Q_2) \psi_0(Q_3)$  |
|3    |                     | (0, 0, 1)   | $\psi_0(Q_1) \psi_0(Q_2) \psi_1(Q_3)$  |
|4    | 2                   | (2, 0, 0)   | $\psi_2(Q_1) \psi_0(Q_2) \psi_1(Q_3)$  |
|5    |                     | (1, 1, 0)   | $\psi_1(Q_1) \psi_1(Q_2) \psi_1(Q_3)$  |
|6    |                     | (1, 0, 1)   | $\psi_1(Q_1) \psi_0(Q_2) \psi_1(Q_3)$  |
|7    |                     | (0, 2, 0)   | $\psi_0(Q_1) \psi_2(Q_2) \psi_0(Q_3)$  |
|8    |                     | (0, 1, 1)   | $\psi_0(Q_1) \psi_1(Q_2) \psi_1(Q_3)$  |
|9    |                     | (0, 0, 2)   | $\psi_0(Q_1) \psi_0(Q_2) \psi_2(Q_3)$  |

Now define a vector of random variables 
$$
    Q = [Q_1,\ldots,Q_p]
$$ 
that are mutually independent with the density
$$
    \rho_Q = \prod^p_{i=1} \rho_{Q_p}.
$$

Let the univariate basis functions of each $Q_i$ be
$$
    \left \{ \psi_k(Q_i) \right \}^K_{k=0}
$$
is the univariate basis functions of degree $\leq K$ for variable $Q_i$.  We can then form the multivariate basis as
$$
    \Psi_{\boldsymbol{i'}}(Q) = \psi_{i_1}(Q_1), \cdots \psi_{i_p}(Q_p)
$$
for $0 \leq |\boldsymbol{i'}| \leq K$.  

The resulting basis functions therefore satisfy
$$\begin{aligned}
    \mathbb E[\Psi_{\boldsymbol{i'}}(Q) \Psi_{\boldsymbol{j'}}(Q)] &= \int \Psi_{\boldsymbol{i'}}(q) \Psi_{\boldsymbol{j'}}(q) \rho_Q(q) dq \\
    &= \langle \Psi_{\boldsymbol{i'}}, \Psi_{\boldsymbol{j'}} \rangle_\rho \\
    &= \delta_{\boldsymbol{i'} \boldsymbol{j'}} \gamma_{\boldsymbol{i'}}
\end{aligned}$$
where
$$
    \gamma_{\boldsymbol{i'}} = \mathbb{E}[\Psi^2_{\boldsymbol{i'}}] = \gamma_{i_1} \cdots \gamma_{i_p}.
$$

Turning back now to the representation of our process we have for $u(t, x, Q)$ the expansion
$$
    u^K(t, x, Q) = \sum^K_{|\boldsymbol{k'}| = 0} u_{\boldsymbol{k'}}(t, x) \Psi_{\boldsymbol{k'}}(Q),
$$
again the projection of u(t, x, Q) onto the basis $\Psi_{\boldsymbol{k'}}$.  Moreover the orthogonality of the basis functions allow us to write down
$$
    u_k(t,x) = \frac{1}{\gamma_k} \mathbb E[u(t, x, Q) \Psi_k(Q) ]
$$

Moreover the orthogonality of the basis functions allow us to write down
$$
    u_k(t,x) = \frac{1}{\gamma_k} \mathbb E[u(t, x, Q) \Psi_k(Q) ]
$$

## Galerkin, Collocation, and Discrete Projection Frameworks

We now turn to ways to compute the $u_k(t,x) using constraints provided by the assumptions of each approach.

### Finite Elements

As an aside we will briefly describe the relatively similar notation and ideas from finite elements and how they will relate to the methods for computing the coefficients $u_k(t, x)$.

Consider the simple ODE
$$
    \frac{\text{d}^2 u}{\text{d}x^2} = f(x) \quad x \in \Omega \quad u|_{\partial \Omega} = \Gamma(x).
$$

The first thing we will do is write the equation above, a.k.a. the *strong-form* of the equation, in the *weak-form* instead.  We do this by multiplying by a test function $v$ that satisfies the boundary conditions $v|_{\partial \Omega} = 0$ (you can also require $u$ to do this) and integrating to find
$$
    \int_{\Omega} u''(x) v(x) dx = \int_{\Omega} v(x) f(x) dx.
$$

Integrating the LHS by parts leads to
$$\begin{aligned}
    u'(x) v(x) |_{\partial \Omega} - \int_{\Omega} u'(x) v'(x) dx &= \int_{\Omega} v(x) f(x) dx. \\
    - \int_{\Omega} u'(x) v'(x) dx &= \int_{\Omega} v(x) f(x) dx.
\end{aligned}$$

Since the weak-form of the equation should be true $\forall v \in H^1_0(\Omega)$ such that we can restate the problem as 
$$
    \text{find a } u \in H^1_0(\Omega) \quad \forall v \in H^1_0(\Omega) \quad \int_{\Omega} u'(x) v'(x) dx = \int_{\Omega} v(x) f(x) dx.
$$
This is an infinite dimensional problem and where discretization occurs.  Instead of the above problem we replace the Sobolev space with a finite dimensional space $V$ such that the problem is now to
$$
    \text{find a } u \in U \quad \forall v \in V \quad \int_{\Omega} u'(x) v'(x) dx = \int_{\Omega} v(x) f(x) dx.
$$

For finite element methods we generally pick compactly support (with regards to the domain) piece-wise defined polynomials such as the hat functions
$$
    \psi_i(x) = \left \{ \begin{aligned}
        &\frac{x - x_{i-1}}{x_i - x_{i-1}} & & \text{if } x \in [x_{i-1}, x_i] \\
        &\frac{x_{i+1} - x}{x_{i+1} - x_{i}} & & \text{if } x \in [x_{i}, x_{i+1}] \\
        &0 & & \text{otherwise}
    \end{aligned} \right .
$$
Note with this example that the basis is orthogonal at the nodes of the grid $x_i$ and have overlapping support in the intervals between nodes.  Other choices, such as the Fourier basis, lead to other methods such as spectral methods.

The final element (heh) of turning the problem above into a discretization is to write the solution $u$ as
$$
    u \approx U = \sum^K_{k=0} u_k \psi_k(x)
$$
or in other words assume that the $\psi_k(x) \in \Psi$ spans the space $U$.

Plugging this back into the weak form we have
$$\begin{aligned}
    \int_{\Omega} u'(x) v'(x) dx &= \int_{\Omega} v(x) f(x) dx \\
    \int_{\Omega}  v'(x) \sum^K_{k=0} u_k  \psi_k'(x) dx &= \int_{\Omega} v(x) f(x) dx
\end{aligned}$$

We now have to identify the space $V$.  A Galerkin method (also sometimes called model finite elements) assumes that the same set of basis functions also spans the space $V$ (or in most cases that $U = V$).  This then turns the weak form into
$$\begin{aligned}
    \int_{\Omega}  \psi_j'(x) \sum^K_{k=0} u_k  \psi_k'(x) dx &= \int_{\Omega} \psi_j(x) f(x) dx \quad \forall \psi_j \in \Psi \\
    \sum^K_{k=0} u_k \int_{\Omega}  \psi_j'(x) \psi_k'(x) dx &= \int_{\Omega} \psi_j(x) f(x) dx \quad \forall \psi_j \in \Psi
\end{aligned}$$
for suitable assumptions.  This last expression can then be understood as a matrix problem on the LHS with the entries in the matrix comprised of
$$
    A_{jk} = \int_{\Omega} \psi_j'(x) \psi_k'(x) dx
$$
and the RHS the projection of $f(x)$ onto the space $\Psi$ giving us the discretized problem
$$
    A U = f.
$$

One way to think of the finite dimensional problem is to think of the problem as we have defined it as a projection onto the function space $U$ and $V$.  The function $f(x)$ is being projected onto $V$ and $U$ defines the space of functions we can look at to solve the problem (the search space).  If a problem converges to the true solution then $U \rightarrow H$ where $H$ contains the true solution.

There are three different ways we will approach this problem:
 - *Stochastic Galerkin Approach:*  This proceeds as we did with the finite element discussion and is equivalent to minimizing the residual onto a finite subspace.  Unfortunately things are not so easy when we switch back to our spectral representation of our process.  Instead the projection now requires the computation of expectations that are not usually needed in a deterministic setting.  This is often then called an *intrusive* method.
 - *Collocation:*  An alternative to Galerkin approaches is collocation or nodal methods.  Here we approximate the solution at a discrete set of points, called nodes or collocation points, and a space of polynomials $\Psi$ such that the problem we try to solve is what $\psi \in \Psi$ approximates the solution at the nodes the best.  This approach is considered *non-intrusive* as existing collocation approaches can be used.  Note that collocation approaches can be thought of as a specific type of Galerkin method.
 - *Discrete Projection:*  Direct approximation of the integral for the coefficients is used and can often be seen as another form of a Galerkin approach.  This approach often needs to be implemented independently but not always.
 
We now turn to studying these three methods applied to problems of varying complexity via examples.

## Examples

### Scalar Initial Value Problem

Consider the initial value problem
$$
    \frac{\text{d} u}{\text{d}t} = f(t, Q, u) \quad t > 0 \quad u(0, Q) = u_0.
$$
Assume $Q = [Q_1,\ldots,Q_p]$ are mutually independent random variables with range $\Gamma \in \mathbb R^p$ and joint density $\rho_Q(q)$.  Take the QoI as
$$
    y(t) = \int_{\Gamma} u(t, q) \rho_Q(q) dq
$$
and solutions $u \in L^2(0, T)$.  We also assume that we will be dealing with spaces of functions that have finite norm w.r.t. the $i$th component of the density such that if
$$
    ||g||_2 = \left(\int_{\Gamma_i} |g(q_i)|^2 \rho_{Q_i}(q_i) dq_i \right)^{1/2} < \infty.
$$
then $g \in L^2_{\rho_i}(\Gamma_i)$.  We can then consider the composed space of functions 
$$
    L^2_\rho(\Gamma) = L^2_{\rho_1}(\Gamma_1) \otimes \cdots \otimes L^2_{\rho_p}(\Gamma_p).
$$

Let $\{\Psi_k\}^K_{k=1}$ be a basis of a finite subspace $Z^K \subset L^2_\rho(\Gamma)$ such that we can project $u(t, Q)$ onto $Z^K$ to find
$$
    u^K(t, Q) = \sum^K_{k=0} u_k(t) \Psi_k(Q).
$$
The resulting coefficients can be computed by
$$
    u_k(t) = \frac{1}{\gamma_k} \int_\Gamma u(t, q) \Psi_k(q) \rho_Q(q) dq.
$$
We can now turn to finding ways to approximate the above integral for the coefficients.  The coefficients $\gamma_k$ can be computed as we have seen before with
$$
    \gamma_k = \langle \psi_k, \psi_k \rangle_\rho
$$

#### Stochastic Galerkin

The basic problem formulation we need for this method is to make the projection of the residual
$$
    r = \frac{\text{d} u^K}{\text{d}t} - f
$$
onto the basis function $\psi_i$ so that in the basis considered so that the residual is orthogonal to the basis functions considered.  This can be formulated as
$$\begin{aligned}
    0 &= \left \langle \frac{\text{d} u^K}{\text{d}t} - f, \psi_i \right \rangle_\rho \\
    &= \int_\Gamma \left [ u^K(t, Q) - f\left(t, q, u^K(t, Q) \right ) \right ] \Psi_i(q) \rho_Q(q)  dq \\
    &= \int_\Gamma \left [ \sum^K_{k=0} \frac{\text{d} u_k}{\text{d}t} \Psi_k(q) - f\left(t, q, \sum^K_{k=0} u_k(t) \Psi_k(q) \right ) \right ] \Psi_i(q) \rho_Q(q) dq \quad \quad \forall i \leq K
\end{aligned}$$

Initial conditions are projected onto the the basis we constructed on $L^2_p(\Gamma)$.

Discretization of the problem
$$
    \int_\Gamma \left [ \sum^K_{k=0} \frac{\text{d} u_k}{\text{d}t} \Psi_k(q) - f\left(t, q, \sum^K_{k=0} u_k(t) \Psi_k(q) \right ) \right ] \Psi_i(q) \rho_Q(q) dq \quad \quad \forall i \leq K
$$
then requires a quadrature rule with points in $q^r \in Q$ and weights $w^r$ so that we obtain
$$
    \sum^R_{r=1} \Psi_i(q^r) \rho_Q(q^r) w^r \left [\sum^K_{k=0} \frac{\text{d} u_k}{\text{d}t} \Psi_k(q) - f\left(t, q, \sum^K_{k=0} u_k(t) \Psi_k(q) \right ) \right ] = 0
$$
At this point many different quadrature methods can be used to find the $q^r$ and $w^r$.  For low-dimensional $Q$ straight-forward tensorial quadratures can be used.  For higher-dimensional spaces or more difficult problems (large gradients) other more sophisticated methods may be needed.

#### Collocation

For collocation we choose a set of points in the parameter space $q^r \in Q$, say $m \in M$ samples such that
$$
    u(t, q^m) = u^K(t, q^m)
$$
This provides a set of constraints
$$
    \begin{bmatrix}
        \Psi_0(q^1) & \cdots & \Psi_K(q^1) \\
        \vdots &  & \vdots \\
        \Psi_0(q^M) & \cdots & \Psi_K(q^M) \\
    \end{bmatrix} \begin{bmatrix}
        u_0(t) \\ \vdots \\ u_K(t)
    \end{bmatrix} = 
    \begin{bmatrix}
        u(t, q^1) \\ \vdots \\ u(t, q^M)
    \end{bmatrix}
$$

The size of the system needs to be $M \geq K + 1$ so that it is not underdetermined.  It could in fact be advantageous to chose $M > K + 1$ although it may no longer be possible to fit solve the system without using a least-squares approach, something that is often done.

Two difficulties with this approach:
1. Choosing the collocation points $q^m$ can be non-trivial.  Again, for smaller parameter spaces this can simply be a tensorial discretization of $Q$.  For higher-dimensional or more complex spaces adaptive methods may need to be used.
1. The choice of basis function can also lead to dense and ill-conditioned matrices.  To avoid this one can use Lagrange basis where
$$
    L_k(q^m) = \delta_{km}
$$
leading to an identity matrix if the system is not over-determined.  Note that this only works when one uses the points for which the collocation and the basis evaluation points agree.

#### Discrete Projection

The final discretization uses a direct approximation of the terms as defined (also called a pseudo-spectral method).  In this case we can approximate the problem as
$$
    u_k(t) = \frac{1}{\gamma_k} \sum^R_{r=1} u(t, q^r) \Psi_k(q^r) \rho_Q(q^r) w^r
$$
This formulation is more or less equivalent to the computational effort required for the collocation method.

### Scalar initial value problem 

(Example 10.9 in Smith)

$$\frac{du}{dt} = -a(\omega) u$$

$$u(t=0,\omega) = b$$

$$a \sim N(a_0,\sigma_a^2)$$

The damping rate $a$ is random with $a_0 = 1$, $\sigma_a = 0.25$. The initial condition is fixed and deterministic $b=b_0 = 10$. 

The analytical solution is $$u(t) = b e^{-at}$$

In [None]:
a0 = 1
sigma_a = 0.25
b0 = 10

# Exact solution
def u(b, a, t):
    return b * numpy.exp(-a * t)

Closed-form expressions for the mean and variance.
$$E[u(t)] = b e^{-a_0 t} e^{\sigma_a^2 t^2/2}$$

$$var[u(t)] = e^{-2a_0 t} b^2 (e^{2 \sigma_a^2 t^2} - e^{\sigma_a^2 t^2})$$

In [None]:
nt = 100
t = numpy.linspace(0, 12, nt)
t2 = t.reshape(nt,1)
umean_exact = b0 * numpy.exp(-a0 * t) * numpy.exp(sigma_a**2 * t**2 / 2)
uvar_exact = b0**2 * numpy.exp(-2 * a0 * t) * (numpy.exp(2 * sigma_a**2 * t**2) - numpy.exp(sigma_a**2 * t**2))

fig, axes = plt.subplots()
axes.plot(t, umean_exact, 'k', label='exact')
axes.plot(t, umean_exact + 2 * numpy.sqrt(uvar_exact), 'k--')
axes.plot(t, umean_exact - 2 * numpy.sqrt(uvar_exact), 'k--')
axes.set_xlabel('Time (s)')
axes.set_ylabel('Displacement (m)')
axes.legend(['Mean'], loc="upper right")
axes.set_title('2-$\sigma$ credible intervals')
plt.show()

#### Direct simulation

Sample damping rates and compute trajectories.

In [None]:
fig, axes = plt.subplots()
axes.plot(t, u(b0, numpy.random.normal(a0, sigma_a, 1000), t2))
axes.set_xlabel('Time (s)')
axes.set_ylabel('Displacement (m)')
plt.show()

Nsamples = 100000
udirect = u(b0, numpy.random.normal(a0, sigma_a, Nsamples), t2)
umean = numpy.mean(udirect,axis=1)
uplus = umean + 2 * numpy.std(udirect, axis=1)
uminus = umean - 2 * numpy.std(udirect, axis=1)

fig, axes  = plt.subplots()
axes.plot(t, umean, label='simulation')
axes.plot(t, uplus)
axes.plot(t, uminus)
axes.plot(t, umean_exact, 'k--',label='exact')
axes.plot(t, umean_exact + 2 * numpy.sqrt(uvar_exact), 'k--')
axes.plot(t, umean_exact - 2 * numpy.sqrt(uvar_exact), 'k--')
axes.set_xlabel('Time (s)')
axes.set_ylabel('Displacement (m)')
axes.legend(['Mean'], loc="upper right")
axes.set_title('2-$\sigma$ credible intervals')
plt.show()

Note the intervals grow and become unbounded for large $t$ because some of the damping rates are negative.

#### Stochastic Spectral

We seek approximate solutions $$u^K(t,Q) = \sum_{k=0}^K u_k(t) \psi_k(Q)$$
Subject to $$\left\langle \frac{du^K}{dt} + a^N u^K,\psi_i \right\rangle_\rho = 0\,, \qquad i=0,\dots, K$$
Or
$$\left\langle \frac{du^K}{dt},\psi_i \right\rangle_\rho = \left\langle_\rho a^N u^K,\psi_i \right\rangle_\rho \,, \qquad i=0,\dots, K$$
Or
$$ \int \sum_{k=0}^K \frac{d u_k}{dt} (t) \psi_k(q) \psi_i(q) \rho_Q(q) dq = \int a^N(q) \sum_{k=0}^K u_k(t) \psi_k(q) \psi_i(q) \rho_Q(q) dq$$
$$ a^N(q) = \sum_{n=0}^N a_n \psi_n(q) = a_0 + \sigma_a q$$

This yields $K+1$ differential equations $$ \frac{du_i}{dt} = \frac{1}{\gamma_i} \sum_{n=0}^N \sum_{k=0}^K a_n u_k(t) e_{ink} $$
where $\gamma_i = E[\psi_i^2]$ and $e_{ink} = E[ \psi_i \psi_n \psi_k]$.

Or
$$ \frac{d\mathbf{u}}{dt}= \mathbf{A} \mathbf{u} $$

In [None]:
def e_ink(i,n,k):
    s2 = i + n + k
    s = (i + n + k)/2
    if numpy.mod(s2,2) == 1:
        f = 0
    elif ((s<i) | (s<n) | (s<k)):
        f = 0
    else:
        f = numpy.math.factorial(i) * numpy.math.factorial(n) * numpy.math.factorial(k) / numpy.math.factorial(s-i) / numpy.math.factorial(s-n) / numpy.math.factorial(s-k)
    return f

# Construct A
def construct_A(K):
    A = numpy.zeros(shape=(K + 1, K + 1))
    gamma = numpy.zeros(K+1)
    for i in range(K+1):
        gamma[i] = numpy.math.factorial(i)
        for k in range(K+1):
            A[i,k] = -1 / gamma[i] * (a0 * e_ink(i, 0, k) + sigma_a * e_ink(i, 1, k))
    return A
        
# Structure of A
K = 6
# K = 8
K = 12
A = construct_A(K)
fig, axes = plt.subplots(1, 2)
fig.set_figwidth(fig.get_figwidth() * 2)
axes[0].spy(A)
plot = axes[1].pcolor(numpy.arange(K + 1), numpy.arange(K + 1), A)
fig.colorbar(plot)
plt.show()

In [None]:
def dU_dt(U, t, A):
    # Here U is a vector such that y=U[0] and z=U[1]. This function should return [y', z']
    return A.dot(U)

# Plot
fig, axes  = plt.subplots(2, 2)
fig.set_figwidth(fig.get_figwidth() * 2)
fig.set_figheight(fig.get_figheight() * 2.5)

K_values = numpy.array([[4, 6], [8, 12]])
for j in range(2):
    for (i, K) in enumerate(K_values[:, j]):
        A = construct_A(K)
        U0 = numpy.zeros(K+1)
        U0[0] = b0
        gamma = numpy.array([numpy.math.factorial(i) for i in range(K+1)])
        UK = scipy.integrate.odeint(dU_dt, U0, t, args=(A,))
        UKmean = UK[:,0]
        UKvar = numpy.sum(gamma[1:] * UK[:,1:]**2, axis=1)

        axes[i, j].plot(t, UKmean)
        axes[i, j].plot(t, UKmean + 2 * numpy.sqrt(UKvar))
        axes[i, j].plot(t, UKmean - 2 * numpy.sqrt(UKvar))
        axes[i, j].plot(t, umean_exact, 'k--')
        axes[i, j].plot(t, umean_exact + 2 * numpy.sqrt(uvar_exact), 'k--')
        axes[i, j].plot(t, umean_exact - 2 * numpy.sqrt(uvar_exact), 'k--')
        axes[i, j].set_xlabel('Time (s)')
        axes[i, j].set_ylabel('Displacement (m)')
        axes[i, j].set_title('K = %s' % K)
plt.show()

#### Discrete projection
Also called pseudospectral.
$$ u_k(t) = \frac{1}{\gamma_k} \langle(u(t,q),\psi_k \rangle = \frac{1}{\gamma_i} \int u(t,q)\psi_k(q) \rho_Q(q) dq \approx \frac{1}{\gamma_i} \sum_{r=1}^R u(t,q^r)\psi_k(q^r) \rho_Q(q^r) w^r$$
Note requires solving for $u(t,q^r)$. Non-intrusive.

In [None]:
from numpy.polynomial import HermiteE as H

qq = numpy.linspace(-2, 2, 100)
fig, axes = plt.subplots()
for i in range(4): 
    axes.plot(qq, H.basis(i)(qq), lw=2, label="$H_%d$" % i)
plt.legend(loc="lower left")
plt.show()

$$u_k(t) = \frac{1}{\gamma_k} \sum_{r=1}^R u(t,q^r) \psi_k(q^r) \rho_Q(q^r) w^r$$
Use Gauss-Hermite quadrature points. Check normalization. They come in different flavors.

In [None]:
R = 16
q, w = numpy.polynomial.hermite_e.hermegauss(R)
w = w / numpy.sqrt(2 * numpy.pi)
numpy.sum(w)

$$u_k(t) = \frac{1}{\gamma_i} \sum_{r=1}^R u(t,q^r)\psi_k(q^r)  w^r$$

In [None]:
UKp = numpy.zeros(shape=(nt,K+1))
for k in range(K + 1):
    UKp[:,k] = numpy.sum(H.basis(k)(q) * w * u(b0, a0 + sigma_a * q, t2), axis=1) / gamma[k]
UKpmean = UKp[:,0]
UKpvar = numpy.sum(gamma[1:] * UKp[:,1:]**2, axis=1)

fig, ax  = plt.subplots()
ax.plot(t, UKpmean)
ax.plot(t, UKpmean + 2 * numpy.sqrt(UKpvar))
ax.plot(t, UKpmean - 2 * numpy.sqrt(UKpvar))
ax.plot(t, umean_exact, 'k--')
ax.plot(t, umean_exact + 2 * numpy.sqrt(uvar_exact),'k--')
ax.plot(t, umean_exact - 2 * numpy.sqrt(uvar_exact),'k--')
plt.show()

#### Collocation
Find coefficients $u_k(t)$ that make $u(t,q^m) \approx u^K(t,q^m)$, $m=1,\dots,M$. $q^m$ are collocation points. 

Least-squares problem.

$$  u^K(t,q^m) = \sum_{k=0}^K u_k(t) \psi_k(q^m) = u(t,q^m)\,, m = 1,\dots,M$$
Or
$$ 
\begin{bmatrix} 
\psi_0(q^1) & \cdots & \psi_k(q^1) \\
\vdots & & \vdots\\
\psi_0(q^M) & \cdots & \psi_K(q^M)
\end{bmatrix}
\begin{bmatrix} 
u_0(t)\\
\vdots\\
u_K(t)
\end{bmatrix} 
=\begin{bmatrix} 
u(t,q^1)\\
\vdots\\
u(t,q^M)
\end{bmatrix} 
$$

Note rhs requires $M$ solutions, comparable to discrete projection.

Let's just use the Gauss-Hermite points. (Scaling seems to work better. What is a good choice?)

In [None]:
from numpy.polynomial.hermite_e import hermevander
from numpy.polynomial.hermite_e import hermefit

q = q/1.2
rhs = u(b0, a0 + sigma_a * q, t2)
rhs = numpy.swapaxes(rhs, 0, 1)

UKc = hermefit(q,rhs,K)
UKc = numpy.swapaxes(UKc,0,1)

UKcmean = UKc[:,0]
UKcvar = numpy.sum(gamma[1:] * UKc[:,1:]**2, axis=1)

fig, ax  = plt.subplots()
ax.plot(t, UKcmean)
ax.plot(t, UKcmean + 2 * numpy.sqrt(UKcvar))
ax.plot(t, UKcmean - 2 * numpy.sqrt(UKcvar))
ax.plot(t, umean_exact, 'k--')
ax.plot(t, umean_exact + 2 * numpy.sqrt(uvar_exact), 'k--')
ax.plot(t, umean_exact - 2 * numpy.sqrt(uvar_exact), 'k--')
plt.show()

### Elliptic PDEs

Taking the next step in complexity and adding spatial dependence to our problem we have the strong formulation of an elliptic PDE as
$$\begin{aligned}
    &\mathcal N(u, Q) = F(Q) \quad x \in \mathcal D \\
    &B(u, Q) = G(Q) \quad x \in \partial D 
\end{aligned}$$
where the first equation represents a possibly non-linear operator $\mathcal N$ and the second the corresponding boundary conditions.  Note that we now instead of having a time dependence we have strictly a spatial dependence and hence this has become an infinite-dimensional problem.  The quantity of interest in this case will be defined as
$$
    y(x) = \int_\Gamma u(x, q) \rho_Q(q) dq
$$
at $x \in \mathcal D$.

To turn the strong form equation into a weak one we will take a space of test function $V$ that satisfy zero boundary conditions (essential boundary conditions) and write the differential equation as
$$
    \int_{\mathcal D} \mathcal N(u, Q) \mathcal S(v) dx = \int_{\mathcal D} F(Q) dx
$$
where the possible non-linearity of $\mathcal N$ requires that we may need more complex representations of the functionals containing the test functions $v$, namely $\mathcal S$ via integration by parts f and $F$.

We can reterm the problem now as the following:

Find a $u \in V \otimes Z$, which satisfies
$$
\int_\Gamma \int_{\mathcal{D}} N(u, q) S(v(x)) z(q) \rho_Q(q) dx dq = \int_\Gamma \int_{\mathcal{D}} F(q) v(x) z(q) \rho_Q(q) dx dq
$$
where $V$ is typically an appropriate Sobolev space.

Take two basis that will span each of the component spaces that $u$ lives in such that
$$
    V^J = \text{span}\{\phi_j\} \supset V \quad Z^K = \text{span}\{\Psi_k \} \supset Z
$$
where $\{\phi_j\}$ are any number of traditional basis such as typical finite element basis and $\{\Psi_k\}$ are the spectral polynomial basis already discussed.

We now write the spectral approximation as
$$
    u^K(x, Q) = \sum^K_{k=0} u_k(x) \Psi_k(Q) = \sum^K_{k=0} \sum^J_{j=1} u_{jk} \phi_j(x) \Psi_k(Q)
$$

#### Stochastic Galerkin

We now need to project the residuals onto to the function test space such that
$$\begin{aligned}
    \int_\Gamma \int_{\mathcal{D}} N(u, q) S(v(x)) z(q) \rho_Q(q) dx dq &= \int_\Gamma \int_{\mathcal{D}} F(q) v(x) z(q) \rho_Q(q) dx dq &\Rightarrow \\
    \int_\Gamma \int_{\mathcal{D}} N\left(u^K(x, Q), q \right) S(\phi_\ell(x)) \Psi_i(q) \rho_Q(q) dx dq &= \int_\Gamma \int_{\mathcal{D}} F(q) \phi_\ell(x) \Psi_i(q) \rho_Q(q) dx dq& \\
    \int_\Gamma \int_{\mathcal{D}} N\left(\sum^K_{k=0} \sum^J_{j=1} u_{jk} \phi_j(x) \Psi_k(Q), q \right) S(\phi_\ell(x)) \Psi_i(q) \rho_Q(q) dx dq &= \int_\Gamma \int_{\mathcal{D}} F(q) \phi_\ell(x) \Psi_i(q) \rho_Q(q) dx dq&
\end{aligned}$$
which needs to hold $\forall \ell = 1, \ldots, J$ and $\forall i = 0, \ldots K$ (note that the indices $k$ and $j$ are inside the summation and we avoid their use).

Now we turn to finding the coefficients $u_{jk}$ by approximating the integrals via quadrature rules formulated here as
$$
    \int_\Gamma z(q) dq \approx \sum^R_{r=1} w^r z(q^r).
$$
where here we have used $R$ quadrature points $q^r$ and $w^r$ are the weights.  Plugging this into the previous formulation we have
$$\begin{aligned}
    \int_\Gamma \int_{\mathcal{D}} N\left(\sum^K_{k=0} \sum^J_{j=1} u_{jk} \phi_j(x) \Psi_k(Q), q \right) S(\phi_\ell(x)) \Psi_i(q) \rho_Q(q) dx dq &= \int_\Gamma \int_{\mathcal{D}} F(q) \phi_\ell(x) \Psi_i(q) \rho_Q(q) dx dq &\Rightarrow \\
    \sum^R_{r=1} w^r \Psi_i(q^r) \rho_Q(q^r) \int_{\mathcal{D}} N\left(\sum^K_{k=0} \sum^J_{j=1} u_{jk} \phi_j(x) \Psi_k(Q), q \right) S(\phi_\ell(x))  dx &= \sum^R_{r=1} w^r \Psi_i(q^r) \rho_Q(q^r)\int_{\mathcal{D}} F(q) \phi_\ell(x) dx &
\end{aligned}$$
If we use standard Gaussian quadrature we have a $J(K+1) \times J(K+1)$ system of fully coupled equations as we still require these to be $\forall \ell = 1, \ldots, J$ and $\forall i = 0, \ldots K$.

We need to also discretize the QoI leading to
$$\begin{aligned}
    y(x) = \int_\Gamma u(x, q) \rho_Q(q) dq \approx \sum^R_{r=1} w^r \rho_Q(q^r) \sum^K_{k=0} \sum^J_{j=1} u_{jk} \phi_j(x) \Psi_k(q^r)
\end{aligned}$$
where we have slightly simplified the notation by using the same quadrature rule as above.  This is then easily evaluated when the $u_{jk}$ have been determined.

#### Collocation

Take $M$ collocation points $q^m \in \Gamma$ and Lagrange polynomials for $\Phi_k$ satisfying the collocation property
$$
    L_k(q^m) = \delta_{km}
$$
at the collocation points we then have
$$
    u(x, q^m) = u^K(x, q^m) = \sum^J_{j=1} u_{jm} \psi_j(x)
$$
yielding
$$
    \int_{\mathcal{D}} N\left(\sum^J_{j=1} u_{jm} \phi_j(x) \Psi_k(Q), q^m \right) S(\phi_\ell(x) dx = \int_{\mathcal{D}} F(q^m) \phi_\ell(x) dx
$$
where we need this to hold $\forall \ell = 1,\ldots,J$.  Each collocation point $q^m$ we can find the solution for $u_{jm}$ requiring the solution of a $J \times J$ system.  Note that due to this decoupling this procedure is highly parallelizable and also non-intrusive.  As mentioned previously, the systems resulting from this collocation approach are a form of a stochastic Galerkin approach.  If the collocation points $q^m$ are taken to be the same as the quadrature points $q^r$ and the basis $\Psi_k$ are the Lagrange basis functions we find the same formulation as we have here.

To compute the QoI we again employ a quadrature rule with $R$ points and weights but now we also use Lagrange basis functions to obtain
$$
    y(x) = \int_\Gamma u(x, q) \rho_Q(q) dq \approx \sum^R_{r=1} w^r \rho_Q(q^r) \sum^J_{j=1} u_{jr} \phi_j(x) \Psi_k(q^r) = \sum^R_{r=1} w^r \rho_Q(q^r) \sum^J_{j=1} u_{jr} \phi_j(x).
$$

#### Discrete Projection

As usual, discrete projection is the most direct way to approximate the coefficients we desire.  Here we would employ
$$
    u_k(x) = \frac{1}{\gamma_k} \sum^R_{r=1} u(x, q^r) \Psi_k(q^r) \rho_Q(q^r) w^r
$$
leading to nearly the same method as collocation.

### Steady-State Heat Equation 

To have a more concrete example consider the Poisson problem or steady-state heat equation (linear case) with
$$
    \alpha \frac{\text{d}^2 u}{\text{d} x^2} = -f(x)
$$
on $x \in [-1, 1]$ and BCs $u(-1) = u(1) = 0$ and
$$
    \alpha \sim N(\bar{\alpha}, \sigma^2_\alpha ).
$$

We can rewrite this problem in the weak form
$$\begin{aligned}
    \int^1_{-1} \alpha \frac{\text{d}^2 u(x)}{\text{d} x^2} v(x) dx &= \int^1_{-1} v(x) f(x) dx \\
    \int^1_{-1} \alpha \frac{\text{d} u(x)}{\text{d} x} \frac{\text{d} v(x)}{\text{d} x} dx &= \int^1_{-1} v(x) f(x) dx \\
\end{aligned}$$
which must hold $\forall v \in V$.  Placing this in our previous notation we then have
$$
    \mathcal{N}(u, Q) = \alpha \frac{\text{d} u(x)}{\text{d} x} \quad S(v) = \frac{\text{d} v(x)}{\text{d} x}
$$

From the last example we have
$$
    \alpha = \alpha^N = \bar{\alpha} + \sigma_\alpha Q = \sum^1_{n=0} \alpha_n \Psi_n(Q)
$$
where $\Psi_0(Q) = 1$ and $\Psi_q(Q) = Q$ are of course the first to Hermite polynomials.  The corresponding density is then
$$
    \rho_Q(q) = \frac{1}{\sqrt{2 \pi}} e^{-q^2 / 2}.
$$

From our formulation of the general problem we have
$$
    \mathcal{N}(u, Q) = (\bar{\alpha} + \sigma_\alpha Q) \frac{\partial u}{\partial x}, \quad \quad S(v) = \frac{\text{d} v}{\text{d} x}
$$
so that the stochastic weak formulation is therefore
$$
    \int_{\mathbb R} \int^1_{-1} (\bar{\alpha} + \sigma_\alpha q) \frac{\partial u}{\partial x} \frac{\text{d} v}{\text{d} x} z(q) \rho_Q(q) dx dq = -\int_{\mathbb R} \int^1_{-1} f(x) v(x) z(q) \rho_Q(q) dx dq.
$$
which must hold $\forall v \in H^1_0(-1, 1)$ and $\forall z \in L^2_\rho(\mathbb R)$.

#### Construction of $V^J$

To construct the approximate space for the test functions we will use the common hat functions from finite elements:
$$
    \phi_j(x) = \frac{1}{\Delta x} \left \{ \begin{aligned}
        x - x_{j-1} & & x_{j-1} \leq x < x_j \\
        x_{j+1} - x & & x_j \leq x < x_{j+1} \\
        0 & & \text{otherwise}
    \end{aligned} \right .
$$
with the points $x_j = -1 + j \cdot \Delta x$ and $j = 1, \ldots, J-1$ so the essential boundary conditions are maintained.

#### Construction of $Z^K$

The stochastic space will be spanned by the Hermite polynomials as the random variable is normally distributed so that
$$
    Z^K = \text{span}\left \{\Psi_k \right \}^K_{k=0}
$$
where the $\Psi_k(q)$ are Hermite polynomials.

#### Discretized Solution

The approximate solution is then in the space $V^J \otimes Z^K$
$$
    u^K(x, Q) = \sum^K_{k=0} \sum^{J-1}_{j=1} u_{jk} \phi_j(x) \Psi_k(q)
$$
yielding the discretized problem
$$\begin{aligned}
    \int_{\mathbb R} \int^1_{-1} (\bar{\alpha} + \sigma_\alpha q) \frac{\partial u}{\partial x} \frac{\text{d} v}{\text{d} x} z(q) \rho_Q(q) dx dq &= -\int_{\mathbb R} \int^1_{-1} f(x) v(x) z(q) \rho_Q(q) dx dq &\Rightarrow \\
    \int_{\mathbb R} (\bar{\alpha} + \sigma_\alpha q) \int^1_{-1} \frac{\partial}{\partial x}\left[\sum^K_{k=0} \sum^{J-1}_{j=1} u_{jk} \phi_j(x) \Psi_k(q) \right] \frac{\text{d}}{\text{d} x} \left[\phi_\ell(x) \right ] z(q) \rho_Q(q) dx dq &= -\int_{\mathbb R} z(q) \rho_Q(q) \left[ \int^1_{-1} f(x) \phi_\ell(x) dx \right ] dq. & \\
    \int_{\mathbb R} (\bar{\alpha} + \sigma_\alpha q)  \sum^K_{k=0} \sum^{J-1}_{j=1} u_{jk} \Psi_k(q) \left [ \int^1_{-1} \phi'_j(x) \phi'_\ell(x) dx \right] z(q) \rho_Q(q) dq &= -\int_{\mathbb R} z(q) \rho_Q(q) \left[ \int^1_{-1} f(x) \phi_\ell(x) dx \right ] dq. & \\
    \int_{\mathbb R} (\bar{\alpha} + \sigma_\alpha q)  \sum^K_{k=0} \sum^{J-1}_{j=1} u_{jk} \Psi_k(q) \left [ \int^1_{-1} \phi'_j(x) \phi'_\ell(x) dx \right] \Psi_i(q) \rho_Q(q) dq &= -\int_{\mathbb R} \Psi_i(q) \rho_Q(q) \left[ \int^1_{-1} f(x) \phi_\ell(x) dx \right ] dq. &
\end{aligned}$$
which must hold $\forall \ell = 1,\ldots,J$ and $\forall i = 0, \ldots, K$.

If we simplify the notation a bit with the two projections that we can compute before hand we can write
$$\begin{aligned}
    &\Phi_{j\ell} = \left \{ \begin{aligned}
        2 & & j = \ell \\
        -1 & & j = \ell - 1 \text{ or } j = \ell + 1 \\
        0 & & \text{otherwise}
    \end{aligned} \right . \\
    &f_\ell = \int^1_{-1} f(x) \phi_\ell(x) dx
\end{aligned}$$
such that
$$\begin{aligned}
    \int_{\mathbb R} (\bar{\alpha} + \sigma_\alpha q)  \sum^K_{k=0} \sum^{J-1}_{j=1} u_{jk} \Psi_k(q) \left [ \int^1_{-1} \phi'_j(x) \phi'_\ell(x) dx \right] \Psi_i(q) \rho_Q(q) dq &= -\int_{\mathbb R} \Psi_i(q) \rho_Q(q) \left[ \int^1_{-1} f(x) \phi_\ell(x) dx \right ] dq & \Rightarrow \\
    \sum^{J-1}_{j=1} \Phi_{j\ell} \sum^K_{k=0} u_{jk} \int_{\mathbb R} (\bar{\alpha} + \sigma_\alpha q) \Psi_k(q) \Psi_i(q) \rho_Q(q) dq &= -  f_\ell \int_{\mathbb R} \Psi_i(q) \rho_Q(q) dq &
\end{aligned}$$

Now due to the orthogonality of the Hermite polynomials
$$
    \int_{\mathbb R} \Psi_k(q) \Psi_i(q) \rho_Q(q) dq = k! \delta_{ki}
$$
and the fact we know
$$
    \int_{\mathbb R} \Psi_\ell(q) \rho_Q(q) dq = \left \{ \begin{aligned} 1 & & \ell = 0 \\ 0 & & \text{otherwise} \end{aligned} \right .
$$
we can construct a system that is $(J - 1) \times (K + 1) \times (J - 1) \times (K + 1)$ in size.

### Evolution PDEs

We now consider the most complex of the systems when we add time to the mix.  Consider a general evolutionary PDE in the form
$$\begin{aligned}
    &\frac{\partial u}{\partial t} + \mathcal{N}(u, Q) = F(Q) & & x \in \mathcal{D}, t \in [0, \infty) \\
    &B(u, Q) = G(Q) & & x \in \partial \mathcal{D}, t \in [0, \infty) & \text{Boundary Condition} \\
    &u(0, x, Q) = I(Q) & & x \in \mathcal{D} & \text{Initial Condition}
\end{aligned}$$
where we have from before $\mathcal{N}$ containing a spatial differential operator, $F$ a source term, along with the boundary conditions and initial conditions.  To solve this problem we will combine the previous two approaches.

The weak deterministic form for the problem is
$$
    \int_{\mathcal{D}} \frac{\partial u}{\partial t} v dx + \int_{\mathcal{D}} \mathcal{N}(u, Q) S(v) dx = \int_{\mathcal{D}} F(Q) v dx
$$
which must hold $\forall v \in V$ where the functions in $V$ satisfy essential boundary conditions.  Adding the stochastic component to this we have
$$
    \int_\Gamma \int_{\mathcal{D}} \frac{\partial u}{\partial t} v(x) z(q) \rho_Q(q) dx dq + \int_\Gamma \int_{\mathcal{D}} \mathcal{N}(u, Q) S(v(x)) z(q) \rho_Q(q) dx dq = \int_\Gamma \int_{\mathcal{D}} F(Q) v(x) z(q) \rho_Q(q) dx dq
$$
where now this must hold $\forall v \in V$ and $\forall z \in Z$.

The QoI will be taken to be the expected value
$$
    y(t, x) = \int_\Gamma u(t, x, q) \rho_Q(q) dq
$$

Here again we take finite dimensional subspaces of the full space $V \otimes Z$ where
$$
    V^J = \text{span}\{\phi_j\} \supset V \quad Z^K = \text{span}\{\Psi_k \} \supset Z
$$
leading to the approximate solutions
$$
    u^K(t, x, Q) = \sum^K_{k=0} \sum^J_{j=1} u_{jk}(t) \phi_j(x) \Psi_k(Q).
$$

#### Stochastic Galerkin

As before we will use a quadrature rule with $R$ quadrature points $q^r$ and weights $w^r$.  Starting from the stochastic weak form
$$
    \int_\Gamma \int_{\mathcal{D}} \frac{\partial u}{\partial t} v(x) z(q) \rho_Q(q) dx dq + \int_\Gamma \int_{\mathcal{D}} \mathcal{N}(u, Q) S(v(x)) z(q) \rho_Q(q) dx dq = \int_\Gamma \int_{\mathcal{D}} F(Q) v(x) z(q) \rho_Q(q) dx dq
$$
we have the following terms
$$\begin{aligned}
    &\int_\Gamma \int_{\mathcal{D}} \frac{\partial u}{\partial t} v(x) z(q) \rho_Q(q) dx dq &\approx& \sum^R_{r=1} \Psi_i(q^r) \rho_Q(q^r) w^r \sum^K_{k=0} \sum^J_{j=1} \frac{\text{d} u_{jk}}{\text{d}t} \Psi_k(q^r) \int_{\mathcal{D}} \phi_j(x) \phi_\ell(x) dx \\
    &\int_\Gamma \int_{\mathcal{D}} \mathcal{N}(u, Q) S(v(x)) z(q) \rho_Q(q) dx dq &\approx& \sum^R_{r=1} \Psi_i(q^r) \rho_Q(q^r) w^r \int_{\mathcal{D}} N\left(\sum^K_{k=0} \sum^J_{j=1} u_{jk} \phi_j(x) \Psi_k(q^r), q^r \right)  S(\phi_\ell(x)) dx \\
    &\int_\Gamma \int_{\mathcal{D}} F(Q) v(x) z(q) \rho_Q(q) dx dq &\approx& \sum^R_{r=1} \Psi_i(q^r) \rho_Q(q^r) w^r \int_{\mathcal{D}} F(q^r) \phi_\ell(x) dx.
\end{aligned}$$
which again holds $\forall \ell = 1,\ldots, J$ and $\forall i= 0,\ldots,K$.

Once the $u_{jk}(t)$ are found we can evaluate the QoI as
$$
    y(t, x) = \int_\Gamma u(t, x, q) \rho_Q(q) dq \approx \sum^R_{r=1} w^r \rho_Q(q^r) \sum^K_{k=0} \sum^J_{j=1} u_{jk}(t) \phi_j(x) \Phi_k(q^r).
$$

#### Collocation

Again using the basis choice for $\Psi_k(q)$ as Lagrange polynomials with a set of collocation points $q^m$ requiring for simplicity that our quadrature points $q^r$ and collocation points $q^m$ to be identical we then have
$$
    \frac{\text{d} u_{jr}}{\text{d} t} + \int_{\mathcal{D}} N\left(\sum^J_{j=1} u_{jr} \phi_j(x), q^r \right) S(\phi_\ell(x)) dx = \int_{\mathcal{D}} F(q^r) \phi_\ell(x) dx
$$
where this holds $\forall \ell=1,\ldots,J$.  The QoI is then approximated with
$$
    y(t, x) = \int_\Gamma u(t, x, q) \rho_Q(q) dq \approx \sum^R_{r=1} w^r \rho_Q(q^r) \sum^J_{j=1} u_{jr}(t) \phi_j(x).
$$

#### Discrete Projection

Here the projection onto the basis forms the systems represented by
$$
    u_k(t,x) = \frac{1}{\gamma_k} \sum^R_{r=1} u(t, x, q^r) \Psi_k(q^r) \rho_Q(q^r) w^r.
$$

#### Example: Heat Equation

$$
    \frac{\partial u}{\partial t} = \kappa(q) \frac{\partial^2 u}{\partial x^2}\\
    x \in (-1,1) \quad u(t, -1) = u(t, 1) = \alpha \quad u(0, x) = \beta e^{-x^2 / \sigma^2} 
$$

In [None]:
def solve_heat_equation(x, U_0, t_0, t_final, kappa=1.0, alpha=0.0, C=0.5):
    """Solve heat equation using Crank-Nicklson
    
    """
    U = U_0[1:-1].copy()
    m = U.shape[0]
    delta_x = x[1] - x[0]
    delta_t = C * delta_x / kappa
    N = int((t_final - t_0) / delta_t)

    g_0 = lambda t: alpha
    g_1 = lambda t: alpha
    
    r = numpy.ones(m) * delta_t * kappa / (2.0 * delta_x**2)
    A = sparse.spdiags([-r, 1.0 + 2.0 * r, -r], [-1, 0, 1], m, m).tocsr()
    B = sparse.spdiags([r, 1.0 - 2.0 * r, r], [-1, 0, 1],  m, m).tocsr()
    
    # Time stepping loop
    t = t_0
    for n in range(N):
        # Construct right-hand side (no BCs)
        b = B.dot(U)
        b[0] += delta_t * kappa / (2.0 * delta_x**2) * 2.0 * alpha
        b[-1] += delta_t * kappa / (2.0 * delta_x**2) * 2.0 * alpha

        # Solve system
        U = linalg.spsolve(A, b)
        t += delta_t
    
    # Take last time step
    delta_t = t_final - (t_0 + delta_t * N)
    r = numpy.ones(m) * delta_t * kappa / (2.0 * delta_x**2)
    A = sparse.spdiags([-r, 1.0 + 2.0 * r, -r], [-1, 0, 1], m, m).tocsr()
    B = sparse.spdiags([r, 1.0 - 2.0 * r, r], [-1, 0, 1],  m, m).tocsr()
    b = B.dot(U)
    b[0] += delta_t * kappa / (2.0 * delta_x**2) * 2.0 * alpha
    b[-1] += delta_t * kappa / (2.0 * delta_x**2) * 2.0 * alpha
    U = linalg.spsolve(A, b)
    
    return U

N = 1000
kappa_values = numpy.random.uniform(low=0.0, high=1.0, size=N)
m = 100
x = numpy.linspace(-1, 1, m + 1)

alpha = 1.0
beta = 1.0
sigma = 0.2
U_0 = beta * numpy.exp(-x**2 / sigma**2) + alpha

U = numpy.ones((N, m + 1)) * alpha
for (i, kappa) in enumerate(kappa_values):
    U[i, 1:-1] = solve_heat_equation(x, U_0, 0.0, 0.1, kappa=kappa, alpha=alpha)
    
U_star = numpy.ones(m + 1) * alpha
U_star[1:-1] = solve_heat_equation(x, U_0, 0.0, 0.1, kappa=0.5, alpha=alpha)
    
# Compute statistics
U_mean = numpy.sum(U, axis=0) / N

# Plot a few solutions
fig = plt.figure()
fig.set_figwidth(fig.get_figwidth() * 2)
fig.set_figheight(fig.get_figheight() * 2)
axes = fig.add_subplot(1, 1, 1)
for (i, kappa) in enumerate(kappa_values):
    axes.plot(x, U[i, :], 'gray')#, label="$\kappa = %s$" % kappa)
    axes.set_xlabel("x")
    axes.set_ylabel("u(x,t)")
    axes.set_title("Solution to Heat Equation using CN")
    axes.set_xlim([-1,1])


axes.plot(x, U_mean, 'r', label='MC mean')
axes.plot(x, U_star, 'c', label='Mean Solution')
axes.legend()
plt.show()

## Closing Remarks Regarding Stochastic Spectral Methods

What follows is a truncated list of comments from Smith section 10.2.4.

### Stochastic Galerkin

Is optimal in the $L^2$ sense due to the projection of the residual onto the approximation space (minimized in the given space).

The fully coupled problem which is $J(K+1)\times J(k+1)$ sometimes can degenerate into the $J \times J$ size decoupled problems when choosing appropriate basis.

Sparse grid techniques may need to be used for the quadrature evaluation.

Disadvantages
   - Can only be used for densities $\rho_Q$ that have associated orthogonal polynomials.  This also implies that this method may not be suitable for general Bayesian approaches.
   - Assumes mutually independent parameters.
   - The method is intrusive

### Stochastic Collocation

Convergence analysis for collocation is based on polynomial approximation theory.

These can be constructed by using Lagrange polynomials as the basis and test functions in the Galerkin framework.

If we choose the collocation and quadrature points to be identical then the deterministic and stochastic components decouple.

As the number of collocation points change the approximation space changes as well.

The method is nonintrusive as existing solvers can be used to compute at the $M$ collocation points.

Collocation can be used on general parameter distributions including Bayesian techniques.  This again is a consequence due to the requirement that the basis functions $\Psi$ must be orthogonal with respect to the densities $\rho_Q$.

Interpolation error for $p$-parameters and $M$ collocation points is
 $$
     f - \mathcal{I}_M f = \mathcal{O}(M^{-\alpha/p})
 $$
 where $\mathcal{I}_M$ is the interpolation operator and $\alpha$ is the regularity of the solution.  This implies that the accuracy of the method decreases with increasing dimension.  In contrast Monte Carlo convergence rate is $\mathcal{O}(M^{1/2})$ and hence dimension independent.

### Discrete Projection

This method goes by many names including pseudospectral, nonintrusive PC (polynomial chaos), and nonintrusive spectral projection (NISP).

The method shares the properties with collocation that the method decouples the deterministic and stochastic components of the solution and is non-intrusive.

This method is equivalent to collocation if Lagrange polynomials are used as basis functions.

The method requires the assumption of mutually independent random variables to construct $\rho_Q$.

### Additional Reading
 - *Spectral Methods for Uncertainty Quantification* by O.P. Le Maitre and O.M. Knio