# **VISCOSITY SOLUTIONS THROUGH PENALTY METHODS**

---
---

## **1. CODE**

### Firedrake

In [1]:
try:
    !wget "https://fem-on-colab.github.io/releases/firedrake-install-release-real.sh" -O "/tmp/firedrake-install.sh"
    !bash "/tmp/firedrake-install.sh"
    from firedrake import *  # noqa: F401
except:
    from firedrake import *  # noqa: F401

--2025-12-17 23:48:37--  https://fem-on-colab.github.io/releases/firedrake-install-release-real.sh
Resolving fem-on-colab.github.io (fem-on-colab.github.io)... 185.199.108.153, 185.199.111.153, 185.199.109.153, ...
Connecting to fem-on-colab.github.io (fem-on-colab.github.io)|185.199.108.153|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4767 (4.7K) [application/x-sh]
Saving to: ‘/tmp/firedrake-install.sh’


2025-12-17 23:48:37 (49.2 MB/s) - ‘/tmp/firedrake-install.sh’ saved [4767/4767]

+ INSTALL_PREFIX=/usr/local
++ echo /usr/local
++ awk -F/ '{print NF-1}'
+ INSTALL_PREFIX_DEPTH=2
+ PROJECT_NAME=fem-on-colab
+ SHARE_PREFIX=/usr/local/share/fem-on-colab
+ FIREDRAKE_INSTALLED=/usr/local/share/fem-on-colab/firedrake.installed
+ [[ ! -f /usr/local/share/fem-on-colab/firedrake.installed ]]
+ PYBIND11_INSTALL_SCRIPT_PATH=https://github.com/fem-on-colab/fem-on-colab.github.io/raw/1f62c6f6/releases/pybind11-install.sh
+ [[ https://github.com/fem-on-colab/fem-on-co

### Other

In [2]:
import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation
from IPython.display import HTML

### Burgers

In [42]:
def burgers(h=2**-10, degree=0, nu=2**0, timestep=2**-5, end_time=1.0, gamma=2**-2, space="DG"):
    '''
    N.B. nu is non-dimensionalised
    '''
    mesh = UnitIntervalMesh(round(1/h))
    n = FacetNormal(mesh)
    x, = SpatialCoordinate(mesh)

    if space.lower() == "dg":
        V = FunctionSpace(mesh, "DG", degree)
    elif space.lower() == "cg":
        V = FunctionSpace(mesh, "CG", degree)
    else:
        raise ValueError(f"Unknown space: {space}")
    u_ = Function(V, name="VelocityOld")
    u = Function(V, name="Velocity")
    v = TestFunction(V)

    ic = assemble(interpolate((
        conditional(le(abs(x - 0.25), 0.125), 1, 0)
      + conditional(le(abs(x - 0.75), 0.125), - 0.5, 0)
    ), V))

    u_.assign(ic)
    u.assign(ic)

    jump = lambda u: -2 * avg(u * n[0])

    F = inner((u - u_)/timestep, v) * dx
    if space == "DG":
        F += (
            2/3 * inner(avg(u) * jump(u), avg(v)) * dS
          + 1/3 * inner(avg(u * v), jump(u)) * dS
          + nu * gamma * inner(jump(u), jump(v)) * dS
        )
    if degree != 0:
        F += inner(u * u.dx(0), v) * dx
    if degree != 0:
        F += nu * h * inner(u.dx(0), v.dx(0)) * dx
    if space == "DG" and degree != 0:
        F += nu * h * (
            inner(avg(u.dx(0)), jump(v))
          + inner(jump(u), avg(v.dx(0)))
        ) * dS

    sp = {
        # "snes_monitor": None,
        # "snes_converged_reason": None,
        "snes_max_it": 100,
    }

    fig, ax = plt.subplots(figsize=(10, 6))
    state = {'t': 0.0}

    def update(frame):
        if frame > 0:
            state['t'] += timestep
            print(GREEN % f"Solving for time t = {state['t']:.4f}:")
            solve(F == 0, u, solver_parameters=sp)
            u_.assign(u)

        ax.clear()
        plot(u, axes=ax, linewidth=3)
        ax.set_title(f"Burgers equation (t = {state['t']:.2f})")
        ax.set_xlabel("x")
        ax.set_ylabel("u")
        ax.set_ylim(-1, 1.5)
        ax.grid(True)

    num_frames = int(end_time / timestep) + 1
    anim = FuncAnimation(fig, update, frames=num_frames, interval=100)
    plt.close()
    return HTML(anim.to_jshtml())

---
---

## **2. THE IDEA**

How can we meaningfully discuss numerical solutions to ideal systems, like Euler, when we're left with a dichotomy:
- **very weak solutions**, which can have arbitrarily pathological energy behaviour and exhibit a continua of solutions, or
- **weak solutions**, which don't necessarily exist for general boundary conditions.

***Viscosity solutions***

So...
I discussed this issue a bunch with some PDE friends, and one thing that was brought to my attention was *viscosity solutions*.
Letting $(u_\nu)$ be a sequence of weak solutions to a *non*-ideal system with viscosity/dissipation $\nu$ as $\nu \to 0_+$, a very physically meaningful set of solutions to the ideal system is given by those $u$ that are accumulation points of $(u_\nu)$.

If my understanding's correct, these viscosity solutions $u$ are (in general) *very weak* solutions to the ideal system, so we might expect some anomalous energy dissipation.
But this makes a lot of sense!
Each weak solution $(u_\nu)$ will have some energy dissipation, so we can reasonably expect $u$ have it too.
In fact, anomalous dissipation is a very meaningful thing to expect in very weak solutions;
it's like an artifact of the neglected high-order terms that is *in general* negligible, but still manages to contribute to the dynamics through the roughness of in your data.

Viscosity solutions then give us a way to restrict to the physically meaningful set of very weak solutions.
It's fair to say, they struck me as such a wonderfully well-formed (in some sense *canonical*) way to analyse ideal systems, so I was curious how we might transfer the ideas to the discrete level.

I had some ideas about how one might approach this, that I think are best shown through example.
Naturally, this whole area isn't something with which I'm familiar, so the things and the way I'm thinking may be very well established in the literature, so please do let me know if so!


### ***2.1. Example (Burgers)***

Let's look at Burgers' equation,
$$
\dot{u}  =  - u u_x + \nu u_{xx}.
$$

We'd like to consider the ideal limit, $\nu \to 0_+$.
A simple dimensional analysis tells us that, as $\nu \to 0_+$, the length scale for variations in $u$ is $\sim \nu$.

Now, a typical weak formulation for Burgers would be:
> Find $u \in V$ such that
> $$
> \int_\Omega\dot{u}v = - \int_\Omega u u_x v + \nu u_x v_x
> $$
> for all $v \in V$.

Discretely, we could take some finite-dimensional continuous space $V^h$, and define:
> Find $u \in V^h$ such that
> $$
> \int_\Omega\dot{u}v = - \int_\Omega u u_x v + \nu u_x v_x
> $$
> for all $v \in V^h$.

**The problem:**
As $\nu \to 0_+$, the continuous solution wants to exhibit sharp gradients in $u$ over $\sim \nu$ length scales, which can't be resolved inside $V^h$.
This means instability and inaccurate solutions as we take $\nu \to 0_+$.

**The solution:**
Use a non-conforming method;
they can resolve arbitrarily steep gradients after all.
In particular, let's consider a DG method with a symmetric interior penalty.
Denote the vertices between elements as $\mathcal{V}^h$, with jumps $[\![\cdot]\!]$ and averages $\{\!\!\{\cdot\}\!\!\}$:

> Find $u \in V^h$ such that
> \begin{multline*}
> \int_\Omega\dot{u}v = - \int_\Omega u u_x v - \sum_{\mathcal{V}^h}\left(\frac{2}{3}\{\!\!\{u\}\!\!\}\{\!\!\{v\}\!\!\} + \frac{1}{3}\{\!\!\{uv\}\!\!\}\!\right)\![\![u]\!]  \\
> - \nu \int_\Omega u_x v_x - \nu\sum_{\mathcal{V}^h}\left(\frac{\sigma}{h}[\![u]\!][\![v]\!] + \{\!\!\{u_x\!\}\!\!\}[\![v]\!] + [\![u]\!]\{\!\!\{v_x\!\}\!\!\}\right)
> \end{multline*}
> for all $v \in V^h$.

That funny form of the convective term is just there to ensure skew-symmetry.

Considering $\nu \to 0_+$, there are two equivalent ways to proceed here, with the same outcome:

---

***Idea 1***

One way to derive a non-conforming discretisation is via mollification.
Take a mollification operator $\mathcal{M}_{h/\sigma}$, which mollifies over a distance $h/\sigma$.
We consider our conforming weak form, then instead of taking $u = u^h$, $v = v^h$ for conforming $u^h$, $v^h$, we take $u = \mathcal{M}_{h/\sigma}[u^h]$, $v = \mathcal{M}_{h/\sigma}[v^h]$ for non-conforming $u^h$, $v^h$.
One then discards all terms that are $o[1]$ in $\sigma$, and gets the non-conforming disretisation above.
Easy peasy.

From our dimensional analysis on the continuous level, we know we should expect gradients over a length scale $\sim \nu$.
So to accurately reproduce the dynamics of the continuous level, it makes sense to mollify over lenth scale $\sim \nu$.
So let's take $h / \sigma \sim \nu$.

***Idea 2***

We'd like to take $\nu \to 0_+$.
But if we do this without scaling $\sigma$, then the viscous term will vanish entirely, which is (i) not appropriate for viscosity solutions, and (ii) will make the problem lose its well-posedness.
So how do we scale $\sigma$ with $\nu$?

We'd like the leading vertex contributions of $\int_\Omega u u_x v$ and $\nu \int_\Omega u_x v_x$ to balance as $\nu \to 0_+$.
This necessitates $h / \sigma \sim \nu$.

---

Taking $h / \sigma = \nu / \gamma$ for some $\gamma$ then, we have the following:

> Find $u \in V^h$ such that
> \begin{multline*}
> \int_\Omega\dot{u}v = - \int_\Omega u u_x v - \sum_{\mathcal{V}^h}\left(\frac{2}{3}\{\!\!\{u\}\!\!\}\{\!\!\{v\}\!\!\} + \frac{1}{3}\{\!\!\{uv\}\!\!\}\!\right)\![\![u]\!]  \\
> - \nu \int_\Omega u_x v_x - \sum_{\mathcal{V}^h}\left(\gamma[\![u]\!][\![v]\!] + \nu\{\!\!\{u_x\!\}\!\!\}[\![v]\!] + \nu[\![u]\!]\{\!\!\{v_x\!\}\!\!\}\right).
> \end{multline*}
> for all $v \in V^h$.

We can now safely push $\nu \to 0_+$ and expect solutions not to become unstable, but to resemble viscosity solutions:

> Find $u \in V^h$ such that
> \begin{multline*}
> \int_\Omega\dot{u}v = - \int_\Omega u u_x v - \sum_{\mathcal{V}^h}\left(\frac{2}{3}\{\!\!\{u\}\!\!\}\{\!\!\{v\}\!\!\} + \frac{1}{3}\{\!\!\{uv\}\!\!\}\!\right)\![\![u]\!]  \\
> - \gamma\sum_{\mathcal{V}^h}[\![u]\!][\![v]\!]
> \end{multline*}
> for all $v \in V^h$.

For stability, it makes sense to ensure $u \to \sum_{\mathcal{V}^h}[\![u]\!]^2$ defines a norm on $V^h$, which is the case when $V^h$ is $\text{DG}_0$.

N.B. Naturally, for $\text{DG}_0$ all the spatial derivative objects vanish anyway, so we didn't really need to do the step where we push $\nu \to 0_+$.
However, I think it's important to do it in this order, to show that (in some sense) we really are solving for viscosity solutions, i.e. this is really a discretisation with (in some sense) $\nu \to 0_+$.

In [37]:
burgers(h=2**-10, space="DG", degree=0, gamma=2**-2)

[1;37;32mSolving for time t = 0.0312:[0m
[1;37;32mSolving for time t = 0.0625:[0m
[1;37;32mSolving for time t = 0.0938:[0m
[1;37;32mSolving for time t = 0.1250:[0m
[1;37;32mSolving for time t = 0.1562:[0m
[1;37;32mSolving for time t = 0.1875:[0m
[1;37;32mSolving for time t = 0.2188:[0m
[1;37;32mSolving for time t = 0.2500:[0m
[1;37;32mSolving for time t = 0.2812:[0m
[1;37;32mSolving for time t = 0.3125:[0m
[1;37;32mSolving for time t = 0.3438:[0m
[1;37;32mSolving for time t = 0.3750:[0m
[1;37;32mSolving for time t = 0.4062:[0m
[1;37;32mSolving for time t = 0.4375:[0m
[1;37;32mSolving for time t = 0.4688:[0m
[1;37;32mSolving for time t = 0.5000:[0m
[1;37;32mSolving for time t = 0.5312:[0m
[1;37;32mSolving for time t = 0.5625:[0m
[1;37;32mSolving for time t = 0.5938:[0m
[1;37;32mSolving for time t = 0.6250:[0m
[1;37;32mSolving for time t = 0.6562:[0m
[1;37;32mSolving for time t = 0.6875:[0m
[1;37;32mSolving for time t = 0.7188:[0m
[1;37;32mS

Viscosity solutions to Burgers' should have shocks that satisfy the Rankine–Hugoniot condition,
$$
u_{\text{shock}} = \frac{1}{2}u_+ + \frac{1}{2}u_-.
$$
We see these indeed that these are satisfied by our numerical solutions!

It's worth noting that the final system is going to be computationally equivalent to doing the conforming discretisation on $\text{CG}_1$ with $\nu = \gamma h$.
But again, this kind of interpretation doesn't make you think you're doing viscosity solutions.

In [51]:
burgers(h=2**-10, space="CG", degree=1, nu=2**0)  # N.B. nu is non-dimensionalised in this function; this "nu" is nu / (gamma * h)

[1;37;32mSolving for time t = 0.0312:[0m
[1;37;32mSolving for time t = 0.0625:[0m
[1;37;32mSolving for time t = 0.0938:[0m
[1;37;32mSolving for time t = 0.1250:[0m
[1;37;32mSolving for time t = 0.1562:[0m
[1;37;32mSolving for time t = 0.1875:[0m
[1;37;32mSolving for time t = 0.2188:[0m
[1;37;32mSolving for time t = 0.2500:[0m
[1;37;32mSolving for time t = 0.2812:[0m
[1;37;32mSolving for time t = 0.3125:[0m
[1;37;32mSolving for time t = 0.3438:[0m
[1;37;32mSolving for time t = 0.3750:[0m
[1;37;32mSolving for time t = 0.4062:[0m
[1;37;32mSolving for time t = 0.4375:[0m
[1;37;32mSolving for time t = 0.4688:[0m
[1;37;32mSolving for time t = 0.5000:[0m
[1;37;32mSolving for time t = 0.5312:[0m
[1;37;32mSolving for time t = 0.5625:[0m
[1;37;32mSolving for time t = 0.5938:[0m
[1;37;32mSolving for time t = 0.6250:[0m
[1;37;32mSolving for time t = 0.6562:[0m
[1;37;32mSolving for time t = 0.6875:[0m
[1;37;32mSolving for time t = 0.7188:[0m
[1;37;32mS

### ***2.2. Example (Navier–Stokes/Euler)***

So, if we can do Burgers, we can do Navier–Stokes:
$$
\dot{\mathbf{u}}  =  - \, \mathbf{u} \cdot \nabla\mathbf{u} - \nabla p + 2\nu \, \mathrm{div} \, \nabla_\mathrm{s} \mathbf{u},  \qquad
0  =  \mathrm{div}\,\mathbf{u}.
$$

A typical weak formulation would be:
> Find $(\mathbf{u}, p) \in V \times Q$ such that
>
> \begin{align*}
> \int_\Omega\dot{\mathbf{u}}\cdot\mathbf{v}  &=  \frac{1}{2} \int_\Omega [(\mathbf{u}\cdot\nabla\mathbf{v})\cdot\mathbf{u} - (\mathbf{u}\cdot\nabla\mathbf{u})\cdot\mathbf{v}] + \int_\Omega p \, (\mathrm{div}\,\mathbf{u}) \\
> &\qquad\qquad\qquad\qquad\qquad\qquad\qquad- 2\nu\int_\Omega \nabla_\mathrm{s}\mathbf{u} : \nabla_\mathrm{s}\mathbf{v},  \\
> 0  &=  \int_\Omega (\mathrm{div}\,\mathbf{u}) \, q,
> \end{align*}
>
> for all $(\mathbf{v}, q) \in V \times Q$.

We go through all the same steps then:
1. Cast into a conforming discretisation.
2. Cast into a non-conforming discretisation, by mollifying over a distance $\sim \nu$.
3. Take $\nu \to 0_+$.

Denoting the facets by $\mathcal{F}^h$, this gives us a discretisation for Euler viscosity solutions:
> Find $(\mathbf{u}, p) \in V^h \times Q^h$ such that
>
> \begin{align*}
> \int_\Omega\dot{\mathbf{u}}\cdot\mathbf{v}  &=  \frac{1}{2} \int_\Omega [(\mathbf{u}\cdot\nabla\mathbf{v})\cdot\mathbf{u} - (\mathbf{u}\cdot\nabla\mathbf{u})\cdot\mathbf{v}] + \int_\Omega p \, (\mathrm{div}\,\mathbf{u}) \\
> &\qquad\qquad\qquad\qquad+ \frac{1}{2} \int_{\mathcal{F}^h} [(\{\!\!\{\mathbf{u}\}\!\!\}\cdot[\![\mathbf{n}\otimes\mathbf{v}]\!])\cdot\{\!\!\{\mathbf{u}\}\!\!\} - (\{\!\!\{\mathbf{u}\}\!\!\}\cdot[\![\mathbf{n}\otimes\mathbf{u}]\!])\cdot\{\!\!\{\mathbf{v}\}\!\!\}] \\
> &\qquad\qquad\qquad\qquad\qquad\qquad\qquad- \gamma\int_{\mathcal{F}^h} [\![\mathbf{u}]\!] \cdot [\![\mathbf{v}]\!],  \\
> 0  &=  \int_\Omega (\mathrm{div}\,\mathbf{u}) \, q,
> \end{align*}
>
> for all $(\mathbf{v}, q) \in V^h \times Q^h$.

Again, it's wise to ensure $\mathbf{u} \mapsto \int_\Omega \|[\![\mathbf{u}]\!]\|^2$ defines a norm on the divergence-free subset $U^h$ of $V^h$, which is the case again the case when $U^h$ is of degree 0.
We can ensure this by taking $V^h = \text{RT}_1$, $Q^h = \text{DG}_0$, for which $U^h \subsetneq [\text{DG}_0]^d$.

I think it's super interesting how going through this discrete lens can offer a pretty intuitive understanding of some of the ideas behind why very weak Euler solutions can have anomalous dissipation, without having to go through so much of the intense analysis.
If it just so happened that $\mathbf{u}$ were continuous, then there wouldn't be any anomalous energy dissipation.
These I guess would correspond to *weak* (as opposed to *very weak*) solutions, where enough regularity implies no anomalous dissipation.

***Gradient jump penalisation***

Just an interesting thing I was thinking...
Suppose we started with Navier–Stokes with a small 4th-order term,
$$
\dot{\mathbf{u}}  =  - \, \mathbf{u} \cdot \nabla\mathbf{u} - \nabla p + 2\nu \, \mathrm{div} \, \nabla_\mathrm{s} \mathbf{u} + \varepsilon \, \mathrm{div} \, \nabla_\mathrm{s} \, \mathrm{div} \, \nabla_\mathrm{s} \mathbf{u},  \qquad
0  =  \mathrm{div}\,\mathbf{u}.
$$
We may then consider "viscosity solutions" as we take $\varepsilon \to 0_+$.
If we go through my steps but using a *continuous* discrete space $V^h$, I believe we're motivated to introduction a dissipation term like
$$
\gamma \int_{\mathcal{F}^h} [\![\nabla_\mathrm{s}\mathbf{u}]\!] : [\![\nabla_\mathrm{s}\mathbf{v}]\!],
$$
mimicking the gradient jump penalisation (GJP) method that's used to stabilise Navier–Stokes at high Reynolds numbers.

### ***2.3. Example (Allen–Cahn)***

As something different, we can consider stationary states for Allen–Cahn,
$$
W'(u)  =  \varepsilon \Delta u.
$$
"Viscosity solutions" as $\varepsilon \to 0_+$ here have some very nice connections with minimal surfaces, however the ideal system $W'(u) = 0$ at $\varepsilon = 0$ is *super* ill-posed.

So, a typical weak formulation would be:
> Find $u \in V$ such that
>
> $$
> \int_\Omega W'(u)v  =  - \, \varepsilon \int_\Omega \nabla u \cdot \nabla v,
> $$
>
> for all $v \in V$.

The difference between Allen–Cahn and Burgers/Navier–Stokes is that the characteristic length scale for variations in $u$ is $\sqrt{\varepsilon}$, since there's no convective term to balance.
We may go through all the same steps though:
1. Cast into a conforming discretisation.
2. Cast into a non-conforming discretisation, by mollifying over a distance $\sim \sqrt{\varepsilon}$.
3. Take $\varepsilon \to 0_+$.

The funny thing is the important facet term we can't discard is $\sim \varepsilon$.
So there's always some $\varepsilon$ dependence:
> Find $u \in V^h$ such that
>
> $$
> \int_\Omega W'(u)v = - \, \gamma\varepsilon \int_{\mathcal{F}^h} [\![u]\!][\![v]\!],
> $$
>
> for all $v \in V^h$.

Again, we can take $V^h = \mathrm{DG}_0$.
I guess one then just finds viscosity solutions by manually taking $\varepsilon \to 0_+$ until ill-conditioning dominates.

<!-- I'd have to run some experiments on this.
If we're including $\sim \varepsilon$ terms now, maybe we should introduce one on the LHS:
> Find $u \in V^h$ such that
>
> $$
> \int_\Omega W'(u)v + \varepsilon \int_{\mathcal{F}^h} W'(\{\!\!\{u\}\!\!\})\{\!\!\{v\}\!\!\} = - \, \gamma\varepsilon \int_{\mathcal{F}^h} [\![u]\!][\![v]\!],
> $$
>
> for all $v \in V^h$. -->

### ***2.4. Example (implicit Euler)***

These same ideas work for time discretisations.
It's seemingly less useful, but I think it's interesting, as we get implicit Euler out at the end of the day!

As a super simple example, say we're considering a gradient-descent system
$$
\dot{\mathbf{x}}  =  - \nabla E(\mathbf{x}).
$$
We want to know the long-term behaviour, say on the order of $t \sim \tfrac{1}{\tau}$ for some small $\tau \to 0_+$, so we scale the time dimension,
$$
\tau\dot{\mathbf{x}}  =  - \nabla E(\mathbf{x}),
$$
and seek to take a timestep $\sim 1$.
This then resembles all our other singularly perturbed systems, with the small coefficient $\tau$ multiplying the highest-order derivative, those in time.
"Viscosity solutions" here are just steady states.

Variations in $\mathbf{x}$ in the above system will occur on a time scale $\sim \tau$, so we can apply the same ideas:
1. Cast into a conforming discretisation in time (say continuous Petrov–Galerkin in time, although this step is kind of redundant).
2. Cast into a non-conforming discretisation (say discontinuous Galerkin in time) by mollifying over a time $\sim \tau$;
up to quadrature in time, this ends up being pretty much equivalent to a $\theta$ method, where $\theta = 1 - \tau$.
1. Take $\tau \to 0_+$;
this naturally ends up just being implicit Euler, where we're solving for $\mathbf{x}(1)$ such that $\nabla E(\mathbf{x}(1)) = \mathbf{0}$.

All this logic ends up saying is that we should use implicit Euler for stiff systems.
Not really informative in and of itself, but it is interesting that it reproduce this classical result, and it does, I guess, provide some motivation that that what I'm suggesting makes some sense.

---
---