# Automatically identifying governing partial differential equations with reverse finite differencing

So far, most "higher level" physical knowledge has been derived from simpler first principles. However, these techniques have still not given a full understanding of turbulent flow. As such, it is of interes to find a different quantity, for which we can find some conservation or transport equation, that will be simpler to solve, e.g., does not have the scale resolution requirements of a direct numerical simulation.

As a first step towards this goal, we want to see if the process of deriving physical laws or theories can be automated, i.e., can we posit a generic homogeneous PDE, such as
$$
A \frac{\partial u}{\partial t} = B \frac{\partial u}{\partial x} + C \frac{\partial^2 u}{\partial x^2} ...
$$
then solve for the coefficients, eliminating the terms for which coefficients are small?

For a first example, we analyze the heat or diffusion equation, for which an analytical solution can be obtained.
We will use two terms we know apply, and one that doesn't

$$
A u_t = B u_x + C u_{xx} + D uu_x + E
$$

with the initial condition
$$
u(x, 0) = \sin(4\pi x)
$$

There exists an analytical solution
$$
u(x,t) = \sin(4\pi x) e^{-16\pi^2 \nu t}
$$

We can find each partial derivative
$$
u_t = -16\pi^2 \nu \sin(4\pi x) e^{-16\pi^2 \nu t}
$$
$$
u_x = 4\pi \cos(4\pi x) e^{-16\pi^2 \nu t}
$$
$$
u_{xx} = -16 \pi^2 \sin(4\pi x) e^{-16\pi^2 \nu t}
$$

In [3]:
import numpy as np

# Specify diffusion coefficient
nu = 0.1

# Initialize u, x, and t vectors
xmax = 1.0
dx = xmax/50
x = np.arange(0, xmax, dx)
tmax = 0.2
dt = tmax/100
t = np.arange(0, tmax, dt)
u = np.zeros((len(t), len(x))) # rows are timesteps

# Compute analytical solution
for n in range(len(t)):
	u[n, :] = np.sin(4*np.pi*x)*np.exp(-16*np.pi**2*nu*t[n])

In [4]:
# Create vectors for partial derivatives
u_t = np.zeros(u.shape)
u_x = np.zeros(u.shape)
u_xx = np.zeros(u.shape)

for n in range(len(t)):
    u_t[n, :] = -16*np.pi**2*nu*np.sin(4*np.pi*x)*np.exp(-16*np.pi**2*nu*t[n])
    u_x[n, :] = 4*np.pi*np.cos(4*np.pi*x)*np.exp(-16*np.pi**2*nu*t[n])
    u_xx[n, :] = -16*np.pi**2*np.sin(4*np.pi*x)*np.exp(-16*np.pi**2*nu*t[n])
    
# Compute a convective term (that we know should have no effect)
uu_x = u*u_x

# Check to make sure some random point satisfies the PDE
i,j = 15,21
print(u_t[i,j] - nu*u_xx[i,j])

0.0


In [6]:
"""Try a solution using a matrix"""

def null(A, eps=1e-15):
    u, s, vh = np.linalg.svd(A)
    null_space = np.compress(s <= eps, vh, axis=0)
    return null_space.T

# Pick some indices out of the field -- somewhat random points in space and time
i, j = 1, 32
i1, j1 = 2, 40
i2, j2 = 7, 23
i3, j3 = 32, 12
i4, j4 = 13, 31

# Now let's make these matrices and solve for our differential equation
K = np.matrix([[u_t[i,j], -u_x[i,j], -u_xx[i,j], -uu_x[i,j], -1.0],
               [u_t[i1,j1], -u_x[i1,j1], -u_xx[i1,j1], -uu_x[i1,j1], -1.0],
               [u_t[i2,j2], -u_x[i2,j2], -u_xx[i2,j2], -uu_x[i2,j2], -1.0],
               [u_t[i3,j3], -u_x[i3,j3], -u_xx[i3,j3], -uu_x[i3,j3], -1.0],
               [u_t[i4,j4], -u_x[i4,j4], -u_xx[i4,j4], -uu_x[i4,j4], -1.0]
              ])
M = null(K)
print(M.T/M[0])

[[  1.00000000e+00   1.00165863e-17   1.00000000e-01   5.52825932e-18
   -2.12841927e-16]]


So we see that this approach tells us that our data fits the equation

$$
u_t = 0u_x + 0.1u_{xx} + 0uu_x + 0, 
$$
or
$$
u_t = \nu u_{xx}
$$

which is the actual governing equation!

In [9]:
"""Now let's automate the matrix creation process a bit using random indices"""
nterms = 5 # total number of terms in the equation
ni, nj = u.shape
K = np.zeros((5,5))

for n in range(nterms):
    i = int(np.random.rand()*(ni-1))
    j = int(np.random.rand()*(nj-1))
    K[n,0] = u_t[i,j]
    K[n,1] = -u_x[i,j]
    K[n,2] = -u_xx[i,j]
    K[n,3] = -uu_x[i,j]
    K[n,4] = -1.0
    
M = null(K)
print(M.T/M[0])

[[  1.00000000e+00   3.63389004e-16   1.00000000e-01   2.05776903e-15
    2.67294524e-16]]


## Would this method work with experimental data?

Imagine we had various "probes" on a physical model of a heat conducting system, which were sampled in time.

We can sample the analytical solution at specified points, but add some Gaussian noise to approximate a sensor in the real works. Can we still unveil the heat equation?

## Future applications

The example presented here may be trivial, as we examined the known analytical solution to a linear PDE. It would of course be of interest to apply this to a real world problem. For example, we might apply this technique to a direct numerical simulation of turbulent flow. After either Reynolds averaging or filtering the "exact" numerical solution (i.e., large eddy simulation), we may be able to find new PDEs that describe the system in terms of these lower resolution sampling techniques. 