In [None]:
import tqdm
import numpy as np
import scipy
import sympy
import matplotlib.pyplot as plt
from mpl_toolkits import mplot3d

## The Lorenz system

Because we practically have to.
Lorenz originally devised this system as a simplified model of atmospheric convection.
The unknowns don't actually represent physical space as such but that's how everyone writes it.
The Lorenz system is:
$$\begin{align}
\dot x & = \sigma(y - x) \\
\dot y & = x(\rho - z) - y \\
\dot z & = xy - \beta z
\end{align}$$
where $\sigma$, $\rho$, and $\beta$ are respectively the Prandtl number, Rayleigh number, and physical scale of the system.
Here we'll simulate the Lorenz system using several methods.

### Forming the problem

Make the right-hand side of the system using sympy.
Since this is a vector problem, it's worth thinking about how we account for this using the symbolic algebra system.
In principle, we could use a Python list of symbolic expressions, but some of the bookkeeping will then be up to us.
Here I've suggested that you put this inside a call to sympy.Matrix because it'll do the bookkeeping for us.

In [None]:
x, y, z = sympy.symbols("x y z", real=True)
σ, ρ, β = sympy.symbols("σ ρ β", real=True, positive=True)

f = sympy.Matrix(
    ...
)
f

Compute the derivative of the right-hand of the system.
Remember that (1) the derivative of a vector-valued function is a matrix, and (2) derivatives with respect to the same variable all go in the same row of this matrix.
It's worth doing this by hand first and then checking with sympy after.
Hint: you can do this with one call to [f.jacobian](https://docs.sympy.org/latest/modules/matrices/matrices.html#sympy.matrices.matrices.MatrixCalculus.jacobian).

In [None]:
df = ...
df

Compute the equilibrium points of the Lorenz system.
Do it by hand, then use sympy.solve to check your work.
I used the first equation to eliminated $y$ and then the third equation to eliminate $z$, but up to you.

In [None]:
equilibria = ...
equilibria

### Lambdifying it

What we'd like to do next is take these symbolic expressions, turn them into Python functions that we can cheaply call repeatedly, and then solve the ODE numerically using a few different methods.
We're going to run into a slight stumbling block that I'll try to illustrate below.
First, we'll create two arrays to represent the initial conditions of the system and the parameters that we'll use:
$$u_0 = \left[\begin{matrix} 2.0 \\ 1.0 \\ 1.0\end{matrix}\right], \qquad \left[\begin{matrix}\sigma \\ \rho \\ \beta\end{matrix}\right] = \left[\begin{matrix}10.0 \\ 28.0 \\ 8/3\end{matrix}\right]$$

In [None]:
u_0 = np.array([2.0, 1.0, 1.0])
params = np.array([10.0, 28, 8/3])

The code below will lambdify the symbolic expression.
It's worth unpacking what this does and thinking about how it might turn out differently if we had passed the arguments in a different order.
The sympy.lambdify function takes in the arguments first and the expression second.
We could have written this as `sympy.lambdify((x, y, z, σ, ρ, β), f)`, in which case we would have to pass in all of the arguments and parameters separately.
That involves some annoying extra work pulling out vector entries and putting them back together again.
The way I've written it below, we can pass in first a numpy array containing the coordinates, then another numpy array containing the parameters.

In [None]:
F = sympy.lambdify([(x, y, z), (σ, ρ, β)], f)

Just to check that things are working correctly, let's evaluate it at the initial condition using the parameters defined above.

In [None]:
F(u_0, params)

This is sort of what we want, but it returns a numpy array of shape `(3, 1)`.
What happens if we add it to `x_0`, as if we were using the forward method?

In [None]:
u_0 + 0.001 * F(u_0, params)

Clearly something is off here.
The reason for this is that sympy's linear algebra features -- being able to define symbolic matrices -- doesn't have a separate type for vectors.
They can only be represented as 3 $\times$ 1 matrices.
In order to get everything to work the way we want, we can wrap the result of lambdify like so:

In [None]:
F_ = sympy.lambdify([(x, y, z), (σ, ρ, β)], f)
def F(*args, **kwargs):
    return F_(*args, **kwargs).flatten()

Now we should see that this extra step gives us an array of the right shape.

In [None]:
F(u_0, params)

In [None]:
assert F(u_0, params).shape == (3,)

If we lambdify the Jacobian of $f$, everything is fine -- it was always supposed to be a matrix.

In [None]:
dF = sympy.lambdify([(x, y, z), (σ, ρ, β)], df)

In [None]:
dF(u_0, params)

In [None]:
assert dF(u_0, params).shape == (3, 3)

### Numerical solution

Finally, let's solve the Lorenz system using a few different approaches.
First, we'll set up the parts that are the same regardless of which method you're using.

In [None]:
dt = 1e-2
T = 20.0
num_steps = int(T / dt)

Next, fill in the code below to solve the ODE using the forward method.

In [None]:
us = np.zeros((num_steps + 1, 3))
us[0] = u_0
for n in tqdm.trange(num_steps):
    ...

In [None]:
fig, ax = plt.subplots(subplot_kw={"projection": "3d"})
ax.plot(*us.T);

Next we'll do the midpoint method.
First, you should look up the documentation for [scipy.optimize.root](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.root.html).
In order to use it, we need to provide:
1. The function whose root we want to find
2. The initial guess for the root
3. Any additional arguments to pass to the function from 1., passed in the keyword argument `args`
4. The derivative of the function from 1., passed in the keyword argument `jac`

We've already written some code above to form the right-hand side of the ODE (`F`).
In order to use this rootfinding procedure, we're going to need to define a few extra auxiliary Python functions -- we're not trying to find a root of `F` but rather a root $u_{n + 1}$ of the equation
$$\frac{u_{n + 1} - u_n}{\delta t} = F\left(\frac{u_n + u_{n + 1}}{2}, \sigma, \rho, \beta\right). \tag{*}$$
So we'll need to rewrite this into the form
$$G(u_{n + 1}, \text{other parameters}) = 0.$$
Crucially, we can make the value at the previous step into one of the additional arguments that we'll supply to the rootfinding function.
Fill in the body of this function below.
You can rearrange equation (*) however you see fit.

In [None]:
def G(u, u_n, params):
    ...

Next, fill in the derivative of $G$ below.

In [None]:
I = np.eye(3)
def dG(u, u_n, params):
    ...

Now write a procedure to approximate the solution of the Lorenz system by repeatedly finding a root of $G$.
It's worth noting that scipy.optimize.root doesn't return just the solution of the nonlinear system, because it can only calculate that solution approximately.
Instead it returns a data structure called an [OptimizeResult](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.OptimizeResult.html#scipy.optimize.OptimizeResult).
An OptimizeResult has a member called `.x` which contains the approximate solution, but it also has a field called `.success` to check whether the rootfinding procedure successfully terminated or not.
In your code below, get the result of the rootfinding procedure and an assert into the loop to make sure that it actually succeeded.

In [None]:
us = np.zeros((num_steps + 1, 3))
us[0] = u_0

for n in tqdm.trange(num_steps):
    ...

Does your solution differ appreciably from that obtained with the forward method?
When you're done with this whole notebook, go back and see what kind of timestep you need to take with the forward method in order to make it produce similar-looking results to the midpoint method.

In [None]:
fig, ax = plt.subplots(subplot_kw={"projection": "3d"})
ax.plot(*us.T);

That was more involved than the forward method.
Can we take some shortcuts?

### Linearly implicit schemes

Or, probably the greatest secret of numerical methods.
I've told you to use scipy.optimize.root in order to solve the nonlinear system in each timestep.
**How does scipy solve that nonlinear system?**

This is admittedly about to get *really confusing* because there are too many indices (i.e. more than 1).
I've written $u_n$ for the value of $u$ at the $n$th timestep.
Now we're going to need another index to specify each successive guess for solving that nonlinear system, at a single timestep.
I'll write these as an upper index $k$ in brackets (seriously if you have a better way I am all ears).
It's going to look awful but we'll get rid of it almost immediately.

The gold standard way to find the root of a function $G$ is *Newton's method*.
Say that $U$ is the true root of $G$, i.e. $G(U) = 0$.
Then we can expand $G$ in a 1st-order Taylor series about the current candidate solution $u^{[k]}$:
$$0 = G(U) = G(u^{[k]} + (U - u^{[k]})) \approx G(u^{[k]}) + dG(u^{[k]})(U - u^{[k]})$$
If we rearrange this equation, it suggests that we can take the next guess $u^{[k + 1]}$ by first solving a linear system for the *update* vector $v$:
$$dG(u^{[k]})v = -G(u^{[k]})$$
and then setting the next guess as
$$u^{[k + 1]} = u^{[k]} + v.$$
More succinctly, we can write this as
$$u^{[k + 1]} = u^{[k]} - dG(u^{[k]})^{-1}G(u^{[k]}).$$
When you have a good enough initial guess, Newton's method converges really fast.
It might not convergence from any starting guess, but there are modifications of this algorithm that do guarnatee convergence.
In short, **Newton's method reduces the problem of solving a *nonlinear* system to the repeated solution of many *linear* systems.**

Why am I telling you all this?
Unless $\delta t$ is really large, it's probably a fair bet that $u_n$ is a pretty good starting guess for $u_{n + 1}$.
The idea of *linearly* implicit schemes is that, rather than execute Newton's method to convergence, **do only a single iteration of Newton's method** in each timestep.

You've already defined $G$ and $dG$ above.
You have everything you need to implement a linearly implicit method.

In [None]:
us = np.zeros((num_steps + 1, 3))
us[0] = u_0
for n in tqdm.trange(num_steps):
    ...

In [None]:
fig, ax = plt.subplots(subplot_kw={"projection": "3d"})
ax.plot(*us.T);

How does its appearance compare to that of the forward solution and the midpoint solution?
If you want, try it with two steps of Newton instead of one.
How does the total execution time differ for each method?