Performance of different integrators #127

ma-sadeghi · 2021-09-20T20:37:33Z

Hi, I have a question regarding the performance graph in the documentation.

I tried cvode and rk5 on a small 3d finite difference discretization (11x11x11), cvode turned out to be much slower than rk5:

RK5: 22 ms
CVODE: 1.57 s

I think the performance graph in the docs is for a relatively small system (<100 equations), and as the number of equations increase, explicit integrators tend to outperform implicit ones, even though they use a much larger number of time steps (to maintain stability). What do you think?

The text was updated successfully, but these errors were encountered:

bmcage · 2021-09-21T09:04:38Z

Every problem is different, and all depends on the tolerances you put on the system. You probably have too strict tolerances in cvode, but it could be something else also, like a discontinuity which with implicit systems you should smooth, or a slow jacobian determination you provide, or .... . Perhaps you ask output at the explicit output times, but that must be computed by interpolation by cvode, so would give impression to be slow, while it is not related to the computation time to solve the system.

There are output procedures to look at number of calls, to try to understand where the problem arises.
It is only worthwhile to further investigate if, or you want to really understand it, or you will run simulations later that will run for hours/days, and spending time now to optimize it make sense.

In the end, with these solvers, for same quality of output, difference in computing time will never be two order of magnitude different. The outperforming implicit ones is only if quality of solution is not taken into account, which is ok for games, but not for physics.

ma-sadeghi · 2021-09-21T21:07:41Z

@bmcage Thanks for your quick reply. I really appreciate it.

I do ask for output times, but I only ask for 10 time points, so I think that should be relatively fast to interpolate. Regarding tolerance, I'm using rtol=1e-3 and atol=1e-3, which are fairly modest. Finally, regarding the physics of the problem I'm solving, it's the heat equation in 2D/3D, initially at u0 = 0 with Dirichlet boundary condition u = 1 imposed on all boundaries. So, the solution should be pretty smooth.

I plan to eventually use scikits.odes for a very large problem (>10^6 equations), which is why I'm trying to first optimize it on a relatively small problem.

Anyway, I just tried again, and I still get ~2 orders of magnitude difference between rk5 and cvode. I also compared it to scipy's solve_ivp using method="RK45, which is basically similar to scikits.odes's rk5 (I think), and it's in the right ballpark. Actually, scipy is about 3x slower. So, I think cvode is a bit off in terms of runtime. Here's the code that reproduces these numbers. I tried to make it easy to read.

Scikits.Odes runtime (rk5): 0.83 s
Scikits.Odes runtime (cvode): 50.11 s
Scipy runtime (rk45): 2.36 s

# %% Import libraries
import matplotlib.pyplot as plt
import numpy as np
from scipy.integrate import solve_ivp
from scikits.odes.odeint import odeint
from time import time

# %% Define du/dt
def func(t, v, du, params):
    h, k, Nx, Ny, Nz = params
    is3d = True if Nz > 1 else False
    u = v.reshape((Nx, Ny, Nz))
    idx = np.arange(1, Nx-1)
    idy = np.arange(1, Ny-1)
    idz = np.arange(1, Nz-1) if is3d else np.array([0])
    u_xx = (u[np.ix_(idx-1, idy, idz)] - 2*u[np.ix_(idx, idy, idz)] + u[np.ix_(idx+1, idy, idz)]) / h**2
    u_yy = (u[np.ix_(idx, idy-1, idz)] - 2*u[np.ix_(idx, idy, idz)] + u[np.ix_(idx, idy+1, idz)]) / h**2
    if is3d:
        u_zz = (u[np.ix_(idx, idy, idz-1)] - 2*u[np.ix_(idx, idy, idz)] + u[np.ix_(idx, idy, idz+1)]) / h**2
    else:
        u_zz = 0.0
    idx1d = np.ravel_multi_index(np.ix_(idx, idy, idz), (Nx, Ny, Nz)).flatten()
    du[idx1d] = k * (u_xx + u_yy + u_zz).flatten()
    return du

# %% Define model parameters
Nx, Ny, Nz = 100, 100, 1
Lx = 1.0
Ly = Lx * (Ny-1)/(Nx-1)
Lz = Lx * (Nz-1)/(Nx-1)
h = Lx/(Nx-1)
D = 1e-2
params = (h, D, Nx, Ny, Nz)

# Safety checks
assert Nx >= 3
assert Ny >= 3
if Nz != 1:
    assert Nz >= 3

# %% Scikits.Odes
def func_odes(t, u, du):
    func(t, u, du, params)

tspan = (0.0, 1.0)
tout = np.linspace(tspan[0], tspan[1], 10)
u0 = np.zeros((Nx, Ny, Nz))     # IC @ t = 0 ; u = 0.0
u0[0, ...] = 1.0                # BC @ x = 0 ; u = 1.0
u0[-1, ...] = 1.0               # BC @ x = L ; u = 1.0
u0[:, 0, :] = 1.0               # BC @ y = 0 ; u = 1.0
u0[:, -1, :] = 1.0              # BC @ y = L ; u = 1.0

t0 = time()
out = odeint(func_odes, tout, u0.flatten(), method="rk5", rtol=1e-3, atol=1e-3)
print(f"Scikits.Odes runtime (rk5): {time()-t0:.2f} s")

t0 = time()
out = odeint(func_odes, tout, u0.flatten(), method="cvode", rtol=1e-3, atol=1e-3)
print(f"Scikits.Odes runtime (cvode): {time()-t0:.2f} s")

u3d_odes = out.values.y.reshape((tout.size, Nx, Ny, Nz))

fig, ax = plt.subplots()
f = ax.imshow(u3d_odes[-1, ..., Nz//2], vmin=0, vmax=u3d_odes.max())
ax.set_title("Scikits.Odes")
plt.colorbar(f)

# %% Scipy
du = np.zeros(u0.size)
def func_scipy(t, u):
    return func(t, u, du, params)

t0 = time()
sol = solve_ivp(func_scipy, tspan, u0.flatten(), method="RK45", rtol=1e-3, atol=1e-3)
print(f"Scipy runtime (rk45): {time()-t0:.2f} s")

u3d_scipy = sol.y.reshape(Nx, Ny, Nz, sol.t.size)

fig, ax = plt.subplots()
f = ax.imshow(u3d_scipy[..., Nz//2, -1], vmin=0, vmax=u3d_scipy.max())
ax.set_title("Scipy")
plt.colorbar(f)
plt.show()

bmcage · 2021-12-11T13:48:15Z

Hi, you are doing a mistake here. You use cvode in the dense option, which for 100x100x100 problem is not realistic.
Eg, changing your line of cvode call to:

out = odeint(func_odes, tout, u0.flatten(), method="cvode", rtol=1e-6, atol=1e-6, linsolver='band', lband=1, uband=1)

you will have output:

Scikits.Odes runtime (rk5): 0.63 s
Scikits.Odes runtime (cvode): 2.22 s
Scipy runtime (rk45): 3.24 s

Obviously, with a band jacobian, you will have less optimal solution (Ok for 1D problem with method of lines, less so for 2 or 3D). However, as you want to do method of lines here, then you should with cvode provide the jacobian as a sparse matrix, and it will again outperform other solutions.
In other words, you need to look at set_options possibilities in https://github.com/bmcage/odes/blob/master/scikits/odes/sundials/cvode.pyx#L799 and change to non-stiff problem as needed, and make use of the sparse structure instead of it being dense.

bmcage · 2021-12-11T13:52:15Z

Before closing this, as an extra, odes is only a wrapper, for actual use, better have look at sundials C code examples. Eg, the cvode docs have an example with band matrix for 2D advection diffusion problem, see in the doc section:

2.2 A banded example: cvAdvDiff bnd

from: https://raw.githubusercontent.com/LLNL/sundials/master/doc/cvode/cv_examples.pdf

Almost all that is possible in C with sundials, you can reproduce with odes is python

ma-sadeghi · 2021-12-14T00:23:41Z

Thank you for your thorough response, really appreciate it. It all makes sense now.

ma-sadeghi mentioned this issue Sep 21, 2021

Re-importing scikits.odes seems to break it #128

Closed

bmcage closed this as completed Dec 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance of different integrators #127

Performance of different integrators #127

ma-sadeghi commented Sep 20, 2021

bmcage commented Sep 21, 2021

ma-sadeghi commented Sep 21, 2021

bmcage commented Dec 11, 2021

bmcage commented Dec 11, 2021

ma-sadeghi commented Dec 14, 2021

Performance of different integrators #127

Performance of different integrators #127

Comments

ma-sadeghi commented Sep 20, 2021

bmcage commented Sep 21, 2021

ma-sadeghi commented Sep 21, 2021

bmcage commented Dec 11, 2021

bmcage commented Dec 11, 2021

ma-sadeghi commented Dec 14, 2021