Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of different integrators #127

Closed
ma-sadeghi opened this issue Sep 20, 2021 · 5 comments
Closed

Performance of different integrators #127

ma-sadeghi opened this issue Sep 20, 2021 · 5 comments

Comments

@ma-sadeghi
Copy link

Hi, I have a question regarding the performance graph in the documentation.

I tried cvode and rk5 on a small 3d finite difference discretization (11x11x11), cvode turned out to be much slower than rk5:

RK5: 22 ms
CVODE: 1.57 s

I think the performance graph in the docs is for a relatively small system (<100 equations), and as the number of equations increase, explicit integrators tend to outperform implicit ones, even though they use a much larger number of time steps (to maintain stability). What do you think?

@bmcage
Copy link
Owner

bmcage commented Sep 21, 2021

Every problem is different, and all depends on the tolerances you put on the system. You probably have too strict tolerances in cvode, but it could be something else also, like a discontinuity which with implicit systems you should smooth, or a slow jacobian determination you provide, or .... . Perhaps you ask output at the explicit output times, but that must be computed by interpolation by cvode, so would give impression to be slow, while it is not related to the computation time to solve the system.

There are output procedures to look at number of calls, to try to understand where the problem arises.
It is only worthwhile to further investigate if, or you want to really understand it, or you will run simulations later that will run for hours/days, and spending time now to optimize it make sense.

In the end, with these solvers, for same quality of output, difference in computing time will never be two order of magnitude different. The outperforming implicit ones is only if quality of solution is not taken into account, which is ok for games, but not for physics.

@ma-sadeghi
Copy link
Author

@bmcage Thanks for your quick reply. I really appreciate it.

I do ask for output times, but I only ask for 10 time points, so I think that should be relatively fast to interpolate. Regarding tolerance, I'm using rtol=1e-3 and atol=1e-3, which are fairly modest. Finally, regarding the physics of the problem I'm solving, it's the heat equation in 2D/3D, initially at u0 = 0 with Dirichlet boundary condition u = 1 imposed on all boundaries. So, the solution should be pretty smooth.

I plan to eventually use scikits.odes for a very large problem (>10^6 equations), which is why I'm trying to first optimize it on a relatively small problem.

Anyway, I just tried again, and I still get ~2 orders of magnitude difference between rk5 and cvode. I also compared it to scipy's solve_ivp using method="RK45, which is basically similar to scikits.odes's rk5 (I think), and it's in the right ballpark. Actually, scipy is about 3x slower. So, I think cvode is a bit off in terms of runtime. Here's the code that reproduces these numbers. I tried to make it easy to read.

Scikits.Odes runtime (rk5): 0.83 s
Scikits.Odes runtime (cvode): 50.11 s
Scipy runtime (rk45): 2.36 s
# %% Import libraries
import matplotlib.pyplot as plt
import numpy as np
from scipy.integrate import solve_ivp
from scikits.odes.odeint import odeint
from time import time

# %% Define du/dt
def func(t, v, du, params):
    h, k, Nx, Ny, Nz = params
    is3d = True if Nz > 1 else False
    u = v.reshape((Nx, Ny, Nz))
    idx = np.arange(1, Nx-1)
    idy = np.arange(1, Ny-1)
    idz = np.arange(1, Nz-1) if is3d else np.array([0])
    u_xx = (u[np.ix_(idx-1, idy, idz)] - 2*u[np.ix_(idx, idy, idz)] + u[np.ix_(idx+1, idy, idz)]) / h**2
    u_yy = (u[np.ix_(idx, idy-1, idz)] - 2*u[np.ix_(idx, idy, idz)] + u[np.ix_(idx, idy+1, idz)]) / h**2
    if is3d:
        u_zz = (u[np.ix_(idx, idy, idz-1)] - 2*u[np.ix_(idx, idy, idz)] + u[np.ix_(idx, idy, idz+1)]) / h**2
    else:
        u_zz = 0.0
    idx1d = np.ravel_multi_index(np.ix_(idx, idy, idz), (Nx, Ny, Nz)).flatten()
    du[idx1d] = k * (u_xx + u_yy + u_zz).flatten()
    return du

# %% Define model parameters
Nx, Ny, Nz = 100, 100, 1
Lx = 1.0
Ly = Lx * (Ny-1)/(Nx-1)
Lz = Lx * (Nz-1)/(Nx-1)
h = Lx/(Nx-1)
D = 1e-2
params = (h, D, Nx, Ny, Nz)

# Safety checks
assert Nx >= 3
assert Ny >= 3
if Nz != 1:
    assert Nz >= 3

# %% Scikits.Odes
def func_odes(t, u, du):
    func(t, u, du, params)

tspan = (0.0, 1.0)
tout = np.linspace(tspan[0], tspan[1], 10)
u0 = np.zeros((Nx, Ny, Nz))     # IC @ t = 0 ; u = 0.0
u0[0, ...] = 1.0                # BC @ x = 0 ; u = 1.0
u0[-1, ...] = 1.0               # BC @ x = L ; u = 1.0
u0[:, 0, :] = 1.0               # BC @ y = 0 ; u = 1.0
u0[:, -1, :] = 1.0              # BC @ y = L ; u = 1.0

t0 = time()
out = odeint(func_odes, tout, u0.flatten(), method="rk5", rtol=1e-3, atol=1e-3)
print(f"Scikits.Odes runtime (rk5): {time()-t0:.2f} s")

t0 = time()
out = odeint(func_odes, tout, u0.flatten(), method="cvode", rtol=1e-3, atol=1e-3)
print(f"Scikits.Odes runtime (cvode): {time()-t0:.2f} s")

u3d_odes = out.values.y.reshape((tout.size, Nx, Ny, Nz))

fig, ax = plt.subplots()
f = ax.imshow(u3d_odes[-1, ..., Nz//2], vmin=0, vmax=u3d_odes.max())
ax.set_title("Scikits.Odes")
plt.colorbar(f)

# %% Scipy
du = np.zeros(u0.size)
def func_scipy(t, u):
    return func(t, u, du, params)

t0 = time()
sol = solve_ivp(func_scipy, tspan, u0.flatten(), method="RK45", rtol=1e-3, atol=1e-3)
print(f"Scipy runtime (rk45): {time()-t0:.2f} s")

u3d_scipy = sol.y.reshape(Nx, Ny, Nz, sol.t.size)

fig, ax = plt.subplots()
f = ax.imshow(u3d_scipy[..., Nz//2, -1], vmin=0, vmax=u3d_scipy.max())
ax.set_title("Scipy")
plt.colorbar(f)
plt.show()

@bmcage
Copy link
Owner

bmcage commented Dec 11, 2021

Hi, you are doing a mistake here. You use cvode in the dense option, which for 100x100x100 problem is not realistic.
Eg, changing your line of cvode call to:

out = odeint(func_odes, tout, u0.flatten(), method="cvode", rtol=1e-6, atol=1e-6, linsolver='band', lband=1, uband=1)

you will have output:

Scikits.Odes runtime (rk5): 0.63 s
Scikits.Odes runtime (cvode): 2.22 s
Scipy runtime (rk45): 3.24 s

Obviously, with a band jacobian, you will have less optimal solution (Ok for 1D problem with method of lines, less so for 2 or 3D). However, as you want to do method of lines here, then you should with cvode provide the jacobian as a sparse matrix, and it will again outperform other solutions.
In other words, you need to look at set_options possibilities in https://github.com/bmcage/odes/blob/master/scikits/odes/sundials/cvode.pyx#L799 and change to non-stiff problem as needed, and make use of the sparse structure instead of it being dense.

@bmcage
Copy link
Owner

bmcage commented Dec 11, 2021

Before closing this, as an extra, odes is only a wrapper, for actual use, better have look at sundials C code examples. Eg, the cvode docs have an example with band matrix for 2D advection diffusion problem, see in the doc section:

2.2 A banded example: cvAdvDiff bnd 

from: https://raw.githubusercontent.com/LLNL/sundials/master/doc/cvode/cv_examples.pdf

Almost all that is possible in C with sundials, you can reproduce with odes is python

@bmcage bmcage closed this as completed Dec 11, 2021
@ma-sadeghi
Copy link
Author

Thank you for your thorough response, really appreciate it. It all makes sense now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants