Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: integrate: add functionality to Radau #13068

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

laurent90git
Copy link

@laurent90git laurent90git commented Nov 10, 2020

Reference issue

Closes #12762

What does this implement/fix?

This commit adds the ability to solve ODEs or DAEs with a mass matrix (potentially singular).

Additional information

The documentation has been completed accordingly.

Two tests have been added to the solve_ivp test suite:

  • simulation of a simple ODE with a mass matrix and an equivalent ODE without mass matrix, thus verifiying the mass matrix implementation.
  • simulation of DAEs of index 0 to 3 (pendulum with constant rod length in cartesian coordinates), and verification against the simple ODE obtained in polar coordinates.
    The build and tests run without issues (apart from unrelated errors in other modules).

These two test cases are actually compact versions of some scripts available in a personal repository:
https://github.com/laurent90git/DAE-Scipy

There the implementation and the test cases are explained in depth if more information is required. An example of what is possible with this new feature: computing the "free falling chain" problem, implemented as an index-3 DAE system. A mass is attached to a chain and falls freely, faster than it would without the chain:
https://www.youtube.com/watch?v=VESQ7IXPlQw

Note that I could not find the THANKS.txt file to add my name to the contributors list :/ Have I missed something ?

Thanks in advance for the review !

Laurent Francois

Ability to solve ODEs or DAEs with a mass matrix.
@laurent90git laurent90git changed the title ENH: add functionnality SciPy.integrate.Radau, see #12762 ENH: add functionnality to scipy.integrate._ivp.Radau, see #12762 Nov 11, 2020
(as part of the push request  scipy#13068)
if not (mass is None):
if issparse(mass):
self.mass_matrix = csc_matrix(mass)
self.index_algebraic_vars = np.where(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not find a sparse-specific method to find the rows that are nil. Therefore I just convert the matrix to a dense array, which is not optimal. This is however only done once, so I guess it should not affect the overall performance.

Copy link
Contributor

@Patol75 Patol75 Aug 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use something like list(set(range(mass.shape[0])).difference(find(mass)[0])) where find is scipy.sparse.find. I am not sure if it is faster though. Additionally, why do you write self.mass_matrix = csc_matrix(mass) and not self.mass_matrix = mass? mass is already sparse in that case, and if you want to benefit from fast dot operations, Scipy recommends using the csr format, so you could have self.mass_matrix = mass.tocsr().

Copy link
Author

@laurent90git laurent90git Sep 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather leave this part as is, as I am not really knowledgable on sparse matrices. I chose to stick to the CSC format since it is the one used in the rest of the code, e.g. for the Jacobian. It should be easy to change this to CSR, but I'd rather keep that for a separate pull request maybe.

Comment on lines 352 to 355
self.mass_matrix = mass
self.index_algebraic_vars = np.where(np.all(
self.mass_matrix == 0, axis=1))[0]
self.nvars_algebraic = self.index_algebraic_vars.size
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are nil rows in the mass matrix, this means that the corresponding variables are purely algebraic. Identifying these variables allows to improve the behaviour of the error estimate on the algebraic components further in step_impl. Indeed the original error estimation method overestimates the error on algebraic variables (see Hairer, Lubich, Roche 1989: "The Numerical Solution of Differential-Algebraic Systems by Runge-Kutta Methods"), resulting in unnecessary low time step values. Therefore in this implementation, the error on explicitly identified algebraic variables is set to 0, and the norm of the error estimate is computed only by considering the other variables.

Comment on lines 476 to 481
if self.mass_matrix is None:
LU_real = self.lu(MU_REAL / h * self.I - J)
LU_complex = self.lu(MU_COMPLEX / h * self.I - J)
else:
LU_real = self.lu(MU_REAL / h * self.mass_matrix - J)
LU_complex = self.lu(MU_COMPLEX / h * self.mass_matrix - J)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have chosen to let the initial implementation in the code, using the if statement to switch to the formulation with a mass matrix if necessary. Another more compact solution would be to simply assume that the mass matrix is the identity if it is not supplied, and only use the mass matrix formulation.
The same can be said regarding the error estimation further down.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should define mass_matrix as the identity matrix as you mention here. This way, you can get rid of all the if conditions you have added throughout radau.py, which will help streamline the code without any loss of generality.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have corrected this. If it is not given by the user, the mass matrix is now replaced by a sparse identity matrix (which was already available as self.I in the Radau class).

Comment on lines 1069 to 1090
def dae_fun(t, X):
x, y, vx, vy, lbda = X[0], X[1], X[2], X[3], X[4]
if chosen_index == 3:
constraint = x ** 2 + y ** 2 - r0 ** 2
elif chosen_index == 2:
constraint = x * vx + y * vy
elif chosen_index == 1:
constraint = lbda * (x ** 2 + y ** 2) / m \
+ g * y - (vx ** 2 + vy ** 2)
elif chosen_index == 0:
rsq = x ** 2 + y ** 2
dvx = -lbda * x / m
dvy = -lbda * y / m - g
constraint = (1 / m) * (- g * vy / rsq
+ 2 * (vx * dvx + vy * dvy) / rsq
+ (vx ** 2 + vy ** 2 - g * y) *
(2 * x * vx + 2 * y * vy) / (rsq ** 2))
return np.array([vx,
vy,
-x*lbda/m,
-g-(y*lbda)/m,
constraint])
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sets up a DAE model of the pendulum with a given index. I explain this process in more details in the following file:
https://github.com/laurent90git/DAE-Scipy/blob/main/pendulum.py

Comment on lines 1095 to 1096
# test sparse mass matrix
mass = csc_matrix(mass)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just take the opportunity to test the support of a sparse mass matrix. All the other tests use a dense mass matrix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

@@ -304,12 +317,14 @@ def __init__(self, fun, t0, y0, t_bound, max_step=np.inf,

self.jac_factor = None
self.jac, self.J = self._validate_jac(jac, jac_sparsity)
self.nlusove = 0
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added this counter for the number of forward-backward substitutions. During my development I used is to ensure the computational effort remained acceptable. I guess it could probably be used as an additional attribute to the OdeResult class in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is usually coined nlu within scipy.integrate.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nlu counts the number of LU-factorisations of the Jacobian matrix, while the here introduced nlusolve counts the number of linear system solutions (using the previous LU-factorisations). Therefore, nlusolve increases at for each Newton step performed, while nlu only increase upon when convergence of the Newton loop is not reached, i.e. if the jacobian of the ODE function needs to be updated or if the time step has changed too much (both these elements appear in the residual Jacobian used in the Newton loop).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the clarification. 👍🏻

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.nlusove = 0
self._nlusove = 0

Consider the preceding underscore to indicate that it's supposed to be private. (Even if that's not already done in the existing code, it's good practice.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have performed that change. In the future, it would be interesting to make BDF and Radau return that value as part of the OdeSolution result, as it is a good indicator of the computational cost.

@rgommers rgommers added enhancement A new feature or improvement scipy.integrate labels Nov 15, 2020
@rgommers rgommers changed the title ENH: add functionnality to scipy.integrate._ivp.Radau, see #12762 ENH: add functionality to scipy.integrate._ivp.Radau, see #12762 Nov 15, 2020
@rgommers rgommers changed the title ENH: add functionality to scipy.integrate._ivp.Radau, see #12762 ENH: add functionality to scipy.integrate._ivp.Radau Nov 15, 2020
@laurent90git
Copy link
Author

Tagging @nmayorov who developed the original version of Scipy's Radau :)

Copy link
Contributor

@mdhaber mdhaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a partial type-2 review.

Overall, this is great. Big improvement to functionality, fit cleanly into the existing code, good style, etc. Ping me when you've responded and I'll do another pass and test the code manually.

@@ -245,6 +254,10 @@ class Radau(OdeSolver):
is assumed to be dense.
vectorized : bool, optional
Whether `fun` is implemented in a vectorized fashion. Default is False.
mass : {None, array_like, sparse_matrix}, optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this and mass_matrix above have the same name?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this may be more clear. I have updated this.

@@ -304,12 +317,14 @@ def __init__(self, fun, t0, y0, t_bound, max_step=np.inf,

self.jac_factor = None
self.jac, self.J = self._validate_jac(jac, jac_sparsity)
self.nlusove = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.nlusove = 0
self._nlusove = 0

Consider the preceding underscore to indicate that it's supposed to be private. (Even if that's not already done in the existing code, it's good practice.)

if self.mass_matrix is None:
error = self.solve_lu(LU_real, f + ZE)
error_norm = norm(error / scale)
else: # see [1], chapter IV.8, page 127
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the comments!

error = self.solve_lu(LU_real, self.fun(t, y + error) + ZE)
error_norm = norm(error / scale)

if self.mass_matrix is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty similar to the block above. Consider whether it's worth it to write a private function to reduce code duplication.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are actually two different error estimates. They were originally present in the code, and are quite necessary for a proper functioning of the error estimate. Basically, the second one (which involves on linear system resolution, unlike the first one which is explicit) is less affected by the stiffness of the system. It is used only when the first estimate is too high. This is also done in the original Fortran code from Hairer & Wanner. I have added a comment to clarify that.

Comment on lines 1013 to 1016
assert_(sol_without_mass.success,
msg=f'solver {method} failed without mass matrix')
assert_(sol_with_mass.success,
msg=f'solver {method} failed with mass matrix')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and throughout, I think we prefer plain assert over assert_ now. (Things have changed over time.) Note that this is just for assert_; assert_allclose is fine.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have replaced all my assert_ calls with plain assert statements.

Comment on lines 1095 to 1096
# test sparse mass matrix
mass = csc_matrix(mass)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

Defined the constant mass matrix of the system, with shape (n,n).
It may be singular, thus defining a problem of the differential-
algebraic type (DAE). The default value is None (equivalent to
an identity mass matrix).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an example to the docstring. It's been a few years since I've worked with this sort of thing, so seeing the test reminded me of how this would work.

Comment on lines 1068 to 1090
for chosen_index in range(4):
def dae_fun(t, X):
x, y, vx, vy, lbda = X[0], X[1], X[2], X[3], X[4]
if chosen_index == 3:
constraint = x ** 2 + y ** 2 - r0 ** 2
elif chosen_index == 2:
constraint = x * vx + y * vy
elif chosen_index == 1:
constraint = lbda * (x ** 2 + y ** 2) / m \
+ g * y - (vx ** 2 + vy ** 2)
elif chosen_index == 0:
rsq = x ** 2 + y ** 2
dvx = -lbda * x / m
dvy = -lbda * y / m - g
constraint = (1 / m) * (- g * vy / rsq
+ 2 * (vx * dvx + vy * dvy) / rsq
+ (vx ** 2 + vy ** 2 - g * y) *
(2 * x * vx + 2 * y * vy) / (rsq ** 2))
return np.array([vx,
vy,
-x*lbda/m,
-g-(y*lbda)/m,
constraint])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great that you're testing with different index. (I remember back in the day Matlab's integrator complaining that it could only solve DAEs of a certain index, and that was frustrating.) I think it would be preferable to @parametrize the test with different dae_fun (e.g. before the test begins, include this sort of code in a generator that yields the dae_fun of a given index) rather than having this loop in the test.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have included the index as an additional argument in the @parametrize decorator.

f_real = F.T.dot(TI_REAL) - M_real * W[0]
f_complex = F.T.dot(TI_COMPLEX) - M_complex * (W[1] + 1j * W[2])
else:
f_real = F.T.dot(TI_REAL) - M_real * mass_matrix.dot(W[0])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to make sure mass_matrix is an array or sparse array first. I think this will fail with array_like.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, the Jacobian of the ODE/DAE system is transformed to a Numpy array if it is an array_like, via the _validate_jac function. I have added a similar _validate_mass_matrix function to handle all the required checks and transformations.

@@ -72,6 +72,11 @@ def solve_collocation_system(fun, t, y, h, Z0, scale, tol,
solve_lu : callable
Callable which solves a linear system given a LU decomposition. The
signature is ``solve_lu(LU, b)``.
mass_matrix : {None, array_like, sparse_matrix}, optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the new sparse array is supposed to replace sparse matrix eventually. Doesn't affect the code much, but probably should advertise that sparse array is accepted.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel comfortable yet adding this in the documentation, since this new format is not explicitly handled in solve_ivp for other matrices, e.g. the Jacobian. I would rather wait for this change to be made across all solvers by someone more knowledgeable than me on that particular subject !

@laurent90git
Copy link
Author

laurent90git commented Sep 7, 2022

@mdhaber @Patol75 @rgommers @nmayorov
Sorry for the late answer...

I have pushed a new version with the suggested improvements. The tests pass on my machine.
I am afraid I am quite far behind the latest Scipy release ! However the ivp module has only had a few modifications in the last years, so the merging should not be problematic.

Regarding the comment on the CSR format that might be more efficient that the CSC format for the sparse matrices, I have chosen not to change that, since this format choice is found in the original code, and also in the BDF solver. I feel like this evolution should be the subject of a future PR.

@JonasBreuling
Copy link

Thank you very much for this very nice piece of code. In my opinion, scipy will greatly benefit from the possiblity of solving DAE's up to index 3 using the Radau5 method. That said, I would be happy to assist you to get this merged. Find below some comments/ improvements to the current implementation.

Additional reference

I would recommend to add a reference to Laurent Jay - Convergence of Runge-Kutta methods for differential-algebraic systems of index 3 (https://doi.org/10.1016/0168-9274(95)00013-K). He proofs the convergence for Radau IIa for differential-algebraic equations of index 3 in Hessenberg form.

Application with analytical solution

In the same publication an index 3 DAE system with analytical solution is presented.

$$\begin{align} y'_1 &= 2 y_1 y_2 z_1 z_2 \\\ y'_2 &= - y_1 y_2 z_2^2 \\\ z'_1 &= (y_1 y_2 + z_1 z_2) u \\\ z'_2 &= -y_1 y_2^2 z_2^2 u \\\ 0 &= y_1 y_2^2 - 1 \end{align}$$

Consistent initial conditions are $y_1 = y_2 = z_1 = z_2 = u = 1$. The analytical solution is $y_1(x) = z_1(x) = e^{2x}$, $y_2(x) = z_2(x) = e^{-x}$ and $u(x) = e^x$. I've tried to solve it with the current version of the code but there is a strange behavior in the computation of the initial step. The first few step result in high outliers compared to the analytical solution.
Figure_1

The used implementation is

import numpy as np

from scipy.integrate import solve_ivp, Radau
from scipy.sparse import eye
from scipy.optimize._numdiff import approx_derivative


def generate_Jay1995(nonlinear_multiplier):
    def fun(t, y):
        """Index3 DAE found in Jay1995 Example 7.1 and 7.2.

        References:
        -----------
        Jay1995: https://doi.org/10.1016/0168-9274(95)00013-K
        """
        y1, y2, z1, z2, la = y

        if nonlinear_multiplier:
            dy3 = -y1 * y2**2 * z2**3 * la**2
        else:
            dy3 = -y1 * y2**2 * z2**2 * la

        dy = np.array(
            [
                2 * y1 * y2 * z1 * z2,
                -y1 * y2 * z2**2,
                (y1 * y2 + z1 * z2) * la,
                dy3,
                y1 * y2**2 - 1.0,
            ],
            dtype=y.dtype,
        )

        return dy

    # the jacobian is computed via finite-differences
    def jac(t, x):
        return approx_derivative(
            fun=lambda x: fun(t, x), x0=x, method="cs", rel_step=1e-50
        )

    def plot(t, y):
        import matplotlib.pyplot as plt

        fig, ax = plt.subplots()

        ax.plot(t, y[0], "--r", label="y1")
        ax.plot(t, y[1], "--g", label="y2")
        ax.plot(t, y[2], "-.r", label="z1")
        ax.plot(t, y[3], "-.g", label="z2")
        ax.plot(t, y[4], "--b", label="u")

        ax.plot(t, np.exp(2 * t), "-r", label="y1/z1 true")
        ax.plot(t, np.exp(-t), "-g", label="y2/z2 true")
        ax.plot(t, np.exp(t), "-b", label="u true")

        ax.grid()
        ax.legend()

        plt.show()

    # construct singular mass matrix
    mass_matrix = eye(5, format="csr")
    mass_matrix[-1, -1] = 0

    # initial conditions
    y0 = np.ones(5, dtype=float)

    # tolerances and t_span
    rtol = atol = 1.0e-6
    t_span = (0, 0.1)

    return y0, mass_matrix, fun, jac, rtol, atol, t_span, plot


if __name__ == "__main__":
    y0, mass_matrix, fun, jac, rtol, atol, t_span, plot = generate_Jay1995(
        nonlinear_multiplier=False
    )

    # # this fails
    # y0, mass_matrix, fun, jac, rtol, atol, t_span, plot = generate_Jay1995(
    #     nonlinear_multiplier=True
    # )

    sol = solve_ivp(
        fun=fun,
        t_span=t_span,
        y0=y0,
        rtol=rtol,
        atol=atol,
        jac=jac,
        method=Radau,
        mass_matrix=mass_matrix,
        # first_step=1.0e-4,
    )

    nfev = sol.nfev
    njev = sol.njev
    nlu = sol.nlu
    success = sol.success
    assert success

    print(f"nfev: {nfev}")
    print(f"njev: {njev}")
    print(f"nlu: {nlu}")

    plot(sol.t, sol.y)

The second example of the paper deals with DAE's with a nonlinear Lagrange multiplier. This can't be solved with this implementation. Possibly, this is related to the problem of the initial step.

Improved error estimate

The error estimate is crucial for solving DAE's of index 2 and 3. Let's have a look at the original fotran code of Ernst available at his personal webpage. I quoted the important part of the documentation below.

C       THE FOLLOWING 3 PARAMETERS ARE IMPORTANT FOR
C       DIFFERENTIAL-ALGEBRAIC SYSTEMS OF INDEX > 1.
C       THE FUNCTION-SUBROUTINE SHOULD BE WRITTEN SUCH THAT
C       THE INDEX 1,2,3 VARIABLES APPEAR IN THIS ORDER. 
C       IN ESTIMATING THE ERROR THE INDEX 2 VARIABLES ARE
C       MULTIPLIED BY H, THE INDEX 3 VARIABLES BY H**2.
C
C    IWORK(5)  DIMENSION OF THE INDEX 1 VARIABLES (MUST BE > 0). FOR 
C              ODE'S THIS EQUALS THE DIMENSION OF THE SYSTEM.
C              DEFAULT IWORK(5)=N.
C
C    IWORK(6)  DIMENSION OF THE INDEX 2 VARIABLES. DEFAULT IWORK(6)=0.
C
C    IWORK(7)  DIMENSION OF THE INDEX 3 VARIABLES. DEFAULT IWORK(7)=0.
C    IWORK(8)  SWITCH FOR STEP SIZE STRATEGY
C              IF IWORK(8).EQ.1  MOD. PREDICTIVE CONTROLLER (GUSTAFSSON)
C              IF IWORK(8).EQ.2  CLASSICAL STEP SIZE CONTROL
C              THE DEFAULT VALUE (FOR IWORK(8)=0) IS IWORK(8)=1.
C              THE CHOICE IWORK(8).EQ.1 SEEMS TO PRODUCE SAFER RESULTS;
C              FOR SIMPLE PROBLEMS, THE CHOICE IWORK(8).EQ.2 PRODUCES
C              OFTEN SLIGHTLY FASTER RUNS

On the same page there is also a C++ source code of Radau5, which is possibly much easier to read. The same point is mentioned there.

/*
The following 3 parameters are important for differential-algebraic systems of
index > 1. The function-subroutine should be written such that the index 1, 2, 3
variables appear in this order. In estimating the error the index 2 variables are
multiplied by h, the index 3 variables by h^2.

	nind1		Dimension of the index 1 variables (must be > 0). For ODE's this
				equals the dimension of the system. Default nind1 = n.

	nind2		Dimension of the index 2 variables. Default nind2 = 0.

	nind3		Dimension of the index 3 variables. Default nind3 = 0.

	npred		Switch for step size strategy
					If npred = 1  mod. predictive controller (Gustafsson)
					If npred = 2  classical step size control
				The default value (for npred = 0) is npred = 1. The choice
				npred = 1 seems to produce safer results. For simple problems,
				the choice npred = 2 produces often slightly faster runs.
*/

This brings me to my final point. I would recommend adding an additional option to the solver that defines for each equation its DAE index. 0 for differential equations and otherwise the corresponding DAE index. By that, the error estimate used by the original implementation can be used. A similar approach is used for the DASSL solver of Linda Petzold:

pow(r,MAX(0,root->index[i]-1))

We can replace line 518 and 533 by something like

error *= h**np.maximum(0, self.index_algebraic_vars - 1)

@laurent90git
Copy link
Author

laurent90git commented Dec 7, 2022

Thank you very much for this very nice piece of code. In my opinion, scipy will greatly benefit from the possiblity of solving DAE's up to index 3 using the Radau5 method. That said, I would be happy to assist you to get this merged. Find below some comments/ improvements to the current implementation.
...

Thank you for looking into this pull request ! I agree with some of the changes you suggested. The estimated error on the algebraic variables was previously zeroed out. I have done some more developments in a personal repository (https://github.com/laurent90git/DAE-Scipy). I have, among other changes, made it a requirement to specify the index associated with each component of the state vector. This is needed for the scaling of the residuals, of the norm of the Newton increment, and of the estimated error. These 3 scalings have helped quite a lot in marking the code more robust in my usual test cases. This personal version of Radau is also filled with optional printouts to investigate convergence of the Newton iterations, error estimation and other things.

I have been able to successfully integrate Jay's DAE model with the following trick. If the Newton loop has one bad iteration (i.e. convergence is found to be too slow, or a minor divergence occurs), the iteration is performed nonetheless instead of simply failing and asking for a Jacobian update of time step restriction. Indeed, I noticed that Jay's problem would fail most of the time at the first step, even with a reduced time step and an updated Jacobian. However, allowing for one such bad iteration to occur allows the Newton solver to "get back on his feet" and it usually converges within two additional iterations.
I had already used that trick successfully in other work with DIRK integrators, however I reckon that I have not found anything on this strategy in the literature, and I am not able to explain the profound reason why it can be sensible in some cases.
Note that this strategy is not present in the original Fortran code of Hairer. Thus, I will try to find more time to benchmark the original Fortran code against the current Python version in a variety of test cases. Also, I have to find out how the modifications impact the behaviour of the integrator in the classical ODE case.

Still, for low error tolerances (1e-10) on Jay's problem, the Newton cannot reach a low enough level of error, even with more iterations. The only way to make it work is to increase the Newton tolerance.

The code now may start to be substantially more complex compared to the original ODE-only implementation. Given the fact that solve_ivp's solvers are intended for ease of use, I wonder if the modified Radau should not be included as a separate method...

@kookma
Copy link

kookma commented Jan 6, 2024

Bump!

@lucascolley
Copy link
Member

Hi @laurent90git, are you still interested in including this in SciPy? If so, would you like the code that's here reviewed, or would you rather incorporate changes from your personal repository first?

@laurent90git
Copy link
Author

Hi,
I still haven't thought about which way the solver should be added to solve_ivp. I feel like it adds some complexity to the code. Robustness may for instance be greatly improved with some minor but rather technical tricks which I haven't yet thought much about giving access to the user.

I also wonder if a new function, e.g. solve_dae would not be interesting idea, event if it calls solve_ivp under the hood.

I'll try to take a look back at my branch and see how to wrap up this new functionality, before submitting either a new version, or simply coming back at you for a review of the current one.

@lucascolley lucascolley added the needs-work Items that are pending response from the author label Jan 10, 2024
@JonasBreuling
Copy link

I have some further comments to add. The features you have added in your personal repository would be a great enhancement for scipy. They outperform the original fortran code of Ernst Hairer and can even solve nonlinear index 3 DAE's. The crucial part is the Newton iteration as you already mentioned above. The settings should be given as optional parameters to the user insead of using fixed constants as the current implementation does.

Moreover, the idea of adding a solve_dae functionality sounds great for me. This allows so seperate all the Jacobian, mass matrix and DAE Index checks in a base class for systems of the form My' = f(x, y). Other DAE solvers like TRBDF2 and the BDF method could be added later. By that, scipy's solve_dae method would outplay matlab's DAE solvers, since they only work for index 1 DAE's due to the usage of standard error estimates. I think this would be very appealing for the community.

In my optinion, The last missing ingredient is to add a bunch of DAE problems as unit tests. I have experienced with lots of examples including Robertson problem, index 3 pendulum and Jay's nonlinear index 3 problem. At least these three examples should be added as test cases.

If I can assist further to get this PR done please let me know.

@kookma
Copy link

kookma commented Jan 11, 2024

I support having aolve_dae, and keep solve_ivp clean and simple. Iam ready to help with testing and proving examples mainly from chemical engineering field where many real examples can be found.

@kookma
Copy link

kookma commented Feb 26, 2024

As this feature: solving DAE with SciPy solve_ivp solver is crucial, is it possible to nominate it for Google summer code:
Program website: https://g.co/gsoc

I propose this as implementing this feature takes time and needs resources.

@lucascolley lucascolley changed the title ENH: add functionality to scipy.integrate._ivp.Radau ENH: add functionality to integrate._ivp.Radau Jun 2, 2024
@lucascolley lucascolley changed the title ENH: add functionality to integrate._ivp.Radau ENH: integrate: add functionality to Radau Jun 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement needs-work Items that are pending response from the author scipy.integrate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Radau - missing option "mass"
7 participants