Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Fix LSODA interpolation scheme #14552

Merged
merged 5 commits into from Jun 2, 2023
Merged

Conversation

Patol75
Copy link
Contributor

@Patol75 Patol75 commented Aug 6, 2021

Reference issue

Closes gh-14480

What does this implement/fix?

As explained in #14480, LsodaDenseOutput fails to reproduce y(t_old). It looks to me that this only occurs when LSODA plans on changing the method order (i.e. NQU != NQCUR, equivalent to self._lsoda_solver._integrator.iwork[13] != self._lsoda_solver._integrator.iwork[14]). To be clear, this is not exactly a bug in Scipy - Scipy perfectly implements Taylor's theorem and applies the scaling explained in DLSODA. Instead, it is likely a mistake either in the FORTRAN implementation or in its associated docstring.

Taking #14480 again as an example, the implemented formula in Scipy is equivalent to:

(1.71524262e-06
 + -1.24130739e-06 * (0.06342019780923039 - 0.06402588986309915) / 0.00019480871245159827
 + 1.88400737e-07 * ((0.06342019780923039 - 0.06402588986309915) / 0.00019480871245159827) ** 2)

which yields 7.395919849794794e-06.

The proposed formula reads:

(1.71524262e-06
 + -1.24130739e-06 * (0.06342019780923039 - 0.06402588986309915) / 0.00019480871245159827
 + 1.88400737e-07 * ((0.06342019780923039 - 0.06402588986309915) / 0.00019480871245159827) ** 2
 + -4.87455807e-07 * ((0.06342019780923039 - 0.06402588986309915) / 0.0006056920538687616) ** 3)

and yields 7.883375656794793e-06, which is the correct result.

In the formulas above, [1.71524262e-06, -1.24130739e-06, 1.88400737e-07, -4.87455807e-07] is the Nordsieck history array YH, obtained in that case through self._lsoda_solver._integrator.rwork[20:24]; 0.06342019780923039 is t_old; 0.06402588986309915 is t; 0.00019480871245159827 is self._lsoda_solver._integrator.rwork[11], HCUR; and 0.0006056920538687616 is self._lsoda_solver._integrator.rwork[10], HU.

Therefore, it looks like one term is missing when NQCUR < NQU. According to the docstring, the number of terms included in the formula should be controlled by NQCUR. However, I have found that it is really NQU that should be used. In the case where NQU == NQCUR, nothing changes. When NQCUR < NQU, as highlighted above, one term is added, and, interestingly, it must use HU instead of HCUR. Finally, when NQCUR > NQU, the proposed implementation discards one term compared to the current implementation. I have tested locally all the possible scenarios with both method order and step size changing; the proposed implementation yields the right result every time.

I do not have an explanation for the why behind all of this. Perhaps someone motivated can have a careful look through DSTODA - I feel that the reason lies there somewhere.

Additional information

ODEPACK is available here.

@Patol75 Patol75 closed this Aug 6, 2021
@Patol75 Patol75 reopened this Aug 7, 2021
@Patol75
Copy link
Contributor Author

Patol75 commented Aug 12, 2021

@tylerjereddy @tupui Any chance for some feedback on this PR? Could you potentially add the proper tags (scipy.integrate and defect I imagine) and request a review from someone familiar with solve_ivp within the SciPy community? Thank you for your help.

@tylerjereddy tylerjereddy added scipy.integrate defect A clear bug or issue that prevents SciPy from being installed or used as expected labels Aug 12, 2021
Copy link
Contributor

@tylerjereddy tylerjereddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI is all green, which is a good sign, but this is way outside my wheelhouse.

One thing that makes it a bit harder for me as a non-expert is that your explanation, while detailed, does leave a bit of confusion about the nature of the "bug," i.e., you mention

I do not have an explanation for the why behind all of this. Perhaps someone motivated can have a careful look through DSTODA - I feel that the reason lies there somewhere.

That said, I suspect that this is pretty specialized stuff, so just needs the right reviewer, which can be tricky for us to do quickly sometimes.

One superficial comment is that if you believe codecov (sometimes it is a bit flaky), some of the code changed in the diff here isn't covered by a test case: https://app.codecov.io/gh/scipy/scipy/compare/14552/diff

image

@Patol75
Copy link
Contributor Author

Patol75 commented Aug 12, 2021

Thanks @tylerjereddy. I agree I would also like to provide the proper explanation as to why this behaviour is happening. Having a look again at the last update I posted on gh-14480, it looks to me like a proper bug in the FORTRAN implementation of either DLSODA, DSTODA or DINTDY. The reason I am saying this is that I can exactly reproduce the value given by SciPy using the lines of code within DINTDY and replacing the values of all the relevant variables with what SciPy gives.

I would not be surprised that LsodaDenseOutput is lacking some testing, because otherwise this bug would have been caught earlier. The tricky thing here is that it is not obvious to test because you need to rely on the fact that DLSODA will decide to change the order of the method used during integration, which is almost impossible to predict. One thing I can think of is to hardcode a few cases where the order of the method is indeed observed to change, but this is very fragile.

@Patol75
Copy link
Contributor Author

Patol75 commented Aug 31, 2021

@tylerjereddy I have just found the following comment, introduced within this commit by @nmayorov, which corresponds exactly to the bug I have reported. I will push a commit to remove this if condition as well as similar ones further down the script and see if CI stays green.

@Patol75
Copy link
Contributor Author

Patol75 commented Sep 1, 2021

It turns out that these tests were passing anyway even without the changes proposed in this PR. I have tried to further replicate the behaviour I have previously described, but I could not achieve any success without hard-coding values. I have also tried to use the tests already present in test_ivp.py but, here again, no success. Nevertheless, I think I have finally found something. The core of the issue I have reported is that interpolants for LSODA are inaccurate within the interval [t_old, t[. The only way to demonstrate it is to compare the interpolant calculated at t and evaluated at t_old with the algorithm solution at t_old. This is almost what is happening while events are handled because brentq evaluates the interpolant both at t_old and t and checks that obtained values bracket 0. If they do not, then it clearly means that the interpolant value for t_old is wrong as the one for t is, by construction, equal to the algorithm solution. However, once the integration is complete, such a comparison is not possible anymore as the interpolants are combined in such a way that evaluation at a given t value that was used by the algorithm during integration always results in calling the interpolant when that t value was equal to t, and not t_old. However, one can easily modify this behaviour, i.e. use the interpolant for when t was equal to t_old instead. I will push a commit that includes this modification and reverts my changes in lsoda.py. This should yield AssertionError in multiple tests within test_ivp.py. Once testing is done, I will re-apply my changes in lsoda.py while keeping the modified behaviour, and this should, hopefully, execute without any error.

@tupui
Copy link
Member

tupui commented Sep 1, 2021

FYI, you will need to merge master for the CI to run due to recent changes on master. For the PR itself, I am afraid I am not qualified as well.

@Patol75
Copy link
Contributor Author

Patol75 commented Sep 1, 2021

@tupui Thanks for the heads-up!

CI is back to green, so I can give more detail about what happened over the last few commits. Reverting the changes in lsoda.py and inverting the behaviour of the interpolant in common.py (combined with merging master into the fork) triggered four AssertionError within test_ivp.py, which we can separate in two groups: test_OdeSolution and the rest. test_OdeSolution failed because of the changes in common.py as test_OdeSolution makes use of solutions that are discontinuous. At the discontinuity, the value given by the interpolant depends on if the interpolant selects the segment before or after the discontinuity. As a result, inverting the behaviour of the interpolant will indeed change the value at the discontinuity and, thereby, lead to failing assertions there. For the second group of errors, let's look at the last commit. In there, I re-apply the changes proposed in this PR to lsoda.py and adjust values in test_OdeSolution to avoid the failure just described. In doing so, all four AssertionError disappear. In particular, for the second group of errors, this demonstrates that the failures described before the last commit were due to LSODA as a result of the change in interpolant behaviour. What is interesting here is that this change does not seem to affect other solvers, meaning that evaluating the solution at t_old from both the interpolants calculated at t and t_old yields the correct results for all solvers, except LSODA (that is, without this PR included). However, it does affect discontinuous solutions such as the ones used in test_OdeSolution and, therefore, it leads to a backwards-incompatible change. The good news here is that the changes in common.py and test_OdeSolution are not strictly speaking needed: I only committed them to perhaps more clearly highlight the origin of the bug at the root of this PR and also demonstrate that currently implemented tests in SciPy do fail if the bug in LSODA is exposed to them. A final concern is that codecov still indicates that the proposed change in lsoda.py to account for the method decreasing order is untested. Again, I cannot manage to think of an elegant way to have this condition tested other than hard-coding a case where it is used.

@Patol75
Copy link
Contributor Author

Patol75 commented Sep 10, 2021

Update regarding the last commits

Through a short discussion with the developer of the Julia wrapper around the C implementation of LSODA, I realised that the way I was describing the issue here at SciPy was not fully accurate. Indeed, I was presenting the bug as resulting from an error in the interpolation scheme, and the fix I had implemented clearly differentiated the case with a method order decrease from the other ones. However, this does not make sense as the scheme used for interpolation is standard and, therefore, must be the same for all cases of method order changes. As a result, I pushed the first of the previous four commits, which clearly demonstrates that the method order used to extract derivative values was wrong and that there is a scaling issue for the highest order derivative when the method order decreases. The resulting behaviour of lsoda.py does not change from the first proposed fix (CI stayed green); I simply put the emphasis on where the problem really is.

Furthermore, I realised that decreases in method order within LSODA occur when solution values become small. As a result, I was able to design a simple test case specifically for LSODA which I pushed in the second of the previous four commits. Also, I updated comments and docstrings in common.py to reflect the change in segment selection behaviour. Finally, I reverted the proposed fix in lsoda.py to highlight that, without it, CI becomes red, which it did. The great outcome of that commit is full coverage of _dense_output_impl within lsoda.py, as highlighted by codecov and which was missing before.

In the third commit, I re-applied the proposed changes to lsoda.py to demonstrate that CI would go back to green. It almost did, as only CircleCI reported an unrelated failure, which I thought I could fix by merging master into my fork (commit 4), but no luck. In any case, this fully establishes that the proposed changes to common.py allow revealing the current bug in SciPy associated with LSODA, which the newly implemented test in test_ivp.py does capture fully.

Summary of this PR

  • The current bug associated with the LSODA solver corresponds to the interpolant not being able to reproduce expected, correct values when the method order used by the solver changes across a step.

  • During a solver step, the method order used for the next step is determined. There are three possible cases:

    1. The method order stays the same. The number of terms considered when evaluating Taylor's theorem, which is equal to order + 1, can only be correct as there is only one order value to choose from. Hence, the interpolant produces correct values.
    2. The method order increases. The number of terms is still order + 1, but this time there are two possible values for order. SciPy uses the value of the next step, which is incorrect and leads to inaccuracies in the interpolant through the addition of an extra term into the formula.
    3. The method order decreases. Here again, there are two possible values for order. SciPy incorrectly uses the value for the next step and, therefore, misses one term, leading to interpolation errors. Additionally, this extra term is not properly scaled when the timestep is set to change across the next solver step.
  • Proposed changes in lsoda.py include using the correct method order value and re-scaling the highest-order-derivative term when the method order decreases (this could be further restricted to when the timestep changes as well, but that would not change the results).

  • There are at least two reasons why current tests implemented within SciPy do not capture this bug:

    1. Comparing solver solutions at solver-chosen steps to dense output interpolation of the same solver-chosen steps. The current behaviour implemented in SciPy is to assemble the full dense output functionality by selecting interpolant segments that correspond to segments calculated when the steps chosen by the solver were the current step within the solver, as opposed to the previous step. By construction, the LSODA interpolant is exact at the current step, and hence it is impossible to assess if the interpolant is doing its job properly.
    2. Assessing comparisons between analytical and interpolated solutions by checking that a computed error metric is small enough given solver tolerances. This is a pretty standard and valid approach. However, errors generated through the present bug by the LSODA interpolant are small enough so that they pass under the radar. Nevertheless, they are big enough to generate errors while handling events, as highlighted in the issue associated with this PR.
  • Proposed changes in this PR mainly address the first reason just stated through changes in common.py and test_ivp.py. Selected interpolant segments at a given solver-chosen step now correspond to calculated segments when the step was the previous step in the solver. An additional test allows capturing both method order increases and decreases; the latter was missing.

  • Proposed changes in this PR only deal with consequences, as opposed to causes. It is extremely likely that the root of this bug is an error in scaling the highest order derivative stored in the Nordsieck history array when the method order is set to decrease. It looks to me that this only influences the behaviour of the interpolant, and not the one of the solver, otherwise asserting equality between analytical and solver-produced solutions would fail. Nevertheless, it would be best to confirm this hypothesis, which may prove tedious.

@tylerjereddy @tupui Can you confirm that the CircleCI failure is unrelated to this PR? Additionally, considering the progress made on this PR, is there any chance for it to be included in the v1.8.0 release?

@tylerjereddy
Copy link
Contributor

Can you confirm that the CircleCI failure is unrelated to this PR?

Looks like it is unrelated from a quick look through the connected functions, and we have had a few timeout issues with CI benchmarks recently. I'm not an expert on the benchmarked code or the code adjusted here though.

Additionally, considering the progress made on this PR, is there any chance for it to be included in the v1.8.0 release?

1.8.0 is still a few months away (usually in December for the "winter release"), and for bug fixes we even consider backporting to i.e., 1.7.x. Main challenge remains an expert reviewer--perhaps we could ping the Julia maintainer you mention? From reading through some of the (detailed) descriptions above, it sounds like we do now have executable evidence/testing that something was fixed? So, that sounds a fair bit more promising as a non-expert at least.

assert_equal(res.t[0], t_span[0])
assert_equal(res.t[-1], t_span[-1])
assert_allclose(first_step, np.abs(res.t[1] - t_span[0]))
assert_(res.t_events is None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for new testing code, we have a modest preference for plain assert

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it fine to replace all instances in the file with the plain assert? I have done it for consistency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, please avoid mixing wholesale changes like this with another PR. The reason is just practical - GH shows all of the changes, and they distract from the point of the PR.

yc = res.sol(tc)

e = compute_error(yc, yc_true, rtol, atol)
assert_(np.all(e < 5))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could assert_array_less() work here to give us somewhat better feedback when an error arises?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is indeed a relevant suggestion. I have applied it to all similar instances within the file for consistency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's tough to say what to do with the old code, 1) because we want consistency, 2) we don't want to invite large style-only PRs, and 3) we don't want to mix big style changes with other PRs. It's tough to get all three. I guess the idea is to change a little at a time so that eventually it becomes consistent? In this case, I would just work something out with the reviewer - make no unnecessary changes in the bug fix PR, and do a follow-up for style cleanup.

@Patol75
Copy link
Contributor Author

Patol75 commented Jan 19, 2022

@nmayorov Would you have some time to take a look at this PR? It would be nice to have this bug fixed.

@Patol75
Copy link
Contributor Author

Patol75 commented Sep 20, 2022

Just a small comment that #15860 seems to be fixed by the changes made to lsoda.py in this PR.

@Patol75
Copy link
Contributor Author

Patol75 commented Oct 11, 2022

@mdhaber I have rewritten the commit history of this PR and now have only two commits. The first one introduces changes in OdeSolution, re-activates testing of LSODA in some tests, and adds a new test targeting the accuracy of LSODA's interpolant. These changes basically expose the bug in the current main branch of SciPy. In the second commit, I patch this bug. Do you think there is any chance you could review these changes, or would we still want @nmayorov input?

@mdhaber
Copy link
Contributor

mdhaber commented Oct 13, 2022

I will try to decide this weekend whether I can review this.

@zachjweiner
Copy link

I had given the content of the first commit here---the choice of segment sides---some thought in #15667 (comment) (which I should've posted here originally, too), which I wanted to follow up on. (I haven't checked the LSODA internals myself to be able comment on the fix in the second commit.)

The addition of the alt_segment flag to OdeSolution seems like the right move to me per my previous comment---it's flexible to methods with polynomial interpolant nodes at the beginning of the timestep interval (e.g., RK methods) or the end (LSODA and BDF, I think). But I'm confused: it doesn't seem anything was changed to actually use alt_segment=True, such that c417b7d doesn't actually make any behavioral changes. Am I inferring correctly that the (now passing) tests that evaluate LSODA interpolants at the nodes/breakpoints are still "trivial" (in the sense that the polynomials are evaluated with argument 0)? My memory was that enabling a nontrivial test of the LSODA interpolants' evaluation at breakpoints was the motivation for such a change.

@Patol75
Copy link
Contributor Author

Patol75 commented Oct 16, 2022

Thanks, @zachjweiner, it is true that the discussion you linked is the one that made me think of this alt_segment argument. It took me forever to finally get to write it down, but here we are. Thank you also for pointing out again that BDF uses the same interpolant construction choice as LSODA; I had missed that. And indeed, thank you for the careful comment regarding the missing change: for some reason, I forgot to include it.

I have also taken the opportunity to update the additional test I contributed to test_ivp.py. The main purpose of this test was to cover the added if condition of _dense_output_impl, but it is also useful to highlight the failure that I had initially reported in gh-14480.

@nmayorov
Copy link
Contributor

I've been trying to understand the logic in LSODA with order and step size and how it affects "yh".

Theoretically when the order increases the highest derivative of the new model polynom should be zero, according to this paragraph:

Screenshot from 2022-10-18 17-45-59

But we see that it is not the case with LSODA. Maybe this last column in yh is in some "utility" state at this point and we indeed should not use it (and thus use the order in the last step).

As of the decreasing order, @Patol75 can you provide me with an example of the problem when the order decrease is triggered?

@Patol75
Copy link
Contributor Author

Patol75 commented Oct 23, 2022

Hi @nmayorov,
Thank you for looking into this PR. As for providing an example of the problem when order decreases, the new test implemented in this PR does exactly that. If you look at the CI run associated with the first commit above, you can see a failure being reported for this new test.

Also, not sure if that helps, but you can check the following comment block.

@Patol75
Copy link
Contributor Author

Patol75 commented Mar 2, 2023

Just checking in, any chance for an update on the review status of this PR?

@h-vetinari
Copy link
Member

I'm sorry this hasn't received further review yet. I'm not an expert on the code in question, but I may be able to take a look. However, it's unlikely that this makes it into 1.11, which is due to branch very soon, so I'm bumping the milestone for now. If things get resolved in time, we can of course change it back again.

@h-vetinari h-vetinari modified the milestones: 1.11.0, 1.12.0 May 20, 2023
@steppi
Copy link
Contributor

steppi commented May 26, 2023

@Patol75 I've looked carefully through the relevant Fortran code in stoda.f and lsoda.f, and now understand what the issue is.

Your solution appears correct. What follows is my understanding of what's going on:

This is not due to a bug in ODEPACK. For getting dense output, what we want is the approximating polynomial for the last successful step. As you noticed, SciPy's wrapper is currently taking the approximating polynomial to order nq, the order to be attempted next, rather than nqu, the last successful order. Replacing order = iwork[14] with order = iwork[13] is the correct fix here.

After each call to stoda, the Nordsieck history array, yh, will be in the state needed for attempting the next step, not in the state for the last successful attempt.

The columns of the Nordsieck array are of the form

$$\frac{h^j}{j!}P^{(j)}(t)$$

for j from 0 to the order q, and where $P(t)$ is a vector of length $n$ where $n$ is the number of equations. I read Nordsieck (1962) and found that t and P can not change when modifying the array for attempting the next step, what can change is the order q and the step size h. Change in the order is handled by adding or removing a term, and a change in h is handled by rescaling.

Looking at the dense output implementation code

    def _call_impl(self, t):
        if t.ndim == 0:
            x = ((t - self.t) / self.h) ** self.p
        else:
            x = ((t - self.t) / self.h) ** self.p[:, None]

        return np.dot(self.yh, x)

the particular value of h used doesn't actually matter. It's being scaled out. What matters is that all of the entries of the Nordsieck history array are using the same value of h. However, when the order is being reduced within stoda, because the Nordsieck array yh needs to end up in the state needed for the next step, what is the last column for the current order will be dropped from consideration, so stoda does not need to rescale it. All of the other columns are rescaled for the h to be attempted next. This is why when the order is decreasing, you need to rescale only the last column. Your fix is correct.

I'm sorry if this not clear yet. I've written it up hastily. Please let me know if you have any questions. I think there should be a comment added to _dense_output_impl explaining the changes you've made. I can suggest something in a review comment. I should have the full review completed today.

Copy link
Contributor

@steppi steppi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work identifying the fix, finding test cases which lead to order decrease, and identifying the alt_segment issue which made the tests for LSODA and BDF not actually work.

As I mentioned before, I dug through the Fortran and found the root cause. Only minor suggestions related to adding or updating comments. Impressed with this one.

scipy/integrate/_ivp/lsoda.py Outdated Show resolved Hide resolved
scipy/integrate/_ivp/common.py Outdated Show resolved Hide resolved
@steppi steppi mentioned this pull request May 27, 2023
36 tasks
@steppi
Copy link
Contributor

steppi commented Jun 1, 2023

@Patol75 Do you have any thoughts on the suggested comments? If you think they look fine, let's commit them, and then I think this is ready to merge, as long as there are no unexpected test failures.

Patol75 and others added 2 commits June 2, 2023 21:16
Co-authored-by: Albert Steppi <albert.steppi@gmail.com>
Co-authored-by: Albert Steppi <albert.steppi@gmail.com>
@Patol75
Copy link
Contributor Author

Patol75 commented Jun 2, 2023

@steppi I am very, very grateful for the reviewing work you have done here. What you uncovered makes a lot of sense to me and explains perfectly the behaviour that I had noticed and documented. The added comments are, I believe, extremely helpful for anyone who would try to understand how the interpolant is built. I think I owe you a massive thank you!

I agree that the PR should now be ready for merging provided the CI returns green. For now, I see that CircleCI reports a failure while attempting to build SciPy, but it seems to be related to some code in special.

@steppi
Copy link
Contributor

steppi commented Jun 2, 2023

@steppi I am very, very grateful for the reviewing work you have done here. What you uncovered makes a lot of sense to me and explains perfectly the behaviour that I had noticed and documented. The added comments are, I believe, extremely helpful for anyone who would try to understand how the interpolant is built. I think I owe you a massive thank you!

I agree that the PR should now be ready for merging provided the CI returns green. For now, I see that CircleCI reports a failure while attempting to build SciPy, but it seems to be related to some code in special.

No problem, happy to help! Thanks for identifying and finding a fix for this problem.

The CircleCI failure is due to that build using Python 3.8, while the type hints used are only available in what is now the minimum required version: 3.9.

I'm not sure what's causing the MannWhitneyU failure in linux_aarch64_test and macos_arm64_test, but it's definitely unrelated. I plan to merge this later today.

Feel free to ping me if you need a review for any PRs in the future.

Copy link
Contributor

@steppi steppi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@steppi steppi merged commit c374ca7 into scipy:main Jun 2, 2023
16 of 19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect A clear bug or issue that prevents SciPy from being installed or used as expected scipy.integrate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LSODA implementation of dense output yields incorrect result
9 participants