New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Fix LSODA interpolation scheme #14552
Conversation
@tylerjereddy @tupui Any chance for some feedback on this PR? Could you potentially add the proper tags ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI is all green, which is a good sign, but this is way outside my wheelhouse.
One thing that makes it a bit harder for me as a non-expert is that your explanation, while detailed, does leave a bit of confusion about the nature of the "bug," i.e., you mention
I do not have an explanation for the why behind all of this. Perhaps someone motivated can have a careful look through DSTODA - I feel that the reason lies there somewhere.
That said, I suspect that this is pretty specialized stuff, so just needs the right reviewer, which can be tricky for us to do quickly sometimes.
One superficial comment is that if you believe codecov (sometimes it is a bit flaky), some of the code changed in the diff here isn't covered by a test case: https://app.codecov.io/gh/scipy/scipy/compare/14552/diff
Thanks @tylerjereddy. I agree I would also like to provide the proper explanation as to why this behaviour is happening. Having a look again at the last update I posted on gh-14480, it looks to me like a proper bug in the FORTRAN implementation of either DLSODA, DSTODA or DINTDY. The reason I am saying this is that I can exactly reproduce the value given by SciPy using the lines of code within DINTDY and replacing the values of all the relevant variables with what SciPy gives. I would not be surprised that LsodaDenseOutput is lacking some testing, because otherwise this bug would have been caught earlier. The tricky thing here is that it is not obvious to test because you need to rely on the fact that DLSODA will decide to change the order of the method used during integration, which is almost impossible to predict. One thing I can think of is to hardcode a few cases where the order of the method is indeed observed to change, but this is very fragile. |
@tylerjereddy I have just found the following comment, introduced within this commit by @nmayorov, which corresponds exactly to the bug I have reported. I will push a commit to remove this |
It turns out that these tests were passing anyway even without the changes proposed in this PR. I have tried to further replicate the behaviour I have previously described, but I could not achieve any success without hard-coding values. I have also tried to use the tests already present in |
FYI, you will need to merge master for the CI to run due to recent changes on master. For the PR itself, I am afraid I am not qualified as well. |
@tupui Thanks for the heads-up! CI is back to green, so I can give more detail about what happened over the last few commits. Reverting the changes in |
Update regarding the last commits Through a short discussion with the developer of the Julia wrapper around the C implementation of LSODA, I realised that the way I was describing the issue here at SciPy was not fully accurate. Indeed, I was presenting the bug as resulting from an error in the interpolation scheme, and the fix I had implemented clearly differentiated the case with a method order decrease from the other ones. However, this does not make sense as the scheme used for interpolation is standard and, therefore, must be the same for all cases of method order changes. As a result, I pushed the first of the previous four commits, which clearly demonstrates that the method order used to extract derivative values was wrong and that there is a scaling issue for the highest order derivative when the method order decreases. The resulting behaviour of Furthermore, I realised that decreases in method order within LSODA occur when solution values become small. As a result, I was able to design a simple test case specifically for LSODA which I pushed in the second of the previous four commits. Also, I updated comments and docstrings in In the third commit, I re-applied the proposed changes to Summary of this PR
@tylerjereddy @tupui Can you confirm that the CircleCI failure is unrelated to this PR? Additionally, considering the progress made on this PR, is there any chance for it to be included in the v1.8.0 release? |
Looks like it is unrelated from a quick look through the connected functions, and we have had a few timeout issues with CI benchmarks recently. I'm not an expert on the benchmarked code or the code adjusted here though.
|
assert_equal(res.t[0], t_span[0]) | ||
assert_equal(res.t[-1], t_span[-1]) | ||
assert_allclose(first_step, np.abs(res.t[1] - t_span[0])) | ||
assert_(res.t_events is None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for new testing code, we have a modest preference for plain assert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it fine to replace all instances in the file with the plain assert
? I have done it for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future, please avoid mixing wholesale changes like this with another PR. The reason is just practical - GH shows all of the changes, and they distract from the point of the PR.
yc = res.sol(tc) | ||
|
||
e = compute_error(yc, yc_true, rtol, atol) | ||
assert_(np.all(e < 5)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could assert_array_less()
work here to give us somewhat better feedback when an error arises?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is indeed a relevant suggestion. I have applied it to all similar instances within the file for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's tough to say what to do with the old code, 1) because we want consistency, 2) we don't want to invite large style-only PRs, and 3) we don't want to mix big style changes with other PRs. It's tough to get all three. I guess the idea is to change a little at a time so that eventually it becomes consistent? In this case, I would just work something out with the reviewer - make no unnecessary changes in the bug fix PR, and do a follow-up for style cleanup.
@nmayorov Would you have some time to take a look at this PR? It would be nice to have this bug fixed. |
Just a small comment that #15860 seems to be fixed by the changes made to |
d3ce9d7
to
c417b7d
Compare
@mdhaber I have rewritten the commit history of this PR and now have only two commits. The first one introduces changes in |
I will try to decide this weekend whether I can review this. |
I had given the content of the first commit here---the choice of segment sides---some thought in #15667 (comment) (which I should've posted here originally, too), which I wanted to follow up on. (I haven't checked the LSODA internals myself to be able comment on the fix in the second commit.) The addition of the |
f55ba6e
to
d38e816
Compare
Thanks, @zachjweiner, it is true that the discussion you linked is the one that made me think of this I have also taken the opportunity to update the additional test I contributed to |
I've been trying to understand the logic in LSODA with order and step size and how it affects "yh". Theoretically when the order increases the highest derivative of the new model polynom should be zero, according to this paragraph: But we see that it is not the case with LSODA. Maybe this last column in As of the decreasing order, @Patol75 can you provide me with an example of the problem when the order decrease is triggered? |
Hi @nmayorov, Also, not sure if that helps, but you can check the following comment block. |
Just checking in, any chance for an update on the review status of this PR? |
I'm sorry this hasn't received further review yet. I'm not an expert on the code in question, but I may be able to take a look. However, it's unlikely that this makes it into 1.11, which is due to branch very soon, so I'm bumping the milestone for now. If things get resolved in time, we can of course change it back again. |
@Patol75 I've looked carefully through the relevant Fortran code in stoda.f and lsoda.f, and now understand what the issue is. Your solution appears correct. What follows is my understanding of what's going on: This is not due to a bug in ODEPACK. For getting dense output, what we want is the approximating polynomial for the last successful step. As you noticed, SciPy's wrapper is currently taking the approximating polynomial to order After each call to The columns of the Nordsieck array are of the form for Looking at the dense output implementation code def _call_impl(self, t):
if t.ndim == 0:
x = ((t - self.t) / self.h) ** self.p
else:
x = ((t - self.t) / self.h) ** self.p[:, None]
return np.dot(self.yh, x) the particular value of I'm sorry if this not clear yet. I've written it up hastily. Please let me know if you have any questions. I think there should be a comment added to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work identifying the fix, finding test cases which lead to order decrease, and identifying the alt_segment
issue which made the tests for LSODA
and BDF
not actually work.
As I mentioned before, I dug through the Fortran and found the root cause. Only minor suggestions related to adding or updating comments. Impressed with this one.
@Patol75 Do you have any thoughts on the suggested comments? If you think they look fine, let's commit them, and then I think this is ready to merge, as long as there are no unexpected test failures. |
Co-authored-by: Albert Steppi <albert.steppi@gmail.com>
Co-authored-by: Albert Steppi <albert.steppi@gmail.com>
@steppi I am very, very grateful for the reviewing work you have done here. What you uncovered makes a lot of sense to me and explains perfectly the behaviour that I had noticed and documented. The added comments are, I believe, extremely helpful for anyone who would try to understand how the interpolant is built. I think I owe you a massive thank you! I agree that the PR should now be ready for merging provided the CI returns green. For now, I see that CircleCI reports a failure while attempting to build SciPy, but it seems to be related to some code in |
No problem, happy to help! Thanks for identifying and finding a fix for this problem. The CircleCI failure is due to that build using Python 3.8, while the type hints used are only available in what is now the minimum required version: 3.9. I'm not sure what's causing the Feel free to ping me if you need a review for any PRs in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Reference issue
Closes gh-14480
What does this implement/fix?
As explained in #14480,
LsodaDenseOutput
fails to reproducey(t_old)
. It looks to me that this only occurs when LSODA plans on changing the method order (i.e.NQU != NQCUR
, equivalent toself._lsoda_solver._integrator.iwork[13] != self._lsoda_solver._integrator.iwork[14]
). To be clear, this is not exactly a bug in Scipy - Scipy perfectly implements Taylor's theorem and applies the scaling explained in DLSODA. Instead, it is likely a mistake either in the FORTRAN implementation or in its associated docstring.Taking #14480 again as an example, the implemented formula in Scipy is equivalent to:
which yields
7.395919849794794e-06
.The proposed formula reads:
and yields
7.883375656794793e-06
, which is the correct result.In the formulas above,
[1.71524262e-06, -1.24130739e-06, 1.88400737e-07, -4.87455807e-07]
is the Nordsieck history arrayYH
, obtained in that case throughself._lsoda_solver._integrator.rwork[20:24]
;0.06342019780923039
ist_old
;0.06402588986309915
ist
;0.00019480871245159827
isself._lsoda_solver._integrator.rwork[11]
,HCUR
; and0.0006056920538687616
isself._lsoda_solver._integrator.rwork[10]
,HU
.Therefore, it looks like one term is missing when
NQCUR < NQU
. According to the docstring, the number of terms included in the formula should be controlled byNQCUR
. However, I have found that it is reallyNQU
that should be used. In the case whereNQU == NQCUR
, nothing changes. WhenNQCUR < NQU
, as highlighted above, one term is added, and, interestingly, it must useHU
instead ofHCUR
. Finally, whenNQCUR > NQU
, the proposed implementation discards one term compared to the current implementation. I have tested locally all the possible scenarios with both method order and step size changing; the proposed implementation yields the right result every time.I do not have an explanation for the why behind all of this. Perhaps someone motivated can have a careful look through DSTODA - I feel that the reason lies there somewhere.
Additional information
ODEPACK is available here.