Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix coefficients in expm helper function #9705

Merged
merged 1 commit into from Apr 19, 2019

Conversation

SuperFluffy
Copy link
Contributor

@SuperFluffy SuperFluffy commented Jan 21, 2019

According to Higham 2005, the leading term of the backward error function of the [m/m] Padé approximant of the exponential function is x^{2m+1}. Its corresponding coefficient is C_{2m+1} = (m!)^2 / ( (2m)! * (2m+1)!), cf. Equations (2.2) and (2.6) in said reference.

So far, the code didn't compute C_{2m+1}, but replaced m <- 2m + 1, which is probably due to the literature's somewhat surprising notation (cf. the index i = 2m + 1 in Equation (2.6)).

Al-Mohy and Higham 2009 actually have a Matlab reference implementation available, where they hard-code the cofficients, which are reproduced verbatim below. Using m instead of m <- 2m + 1, as is currently done, exactly reproduces these coefficients:

% Coefficients of leading terms in the backward error functions h_{2m+1}.
Coeff = [1/100800, 1/10059033600, 1/4487938430976000,...
         1/5914384781877411840000, 1/113250775606021113483283660800000000];

According to [Higham 2005], the leading term of the backward error
function of the [m/m] Padé approximant of the exponential function is
x^{2m+1}. Its corresponding coefficient is C_{2m+1} = (m!)^2 / ( (2m)! *
(2m+1)!), cf. Equations (2.2) and (2.6) in said reference.

So far, the code didn't compute C_{2m+1}, but replaced m <- 2m + 1,
which is probably due to the literature's somewhat surprising notation
(cf. the index i = 2m + 1 in Equation (2.6)).

[Al-Mohy and Higham 2009] actually have a [Matlab reference implementation]
available, where they hard-code the cofficients, which are reproduced
verbatim below. Using m instead of m <- 2m + 1, as is currently done, exactly
reproduces these coefficients

```matlab
% Coefficients of leading terms in the backward error functions h_{2m+1}.
Coeff = [1/100800, 1/10059033600, 1/4487938430976000,...
         1/5914384781877411840000, 1/113250775606021113483283660800000000];
```

[Higham 2005]: https://doi.org/10.1137/04061101X
[Al-Mohy and Higham 2009]: https://doi.org/10.1137/09074721X
[Matlab reference implementation]: http://eprints.ma.man.ac.uk/1442/03/expm_new.zip
@SuperFluffy
Copy link
Contributor Author

Any way to move this forward? I am cannot assess how bad this incorrect factor is, but given that a probably non-trivial number of people is using matrix exponents in scipy, I would love to get this fixed (or assess the severity of this issue, if somebody has that ability).

Copy link
Contributor

@tylerjereddy tylerjereddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A unit test that fails before & succeeds after the fix is usually helpful.

Also, there may be concerns about the license of the reference code you are citing? Maybe clarify--that code can definitely be used to fix issues in our code base?

@SuperFluffy
Copy link
Contributor Author

SuperFluffy commented Feb 17, 2019

Re the unit tests:

As I wrote, I cannot assess the severity of this error. I can only state that a) scipy's expm claims to be an implementation of Al-Mohy and Higham 2009, and b) that it is an incorrect implementation according to their paper and the ones cited. Given the approximate nature of this algorithm we would need to establish an error bound that is not fulfilled by the current implementation but should be fulfilled by the reference implementation. I do not have the sufficient understanding of what's going on here to provide this expertise myself.

Right now, the unit tests available for scipy's expm itself are only testing trivial cases. Furthermore, I do not see how unit tests are useful here. If you want, I can write unit tests with the mathematically correct values, but that seems to be circular.

Re licensing:

I do not quite understand your point. In how far is their code used to fix scipy's code? Scipy implements the actual equations in the cited papers, with my fix calculating the same numbers that the matlab script hardcodes. There is no sharing of code, no reimplementation, not even a derivation of work. The calculation of the correct terms in the error function of the Padé approximants I did a) by hand because I was using scipy's implementation to learn the algorithm and not being able to get the same factors and b) by implementing them in a script. I have crosschecked the correctness of my factors by comparing them with the matlab code above. The matlab code itself was published alongside their paper, see the preprint server of Manchester University: http://eprints.maths.manchester.ac.uk/1442/

Steps forward:

The code is mathematically incorrect. You can follow the derivation of the coefficients in the papers cited. Gautschi 2012 is also useful to read here.

Scipy should either not mention that they implement Al-Mohy and Higham 2009, or alternatively specify that the implementation is probably not correct and warn people from using it.

@rgommers rgommers added this to the 1.3.0 milestone Feb 20, 2019
@rgommers
Copy link
Member

If you want, I can write unit tests with the mathematically correct values, but that seems to be circular.

agreed

There is no sharing of code, no reimplementation, not even a derivation of work. ..... I have crosschecked the correctness of my factors by comparing them with the matlab code above. The matlab code itself was published alongside their paper

this is all fine, thanks for clarifying

The code is mathematically incorrect.

yes, this PR should be okay as is. While this is a bugfix and we can therefore change the code, we should put a note in the 1.3.0 release notes when this PR is merged. Other than that, this just needs review. Unfortunately the expm expert we had went MIA. This should not be too hard to review though, since it's only a couple of line of code.

I'll mark it for the next release; if no one has reviewed in 1-2 weeks I'll get to it at some point (no time now, sorry).

@tylerjereddy
Copy link
Contributor

we would need to establish an error bound that is not fulfilled by the current implementation but should be fulfilled by the reference implementation

I don't get how this wouldn't be useful. Maybe it is hard to do, and maybe the PR is fine as is, but certainly not ideal to not have a regression guard for something that is mathematically incorrect.

@SuperFluffy
Copy link
Contributor Author

SuperFluffy commented Feb 21, 2019

we would need to establish an error bound that is not fulfilled by the current implementation but should be fulfilled by the reference implementation

I don't get how this wouldn't be useful. Maybe it is hard to do, and maybe the PR is fine as is, but certainly not ideal to not have a regression guard for something that is mathematically incorrect.

Oh, better unit tests would definitely be useful! I just don't know how to approach this, because I would need a much deeper understanding of the methods involved.

Maybe you can open an issue separate from this PR and take the lead on this? It would definitely be nice to see a unit test on expm which is directly affected by the wrong calculation of the error Padé coefficients. I would be very interested in that! I am happy to share anything I found out about this stuff.

@tylerjereddy
Copy link
Contributor

Makes sense -- I'll think about this a bit!

Copy link
Contributor

@tylerjereddy tylerjereddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll go with Ralf on this one:

yes, this PR should be okay as is

Unit tests would have been nice, but we need a resident expert!

@tylerjereddy tylerjereddy merged commit 4a497cc into scipy:master Apr 19, 2019
@tylerjereddy
Copy link
Contributor

@SuperFluffy
Copy link
Contributor Author

SuperFluffy commented Apr 19, 2019

@tylerjereddy Awesome that this got merged, makes me very happy!

It looks like I cannot assign labels/milestons to issues. I guess I don't have the rights. :) Can you do that for me? Here is the associated issue: #10074

I hope that that's what you meant me to do. You linked the issue tracker filtered by milestone 1.3.0.

@rgommers rgommers added the maintenance Items related to regular maintenance tasks label Apr 19, 2019
@rgommers
Copy link
Member

It looks like I cannot assign labels/milestons to issues. I guess I don't have the rights. :) Can you do that for me? Here is the associated issue: #10074

done

I hope that that's what you meant me to do.

No the link was wrong. Tyler's question was to add a note to https://github.com/scipy/scipy/wiki/Release-note-entries-for-SciPy-1.3.0#scipysparse-improvements. I'm not sure that this is relevant for users though, since it's not clear how this is relevant to an end user of expm in a practical sense. So I suggest to leave it.

@SuperFluffy
Copy link
Contributor Author

I'm not sure that this is relevant for users though, since it's not clear how this is relevant to an end user of expm in a practical sense. So I suggest to leave it.

It would indeed be interesting to ask users who rely heavily on expm to check if they observe any differences in their calculations before and after the fix.

@rgommers
Copy link
Member

That's a good idea - more a question for the mailing list than the release notes though. If you would want to write a message to scipy-user, that would be helpful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Items related to regular maintenance tasks scipy.sparse.linalg
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants