Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: special: add the Wright Omega function #6653

Merged
merged 20 commits into from Oct 13, 2016
Merged

Conversation

person142
Copy link
Member

This adds @pierslawrence's implementation of the Wright Omega function to special as discussed in #2851. Since we don't support C99 complex arithmetic the original C code is ported to C++. The tests generally look good, though there are what appear to be some overflow issues for large arguments. They might be caused by this line:

https://github.com/person142/scipy/blob/48f02f1ed7a2af6041fc4eb45726f23ab73faa41/scipy/special/wright.cc#L259

(In particular the division by z*z*z*; the product could overflow.) Or perhaps I introduced an error while porting; haven't investigated much since at this point it seems best to leave the code as close to the original as possible.

I am unsure as to whether we should also add the unwinding number as a special function since it's a one-liner, but do not have very strong opinions either way.

@person142
Copy link
Member Author

Ok, the build is failing because Travis is using gcc 4.6 which only includes experimental support for the C++11 cfenv. Since fenv.h is C99, is there a portable way to get functions like fesetround besides adding our own implementations? (NumPy doesn't have them AFAICT?)

@person142
Copy link
Member Author

person142 commented Oct 5, 2016

Can probably use FreeBSD's implementation:

https://github.com/freebsd/freebsd/blob/af3e10e5a78d3af8cef6088748978c6c612757f0/lib/msun/mips/fenv.h

though it seems to imply that on ARM changing the rounding mode won't work anyway.

@person142 person142 force-pushed the wrightomega branch 2 times, most recently from ee2fd52 to 5df360e Compare October 6, 2016 04:14
@person142
Copy link
Member Author

Whew, finally a happy Travis. Use of fenv is eliminated: the code was using it to change the rounding mode before an addition so standalone functions that add with a desired rounding mode are added in special/_round.h. They rely on the fact that error(a + b) = a + b - fl(a + b) is exactly representable (and computable) in double precision; the sign of the error tells us whether rounding to nearest rounded up or down.

Here's a gist containing the code I was using to test the addition functions:

https://gist.github.com/person142/2eec2572fed3fea7ec22155f204c0aff

@person142 person142 added needs-work Items that are pending response from the author enhancement A new feature or improvement scipy.special labels Oct 6, 2016
@person142
Copy link
Member Author

Tweaked the asymptotic series to avoid overflow and also skipped using the iterative scheme in some regimes where it can overflow/try to take a log after underflow. (In those regimes the asymptotic expansions are accurate enough.) The systematic tests now pass on the entire complex plane even with nan_ok=False; added tests near the branch cuts which also look good.

Think I'm done fiddling with things.

@person142 person142 removed the needs-work Items that are pending response from the author label Oct 6, 2016


int
sc_signbit(double x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we instead use the math.h + npy_math.h solution?
The portability issue in gh-5689 apparently concerned only #include <cmath> in combination with npy_math.h.
Using #include <math.h> should work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not, because #include <complex> includes <cmath>.
It's really a PITA that we can't use C99 or C++11 --- the former won't work on MSVC, and the latter apparently requires passing compiler-dependent flags in.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I used npy_signbit at first but then g++ undef'd it on Travis. :-(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, what do you mean by g++ undef'd npy_signbit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't look into it too carefully, but I assumed it was this issue: #5689 (comment). (The CI build complained about not being able to find signbit.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my understanding (maybe wrong) was: the CI machine has C99, so it hits

#define npy_signbit(x) signbit((x))

in npy_math.h. Then we hit #undef signbit in cmath. But then there's no C++11, so we end up having no signbit. Maybe this could also be fixed by shuffling the includes, though that sounds kind of brittle.

@person142
Copy link
Member Author

I didn't need to be using signbit since the code didn't need to detect the sign of 0 or NaN. Wasn't thinking; it's gone now.

Copy link
Member

@pv pv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions on _round.h + unnecessary assignments.

Note also a merge conflict.


double add_round_up(double a, double b)
{
double s, e, v;
Copy link
Member

@pv pv Oct 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May need to be volatile double --- is the compiler allowed to keep stuff in fp extended precision registers, so you don't get the truncation as expected? Does this break with -O3 etc? How do you test that this works?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would very much break with -O3 since (IIRC) -O3 enables unsafe floating point optimizations. Tests I was using to see if it works were mentioned here: #6653 (comment); would be possible to put the tests in SciPy itself if we only run them when we detect that fenv is present. Shall I?

Copy link
Member

@pv pv Oct 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may also break on without unsafe optimizations on x87, and I'm not sure if C (or C99) standards specify the precision used in intermediate results. OTOH, storing relevant intermediate results in volatile double could force appropriate truncation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will do that. From the gcc wiki it looks like it could also break on older x86 processors without SSE2 since they don't guarantee correct rounding for arithmetic. Loath as I am to do it, it might be safer to rewrite the routines to just do the bit twiddling ourselves so that we avoid all the floating point issues.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C99 standard states: "The values of operations with floating operands and values subject to the usual arithmetic conversions and of floating constants are evaluated to a format whose range and precision may be greater than required by the type. The use of evaluation formats is characterized by the implementation-defined value of FLT_EVAL_METHOD" --- which probably means the code here is not portable...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need just controlled truncation, I think assignment to a volatile double variables is OK.

sf_error("wrightomega", SF_ERROR_UNDERFLOW, "underflow in exponential series");
/* Skip the iterative scheme because it computes log(*w) */
e = NULL;
r = NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these assignements don't do anything?

Copy link
Member Author

@person142 person142 Oct 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we've already modified the source a fair bit anyway, should we just remove the wrightomega_ext function and work only with wrightomega? (i.e. don't return e and r.)

/* */
/* r -- double complex* */
/* Pointer to penultimate residual r_k = z - w_k - log(w_k) */
/* NULL if the iterative scheme is not used. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot return NULL since it's not a pointer-to-pointer?

/* Series is accurate and the iterative scheme could overflow */
{
e = NULL;
r = NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary assignments.

/* Series is accurate and the iterative scheme could overflow */
{
e = NULL;
r = NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above.

/* Series is accurate and the iterative scheme could overflow */
{
e = NULL;
r = NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above.

{
pz=log(z);
*w = z - pz;
fac = pz/z;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming these branches were tested.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should be covered in the TestSystematic test--if you go back to 347d7d3 but flip the nan_ok flag in TestSystematic.test_wrightomega to False you'll get around 1,200 failures, which these changes are trying to address.

@person142 person142 force-pushed the wrightomega branch 2 times, most recently from f7a72ff to ecd1bba Compare October 9, 2016 15:26
@person142
Copy link
Member Author

Added tests to SciPy for add_round_up and add_round_down (and marked everything as volatile); they pass locally and fail on Travis so the answer to "is it portable" seems to be "definitely not". Back to square 1...

@pv
Copy link
Member

pv commented Oct 9, 2016

Maybe you can try removing the first assert_ from the test and add more precision {:.21g}, so that it will show which points failed. I.e., is it even clear fesetround/fegetround work on the travis gcc?

@person142
Copy link
Member Author

Indeed, might be an issue with fenv on the Travis machines. For example one failure is:

a = -5.41562869078764134668e+306
b = -9.46574644932701959158e+306
a + b in round up mode = -1.48813751401146621857e+307
a + b with add_round_up = -1.48813751401146596909e+307

But locally I get:

a + b in round up mode = -1.48813751401146596909e+307

Also tested on Ubuntu 16.04 (gcc 5.3.1, glibc 2.23) and tests passed there. (Other local tests were on OS X 10.11.6.)

- Add more comprehensive `fenv` check
- Try to sample a wider range of double precision numbers
- Improve NaN handling
In particular the regions only lightly covered by the `TestSystematic`
tests are more carefully covered.
@person142 person142 added the needs-work Items that are pending response from the author label Oct 11, 2016
@person142 person142 removed the needs-work Items that are pending response from the author label Oct 11, 2016
@person142
Copy link
Member Author

Now:

  • Travis build is succeeding because the add_round_* tests are being skipped--I modified have_fenv (which controls whether the tests run) to also check whether fesetround is working (before just tested fegetround), and apparently it is not. (i.e. it returns an error status when you try to change the rounding mode.)
  • After fixing a bug I introduced when modifying the asymptotic series the relative error on the tests is as good as the original (and better in the asymptotic regime where there were overflows). Those interested in checking this can look at: https://github.com/person142/scipy/tree/wrightomega-original-code. (The only change in this branch is the port to C++; i.e. it's still using fenv and there are no modifications to the asymptotic regime.)

@pv
Copy link
Member

pv commented Oct 12, 2016

I get deprecation warnings from test_round:

/home/pauli/prj/scipy/scipy/scipy/special/tests/test_round.py:12: DeprecationWarning: This function is deprecated. Please call randint(-9223372036854775808, 9223372036854775807 + 1) instead
  _test_round.test_add_round(10**5, 'up')
/home/pauli/prj/scipy/scipy/scipy/special/tests/test_round.py:12: DeprecationWarning: This function is deprecated. Please call randint(-9223372036854775808, 9223372036854775807 + 1) instead
  _test_round.test_add_round(10**5, 'up')
./home/pauli/prj/scipy/scipy/scipy/special/tests/test_round.py:18: DeprecationWarning: This function is deprecated. Please call randint(-9223372036854775808, 9223372036854775807 + 1) instead
  _test_round.test_add_round(10**5, 'down')
/home/pauli/prj/scipy/scipy/scipy/special/tests/test_round.py:18: DeprecationWarning: This function is deprecated. Please call randint(-9223372036854775808, 9223372036854775807 + 1) instead
  _test_round.test_add_round(10**5, 'down')

Otherwise LGTM.

The corner-case branches before /* Choose approximation based on region */ in wright.cc are not tested (rm -rf build; python runtests.py -g -s special -m full --gcov; python runtests.py --lcov-html; firefox build/lcov/index.html), but probably doesn't block merge.

@person142
Copy link
Member Author

person142 commented Oct 13, 2016

Hopefully this fixes the above issues. Indeed, the corner cases were not handling -0.0 correctly. (Warning: did not run gcov myself; llvm builds appear to no longer be compatible and the gcc build keeps failing.)

Edit: gcov issues sorted out; looks like all lines are hit now except for the condition number computations (which aren't used in any way in the public special.wrightomega function).

@pv pv merged commit ef3d6b6 into scipy:master Oct 13, 2016
@pv
Copy link
Member

pv commented Oct 13, 2016

Thanks, innit goes

@pierslawrence
Copy link

Hello,

I am really appreciative for all of the effort by everyone to integrate this function into scipy. I do have one question about the usual practice regarding real/complex functions within this module.

For real valued z the wrightomega(z) is also real (the same can be said for lambertw(z,k=0) on (1/e,inf) and lambertw(z,k=-1) on (-1/e,0)).

What is the normal practice for dealing with these cases? Should we expect to get back a real value, or is it usual to return a complex type with Im(w)=0.?

Kind regards,

Piers

@person142
Copy link
Member Author

Should we expect to get back a real value, or is it usual to return a complex type with Im(w)=0.?

Ideally get back a real value, but we haven't always been very consistent about that. See e.g. #7983 where we fixed this issue for loggamma.

@pierslawrence
Copy link

From #7983 I did not quite see the fix. Is this a multiple dispatch solution?

The current logic in the original algorithm applied to the real case would restrict the solution to three regions, (-inf,-2] (-2,1] (1,inf). Then the rest of the algorithm would be identical.

If I remember correctly, the series about infinity might actually be accurate enough in the region (-2,1] as I think that the main need for the "mushroom" shaped region was to get a good approximation near the starts of the branch cuts. (Although this would need testing).

Kind regards,

Piers

@pierslawrence
Copy link

Sorry it took me so long to follow up on this real extension to the Wright omega function.

As mentioned in the first post of this thread, we really do need to avoid the overflow introduced by the zzz term in the series about infinity. As it turns out, we actually only need the first two terms (x-log(x)+log(x)/x) in the series about infinity on the real axis, and we can avoid the overflow problem. The same could probably be done for the complex code, but it would need testing for the accuracy.

Similarly, we need only the first term (exp(x)) of the series about negative infinity, and this is good enough on (-inf,-2] with two iterations of the Fritsch, Schafer, Crowley (FSC) iteration.

For the remaining region (-2,1] we only need a low accuracy approximation, and I suggest exp(2/3*(x-1)) which is continuous with respect to the other series (not the derivatives, obviously) and is good enough to obtain double precision accuracy in two iterations of FSC.

I attach my implementation of the real Wright omega function, if someone with more knowledge of integrating into scipy can work this in, I would appreciate it (especially since the complex implementation seemed to have many issues with the build system).

Regards,

Piers Lawrence

wright_real_dispatch.txt

@person142
Copy link
Member Author

Hey @pierslawrence, could you open a new issue for the real implementation? It will have a lot more visibility that way and could be a good first project for someone looking to get familiar with the special internals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.special
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants