Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: implemented Owen's T function #7120

Merged
merged 1 commit into from
Feb 24, 2018
Merged

ENH: implemented Owen's T function #7120

merged 1 commit into from
Feb 24, 2018

Conversation

evgenyzhurko
Copy link
Contributor

@evgenyzhurko evgenyzhurko commented Mar 2, 2017

@@ -32,7 +32,7 @@
'keip_zeros', 'kelvin_zeros', 'ker_zeros', 'kerp_zeros', 'kv',
'kvp', 'lmbda', 'lpmn', 'lpn', 'lqmn', 'lqn', 'mathieu_a',
'mathieu_b', 'mathieu_even_coef', 'mathieu_odd_coef', 'ndtri',
'obl_cv_seq', 'pbdn_seq', 'pbdv_seq', 'pbvv_seq', 'perm',
'obl_cv_seq', 'owens_t','pbdn_seq', 'pbdv_seq', 'pbvv_seq', 'perm',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pep8, space after comma


Parameters
----------
h: int or float
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int or float --> scalar

(same 2 lines lower)

----------
.. [1] Mike Patefield and David Tandy. "Fast and accurate calculation
of Owen's T-function", 2000.
https://www.jstatsoft.org/article/view/v005i05/t.pdf
Copy link
Member

@rgommers rgommers Mar 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No link please, but a complete citation:

M. Patefield and D. Tandy, "Fast and accurate calculation of Owen’st function",
Statistical Software vol. 5, pp. 1-25, 2000.

elif a > 1:
ncdf_h = ndtr(h)
ncdf_ah = ndtr(a * h)
return 0.5 * (ncdf_h + ncdf_ah) - ncdf_h * ncdf_ah -\
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you get rid of the \ (replace by brackets)

owens_t(a * h, 1 / a)

if a == 0:
return 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blank line here

if a == 0:
return 0
if h == 0:
return 1 * np.arctan(a) / (2 * np.pi)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blank line here

a = 1.5
assert_almost_equal(special.owens_t(h, a), special.owens_t(h, np.inf))

delta = 0.5 * (cephes.ndtr(h) + cephes.ndtr(a * h)) -\
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, remove \

result += ((-1) ** (i - 1)) * z_curr
z_prev = z_curr
z_curr = ((2 * i - 1) * z_prev - pi_sq *
np.power(a, 2 * i - 1) * np.exp(ah_sq)) / (h * h)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blank line here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and style nitpick: lines like these should line up with the opening bracket above, not use 4-space indents

return 0
if h == 0:
return 1 * np.arctan(a) / (2 * np.pi)
if a >= 0.99:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no test or comment on the PR for the accuracy when approaching 0.99 from either side. did you verify this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and same question around 1.0 (other if clause above)

.. [1] Mike Patefield and David Tandy. "Fast and accurate calculation
of Owen's T-function", 2000.
https://www.jstatsoft.org/article/view/v005i05/t.pdf

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add an example which plots the function?


"""
if not (np.isscalar(h) and np.isscalar(a)):
raise ValueError("arguments must be scalars.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blank lines after an if-block ends (here and below)

@rgommers
Copy link
Member

rgommers commented Mar 2, 2017

Thanks @evgenyzhurko. Would be good to mention that you want to add this function on the scipy-dev mailing list.

@rgommers rgommers added enhancement A new feature or improvement scipy.special labels Mar 2, 2017
@pv
Copy link
Member

pv commented Mar 2, 2017 via email

@evgenyzhurko
Copy link
Contributor Author

@rgommers I made a mistake wheni use if a > 0.99 because this property can be used only for a == 1
Main part of algorithm can evaluate T(h, a) when 0 <= a <= 1 and h >= 0 and for this we use property: T(h, a) = T(-h, a), T(h, -a) = -T(h, a), T(h, a) + T(h, 1/a) = 0.5 (F(h) + F(ah)) - F(h) * F(ah).

@pv I'll try to do it, but i have never use Cython before.

@ev-br
Copy link
Member

ev-br commented Mar 2, 2017

@evgenyzhurko it's likely easiest to start from IPython's magic %%cython -a```, then move the code over to the scipy/special/generate_ufuncs.py` framework.

@ewmoore
Copy link
Member

ewmoore commented Mar 2, 2017

You should import the tests for this function from Boost as well. They have a table of values in https://github.com/boostorg/math/blob/develop/test/owens_t.ipp that can be added to the similar tests as well as a bunch of other tests that can be ported over. No reason not to make use of work already done to construct some good tests.

@person142
Copy link
Member

You reference M. Patefield and D. Tandy, but it seems that the algorithm used in the paper is much more complex. Is that true, and if so, how did you achieve the simplifications?

@pv
Copy link
Member

pv commented Mar 2, 2017 via email

@person142
Copy link
Member

Just to second @pv, to make this production-ready it should be rewritten in Cython (or C, or C++, or Fortran) and hooked up to the ufuncs machinery.

Copy link
Member

@person142 person142 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good--the build is failing because you need to add a line to test_cython_special.py.

@@ -6369,3 +6369,8 @@ def add_newdoc(place, name, doc):
"""
Internal function, do not use.
""")

add_newdoc("scipy.special", "owens_t",
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of putting the documentation in owens_t.c put it here.

0.60419009528470238773E-02, 0.38862217010742057883E-02,
0.16793031084546090448E-02};

static double PI = 3.1415926535897932384626433832795;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use NPY_PI.

return SELECT_METHOD[iaint* 15 + ihint];
}

int get_ord(int index) {
Copy link
Member

@person142 person142 Mar 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this helper function is necessary.

}

double owens_t(double h, double a) {
if (cephes_isnan(h) || cephes_isnan(a))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Braces.

}

double owensT3(double h, double a, double m) {
double a_sq = a * a;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you name them something like a2 or aa?

double yi = 1;
double result = 0;

while (1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not for (i = 1; i < maxi; i += 2)?

int i = 0;

for (i = 0; i < 14; i++) {
if (h <= HRANGE[i]) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spacing isn't consistent with the rest of the code here.

@@ -3416,5 +3416,412 @@ def xfunc(delta, r):
assert_func_equal(special.pseudo_huber, w, z, rtol=1e-13, atol=1e-13)


owenst_dt = [[(1.9508080482482910156250000000000000000000000000000000000000000000000000000000000000000000000000000000e+00), (9.7540378570556640625000000000000000000000000000000000000000000000000000000000000000000000000000000000e-02), (2.2942365980155351117754771526378292708740231264960539316842063696200884829628237039895484898529192823e-03)],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is the test data from boost, it should be put in a file under data/boost/owenst_ipp and added to the TESTS list in test_data.py:test_boost. (Also if it's some other test data, long data set like this should be put in a data file.)

After adding the text data, the npz data file needs to be regenerated using utils/makenpz.py


for (i = 0; i < 7; i++) {
if (h <= HRANGE[i]) {
ihint = i;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iahint?

@evgenyzhurko
Copy link
Contributor Author

I add (special.owens_t, cython_special.owens_t, ('dd',), None) to test_cython_special.py but it's not solve the previous fail. I try to run test using command python runtest.py -v and tests stoped without error.
End of logs:
test_cython_special.test_cython_api((<ufunc 'zetac'>, <built-in function zetac>, ('d',), None),) ... ok
test_cython_special.test_cython_api((<ufunc 'owens_t'>, <built-in function owens_t>, ('dd',), None),) ...

@person142
Copy link
Member

Now there are different failures--looks like the docstring has some issues and at least one of the builds is getting a segfault. You can go look at the build logs for details.

@evgenyzhurko
Copy link
Contributor Author

How i can correct issue with docstring?
Failed example:
special.owens_t(h, a)
Expected:
0.10877216734852269
Got:
0.10877216734852269

@ev-br
Copy link
Member

ev-br commented Mar 7, 2017

Expected:
0.10877216734852269
Got:
0.10877216734852269

Something about locale / non-ascii symbols?

@evgenyzhurko
Copy link
Contributor Author

There is no message about locale or ascii. I think it's a result of others error or segmentation foult, because in previous commit all is good with docstring.

@person142
Copy link
Member

in previous commit all is good with docstring

It wasn't really checked in the previous commit because most of the documentation was in the C source; the refguide checker only looks at the Python docstrings.

@person142
Copy link
Member

Unless the Boost function is actually correctly rounded on all that test data, which would be impressive.

@evgenyzhurko
Copy link
Contributor Author

Relative error for all test cases = from 1e-17 to 7e-18

@evgenyzhurko
Copy link
Contributor Author

When i wrote aboute absolute error, i made a mistake, 1.7194768829262296883086946763796731829643249511719e-19 is maximal absolute error, minimal absolute error is about 7e-106 for data ~1e-89

@evgenyzhurko
Copy link
Contributor Author

Relative tolerance for this data is from 1e-17 to 7e-19

@person142
Copy link
Member

Is Boost using extended precision internally for the double precision routine? That would explain this. (Would just check myself but am away from computer.)

@evgenyzhurko
Copy link
Contributor Author

evgenyzhurko commented Mar 18, 2017

I tested owens_t (from this pull request) with different tolerance and i got error only for 2 test case(rtol=5e-14), rdiff in this 2 test case ~1e-8, others test case have rdiff about 5e-14 and less. 2 test case with problem use T6 algorithm for computing.

If I use T5 algorithm for 2 problem test case, tests will pass with rtol = 5e-14.

@evgenyzhurko
Copy link
Contributor Author

I found a bug in my code. Current owens_t function passes tests with rtol=5e-14

@person142
Copy link
Member

Very nice!

@evgenyzhurko
Copy link
Contributor Author

@person142 Do you expect more accuracy in computing for this pull request or some other changes?

@person142
Copy link
Member

person142 commented Mar 20, 2017

5e-14 is certainly sufficient. I still need to review the mathematics in detail, but that's on me.

Probably the most important thing now is using gcov to make sure all of the different code branches are getting tested and adding additional tests if some of them aren't. (Since the selection routine for the function is so complicated it will be difficult to feel confident that all the cases are being tested otherwise.)

@evgenyzhurko
Copy link
Contributor Author

Thanks, i'll try to generate test case for all code branches.

@evgenyzhurko
Copy link
Contributor Author

Boost test cases covers T3, T5, T6, T4_mp algorithms.
I added test cases for T1, T2, T1 accelerated and T2 accelerated, but i didn't find test case for T4.
I found the whole range of values a and h for T4 method and tested what is method used to compute for this value, as a result all this values use T4_mp algorithm. I used step 0.0001 and can't more(it takes very long time).
I have no test case fot T4 algorithm, but i think it's not necessary delete T4 algorithm, because T4_mp isn't stable in some cases(but i didn't find this cases for T4 algorithm).

@pv pv removed the needs-work Items that are pending response from the author label Mar 25, 2017
@evgenyzhurko
Copy link
Contributor Author

evgenyzhurko commented Apr 9, 2017

@person142 gcov shows 92.5% coverage for lines and 93.3% for functions. As I said, only OwensT4 function not coveraged yet, because OwenstT4_mp covers range of OwensT4, but OwenstT4_mp might be unstable in some case.
image
image

@ev-br
Copy link
Member

ev-br commented Jul 14, 2017

Josh, @person142, any chance you can take a look?

@person142
Copy link
Member

Yes, sorry @evgenyzhurko for sitting on this for so long... will try to take a final look; probably won't happen until next week though.

@rgommers rgommers added this to the 1.1.0 milestone Sep 16, 2017
@ghost
Copy link

ghost commented Nov 6, 2017

@evgenyzhurko Can you rebase on master?

@evgenyzhurko
Copy link
Contributor Author

I rebased this PR on master, fixed all build problem and merge conflicts. Is anyone can review this PR?

@person142
Copy link
Member

@evgenyzhurko I rebased again (recent changes in cephes conflicted) and made some fixes. Still need to investigate the test coverage a little more.

Probably most importantly: from reading the code it looks like the Boost implementation was followed closely, so I added the Boost license information to the top of the file. Am I correct?

@evgenyzhurko
Copy link
Contributor Author

@person142 You are right, in many case i follow to boost implementation.

I was trying to find values a and h that can be calculated using only T4(that not covered yet) function but it's no use. I calculated that for 0.06 < a < 0.25 and 0.6 < h < 3.5 should be used T4 function. I iterated through this interval with step=0.00001 and T4 wasn't used.

@person142
Copy link
Member

I was trying to find values a and h that can be calculated using only T4(that not covered yet) function but it's no use

It appears you were using Boost's arbitrary precision dispatch function instead of the double precision version. I switched to using the double precision version (which covers T4), fixed a bug this revealed in T1, and then removed the no-longer-needed arbitrary precision helper functions.

@evgenyzhurko
Copy link
Contributor Author

@person142 Thanks for your help and my mistakes explanation. Is something else need from me in this PR?

@person142
Copy link
Member

Is something else need from me in this PR?

I made a few more fixes, and I think this is good to go now.

We will see what the CIs say, and we should probably ping @pv to do a sanity check since I feel a little too involved in the code editing process to be an objective third party. Pauli, I'm seeing 100% test coverage on owens_t.c with the exception of the default case on line 308, which should in theory never be reached (it's just a sentinel).

@person142
Copy link
Member

Ah shoot, looks like cephes/protos.h and some submodule updates got picked up in the rebase--will have to fix that.

@person142
Copy link
Member

Ok, cleaned up the history. Now there's one clean commit.

@person142 person142 merged commit 895a774 into scipy:master Feb 24, 2018
@person142
Copy link
Member

Ok, got this in at long last. Thanks for your patience @evgenyzhurko.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.special
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants