ENH: implemented Owen's T function #7120

evgenyzhurko · 2017-03-02T08:00:58Z

rgommers · 2017-03-02T08:52:30Z

scipy/special/basic.py

@@ -32,7 +32,7 @@
           'keip_zeros', 'kelvin_zeros', 'ker_zeros', 'kerp_zeros', 'kv',
           'kvp', 'lmbda', 'lpmn', 'lpn', 'lqmn', 'lqn', 'mathieu_a',
           'mathieu_b', 'mathieu_even_coef', 'mathieu_odd_coef', 'ndtri',
-           'obl_cv_seq', 'pbdn_seq', 'pbdv_seq', 'pbvv_seq', 'perm',
+           'obl_cv_seq', 'owens_t','pbdn_seq', 'pbdv_seq', 'pbvv_seq', 'perm',


pep8, space after comma

rgommers · 2017-03-02T08:53:09Z

scipy/special/basic.py

+
+    Parameters
+    ----------
+    h: int or float


int or float --> scalar

(same 2 lines lower)

rgommers · 2017-03-02T09:01:50Z

scipy/special/basic.py

+    ----------
+    .. [1] Mike Patefield and David Tandy. "Fast and accurate calculation
+           of Owen's T-function", 2000.
+           https://www.jstatsoft.org/article/view/v005i05/t.pdf


No link please, but a complete citation:

M. Patefield and D. Tandy, "Fast and accurate calculation of Owen’st function", Statistical Software vol. 5, pp. 1-25, 2000.

rgommers · 2017-03-02T09:02:32Z

scipy/special/basic.py

+    elif a > 1:
+        ncdf_h = ndtr(h)
+        ncdf_ah = ndtr(a * h)
+        return 0.5 * (ncdf_h + ncdf_ah) - ncdf_h * ncdf_ah -\


can you get rid of the \ (replace by brackets)

rgommers · 2017-03-02T09:02:53Z

scipy/special/basic.py

+            owens_t(a * h, 1 / a)
+
+    if a == 0:
+        return 0


blank line here

rgommers · 2017-03-02T09:02:58Z

scipy/special/basic.py

+    if a == 0:
+        return 0
+    if h == 0:
+        return 1 * np.arctan(a) / (2 * np.pi)


blank line here

rgommers · 2017-03-02T09:03:29Z

scipy/special/tests/test_basic.py

+    a = 1.5
+    assert_almost_equal(special.owens_t(h, a), special.owens_t(h, np.inf))
+
+    delta = 0.5 * (cephes.ndtr(h) + cephes.ndtr(a * h)) -\


same here, remove \

rgommers · 2017-03-02T09:03:45Z

scipy/special/basic.py

+        result += ((-1) ** (i - 1)) * z_curr
+        z_prev = z_curr
+        z_curr = ((2 * i - 1) * z_prev - pi_sq *
+            np.power(a, 2 * i - 1) * np.exp(ah_sq)) / (h * h)


blank line here

and style nitpick: lines like these should line up with the opening bracket above, not use 4-space indents

rgommers · 2017-03-02T09:05:22Z

scipy/special/basic.py

+        return 0
+    if h == 0:
+        return 1 * np.arctan(a) / (2 * np.pi)
+    if a >= 0.99:


there's no test or comment on the PR for the accuracy when approaching 0.99 from either side. did you verify this?

and same question around 1.0 (other if clause above)

rgommers · 2017-03-02T09:08:44Z

scipy/special/basic.py

+    .. [1] Mike Patefield and David Tandy. "Fast and accurate calculation
+           of Owen's T-function", 2000.
+           https://www.jstatsoft.org/article/view/v005i05/t.pdf
+


maybe add an example which plots the function?

rgommers · 2017-03-02T09:09:12Z

scipy/special/basic.py

+
+    """
+    if not (np.isscalar(h) and np.isscalar(a)):
+        raise ValueError("arguments must be scalars.")


blank lines after an if-block ends (here and below)

rgommers · 2017-03-02T09:10:13Z

Thanks @evgenyzhurko. Would be good to mention that you want to add this function on the scipy-dev mailing list.

pv · 2017-03-02T09:23:50Z

Would make sense to implement this in Cython, to get a proper ufunc without restriction to scalar arguments.

evgenyzhurko · 2017-03-02T11:51:08Z

@rgommers I made a mistake wheni use if a > 0.99 because this property can be used only for a == 1
Main part of algorithm can evaluate T(h, a) when 0 <= a <= 1 and h >= 0 and for this we use property: T(h, a) = T(-h, a), T(h, -a) = -T(h, a), T(h, a) + T(h, 1/a) = 0.5 (F(h) + F(ah)) - F(h) * F(ah).

@pv I'll try to do it, but i have never use Cython before.

ev-br · 2017-03-02T12:36:54Z

@evgenyzhurko it's likely easiest to start from IPython's magic %%cython -a```, then move the code over to the scipy/special/generate_ufuncs.py` framework.

ewmoore · 2017-03-02T13:00:10Z

You should import the tests for this function from Boost as well. They have a table of values in https://github.com/boostorg/math/blob/develop/test/owens_t.ipp that can be added to the similar tests as well as a bunch of other tests that can be ported over. No reason not to make use of work already done to construct some good tests.

person142 · 2017-03-02T14:42:01Z

You reference M. Patefield and D. Tandy, but it seems that the algorithm used in the paper is much more complex. Is that true, and if so, how did you achieve the simplifications?

pv · 2017-03-02T15:20:06Z

It's the method T2 from Patefield&Tandy. Probably covers only part of the parameter region, and if so more work is necessary here.

person142 · 2017-03-03T23:07:37Z

Just to second @pv, to make this production-ready it should be rewritten in Cython (or C, or C++, or Fortran) and hooked up to the ufuncs machinery.

person142

Looking good--the build is failing because you need to add a line to test_cython_special.py.

person142 · 2017-03-07T02:04:22Z

scipy/special/add_newdocs.py

@@ -6369,3 +6369,8 @@ def add_newdoc(place, name, doc):
    """
    Internal function, do not use.
    """)
+
+add_newdoc("scipy.special", "owens_t",
+    """


Instead of putting the documentation in owens_t.c put it here.

person142 · 2017-03-07T02:05:17Z

scipy/special/cephes/owens_t.c

+    0.60419009528470238773E-02, 0.38862217010742057883E-02,
+    0.16793031084546090448E-02};
+
+static double PI = 3.1415926535897932384626433832795;


Use NPY_PI.

person142 · 2017-03-07T02:10:08Z

scipy/special/cephes/owens_t.c

+	return SELECT_METHOD[iaint* 15 + ihint];
+}
+
+int get_ord(int index) {


I don't think this helper function is necessary.

person142 · 2017-03-07T02:11:52Z

scipy/special/cephes/owens_t.c

+}
+
+double owens_t(double h, double a) {
+	if (cephes_isnan(h) || cephes_isnan(a))


person142 · 2017-03-07T02:15:41Z

scipy/special/cephes/owens_t.c

+}
+
+double owensT3(double h, double a, double m) {
+    double a_sq = a * a;


Can you name them something like a2 or aa?

person142 · 2017-03-07T02:18:11Z

scipy/special/cephes/owens_t.c

+    double yi = 1;
+    double result = 0;
+
+    while (1) {


Why not for (i = 1; i < maxi; i += 2)?

person142 · 2017-03-07T02:19:12Z

scipy/special/cephes/owens_t.c

+	int i = 0;
+
+	for (i = 0; i < 14; i++) {
+		if (h <= HRANGE[i]) {


Spacing isn't consistent with the rest of the code here.

pv · 2017-03-07T08:23:00Z

scipy/special/tests/test_basic.py

@@ -3416,5 +3416,412 @@ def xfunc(delta, r):
    assert_func_equal(special.pseudo_huber, w, z, rtol=1e-13, atol=1e-13)


+owenst_dt = [[(1.9508080482482910156250000000000000000000000000000000000000000000000000000000000000000000000000000000e+00), (9.7540378570556640625000000000000000000000000000000000000000000000000000000000000000000000000000000000e-02), (2.2942365980155351117754771526378292708740231264960539316842063696200884829628237039895484898529192823e-03)],


If this is the test data from boost, it should be put in a file under data/boost/owenst_ipp and added to the TESTS list in test_data.py:test_boost. (Also if it's some other test data, long data set like this should be put in a data file.)

After adding the text data, the npz data file needs to be regenerated using utils/makenpz.py

pv · 2017-03-07T08:24:03Z

scipy/special/cephes/owens_t.c

+
+	for (i = 0; i < 7; i++) {
+		if (h <= HRANGE[i]) {
+			ihint = i;


evgenyzhurko · 2017-03-07T21:10:53Z

I add (special.owens_t, cython_special.owens_t, ('dd',), None) to test_cython_special.py but it's not solve the previous fail. I try to run test using command python runtest.py -v and tests stoped without error.
End of logs:
test_cython_special.test_cython_api((<ufunc 'zetac'>, <built-in function zetac>, ('d',), None),) ... ok
test_cython_special.test_cython_api((<ufunc 'owens_t'>, <built-in function owens_t>, ('dd',), None),) ...

person142 · 2017-03-07T21:15:44Z

Now there are different failures--looks like the docstring has some issues and at least one of the builds is getting a segfault. You can go look at the build logs for details.

evgenyzhurko · 2017-03-07T21:19:32Z

How i can correct issue with docstring?
Failed example:
special.owens_t(h, a)
Expected:
0.10877216734852269
Got:
0.10877216734852269

ev-br · 2017-03-07T21:33:14Z

Expected:
0.10877216734852269
Got:
0.10877216734852269

Something about locale / non-ascii symbols?

evgenyzhurko · 2017-03-07T21:43:47Z

There is no message about locale or ascii. I think it's a result of others error or segmentation foult, because in previous commit all is good with docstring.

person142 · 2017-03-07T21:48:52Z

in previous commit all is good with docstring

It wasn't really checked in the previous commit because most of the documentation was in the C source; the refguide checker only looks at the Python docstrings.

person142 · 2017-03-18T17:22:46Z

Unless the Boost function is actually correctly rounded on all that test data, which would be impressive.

evgenyzhurko · 2017-03-18T17:23:33Z

Relative error for all test cases = from 1e-17 to 7e-18

evgenyzhurko · 2017-03-18T17:32:14Z

When i wrote aboute absolute error, i made a mistake, 1.7194768829262296883086946763796731829643249511719e-19 is maximal absolute error, minimal absolute error is about 7e-106 for data ~1e-89

evgenyzhurko · 2017-03-18T17:37:12Z

Relative tolerance for this data is from 1e-17 to 7e-19

person142 · 2017-03-18T17:44:23Z

Is Boost using extended precision internally for the double precision routine? That would explain this. (Would just check myself but am away from computer.)

evgenyzhurko · 2017-03-18T19:13:36Z

I tested owens_t (from this pull request) with different tolerance and i got error only for 2 test case(rtol=5e-14), rdiff in this 2 test case ~1e-8, others test case have rdiff about 5e-14 and less. 2 test case with problem use T6 algorithm for computing.

If I use T5 algorithm for 2 problem test case, tests will pass with rtol = 5e-14.

evgenyzhurko · 2017-03-18T19:48:18Z

I found a bug in my code. Current owens_t function passes tests with rtol=5e-14

person142 · 2017-03-18T23:43:32Z

Very nice!

evgenyzhurko · 2017-03-20T19:47:19Z

@person142 Do you expect more accuracy in computing for this pull request or some other changes?

person142 · 2017-03-20T19:59:24Z

5e-14 is certainly sufficient. I still need to review the mathematics in detail, but that's on me.

Probably the most important thing now is using gcov to make sure all of the different code branches are getting tested and adding additional tests if some of them aren't. (Since the selection routine for the function is so complicated it will be difficult to feel confident that all the cases are being tested otherwise.)

evgenyzhurko · 2017-03-20T20:06:35Z

Thanks, i'll try to generate test case for all code branches.

evgenyzhurko · 2017-03-21T21:07:04Z

Boost test cases covers T3, T5, T6, T4_mp algorithms.
I added test cases for T1, T2, T1 accelerated and T2 accelerated, but i didn't find test case for T4.
I found the whole range of values a and h for T4 method and tested what is method used to compute for this value, as a result all this values use T4_mp algorithm. I used step 0.0001 and can't more(it takes very long time).
I have no test case fot T4 algorithm, but i think it's not necessary delete T4 algorithm, because T4_mp isn't stable in some cases(but i didn't find this cases for T4 algorithm).

evgenyzhurko · 2017-04-09T10:01:24Z

@person142 gcov shows 92.5% coverage for lines and 93.3% for functions. As I said, only OwensT4 function not coveraged yet, because OwenstT4_mp covers range of OwensT4, but OwenstT4_mp might be unstable in some case.

ev-br · 2017-07-14T10:34:07Z

Josh, @person142, any chance you can take a look?

person142 · 2017-07-14T12:41:42Z

Yes, sorry @evgenyzhurko for sitting on this for so long... will try to take a final look; probably won't happen until next week though.

ghost · 2017-11-06T05:42:55Z

@evgenyzhurko Can you rebase on master?

evgenyzhurko · 2018-02-12T20:08:59Z

I rebased this PR on master, fixed all build problem and merge conflicts. Is anyone can review this PR?

person142 · 2018-02-18T20:17:08Z

@evgenyzhurko I rebased again (recent changes in cephes conflicted) and made some fixes. Still need to investigate the test coverage a little more.

Probably most importantly: from reading the code it looks like the Boost implementation was followed closely, so I added the Boost license information to the top of the file. Am I correct?

evgenyzhurko · 2018-02-18T22:26:27Z

@person142 You are right, in many case i follow to boost implementation.

I was trying to find values a and h that can be calculated using only T4(that not covered yet) function but it's no use. I calculated that for 0.06 < a < 0.25 and 0.6 < h < 3.5 should be used T4 function. I iterated through this interval with step=0.00001 and T4 wasn't used.

person142 · 2018-02-19T06:59:19Z

I was trying to find values a and h that can be calculated using only T4(that not covered yet) function but it's no use

It appears you were using Boost's arbitrary precision dispatch function instead of the double precision version. I switched to using the double precision version (which covers T4), fixed a bug this revealed in T1, and then removed the no-longer-needed arbitrary precision helper functions.

evgenyzhurko · 2018-02-19T17:02:45Z

@person142 Thanks for your help and my mistakes explanation. Is something else need from me in this PR?

person142 · 2018-02-19T18:38:21Z

Is something else need from me in this PR?

I made a few more fixes, and I think this is good to go now.

We will see what the CIs say, and we should probably ping @pv to do a sanity check since I feel a little too involved in the code editing process to be an objective third party. Pauli, I'm seeing 100% test coverage on owens_t.c with the exception of the default case on line 308, which should in theory never be reached (it's just a sentinel).

person142 · 2018-02-19T18:40:41Z

Ah shoot, looks like cephes/protos.h and some submodule updates got picked up in the rebase--will have to fix that.

person142 · 2018-02-19T21:03:24Z

Ok, cleaned up the history. Now there's one clean commit.

person142 · 2018-02-24T17:14:20Z

Ok, got this in at long last. Thanks for your patience @evgenyzhurko.

rgommers reviewed Mar 2, 2017

View reviewed changes

scipy/special/basic.py Outdated

Parameters

----------

h: int or float

Copy link

Member

rgommers Mar 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int or float --> scalar

(same 2 lines lower)

rgommers reviewed Mar 2, 2017

View reviewed changes

scipy/special/basic.py Outdated

owens_t(a * h, 1 / a)

if a == 0:

return 0

Copy link

Member

rgommers Mar 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blank line here

rgommers reviewed Mar 2, 2017

View reviewed changes

scipy/special/basic.py Outdated

if a == 0:

return 0

if h == 0:

return 1 * np.arctan(a) / (2 * np.pi)

Copy link

Member

rgommers Mar 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blank line here

rgommers reviewed Mar 2, 2017

View reviewed changes

rgommers added enhancement A new feature or improvement scipy.special labels Mar 2, 2017

person142 reviewed Mar 7, 2017

View reviewed changes

pv reviewed Mar 7, 2017

View reviewed changes

scipy/special/cephes/owens_t.c Outdated

for (i = 0; i < 7; i++) {

if (h <= HRANGE[i]) {

ihint = i;

Copy link

Member

pv Mar 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iahint?

pv removed the needs-work Items that are pending response from the author label Mar 25, 2017

ev-br mentioned this pull request Aug 18, 2017

False CDF values for skew normal distribution #7746

Closed

rgommers added this to the 1.1.0 milestone Sep 16, 2017

ENH: special: implement Owen's T function

b5705e8

person142 merged commit 895a774 into scipy:master Feb 24, 2018

		@@ -3416,5 +3416,412 @@ def xfunc(delta, r):
		assert_func_equal(special.pseudo_huber, w, z, rtol=1e-13, atol=1e-13)


		owenst_dt = [[(1.9508080482482910156250000000000000000000000000000000000000000000000000000000000000000000000000000000e+00), (9.7540378570556640625000000000000000000000000000000000000000000000000000000000000000000000000000000000e-02), (2.2942365980155351117754771526378292708740231264960539316842063696200884829628237039895484898529192823e-03)],

ENH: implemented Owen's T function #7120

ENH: implemented Owen's T function #7120

Conversation

evgenyzhurko commented Mar 2, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rgommers Mar 2, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rgommers commented Mar 2, 2017

pv commented Mar 2, 2017 via email

evgenyzhurko commented Mar 2, 2017

ev-br commented Mar 2, 2017

ewmoore commented Mar 2, 2017

person142 commented Mar 2, 2017

pv commented Mar 2, 2017 via email

person142 commented Mar 3, 2017

person142 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

person142 Mar 7, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

evgenyzhurko commented Mar 7, 2017

person142 commented Mar 7, 2017

evgenyzhurko commented Mar 7, 2017

ev-br commented Mar 7, 2017

evgenyzhurko commented Mar 7, 2017

person142 commented Mar 7, 2017

person142 commented Mar 18, 2017

evgenyzhurko commented Mar 18, 2017

evgenyzhurko commented Mar 18, 2017

evgenyzhurko commented Mar 18, 2017

person142 commented Mar 18, 2017

evgenyzhurko commented Mar 18, 2017 • edited Loading

evgenyzhurko commented Mar 18, 2017

person142 commented Mar 18, 2017

evgenyzhurko commented Mar 20, 2017

person142 commented Mar 20, 2017 • edited Loading

evgenyzhurko commented Mar 20, 2017

evgenyzhurko commented Mar 21, 2017

evgenyzhurko commented Apr 9, 2017 • edited Loading

ev-br commented Jul 14, 2017

person142 commented Jul 14, 2017

ghost commented Nov 6, 2017

evgenyzhurko commented Feb 12, 2018

person142 commented Feb 18, 2018

evgenyzhurko commented Feb 18, 2018

person142 commented Feb 19, 2018

evgenyzhurko commented Feb 19, 2018

person142 commented Feb 19, 2018

person142 commented Feb 19, 2018

person142 commented Feb 19, 2018

person142 commented Feb 24, 2018

evgenyzhurko commented Mar 2, 2017 •

edited

Loading

rgommers Mar 2, 2017 •

edited

Loading

person142 Mar 7, 2017 •

edited

Loading

evgenyzhurko commented Mar 18, 2017 •

edited

Loading

person142 commented Mar 20, 2017 •

edited

Loading

evgenyzhurko commented Apr 9, 2017 •

edited

Loading