Overhaul `factorial{,2,k}`: API coherence, bug fixes & consistent array-support #15841

h-vetinari · 2022-03-22T09:22:57Z

This is some of the work towards #15600. It's still missing several things (mainly extend='complex'), but it's very big as-is, so I wanted to post this as just a something to further the discussion.

Currently it contains several small behaviour changes:

Make factorial2 / factorialk return 0 for small negative integers
- See 016919a and this comment.
- Currently no mitigation/deprecation for the behaviour change implemented, though this is mainly because of the WIP - we can adapt however we'd like.
Make both factorial / factorial2 consistently return the same type as the input. This changes factorial2 to not return 0-dim arrays for scalar inputs, and - inversely - changes factorial to not turn 0-dim arrays into scalars.
- See 58abcb6 and this comment.
- As above - no deprecation for the behaviour change yet, but happy to discuss (though BUG: special: fix factorial return type for 0-d inputs #10954 was treated as a bugfix as well...).
Switch the output type of factorialk to object for k >= 10 (I think this is a reasonable trade-off, for more details see 99d088a)
~~Switch the float approximation for factorial2 to the one from ENH: extend factorial2 to complex domain #15349~~ deferred due to ambiguity of the extension.

In addition, it adds array-support for factorialk, exact=True array-support for factorial2 (and many related fixes), and also fixes #13122; Finally, it cleans up and parametrizes a lot of tests, and adds a whole bunch of new ones.

Note: the individual commits should be much easier to review than the entire diff at once.

h-vetinari · 2022-03-27T01:49:41Z

@tupui @ilayn @tirthasheshpatel @j-bowhay

This should be ready for a first look. I realize it's pretty big, and I'll be happy to cut it up into more digestible pieces (though as the OP notes, I've tried to make the individual commits make sense on their own). I just wanted to ask for some confirmation about the direction (slight behaviour changes, array-extensions, testing approach) before doing that.

I think cleaning up the current behavioural mess should be the first goal (TBD if we want to explicitly deprecate certain things, or if they're inconsequential enough to just change like 0-dim arrays to scalars), and further unifications/extensions should come on top of that.

h-vetinari · 2022-03-27T02:03:15Z

Regarding test coverage, this adds about 500 tests (though highly parametrized) - compared to the CI for 9ce6bb0:

=== 36914 passed, 2640 skipped, 140 xfailed, 9 xpassed in 428.93s (0:07:08) ====

we now have

=== 37430 passed, 2640 skipped, 140 xfailed, 9 xpassed in 362.98s (0:06:02) ====

They're all pretty-to-very fast though. 🙃

ilayn

Had a brief look to the parts that I could follow.

scipy/special/_basic.py

j-bowhay · 2022-11-20T22:20:17Z

A friendly ping @h-vetinari

h-vetinari · 2022-11-21T00:15:06Z

A friendly ping @h-vetinari

Thanks. Just a rebase so far, still needs some work on some open review comments (I had still been hoping to get some feedback about the double factorial for even integers, but since extend='complex' is not part of this PR, I guess we don't have to solve that now).

h-vetinari · 2023-04-13T03:28:35Z

@ilayn @j-bowhay
I've finally picked this up again. I've now disabled the float-case for factorial2 that needed the ambiguous extension into the complex plane. I'll leave that for a follow-up if/once we implement extend='complex' or something like that.

The commit history is a complete mess (sorry), but before I spend a bunch of time cleaning it up, I wanted to ask:

could you please have a look if you see anything egregious that needs fixing?
would you object to landing this not as a single squashed commit, but as something like ~5 well-delineated commits (which could be rebased & fast-forward merged by the github interface)?

I think this PR does quite a number of things (though all closely related) and I wouldn't want that all to be bunched up into a giant commit that makes the git history almost useless. My goal would be to always pair code-changes with the respective test changes/additions per commit (in a way that's passing CI, obviously), but I could also imagine different ways (e.g. commit tests first, but with some skips, and remove those skips as the fixes get committed to the code). Please let me know what you think.

I've rebased the PR (no conflicts), new commits since the last discussions start from 45a3177 (but I was thrashing around a fair bit, so not sure if single-commit reviewing is worth your time on this).

j-bowhay · 2023-04-13T07:25:06Z

@ilayn @j-bowhay I've finally picked this up again. I've now disabled the float-case for factorial2 that needed the ambiguous extension into the complex plane. I'll leave that for a follow-up if/once we implement extend='complex' or something like that.

I am fine with this, better to have a solid implementation first before worrying about extensions.

The commit history is a complete mess (sorry), but before I spend a bunch of time cleaning it up, I wanted to ask:

could you please have a look if you see anything egregious that needs fixing?

I'll try to have a look over this in the coming weeks

would you object to landing this not as a single squashed commit, but as something like ~5 well-delineated commits (which could be rebased & fast-forward merged by the github interface)?
I think this PR does quite a number of things (though all closely related) and I wouldn't want that all to be bunched up into a giant commit that makes the git history almost useless. My goal would be to always pair code-changes with the respective test changes/additions per commit (in a way that's passing CI, obviously), but I could also imagine different ways (e.g. commit tests first, but with some skips, and remove those skips as the fixes get committed to the code). Please let me know what you think.

I'm happy with what you're proposing

j-bowhay · 2023-04-13T09:02:01Z

Can you post the benchmark results vs main?

h-vetinari · 2023-04-13T13:44:58Z

Can you post the benchmark results vs main?

Unfortunately no. python dev.py bench -t special --compare main has failed 3 times already in my WSL. I'm not worried about performance though; the hot path for factorial is unchanged, and accelerated for factorial{2,k}. But even if the few extra checks were to have a little overhead, they're worth it for correctness / consistency.

If you can check out this PR and run the benchmark, that would be nice though.

j-bowhay · 2023-04-13T16:50:55Z

before           after         ratio
     [c62be9c6]       [c71cc6e2]
     <main>           <factorial>
!      17.6±0.7μs           failed      n/a  special.Factorial.time_factorial_exact_true_array_negative_float
+         374±6ns       11.9±0.7μs    31.79  special.Factorial.time_factorial_exact_false_array_negative_float
+         420±8ns       13.2±0.5μs    31.33  special.Factorial.time_factorial_exact_false_array_positive_float
+        93.3±1ns      2.16±0.06μs    23.13  special.FactorialK.time_factorialk_exact_false_scalar_negative_int
+        87.5±2ns      1.90±0.02μs    21.72  special.Factorial2.time_factorial2_exact_true_scalar_negative_int
+         679±8ns       11.4±0.6μs    16.72  special.Factorial.time_factorial_exact_false_array_negative_int
+        799±20ns       11.7±0.7μs    14.62  special.Factorial.time_factorial_exact_false_array_positive_int
+        614±30ns      4.13±0.05μs     6.73  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(100, 9)
+         826±8ns       5.44±0.4μs     6.60  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(100, 5)
+        618±10ns      4.07±0.07μs     6.58  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(100, 8)
+         759±6ns       4.83±0.4μs     6.36  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(100, 6)
+         974±3ns       6.13±0.4μs     6.29  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(100, 4)
+         689±3ns       4.10±0.1μs     5.96  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(100, 7)
+     1.76±0.04μs       9.25±0.2μs     5.25  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(100, 2)
+     1.25±0.01μs      6.35±0.08μs     5.07  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(100, 3)
+     1.85±0.03μs       8.95±0.1μs     4.83  special.Factorial2.time_factorial2_exact_true_scalar_positive_int(100)
+     2.21±0.01μs       10.3±0.1μs     4.67  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(500, 9)
+     3.33±0.05μs       15.4±0.2μs     4.63  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(500, 6)
+     4.04±0.04μs       18.2±0.4μs     4.51  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(500, 5)
+     2.86±0.02μs      12.7±0.08μs     4.43  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(500, 7)
+     3.84±0.02μs       16.2±0.4μs     4.22  special.Factorial2.time_factorial2_exact_true_scalar_positive_int(200)
+     2.52±0.03μs       10.5±0.1μs     4.15  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(500, 8)
+         551±8ns       2.17±0.1μs     3.93  special.Factorial.time_factorial_exact_false_scalar_positive_float(10000.5)
+      3.64±0.1μs       14.3±0.3μs     3.92  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(100, 1)
+        555±20ns       2.16±0.1μs     3.89  special.Factorial.time_factorial_exact_false_scalar_positive_float(1000.3)
+     7.21±0.01μs       28.0±0.2μs     3.88  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(500, 3)
+         573±9ns       2.21±0.1μs     3.85  special.Factorial.time_factorial_exact_false_scalar_positive_float(100.8)
+     5.17±0.02μs       18.7±0.3μs     3.62  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(500, 4)
+      9.71±0.3μs         33.2±1μs     3.42  special.Factorial2.time_factorial2_exact_true_scalar_positive_int(400)
+         498±7ns      1.62±0.06μs     3.24  special.Factorial.time_factorial_exact_false_scalar_negative_float
+      12.1±0.1μs       35.0±0.4μs     2.90  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(500, 2)
+        841±20ns       2.37±0.2μs     2.82  special.Factorial.time_factorial_exact_false_scalar_positive_int(10000)
+        851±10ns       2.39±0.2μs     2.80  special.Factorial.time_factorial_exact_false_scalar_positive_int(1000)
+         870±4ns       2.37±0.1μs     2.73  special.Factorial.time_factorial_exact_false_scalar_positive_int(100)
+        774±30ns       1.55±0.1μs     2.01  special.Factorial.time_factorial_exact_false_scalar_negative_int

I am getting quite a significant slow down in certain cases

h-vetinari · 2023-04-13T21:40:17Z

I am getting quite a significant slow down in certain cases

Thanks for testing. Even though we're testing very small arrays (where the checks' constant impact is mitigated the least by the size of the array), I'm surprised by the magnitude. Looking at the added checks, I don't see something that would explain this, but perhaps _ufuncs._factorial(n) is much faster than gamma.

It's was also an API design decision to return all-nan arrays of any dtype unchanged, but that does add the overhead of having to check whether a given array is actually all-nan. Given the slow-downs, I think I'll remove this bit again and just throw if the dtype is not what it should be.

I'll have a look at other places where we might be leaving some performance on the table.

h-vetinari · 2023-04-14T00:57:05Z

@j-bowhay

Could you do another run perhaps?

For factorialk, I'm quite willing to eat the performance drop though - it currently has 0 checks and corner-case handling, and adding that is worth some overhead. Also, the _range_prod thing has a much better complexity than the existing implementation (O(log(n)) rather than O(n) multiplications); I guess with the recursion overhead it's possible that it's slower for small integers, but I don't think it's worth implementing a cut-off for this TBH.

j-bowhay · 2023-04-14T08:16:26Z

       before           after         ratio
     [49d46084]       [ae4a7795]
     <main>           <factorial>
!      16.5±0.1μs           failed      n/a  special.Factorial.time_factorial_exact_true_array_negative_float
+      84.2±0.5ns      1.81±0.04μs    21.50  special.Factorial2.time_factorial2_exact_true_scalar_negative_int
+        131±40ns      2.14±0.01μs    16.38  special.FactorialK.time_factorialk_exact_false_scalar_negative_int
+         400±3ns       2.50±0.8μs     6.26  special.Factorial.time_factorial_exact_false_array_positive_float
+     1.77±0.01μs      8.73±0.04μs     4.94  special.Factorial2.time_factorial2_exact_true_scalar_positive_int(100)
+         602±8ns       2.96±0.9μs     4.91  special.Factorial.time_factorial_exact_false_scalar_positive_float(10000.5)
+         600±9ns       2.95±0.9μs     4.91  special.Factorial.time_factorial_exact_false_scalar_positive_float(1000.3)
+         622±7ns       3.01±0.9μs     4.84  special.Factorial.time_factorial_exact_false_scalar_positive_float(100.8)
+       851±300ns      4.01±0.09μs     4.71  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(100, 9)
+        4.65±1μs         21.9±7μs     4.70  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(500, 6)
+       530±200ns       2.46±0.8μs     4.64  special.Factorial.time_factorial_exact_false_array_negative_float
+       889±300ns       4.12±0.2μs     4.63  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(100, 8)
+        5.53±2μs         25.4±8μs     4.60  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(500, 5)
+        3.93±1μs         18.0±6μs     4.59  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(500, 7)
+        3.26±1μs         14.7±5μs     4.50  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(500, 9)
+     3.61±0.01μs       16.1±0.2μs     4.47  special.Factorial2.time_factorial2_exact_true_scalar_positive_int(200)
+      1.15±0.4μs       5.09±0.1μs     4.44  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(100, 5)
+       956±300ns       4.15±0.2μs     4.34  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(100, 7)
+        3.47±1μs         14.8±5μs     4.26  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(500, 8)
+      1.36±0.4μs       5.77±0.2μs     4.24  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(100, 4)
+      1.06±0.3μs      4.47±0.03μs     4.21  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(100, 6)
+        541±10ns       2.16±0.7μs     4.00  special.Factorial.time_factorial_exact_false_scalar_negative_float
+        835±30ns         3.25±1μs     3.89  special.Factorial.time_factorial_exact_false_scalar_positive_int(1000)
+        838±30ns         3.25±1μs     3.89  special.Factorial.time_factorial_exact_false_scalar_positive_int(10000)
+        861±30ns         3.26±1μs     3.79  special.Factorial.time_factorial_exact_false_scalar_positive_int(100)
+     8.66±0.01μs         32.7±1μs     3.77  special.Factorial2.time_factorial2_exact_true_scalar_positive_int(400)
+      2.49±0.8μs         9.16±4μs     3.67  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(100, 2)
+         596±3ns       2.17±0.7μs     3.65  special.Factorial.time_factorial_exact_false_array_negative_int
+      1.76±0.6μs       6.33±0.3μs     3.60  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(100, 3)
+         702±3ns       2.35±0.7μs     3.35  special.Factorial.time_factorial_exact_false_array_positive_int
+        755±20ns       2.11±0.7μs     2.80  special.Factorial.time_factorial_exact_false_scalar_negative_int
+        10.0±3μs       27.7±0.6μs     2.76  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(500, 3)
+        5.11±2μs      14.1±0.07μs     2.75  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(100, 1)
+        7.14±2μs       18.4±0.3μs     2.58  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(500, 4)
+        17.0±5μs       34.7±0.5μs     2.04  special.FactorialK.time_factorialk_exact_true_scalar_positive_int(500, 2)

That change looks like it improved performance quite a bit

Also avoids math.factorial for these; which errors in 3.10+. Fixes scipyGH-13122

No functional changes, only added the docstring.

Also fix NaN-handling in factorial{2,k}.

For factorial2, the array-path for exact=False existed already.

No functional changes.

Remove benchmarks for floats with exact=True (not supported anymore) and extend the tested range with some larger inputs.

also improve errors for complex inputs in special.factorial

h-vetinari · 2023-05-03T00:57:48Z

OK, it's time :)

After chasing green CI for ~3 weeks, the only blemish is now #18407, but that's manageable.

h-vetinari mentioned this pull request Mar 22, 2022

BUG: handle inconsistencies in factorial functions and their extensions #15600

Closed

h-vetinari force-pushed the factorial branch 6 times, most recently from dfcc136 to abfd462 Compare March 22, 2022 12:54

tupui added enhancement A new feature or improvement scipy.special labels Mar 22, 2022

h-vetinari force-pushed the factorial branch 4 times, most recently from 4cea1e4 to 2c055e8 Compare March 27, 2022 01:46

h-vetinari marked this pull request as ready for review March 27, 2022 01:47

h-vetinari requested a review from person142 as a code owner March 27, 2022 01:47

ilayn reviewed Mar 27, 2022

View reviewed changes

h-vetinari force-pushed the factorial branch from 2c055e8 to 51a62d2 Compare November 21, 2022 00:12

h-vetinari force-pushed the factorial branch from 51a62d2 to 8b2b929 Compare April 12, 2023 01:48

h-vetinari added 18 commits May 3, 2023 10:29

MAINT: accelerate exact-path of factorial{2,k}

9a63e48

TST: add reference tests for factorial{,2,k}

d050d63

API/BUG: consistently handle float arguments in factorial

7b4ae10

Also avoids math.factorial for these; which errors in 3.10+. Fixes scipyGH-13122

TST: break out accuracy tests for factorial; add them for factorial2

5a1f9f4

MAINT: functionalize exact array computation for factorial

ff866eb

No functional changes, only added the docstring.

API: ensure factorial{,2} maintain dimensionality

5e91097

Also fix NaN-handling in factorial{2,k}.

MAINT: simplify factorial2 approximation

4a9b320

API: fix handling of NaNs and empty arrays in factorial{,2}

a7b5669

API: full array-support in factorial{2,k}

ca87b15

For factorial2, the array-path for exact=False existed already.

API: correctly deal with None in factorial{,2,k}

ea6aa8f

ENH: check type/value of k in factorialk

836ef42

DEP: deprecate mixing integers & NaNs in factorial for exact=True

82ab15b

ENH: handle dtype hierarchy for k>1 in factorialk

9e442c8

ENH: add hint to factorialk errors to try factorial{,2}

64a8e85

TST: add test for array-likes

b7d786b

MAINT: improve clarity in _exact_factorialx_array

4b240ad

No functional changes.

BENCH: update factorial{,2,k} benchmarks

83a4a24

Remove benchmarks for floats with exact=True (not supported anymore) and extend the tested range with some larger inputs.

TST: add tests for scalar corner cases for factorial{,2,k}

7f26f05

also improve errors for complex inputs in special.factorial

h-vetinari force-pushed the factorial branch from a5fb9f2 to 7f26f05 Compare May 2, 2023 23:29

h-vetinari merged commit 3519bb7 into scipy:main May 3, 2023
19 of 20 checks passed

h-vetinari deleted the factorial branch May 3, 2023 00:58

This was referenced May 3, 2023

ENH: extensions of factorial{,2,k} for complex domains, recurrences, ... #18409

Open

META: deprecation tracker #15765

Open

h-vetinari mentioned this pull request May 16, 2023

ENH: Improve _range_prod implementation behind exact factorial{,2,k} functions #18474

Open

This was referenced Jun 26, 2023

BUG: factorial return type inconsistent for 0-dim arrays #18753

Closed

BUG: fix handling for factorial(..., exact=False) for 0-dim array inputs #18754

Merged

BUG: scipy.special.factorial2(-1) returns 0 instead of 1 #18813

Closed

This was referenced Dec 18, 2023

DEP: non-integers in factorial(..., exact=True): deprecate for scalars and execute deprecation for arrays #19703

Merged

ENH: enable approximation for factorialk #19716

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overhaul `factorial{,2,k}`: API coherence, bug fixes & consistent array-support #15841

Overhaul `factorial{,2,k}`: API coherence, bug fixes & consistent array-support #15841

h-vetinari commented Mar 22, 2022 •

edited

Loading

h-vetinari commented Mar 27, 2022 •

edited

Loading

h-vetinari commented Mar 27, 2022

ilayn left a comment

j-bowhay commented Nov 20, 2022

h-vetinari commented Nov 21, 2022

h-vetinari commented Apr 13, 2023 •

edited

Loading

j-bowhay commented Apr 13, 2023

j-bowhay commented Apr 13, 2023

h-vetinari commented Apr 13, 2023

j-bowhay commented Apr 13, 2023

h-vetinari commented Apr 13, 2023

h-vetinari commented Apr 14, 2023

j-bowhay commented Apr 14, 2023

h-vetinari commented May 3, 2023

Overhaul factorial{,2,k}: API coherence, bug fixes & consistent array-support #15841

Overhaul factorial{,2,k}: API coherence, bug fixes & consistent array-support #15841

Conversation

h-vetinari commented Mar 22, 2022 • edited Loading

Note: the individual commits should be much easier to review than the entire diff at once.

h-vetinari commented Mar 27, 2022 • edited Loading

h-vetinari commented Mar 27, 2022

ilayn left a comment

Choose a reason for hiding this comment

j-bowhay commented Nov 20, 2022

h-vetinari commented Nov 21, 2022

h-vetinari commented Apr 13, 2023 • edited Loading

j-bowhay commented Apr 13, 2023

j-bowhay commented Apr 13, 2023

h-vetinari commented Apr 13, 2023

j-bowhay commented Apr 13, 2023

h-vetinari commented Apr 13, 2023

h-vetinari commented Apr 14, 2023

j-bowhay commented Apr 14, 2023

h-vetinari commented May 3, 2023

Overhaul `factorial{,2,k}`: API coherence, bug fixes & consistent array-support #15841

Overhaul `factorial{,2,k}`: API coherence, bug fixes & consistent array-support #15841

h-vetinari commented Mar 22, 2022 •

edited

Loading

h-vetinari commented Mar 27, 2022 •

edited

Loading

h-vetinari commented Apr 13, 2023 •

edited

Loading