MNT: disable the coercion cache for the nogil build #26294

ngoldbaum · 2024-04-16T19:11:47Z

This disables the coercion cache on the nogil build, but I'd like to find a benchmark that shows the benefit of this cache more than what's already in bench_array_coercion.py and bench_core.py.

I don't see any significant performance impact before or after this change, although note that I can't actually run asv in comparison mode yet in the nogil build.

When I disable this cache for the single-threaded build (e.g. set COERCION_CACHE_CACHE_SIZE to 0 in both branches of the ifdef) where I can run asv in comparison mode, I get the following results:

| Change   | Before [3c09f166] <main>   | After [c981c067] <disable-coercion-cache>   |   Ratio | Benchmark (Parameter)                                               |
|----------|----------------------------|---------------------------------------------|---------|---------------------------------------------------------------------|
| +        | 219±0.6ns                  | 233±6ns                                     |    1.06 | bench_array_coercion.ArrayCoercionSmall.time_array_no_copy([1])     |
| +        | 196±0.7ns                  | 208±1ns                                     |    1.06 | bench_array_coercion.ArrayCoercionSmall.time_ascontiguousarray([1]) |
| +        | 249±3ns                    | 262±6ns                                     |    1.05 | bench_array_coercion.ArrayCoercionSmall.time_asarray_dtype([1])     |

| Change   | Before [3c09f166] <main>   | After [c981c067] <disable-coercion-cache>   |   Ratio | Benchmark (Parameter)                                                            |
|----------|----------------------------|---------------------------------------------|---------|----------------------------------------------------------------------------------|
| +        | 687±7ns                    | 732±2ns                                     |    1.07 | bench_core.StatsMethods.time_sum('bool_', 100)                                   |
| -        | 10.9±2μs                   | 10.4±0.01μs                                 |    0.95 | bench_core.CountNonzero.time_count_nonzero_axis(3, 10000, <class 'numpy.int64'>) |
| -        | 22.7±0.1μs                 | 21.6±0.4μs                                  |    0.95 | bench_core.StatsMethods.time_var('bool_', 10000)                                 |
| -        | 640±20ns                   | 602±10ns                                    |    0.94 | bench_core.StatsMethods.time_prod('uint64', 100)                                 |
| -        | 665±40ns                   | 610±4ns                                     |    0.92 | bench_core.StatsMethods.time_max('int64', 100)                                   |
| -        | 201±2ns                    | 183±0.9ns                                   |    0.91 | bench_core.Core.time_array_l1                                                    |
| -        | 7.15±2ms                   | 6.34±0.3ms                                  |    0.89 | bench_core.CountNonzero.time_count_nonzero_axis(3, 1000000, <class 'str'>)       |

So it does have a small, ~5% benefit for a few of the array coercion tests, but a somewhat confusing mix of performance improvements and performance hits for reduction operations, and two other 10% performance hits for creating an array from a single-element list and a benchmark for count_nonzero with axis set.

Maybe this is all just noise?

seberg

Maybe this is all just noise?

The reduction changes must be noise. The array([1]) is exactly the type of operation that would benefit, so that must be real (also if it was noise you would expect other small array benchmarks to also fluctuate stronger).

If you would have array([[1]]) or so (i.e. deeper nested and maybe even containing further arrays like array([[small_arr], [small_arr]])) the benefit should be bigger.
I might have timed the benefit of the discovery step on its own also when adding it (so the benefit in percent is bigger compared to full coercion).

Anyway, we can definitely just disable it in the nogil build I think. It may be a nice boost for some code, but overall will have less impact than dropping the small array cache.

ngoldbaum · 2024-04-18T15:11:01Z

I tried testing scikit-learn yesterday and found that parallel forest training led to a similar seg fault and I was able to reduce it down to the test I just added. I decided it's worth having a test_multithreading.py test module specifically for tests of numpy in a multithreaded context. I suspect we'll have more of these and be able to refactor to reduce a lot of boilerplate by just putting them all in the same file.

ngoldbaum · 2024-04-19T14:38:24Z

Pulling this in, if someone has an issue with the new test file I’ll do a followup.

ngoldbaum mentioned this pull request Apr 16, 2024

ENH: Support nogil python build (tracking issue) #26157

Open

8 tasks

seberg approved these changes Apr 17, 2024

View reviewed changes

ngoldbaum added the 39 - no-GIL PRs and issues related to support for free-threaded CPython (a.k.a. no-GIL, PEP 703) label Apr 17, 2024

ngoldbaum force-pushed the disable-coercion-cache branch from 38e3e0f to 9fce93a Compare April 18, 2024 15:12

ngoldbaum added 3 commits April 18, 2024 10:48

MNT: disable the coercion cache for the nogil build

ab719e1

TST: add a test in a new test_multithreading module

b3b5109

TST: skip multithreading tests on WASM

1d976fc

ngoldbaum force-pushed the disable-coercion-cache branch from 11eecb7 to 1d976fc Compare April 18, 2024 16:48

ngoldbaum merged commit da95f8e into numpy:main Apr 19, 2024
65 checks passed

rgommers added this to the 2.1.0 release milestone Apr 22, 2024

rgommers added the 03 - Maintenance label Apr 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MNT: disable the coercion cache for the nogil build #26294

MNT: disable the coercion cache for the nogil build #26294

ngoldbaum commented Apr 16, 2024

seberg left a comment

ngoldbaum commented Apr 18, 2024

ngoldbaum commented Apr 19, 2024

MNT: disable the coercion cache for the nogil build #26294

MNT: disable the coercion cache for the nogil build #26294

Conversation

ngoldbaum commented Apr 16, 2024

seberg left a comment

Choose a reason for hiding this comment

ngoldbaum commented Apr 18, 2024

ngoldbaum commented Apr 19, 2024