Feature Request: Create a Global Setting for Enabling numba engine #33966

jtelleriar · 2020-05-04T11:25:42Z

Would it be possible to create a default pandas global setting to enable numba engine whenever possible?

In pandas.DataFrame.apply, .transform, etc.

Thanks!

TomAugspurger · 2020-05-04T13:04:00Z

cc @mroeschke. Seems reasonable.

jorisvandenbossche · 2020-05-04T13:51:30Z

We currently have pd.options.compute.use_numexpr/use_bottleneck options, so the interface could resemble that

mroeschke · 2020-05-04T19:48:19Z

I am +0 to the idea in theory, but there are some considerations that may not make this entirely convenient for the user.

The numba behavior today doesn't have any "fall back" behavior like numexpr and bottleneck (I believe for those two). This was an intentional decision to make things simpler on our end.
The numba and cython engines usage wise are not totally interchangeable. For example for groupby.transform, the UDF signature needs to be def f(values, index, ...) (exactly) for engine='numba' and anything for engine='cython'. So functions cannot be easily reused between engines

mroeschke · 2020-06-01T03:31:12Z

Just to be clear about the behavior for this feature:

groupby.transform(..., engine='cython') & compute.use_numba = True would use the numba engine
Internally we won't "fall back" to the cython behavior, mainly to reduce code complexity (and groupby.transform(..., engine='numba') already doesn't fall back as well)

jorisvandenbossche · 2020-06-01T07:03:23Z

groupby.transform(..., engine='cython') & compute.use_numba = True would use the numba engine

If the user explicitly specifies engine="cython", can't it then override the global config? Or why would that be complex (it's not a fallback or so)

mroeschke · 2020-06-01T16:41:38Z

engine='cython' is already the default keyword argument for the operations that have a numba option available.

jtelleriar added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels May 4, 2020

TomAugspurger added API Design and removed Needs Triage Issue that has not been reviewed by a pandas team member labels May 4, 2020

DiegoAlbertoTorres mentioned this issue May 28, 2020

TRACKER: milestones twosigma/pandas#44

Open

32 tasks

mroeschke added the numba numba-accelerated operations label Jul 8, 2020

mroeschke mentioned this issue Jul 8, 2020

ENH: Add compute.use_numba configuration for automatically using numba #35182

Merged

5 tasks

jreback added this to the 1.1 milestone Jul 13, 2020

jreback closed this as completed in #35182 Jul 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Create a Global Setting for Enabling numba engine #33966

Feature Request: Create a Global Setting for Enabling numba engine #33966

jtelleriar commented May 4, 2020

TomAugspurger commented May 4, 2020

jorisvandenbossche commented May 4, 2020

mroeschke commented May 4, 2020

mroeschke commented Jun 1, 2020

jorisvandenbossche commented Jun 1, 2020

mroeschke commented Jun 1, 2020

Feature Request: Create a Global Setting for Enabling numba engine #33966

Feature Request: Create a Global Setting for Enabling numba engine #33966

Comments

jtelleriar commented May 4, 2020

TomAugspurger commented May 4, 2020

jorisvandenbossche commented May 4, 2020

mroeschke commented May 4, 2020

mroeschke commented Jun 1, 2020

jorisvandenbossche commented Jun 1, 2020

mroeschke commented Jun 1, 2020