API/DISC: Mock pandas DataFrame/Series/Index in numba agg/apply/transform

## Overview

I was working on numba apply support pretty recently, and I was finding it pretty hard to support apply
the same way as agg and transform(which is to take in ``values`` and ``index`` and return the new ``values``)
because it's way more flexible than agg or transform 
(You can change the index as well as the values, and also change shape of values by adding/deleting columns).

This made me end up writing a ton of code to handle this (basically reimplementing the concat logic for apply for groupby), so I decided to take a look at wrapping the DataFrame object using the numba extension API.

## Pros
- Same API between numba and regular engine, you can reuse functions since you would use the DF object for both
   - Small nit: We wouldn't support stuff that numba doesn't support in nopython mode like non-string names
   - It's also more convenient as a user to not have to keep track the index/columns/values of the DataFrame separately.
- Reuse internal logic (what goes in out of the numba function is a DF/Series because we do the boxing/unboxing)
   - Should be able to de-dup a lot, and save a lot of misery in keeping the numba/regular paths consistent

## Cons
- Annoying to develop
  - Hard to wrap classes
     - You are basically writing C code in Python via numba that tells numba how to box/unbox the Python DF to a C representation
  - Adding new methods is pretty easy tho, those are just like regular numba functions
  - Easy to make mistakes

We should decide whether we want to go down this road, and if so, how much of the pandas API to implement.
(e.g. what methods should exist, and what to wrap)


Some sample code (only works with int64 arrays ATM, which is hardcoded) wrapping a subset of DataFrame attributes/methods.

cc @mroeschke @jbrockmendel @rhshadrach 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

API/DISC: Mock pandas DataFrame/Series/Index in numba agg/apply/transform #53933

Overview

Pros

Cons

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

API/DISC: Mock pandas DataFrame/Series/Index in numba agg/apply/transform #53933

Description

Overview

Pros

Cons

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions