Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ufunc limited to 2 inputs #10512

Closed
2 tasks done
deanm0000 opened this issue Aug 15, 2023 · 2 comments · Fixed by #14328
Closed
2 tasks done

ufunc limited to 2 inputs #10512

deanm0000 opened this issue Aug 15, 2023 · 2 comments · Fixed by #14328
Labels
bug Something isn't working python Related to Python Polars

Comments

@deanm0000
Copy link
Collaborator

Checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

##Setup
import numba
import polars as pl
df=pl.DataFrame(
    [
        pl.Series("a", [1.1, 2.3, 3.3], dtype=pl.Float64),
        pl.Series("b", [8.1, 7.3, 6.64], dtype=pl.Float64),
        pl.Series("c", [8.4, 7.34, 6.4], dtype=pl.Float64),
    ]
)

@numba.guvectorize([(numba.float64[:], numba.float64[:],  numba.float64[:])], '(n),(n)->(n)', nopython=True)
def g(x,y,  res):
    for i in range(x.shape[0]):
        res[i]=x[i]+y[i]

#Demonstrate that 2 inputs works
df.with_columns(zz=g(pl.col('a'), pl.col('b')))

Now try 3

@numba.guvectorize([(numba.float64[:], numba.float64[:], numba.float64[:],  numba.float64[:])], '(n),(n),(n)->(n)', nopython=True)
def g(x,y,z, res):
    for i in range(x.shape[0]):
        res[i]=x[i]+y[i]+z[i]

## Try for 3 inputs
df.with_columns(zz=g(pl.col('a'), pl.col('b'), pl.col('c')))
## Fails with ComputeError
## ComputeError: custom python function failed: g() takes from 3 to 4 positional arguments but 2 were given

Issue description

The docs for ufunc support don't mention anything about being limited to 2 inputs. I'm not sure if this is a bug or just an undocumented, but known, limitation.

Expected behavior

I know I can do this, which works well enough.

df.with_columns(zz=pl.struct('a','b','c').map(lambda x: g(x.struct['a'],x.struct['b'], x.struct['c'])))

Of course, I'd like it if it just did the same thing as going through a struct/map/lambda but if there's a underlying reason it can't be then the ufunc docs should probably updated.

Installed versions

--------Version info---------
Polars:              0.18.15
Index type:          UInt32
Platform:            Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Python:              3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0]

----Optional dependencies----
adbc_driver_sqlite:  <not installed>
cloudpickle:         2.2.1
connectorx:          0.3.1
deltalake:           <not installed>
fsspec:              2023.5.0
matplotlib:          3.7.1
numpy:               1.24.3
pandas:              2.0.1
pyarrow:             12.0.0
pydantic:            <not installed>
sqlalchemy:          2.0.13
xlsx2csv:            0.8.1
xlsxwriter:          <not installed>
@deanm0000 deanm0000 added bug Something isn't working python Related to Python Polars labels Aug 15, 2023
@zundertj
Copy link
Collaborator

It only works for 2 expression inputs indeed, because it ends up calling a reduce function, i.e. works as an accumulator. So using more than 2 expressions means it will take the first two, try to compute a result (fails if you expect three inputs), then take the result and the 3rd item, reduce again, etc.

If any Rust expert willing to pick up, the code ends up calling reduce_exprs in horizontal.rs. Doesn't seem too hard to create a similar function in which to call the function with all arguments at once.

@deanm0000
Copy link
Collaborator Author

This is how reduce works, need to just make a struct and use map_batches

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working python Related to Python Polars
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants