You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
pandas has an extension API that allows us to extend pandas very flexibly. As an example, I developed pyjanitor, which uses pandas-flavor underneath the hood, which enables functions to be registered as if they were native to a pandas dataframe. I'm hoping to see the pandas extension API implemented in cuDF, to allow for other custom functions to be attached to cuDF dataframes.
Having met with some of the Boston NVIDIA folks (Raghav, Rory, and Jennifer visiting from NYC), I wanted to put this up on the cuDF issue tracker for the record. No rush on this, totally understand that there are other priorities.
The text was updated successfully, but these errors were encountered:
Half of this request was implemented in #6302, resolving #6216, which requests a subset of this issue, accessors. We now support accessors. Due to the differences between CPU and GPU execution, I don't know if there's any way that we'll ever be able to support ExtensionArray in the generic way that pandas does. Even if we could, it would likely be painfully slow. I'm inclined to close this in favor of focusing on more specific requests such as our somewhat recent additions of first-class list/struct dtypes that can be represented and operated on efficiently on the GPU.
@shwina what do you think? @ericmjl did the addition of accessors cover the primary use case that you were hoping to address? (Also hi Eric!)
I'm going to close this since I think that we have realistically accomplished what we can here. Unlike pandas, there is no way for us to implement all our operations in terms of a few primitives the way that pandas ExtensionArray does. If we find more specific dtypes of use we can always add direct support for them, but unfortunately I don't see the ExtensionArray concept as being feasible to implement in cuDF.
Feature request
pandas
has an extension API that allows us to extend pandas very flexibly. As an example, I developedpyjanitor
, which usespandas-flavor
underneath the hood, which enables functions to be registered as if they were native to a pandas dataframe. I'm hoping to see the pandas extension API implemented in cuDF, to allow for other custom functions to be attached to cuDF dataframes.Having met with some of the Boston NVIDIA folks (Raghav, Rory, and Jennifer visiting from NYC), I wanted to put this up on the cuDF issue tracker for the record. No rush on this, totally understand that there are other priorities.
The text was updated successfully, but these errors were encountered: