Open
Description
Currently the contents of cudautils.py can be grouped into three sets of functionality:
- The first set of 7 functions are all targeted at supporting
find_first
andfind_last
, which in turn are used respectively by thefind_first_value
andfind_last_value
methods of Column objects. - The second set of functions 4 around window sizes are used by the rolling calculations in rolling.py.
- The third pair of 2 functions is used for UDF compilation.
The functions in group 3 are necessary but should probably be moved to core/udf/utils.py
. The remaining functions should be possible to remove altogether, but they may require some additional functionality to be added to libcudf. For instance, find_first_value
and find_last_value
may be possible to implement using libcudf's lower_bound
followed by an equality check, and similarly find_last_value
could be replaced with upper_bound
, accounting for all of group 1. For group 2 functions I am not certain if the necessary functionality exists in libcudf's roling aggregations, so we may need to do a little more engineering there first.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
In Progress