Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Check dtype requirements on multiindex codes #14472

Open
wence- opened this issue Nov 22, 2023 · 0 comments
Open

[FEA] Check dtype requirements on multiindex codes #14472

wence- opened this issue Nov 22, 2023 · 0 comments
Labels
feature request New feature or request no-oom Reducing memory footprint of cudf algorithms

Comments

@wence-
Copy link
Contributor

wence- commented Nov 22, 2023

Is your feature request related to a problem? Please describe.

(Seen as part of a review of #14470).

Multiindex codes and levels are effectively a categorical encoding of the columns of the multiindex entries. The codes are used to index the levels. As such, they should probably have type equivalent to cudf::size_type. Currently, however, they are a int64. This is a larger memory footprint than necessary. Moreover, it (in some constructor circumstances) necessitates more copies than necessary.

Describe the solution you'd like

Use correct dtype. Since the public codes and levels properties wrap the results in pandas FrozenList objects to mimic the pandas API, it may be possible to just store the codes/levels pairs as CategoricalColumns internally, rather than the current structure.

Describe alternatives you've considered

n/a

Additional context

n/a

@wence- wence- added feature request New feature or request Needs Triage Need team to review and classify no-oom Reducing memory footprint of cudf algorithms and removed Needs Triage Need team to review and classify labels Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request no-oom Reducing memory footprint of cudf algorithms
Projects
Status: In Progress
Development

No branches or pull requests

1 participant