Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Handle setting numpy array to GDF #8674

Closed
esnvidia opened this issue Jul 7, 2021 · 5 comments
Closed

[FEA] Handle setting numpy array to GDF #8674

esnvidia opened this issue Jul 7, 2021 · 5 comments
Labels
feature request New feature or request

Comments

@esnvidia
Copy link

esnvidia commented Jul 7, 2021

Is your feature request related to a problem? Please describe.
Error message either should be clearer or handle setting values from numpy array to GDF

image
image

Describe the solution you'd like
sk_std_scaler is scikit-learn standard scaler.
sk_gdf.loc[:, numer_cols] = sk_std_scaler.fit_transform(sk_gdf[numer_cols].to_pandas())

or a more helpful error message:
Cannot assign numpy array to Device DataFrame. Please convert to cupy.array object first with cupy.array(numpy_array)

Describe alternatives you've considered
Convert to cupy array: sk_gdf.loc[:, numer_cols] = cp.array(sk_std_scaler.fit_transform(sk_gdf[numer_cols].to_pandas()))

Additional context
Untitled.zip
Also see issues here for additional context:
#8672
rapidsai/cuml#4034

@esnvidia esnvidia added Needs Triage Need team to review and classify feature request New feature or request labels Jul 7, 2021
@beckernick
Copy link
Member

@esnvidia would you be able to provide a minimal reproducer in Python code (outside of a Jupyter notebook)? Thanks!

@esnvidia
Copy link
Author

example.zip
@beckernick Cleaned up code to make a minimal py file that I ran with a conda install of rapids 21.06

@beckernick
Copy link
Member

beckernick commented Jul 19, 2021

@esnvidia is this the same issue as #8672 ?

@esnvidia
Copy link
Author

esnvidia commented Jul 24, 2021

@beckernick Related but not the same. This one is related to setting a numpy matrix to a set of indices/columns, which is why this is a Feature Request. In #8672 the output is a cupy array IIRC. In that issue, the setting of the cupy array to the DataFrame was wrong.

@beckernick
Copy link
Member

It looks like the Scaler issue was fixed in cuML based on rapidsai/cuml#4034 , and the setitem issue is covered by #8672

As a result, I'm going to close this issue. Please feel to re-open if I have misunderstood what the request is here.

@bdice bdice removed the Needs Triage Need team to review and classify label Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants