-
Notifications
You must be signed in to change notification settings - Fork 900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REF: Remove instances of pd.core #14421
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Let's target new PRs to 24.02 since we're in burndown, unless you think this is essential for 23.12.
Ah OK. Not essential for 23.12 so will retarget for 24.02 |
python/cudf/cudf/core/dataframe.py
Outdated
result = self.to_pandas().to_dict(orient=orient, into=into) | ||
if orient == "series": | ||
# Special case needed to avoid converting | ||
# cudf.Series objects into pd.Series | ||
into_c = pd.core.common.standardize_mapping(into) | ||
return into_c((k, v) for k, v in self.items()) | ||
|
||
return self.to_pandas().to_dict(orient=orient, into=into) | ||
# Ensure values are cudf.Series | ||
converted = ((k, Series(v)) for k, v in result.items()) | ||
if isinstance(into, defaultdict): | ||
return type(result)(into.default_factory, converted) | ||
return type(result)(converted) | ||
return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: This change means that the "special-case" orient="series"
which previously did not induce a copy now produces both a DtoH and HtoD copy.
We should just handle the case correctly:
if orient == "series":
if not inspect.isclass(into):
cons = type(into)
if isinstance(into, defaultdict):
cons = partial(cons, into.default_factory)
elif issubclass(into, Mapping):
cons = into
if issubclass(into, defaultdict):
raise TypeError("Must provide initialised defaultdict")
else:
raise TypeError(...)
return cons(self.items())
Aside, this is a mad interface, one should just be on the hook for providing something that has the same __init__
signature as dict
(i.e. *args, **kwargs -> Mapping
).
(I note the implementation in pandas has a bug because it doesn't preserve the type of the output if the input is a subclass of defaultdict)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now produces both a DtoH and HtoD copy.
Oof thanks for the catch! Any tips on generally knowing when a DtoH or HtoD occurs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any time you call to_pandas
or to_arrow
or .values_host
or similar to get a host-side version of the object. Then in the other direction, any time you have a host-side object.
/merge |
Description
pandas.core
is technically private and methods could be moved at any time. Avoiding places in the codepace where they could be avoidedChecklist