Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLN: avoid values_from_object in Series #32426

Merged
merged 10 commits into from
Mar 11, 2020

Conversation

jbrockmendel
Copy link
Member

xref #32422, #32419

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, some comments

@@ -7803,11 +7804,11 @@ def _reduce(
self, op, name, axis=0, skipna=True, numeric_only=None, filter_type=None, **kwds
):

dtype_is_dt = self.dtypes.apply(lambda x: x.kind == "M")
dtype_is_dt = self.dtypes.apply(lambda x: x.kind == "M" or is_period_dtype(x))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you apply needs_i8_conversion instead here? (like you do below), this is a non-standard usage

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trouble is we don't want to include td64

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok how about using is_datetime64_any_dtype then; this is a non-obvious pattern

@jreback jreback added the Clean label Mar 8, 2020
@jreback jreback added this to the 1.1 milestone Mar 8, 2020
@@ -7803,11 +7804,11 @@ def _reduce(
self, op, name, axis=0, skipna=True, numeric_only=None, filter_type=None, **kwds
):

dtype_is_dt = self.dtypes.apply(lambda x: x.kind == "M")
dtype_is_dt = self.dtypes.apply(lambda x: x.kind == "M" or is_period_dtype(x))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok how about using is_datetime64_any_dtype then; this is a non-obvious pattern

pandas/core/frame.py Show resolved Hide resolved
# changing timedelta64/datetime64 to int64 needs to happen after
# finding `mask` above
values = getattr(values, "asi8", values)
values = values.view(np.int64)
if isinstance(values, np.ndarray):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this case still here? when is this actually an ndarray? (or conversely, when is this a DTI/TDI). it is non-obvious how we get to this point.

pandas/core/nanops.py Show resolved Hide resolved
@@ -875,11 +875,6 @@ def test_mean_datetimelike(self):
expected = pd.Series({"A": 1.0, "C": df.loc[1, "C"]})
tm.assert_series_equal(result, expected)

@pytest.mark.xfail(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you should technically have a whatsnew note as this 'bug' is fixed (do in followon)

@jreback jreback merged commit 8c38283 into pandas-dev:master Mar 11, 2020
@jreback
Copy link
Contributor

jreback commented Mar 11, 2020

thanks

@jbrockmendel jbrockmendel deleted the series-values_from_object branch March 11, 2020 02:51
@@ -2055,7 +2055,7 @@ def idxmax(self, axis=0, skipna=True, *args, **kwargs):
nan
"""
skipna = nv.validate_argmax_with_skipna(skipna, args, kwargs)
i = nanops.nanargmax(com.values_from_object(self), skipna=skipna)
i = nanops.nanargmax(self._values, skipna=skipna)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nanargmax/nanargmin expect to get an ndarray. Due to this change, it is no longer guaranteed to be an ndarray. Reported this as #32749

So those lines should either be reverted, or another "convert to ndarray" function should be used (or nanargmax/nanargmin could be rewritten to support EAs, but personally I think it is much cleaner to keep those algos based on numpy arrays)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel can you respond to this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(or nanargmax/nanargmin could be rewritten to support EAs, but personally I think it is much cleaner to keep those algos based on numpy arrays)

I'd be fine with either of these options. Probably prefer both actually: a EA-supporting public method and an ndarray-only private method for each of the relevant nanops funcs.

SeeminSyed pushed a commit to CSCD01-team01/pandas that referenced this pull request Mar 22, 2020
@jorisvandenbossche
Copy link
Member

@jbrockmendel another problem with this PR is that you enabled "mean" for Period dtype (but only for DataFrames), while we had long discussions before (when initially adding mean support for datetimelikes) that ended in not supporting mean for period (-> #24757)

@jorisvandenbossche
Copy link
Member

Opened a PR for that in the mean time: #33758

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants