Implement `numeric_only` for `skew` and `kurtosis` #10258

phofl · 2023-05-04T09:56:12Z

Closes #xxxx
Tests added / passed
Passes pre-commit run --all-files

Not sure what the policy Is, but this changes behaviour for pandas 2.0, with this pr

df = pd.DataFrame(
    {
        "int": [1, 2, 3, 4, 5, 6, 7, 8],
        "dt": [pd.NaT] + [datetime(2010, i, 1) for i in range(1, 8)],
    }
)
ddf = dd.from_pandas(df, npartitions=2).skew()

raises like pandas does, but without this pr it drops the dt column. I guess? we would prefer not raising here without deprecating first since we did not show any deprecation warnings?

Nothing changes with pandas < 2.0.

j-bennet

Nice. Make sure to test upstream as well.

jrbourbeau

Thanks @phofl. Just one small comment

jrbourbeau · 2023-05-05T20:22:34Z

dask/dataframe/tests/test_arithmetics_reduction.py

@@ -1426,6 +1426,36 @@ def test_reductions_frame_dtypes_numeric_only(func):
    )


+@pytest.mark.parametrize("func", ["skew", "kurtosis"])
+def test_skew_kurt_numeric_only_false(func):


Can we add an assert_eq to make sure the result from dask matches pandas?

Also, is numeric_only=True tested elsewhere?

Both of them are tested somewhere else, that's why I added only the False case here.

j-bennet · 2023-05-08T20:00:15Z

I guess? we would prefer not raising here without deprecating first since we did not show any deprecation warnings?

I would rather raise. We should have added a deprecation earlier, that's true, but at least right now we have the chance to catch up with pandas' behavior. Supporting diverging behaviors and following one step behind feels like extra work for not much benefit. Just my 2c.

phofl · 2023-05-16T16:56:58Z

That's what we are doing, so all good

Implement numeric_only for skew and kurtosis

fef9588

github-actions bot added the dataframe label May 4, 2023

Fix

b2aca73

j-bennet added the upstream label May 5, 2023

phofl closed this May 5, 2023

phofl reopened this May 5, 2023

j-bennet approved these changes May 5, 2023

View reviewed changes

jrbourbeau reviewed May 5, 2023

View reviewed changes

jrbourbeau approved these changes May 17, 2023

View reviewed changes

jrbourbeau merged commit bcee469 into dask:main May 17, 2023
53 checks passed

phofl deleted the numeric_only_skew branch May 17, 2023 07:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement `numeric_only` for `skew` and `kurtosis` #10258

Implement `numeric_only` for `skew` and `kurtosis` #10258

phofl commented May 4, 2023

j-bennet left a comment

jrbourbeau left a comment

jrbourbeau May 5, 2023

phofl May 8, 2023

j-bennet commented May 8, 2023

phofl commented May 16, 2023

Implement numeric_only for skew and kurtosis #10258

Implement numeric_only for skew and kurtosis #10258

Conversation

phofl commented May 4, 2023

j-bennet left a comment

Choose a reason for hiding this comment

jrbourbeau left a comment

Choose a reason for hiding this comment

jrbourbeau May 5, 2023

Choose a reason for hiding this comment

phofl May 8, 2023

Choose a reason for hiding this comment

j-bennet commented May 8, 2023

phofl commented May 16, 2023

Implement `numeric_only` for `skew` and `kurtosis` #10258

Implement `numeric_only` for `skew` and `kurtosis` #10258