Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Change numeric_only default to True #46096

Merged
merged 23 commits into from Mar 18, 2022

Conversation

NumberPiOso
Copy link
Contributor

@NumberPiOso NumberPiOso commented Feb 21, 2022

I know I have to correct the other tests that fail due to this change. But I want confirmation that this PR is headed in the right direction before doing so.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not going to be possible to change before 2.0

pandas/core/frame.py Outdated Show resolved Hide resolved
@NumberPiOso NumberPiOso marked this pull request as draft February 23, 2022 14:17
@NumberPiOso
Copy link
Contributor Author

NumberPiOso commented Feb 23, 2022

Hello @phofl, do you think this would be the way to go?

I am still having some errors in the tests, when I run

pytest -v pandas/tests/frame/methods/test_quantile.py
...
======== 73 passed, 8 xfailed, 6 warnings in 2.46s =========

Getting al the warnings produced by my modifcation, even though I included the next line as decorator to those tests
python @pytest.mark.filterwarnings("ignore:In future versions of pandas, numeric_only")
Full output below

=================== test session starts ====================
platform linux -- Python 3.8.12, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- /home/pi/anaconda3/envs/pandas-dev/bin/python
cachedir: .pytest_cache
hypothesis profile 'ci' -> deadline=None, suppress_health_check=[HealthCheck.too_slow], database=DirectoryBasedExampleDatabase('/home/pi/git/mypandas/.hypothesis/examples')
rootdir: /home/pi/git/mypandas, configfile: pyproject.toml
plugins: asyncio-0.17.2, cython-0.1.1.post0, cov-3.0.0, hypothesis-6.36.0, instafail-0.4.1, forked-1.4.0, xdist-2.5.0
asyncio: mode=legacy
collected 81 items

pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_numeric_only_default_false_warning PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_sparse[df0-expected0] PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_sparse[df1-expected1] PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_date_range PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_axis_mixed PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_axis_parameter PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_interpolation PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_interpolation_datetime PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_interpolation_int PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_multi PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_datetime PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_dt64_empty[datetime64[ns]] PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_dt64_empty[datetime64[ns, US/Pacific]] PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_dt64_empty[timedelta64[ns]] PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_dt64_empty[Period[D]] PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_invalid PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_box PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_nan PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_nat PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_empty_no_rows_floats PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_empty_no_rows_ints PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_empty_no_rows_dt64 PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_empty_no_columns PASSED
pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_item_cache PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea[interval[int64, right]-DataFrame] XFAIL
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea[interval[int64, right]-Series] XFAIL
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea[period[D]-DataFrame] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea[period[D]-Series] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea[datetime64[ns, US/Pacific]-DataFrame] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea[datetime64[ns, US/Pacific]-Series] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea[timedelta64[ns]-DataFrame] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea[timedelta64[ns]-Series] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea[Int64-DataFrame] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea[Int64-Series] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea[Float64-DataFrame] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea[Float64-Series] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_with_na[interval[int64, right]-DataFrame] XFAIL
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_with_na[interval[int64, right]-Series] XFAIL
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_with_na[period[D]-DataFrame] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_with_na[period[D]-Series] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_with_na[datetime64[ns, US/Pacific]-DataFrame] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_with_na[datetime64[ns, US/Pacific]-Series] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_with_na[timedelta64[ns]-DataFrame] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_with_na[timedelta64[ns]-Series] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_with_na[Int64-DataFrame] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_with_na[Int64-Series] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_with_na[Float64-DataFrame] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_with_na[Float64-Series] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_all_na[interval[int64, right]-DataFrame] XFAIL
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_all_na[interval[int64, right]-Series] XFAIL
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_all_na[period[D]-DataFrame] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_all_na[period[D]-Series] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_all_na[datetime64[ns, US/Pacific]-DataFrame] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_all_na[datetime64[ns, US/Pacific]-Series] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_all_na[timedelta64[ns]-DataFrame] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_all_na[timedelta64[ns]-Series] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_all_na[Int64-DataFrame] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_all_na[Int64-Series] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_all_na[Float64-DataFrame] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_all_na[Float64-Series] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_scalar[interval[int64, right]-DataFrame] XFAIL
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_scalar[interval[int64, right]-Series] XFAIL
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_scalar[period[D]-DataFrame] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_scalar[period[D]-Series] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_scalar[datetime64[ns, US/Pacific]-DataFrame] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_scalar[datetime64[ns, US/Pacific]-Series] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_scalar[timedelta64[ns]-DataFrame] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_scalar[timedelta64[ns]-Series] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_scalar[Int64-DataFrame] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_scalar[Int64-Series] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_scalar[Float64-DataFrame] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_quantile_ea_scalar[Float64-Series] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_empty_numeric[float64-expected_data0-expected_index0-1] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_empty_numeric[int64-expected_data1-expected_index1-1] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_empty_numeric[float64-expected_data2-expected_index2-0] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_empty_numeric[int64-expected_data3-expected_index3-0] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_empty_datelike[datetime64[ns]-expected_data0-expected_index0-1-datetime64[ns]] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_empty_datelike[datetime64[ns]-expected_data1-expected_index1-0-datetime64[ns]] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_datelike_numeric_only[expected_data0-expected_index0-1] PASSED
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_datelike_numeric_only[expected_data1-expected_index1-0] PASSED

===================== warnings summary =====================
../../anaconda3/envs/pandas-dev/lib/python3.8/site-packages/pytest_asyncio/plugin.py:191
/home/pi/anaconda3/envs/pandas-dev/lib/python3.8/site-packages/pytest_asyncio/plugin.py:191: DeprecationWarning: The 'asyncio_mode' default value will change to 'strict' in future, please explicitly use 'asyncio_mode=strict' or 'asyncio_mode=auto' in pytest configuration file.
config.issue_config_time_warning(LEGACY_MODE, stacklevel=2)

pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_axis_mixed
/home/pi/git/mypandas/pandas/tests/frame/methods/test_quantile.py:124: FutureWarning: In future versions of pandas, numeric_only will be set to False by default, and the datetime/timedelta columns will be considered in the results. To not consider these columnsspecify numeric_only=True and ignore this warning.
result = df.quantile(0.5, axis=1)

pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_datelike_numeric_only[expected_data0-expected_index0-1]
pandas/tests/frame/methods/test_quantile.py::TestQuantileExtensionDtype::test_datelike_numeric_only[expected_data1-expected_index1-0]
/home/pi/git/mypandas/pandas/tests/frame/methods/test_quantile.py:786: FutureWarning: In future versions of pandas, numeric_only will be set to False by default, and the datetime/timedelta columns will be considered in the results. To not consider these columnsspecify numeric_only=True and ignore this warning.
result = df[["a", "c"]].quantile(0.5, axis=axis)

../../anaconda3/envs/pandas-dev/lib/python3.8/site-packages/_pytest/cacheprovider.py:428
/home/pi/anaconda3/envs/pandas-dev/lib/python3.8/site-packages/_pytest/cacheprovider.py:428: PytestCacheWarning: cache could not write path /home/pi/git/mypandas/.pytest_cache/v/cache/nodeids
config.cache.set("cache/nodeids", sorted(self.cached_nodeids))

../../anaconda3/envs/pandas-dev/lib/python3.8/site-packages/_pytest/stepwise.py:49
/home/pi/anaconda3/envs/pandas-dev/lib/python3.8/site-packages/_pytest/stepwise.py:49: PytestCacheWarning: cache could not write path /home/pi/git/mypandas/.pytest_cache/v/cache/stepwise
session.config.cache.set(STEPWISE_CACHE_DIR, [])

-- Docs: https://docs.pytest.org/en/stable/warnings.html

  • generated xml file: /home/pi/git/mypandas/test-data.xml --
    =================== slowest 30 durations ===================
    0.07s call pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_numeric_only_default_false_warning
    0.02s call pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_datetime
    0.01s call pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_box
    0.01s call pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile
    0.01s call pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_nan
    0.01s call pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_dt64_empty[Period[D]]
    0.01s call pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_empty_no_rows_dt64
    0.01s call pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_dt64_empty[datetime64[ns]]
    0.01s call pandas/tests/frame/methods/test_quantile.py::TestDataFrameQuantile::test_quantile_nat

(21 durations < 0.005s hidden. Use -vv to show these durations.)
======== 73 passed, 8 xfailed, 6 warnings in 2.46s =========

@NumberPiOso NumberPiOso marked this pull request as ready for review February 23, 2022 21:23
pandas/core/frame.py Outdated Show resolved Hide resolved
@NumberPiOso NumberPiOso requested a review from phofl March 2, 2022 00:05
@jreback
Copy link
Contributor

jreback commented Mar 2, 2022

also pls rebase

pandas/core/frame.py Outdated Show resolved Hide resolved
@jreback jreback added Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Deprecate Functionality to remove in pandas quantile quantile method labels Mar 2, 2022
@jreback jreback added this to the 1.5 milestone Mar 2, 2022
NumberPiOso added a commit to NumberPiOso/pandas that referenced this pull request Mar 2, 2022
@NumberPiOso NumberPiOso force-pushed the ench-quantile-numeric-only branch 2 times, most recently from 3cffa0f to 2a3ae84 Compare March 2, 2022 20:22
@NumberPiOso
Copy link
Contributor Author

Should we modify EXAMPLES in documentation?

pandas/pandas/core/frame.py

Lines 10573 to 10580 in 0ccad38

>>> df.quantile(.1)
a 1.3
b 3.7
Name: 0.1, dtype: float64
>>> df.quantile([.1, .5])
a b
0.1 1.3 3.7
0.5 2.5 55.0

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yees pls update the docs to avoid showing this warning

doc/source/whatsnew/v1.5.0.rst Outdated Show resolved Hide resolved
pandas/tests/frame/methods/test_quantile.py Outdated Show resolved Hide resolved
@@ -10570,11 +10570,11 @@ def quantile(
--------
>>> df = pd.DataFrame(np.array([[1, 1], [2, 10], [3, 100], [4, 100]]),
... columns=['a', 'b'])
>>> df.quantile(.1)
>>> df.quantile(.1, numeric_only=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should't be necessary, only add the option where it actually matters

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

pandas/core/frame.py Outdated Show resolved Hide resolved
Copy link
Member

@phofl phofl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small comments

pandas/tests/frame/methods/test_quantile.py Outdated Show resolved Hide resolved
pandas/tests/frame/methods/test_quantile.py Outdated Show resolved Hide resolved
pandas/tests/frame/methods/test_quantile.py Show resolved Hide resolved
[
pd.date_range("2014-01-01", periods=3, freq="m"),
["a", "b", "c"],
[DataFrame, Series, Timestamp],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you testing with this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual quantile method must ignore the non numeric columns and produce a warning. I am using three different non-numeric columns. A date column, a string column and a column of objects.

@NumberPiOso NumberPiOso requested a review from phofl March 10, 2022 19:04
@jreback
Copy link
Contributor

jreback commented Mar 16, 2022

@NumberPiOso can you merge master and ping on green

@jreback jreback merged commit aa3e420 into pandas-dev:main Mar 18, 2022
@jreback
Copy link
Contributor

jreback commented Mar 18, 2022

thanks @NumberPiOso

yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this pull request Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Deprecate Functionality to remove in pandas quantile quantile method
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: DataFrame quantile with only datetime dtypes should provide better error message
3 participants