groupby.rank 'na_option="bottom"' Usage Clarification #22124

peterpanmj · 2018-07-30T05:10:55Z

Code Sample, a copy-pastable example if possible

In [1]: import pandas as pd
   ...: import numpy as np
   ...: df = pd.DataFrame({'val': [2, np.nan, 2, 8, 2, np.nan, 6]})
   ...: df["key"] = pd.Series(["foo"]*7)

In [2]: df
Out[2]:
   val  key
0  2.0  foo
1  NaN  foo
2  2.0  foo
3  8.0  foo
4  2.0  foo
5  NaN  foo
6  6.0  foo

In [5]: df.groupby("key").rank(na_option="not bottom")
Out[5]:
   val
0  2.0
1  6.5
2  2.0
3  5.0
4  2.0
5  6.5
6  4.0

Problem description

When an invalid value is passed to groupby.rank for na_option argument. It didn't raise a ValueError as expected. The same behavior will raise a ValueError("na_option must be one of 'keep', 'top', or 'bottom'") in DataFrame.rank or Series.rank
The expected output is derived from #19499

Expected Output

In [1]: import pandas as pd
   ...: import numpy as np
   ...: df = pd.DataFrame({'val': [2, np.nan, 2, 8, 2, np.nan, 6]})
   ...: df["key"] = pd.Series(["foo"]*7)
   ...:

In [2]: df.rank(na_option="no bottom")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-ec2afc565a7d> in <module>()
----> 1 df.rank(na_option="no bottom")

C:\Users\Public\pandas-peter\pandas\core\generic.py in rank(self, axis, method,
numeric_only, na_option, ascending, pct)
   7523         if na_option not in {'keep', 'top', 'bottom'}:
   7524             msg = "na_option must be one of 'keep', 'top', or 'bottom'"
-> 7525             raise ValueError(msg)
   7526
   7527         def ranker(data):

ValueError: na_option must be one of 'keep', 'top', or 'bottom'

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit: d30c4a0
python: 3.6.4.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: zh_CN.UTF-8
LOCALE: None.None

pandas: 0.24.0.dev0+377.gd30c4a069
pytest: 3.3.2
pip: 9.0.1
setuptools: 38.4.0
Cython: 0.28.4
numpy: 1.14.0
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.6.7
patsy: 0.5.0
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.4
feather: None
matplotlib: 2.1.2
openpyxl: 2.4.10
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.2
lxml: 4.1.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.1
pymysql: 0.7.11.None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

The text was updated successfully, but these errors were encountered:

…andas-dev#22124)

…22124) (#22125)

…andas-dev#22124) (pandas-dev#22125)

* master: (47 commits) Run tests in conda build [ci skip] (pandas-dev#22190) TST: Check DatetimeIndex.drop on DST boundary (pandas-dev#22165) CI: Fix Travis failures due to lint.sh on pandas/core/strings.py (pandas-dev#22184) Documentation: typo fixes in MultiIndex / Advanced Indexing (pandas-dev#22179) DOC: added .join to 'see also' in Series.str.cat (pandas-dev#22175) DOC: updated Series.str.contains see also section (pandas-dev#22176) 0.23.4 whatsnew (pandas-dev#22177) fix: scalar timestamp assignment (pandas-dev#19843) (pandas-dev#19973) BUG: Fix get dummies unicode error (pandas-dev#22131) Fixed py36-only syntax [ci skip] (pandas-dev#22167) DEPR: pd.read_table (pandas-dev#21954) DEPR: Removing previously deprecated datetools module (pandas-dev#6581) (pandas-dev#19119) BUG: Matplotlib scatter datetime (pandas-dev#22039) CLN: Use public method to capture UTC offsets (pandas-dev#22164) implement tslibs/src to make tslibs self-contained (pandas-dev#22152) Fix categorical from codes nan 21767 (pandas-dev#21775) BUG: Better handling of invalid na_option argument for groupby.rank(pandas-dev#22124) (pandas-dev#22125) use memoryviews instead of ndarrays (pandas-dev#22147) Remove depr. warning in SeriesGroupBy.count (pandas-dev#22155) API: Default to_* methods to compression='infer' (pandas-dev#22011) ...

…andas-dev#22124) (pandas-dev#22125)

peterpanmj added a commit to peterpanmj/pandas that referenced this issue Jul 30, 2018

BUG: Better handling of invalid na_option argument for groupby.rank(p…

ca106c3

…andas-dev#22124)

peterpanmj mentioned this issue Jul 30, 2018

BUG: Better handling of invalid na_option argument for groupby.rank #22125

Merged

4 tasks

jreback added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Error Reporting Incorrect or improved errors from pandas labels Jul 30, 2018

jreback added this to the 0.24.0 milestone Jul 30, 2018

peterpanmj added a commit to peterpanmj/pandas that referenced this issue Jul 31, 2018

BUG: Better handling of invalid na_option argument for groupby.rank(p…

9e83338

…andas-dev#22124)

WillAyd added the Groupby label Jul 31, 2018

peterpanmj added a commit to peterpanmj/pandas that referenced this issue Aug 1, 2018

BUG: Better handling of invalid na_option argument for groupby.rank(p…

d0d3e73

…andas-dev#22124)

peterpanmj added a commit to peterpanmj/pandas that referenced this issue Aug 1, 2018

BUG: Better handling of invalid na_option argument for groupby.rank(p…

8615d25

…andas-dev#22124)

jreback closed this as completed in #22125 Aug 1, 2018

jreback pushed a commit that referenced this issue Aug 1, 2018

BUG: Better handling of invalid na_option argument for groupby.rank(#…

a8836f3

…22124) (#22125)

dberenbaum pushed a commit to dberenbaum/pandas that referenced this issue Aug 3, 2018

BUG: Better handling of invalid na_option argument for groupby.rank(p…

4da257b

…andas-dev#22124) (pandas-dev#22125)

Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this issue Oct 1, 2018

BUG: Better handling of invalid na_option argument for groupby.rank(p…

9d20423

…andas-dev#22124) (pandas-dev#22125)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

groupby.rank 'na_option="bottom"' Usage Clarification #22124

groupby.rank 'na_option="bottom"' Usage Clarification #22124

peterpanmj commented Jul 30, 2018

INSTALLED VERSIONS

groupby.rank 'na_option="bottom"' Usage Clarification #22124

groupby.rank 'na_option="bottom"' Usage Clarification #22124

Comments

peterpanmj commented Jul 30, 2018

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

Output of `pd.show_versions()`