Skip to content

Conversation

tdpetrou
Copy link
Contributor

@tdpetrou tdpetrou commented Dec 6, 2017

I made a minor change to nlargest/nsmallest to keep all the values of the last n when keep=False

@gfyoung gfyoung added Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff API Design labels Dec 6, 2017
@gfyoung gfyoung added this to the No action milestone Dec 6, 2017
@gfyoung
Copy link
Member

gfyoung commented Dec 6, 2017

Closed due to decision in #18559.

@jreback
Copy link
Contributor

jreback commented Dec 6, 2017

actually this is prob reasonable.

@jreback jreback reopened this Dec 6, 2017
@jreback jreback removed this from the No action milestone Dec 6, 2017
@codecov
Copy link

codecov bot commented Dec 6, 2017

Codecov Report

Merging #18656 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18656      +/-   ##
==========================================
- Coverage   91.59%   91.57%   -0.02%     
==========================================
  Files         153      153              
  Lines       51221    51223       +2     
==========================================
- Hits        46917    46910       -7     
- Misses       4304     4313       +9
Flag Coverage Δ
#multiple 89.44% <100%> (ø) ⬆️
#single 40.67% <0%> (-0.11%) ⬇️
Impacted Files Coverage Δ
pandas/core/algorithms.py 94.16% <100%> (+0.01%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.81% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 13f6267...fcfc563. Read the comment docs.

@codecov
Copy link

codecov bot commented Dec 6, 2017

Codecov Report

Merging #18656 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18656      +/-   ##
==========================================
- Coverage   91.61%    91.6%   -0.02%     
==========================================
  Files         153      153              
  Lines       51367    51369       +2     
==========================================
- Hits        47061    47056       -5     
- Misses       4306     4313       +7
Flag Coverage Δ
#multiple 89.46% <100%> (ø) ⬆️
#single 40.76% <0%> (-0.12%) ⬇️
Impacted Files Coverage Δ
pandas/core/frame.py 97.81% <ø> (-0.1%) ⬇️
pandas/core/algorithms.py 94.16% <100%> (+0.01%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/util/testing.py 82.52% <0%> (+0.19%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 265e327...56954b4. Read the comment docs.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change the whatsnew note which remove the keep=False (but keep the actual issue number attached there)

tm.assert_frame_equal(result, expected)

def test_keep_false(self):
df = pd.DataFrame({'a': [5, 4, 4, 2, 3, 3, 3, 3],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the issue number here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make a more informative name for the test

assert_series_equal(result, expected)

def test_keep_false(self):
s = Series([10, 9, 8, 7, 7, 7, 7, 6])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

self.n = n
self.keep = keep

if self.keep not in ('first', 'last'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to add back to the docs-string (don't need a version added as it was tehre before)

@gfyoung
Copy link
Member

gfyoung commented Dec 6, 2017

@jreback : What do you think of making keep=False be the option? Besides API consistency, semantically, it doesn't make as much sense to me (or @tdpetrou ) to use because it would imply that we drop ties.

@tdpetrou
Copy link
Contributor Author

tdpetrou commented Dec 11, 2017

@jreback As @gfyoung mentioned, can we choose the parameter value first before making more changes. Should we do 'all', False, or something completely different?

edit: I am highly in favor of 'all'

@jreback
Copy link
Contributor

jreback commented Dec 13, 2017

yeah False doesn't seem intuitive, all I think would be ok. Not sure if we have this option anywhere else (e.g. it doesn't exist in .drop_duplicates nor should it as that would be silly:>)

@tdpetrou
Copy link
Contributor Author

@jreback Changes complete. Also, I didn't see anything in whatsnew on the other issues.

@jreback
Copy link
Contributor

jreback commented Feb 10, 2018

pls rebase

@jreback
Copy link
Contributor

jreback commented Jun 26, 2018

closing as stale

@jreback jreback closed this Jun 26, 2018
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 27, 2018
@gfyoung
Copy link
Member

gfyoung commented Jun 27, 2018

@jreback : Revived in #21650.

gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 27, 2018
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 28, 2018
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 28, 2018
gfyoung added a commit to forking-repos/pandas that referenced this pull request Jun 28, 2018
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff API Design

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make parameter keep=False keep duplicates for nlargest/nsmallest

3 participants