Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 'threshold' parameter to pd.DataFrame.dropna #4625

Merged
merged 2 commits into from Apr 15, 2019

Conversation

Projects
None yet
2 participants
@nmatare
Copy link
Contributor

commented Mar 22, 2019

dask.DataFrame.dropna is missing the "thresh" parameter.

The dask documentation includes it in the explanation but it doesn't seem to have been implemented.

As an aside, the docs also mention the "subset" parameter. However, I believe passing this through may cause unexpected behavior with the underlying metadata, we may want to remove it from the documentation.

@mrocklin

This comment has been minimized.

Copy link
Member

commented Mar 28, 2019

Thanks for the contribution @nmatare . Looks simple enough. Would you be willing to add a small test, or perhaps modify the existing test_dropna test in dask/dataframe/tests/test_dataframe.py to exercise this new parameter?

Also, sorry for the delay in responding to this. It's a busy month!

@mrocklin

This comment has been minimized.

Copy link
Member

commented Apr 12, 2019

Checking in @nmatare , are you still planning to work on this?

@nmatare

This comment has been minimized.

Copy link
Contributor Author

commented Apr 15, 2019

Added to the existing tests. Apologies on the extended delay getting back to this.

@codecov

This comment has been minimized.

Copy link

commented Apr 15, 2019

Codecov Report

❗️ No coverage uploaded for pull request base (master@eef368f). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #4625   +/-   ##
=========================================
  Coverage          ?   91.24%           
=========================================
  Files             ?       92           
  Lines             ?    17237           
  Branches          ?        0           
=========================================
  Hits              ?    15728           
  Misses            ?     1509           
  Partials          ?        0
Impacted Files Coverage Δ
dask/dataframe/core.py 95.77% <ø> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eef368f...ed3f886. Read the comment docs.

@mrocklin mrocklin merged commit 625118f into dask:master Apr 15, 2019

4 checks passed

codecov/patch Coverage not affected.
Details
codecov/project No report found to compare against
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@mrocklin

This comment has been minimized.

Copy link
Member

commented Apr 15, 2019

Thanks @nmatare ! This is in.

asmith26 added a commit to asmith26/dask that referenced this pull request Apr 22, 2019

jorge-pessoa pushed a commit to jorge-pessoa/dask that referenced this pull request May 14, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.