Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 'threshold' parameter to pd.DataFrame.dropna #4625

Merged
merged 2 commits into from Apr 15, 2019
Merged

Conversation

@nmatare
Copy link
Contributor

@nmatare nmatare commented Mar 22, 2019

dask.DataFrame.dropna is missing the "thresh" parameter.

The dask documentation includes it in the explanation but it doesn't seem to have been implemented.

As an aside, the docs also mention the "subset" parameter. However, I believe passing this through may cause unexpected behavior with the underlying metadata, we may want to remove it from the documentation.

@mrocklin
Copy link
Member

@mrocklin mrocklin commented Mar 28, 2019

Thanks for the contribution @nmatare . Looks simple enough. Would you be willing to add a small test, or perhaps modify the existing test_dropna test in dask/dataframe/tests/test_dataframe.py to exercise this new parameter?

Also, sorry for the delay in responding to this. It's a busy month!

@mrocklin
Copy link
Member

@mrocklin mrocklin commented Apr 12, 2019

Checking in @nmatare , are you still planning to work on this?

@nmatare
Copy link
Contributor Author

@nmatare nmatare commented Apr 15, 2019

Added to the existing tests. Apologies on the extended delay getting back to this.

@codecov
Copy link

@codecov codecov bot commented Apr 15, 2019

Codecov Report

No coverage uploaded for pull request base (master@eef368f). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #4625   +/-   ##
=========================================
  Coverage          ?   91.24%           
=========================================
  Files             ?       92           
  Lines             ?    17237           
  Branches          ?        0           
=========================================
  Hits              ?    15728           
  Misses            ?     1509           
  Partials          ?        0
Impacted Files Coverage Δ
dask/dataframe/core.py 95.77% <ø> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eef368f...ed3f886. Read the comment docs.

@mrocklin mrocklin merged commit 625118f into dask:master Apr 15, 2019
4 checks passed
@mrocklin
Copy link
Member

@mrocklin mrocklin commented Apr 15, 2019

Thanks @nmatare ! This is in.

jorge-pessoa pushed a commit to jorge-pessoa/dask that referenced this issue May 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants