You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I believe it should be addressed to avoid implicit behaviour.
Feature Description
Simplest way to addess it would be to change the default parameter of Series.nunique to dropna=False.
Analogously the same default parameter for DataFrame.nunique.
This would be consistent with current summary of the method:
Count number of distinct elements in specified axis.
Return Series with number of distinct elements. Can ignore NaN values.
"Can ignore NaN values.", hints that should be optional parameter not enabled by default.
Alternative Solutions
Another approach to force consistent NaN handling by default would be to addapt Series.unique to accept dropna and set it to True by default.
Although possible, this is more laborious and more impactful change on Pandas API.
Additional Context
No response
EDIT: Typos
The text was updated successfully, but these errors were encountered:
I think it should be dropna=True by default, so your alternative solution, i.e. add dropna to Series.unique (with default set to True) makes more sense to me. cc: @rhshadrach
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
Currently
Series.nunique
has a default parameterdropna=True
.However
Series.unique
does not accept thedropna
the parameter.This can cause the unexpected behaviour when:
s.nunique()
is not nessesarly equal tolen(s.unique())
.See example below:
I believe it should be addressed to avoid implicit behaviour.
Feature Description
Simplest way to addess it would be to change the default parameter of
Series.nunique
todropna=False
.Analogously the same default parameter for
DataFrame.nunique
.This would be consistent with current summary of the method:
"Can ignore NaN values.", hints that should be optional parameter not enabled by default.
Alternative Solutions
Another approach to force consistent NaN handling by default would be to addapt
Series.unique
to acceptdropna
and set it toTrue
by default.Although possible, this is more laborious and more impactful change on Pandas API.
Additional Context
No response
EDIT: Typos
The text was updated successfully, but these errors were encountered: