Skip to content

ENH: add missing/infinite values counts in .describe  #54076

@lcrmorin

Description

@lcrmorin

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

As scikit-learn often fail with missing or infinite values, I'd like a default way to count missing / infinite values. Currently we need to use describe and some .isna().sum() separately. It would be nice if the describe method could provide missing / infinite values counts. This could even be extended to count some user defined 'sentinel values'.

Feature Description

Add new rows to the output of the .describe method, with the count of missing values, count of infinite values.

Some parameters can be added to the describe function:

  • a list of 'sentinel' values so that the describe method also provide counts for those.
  • an option to provide frequency (proportion of total) instead of counts
  • an option to enable / disable those counts entirely

Alternative Solutions

The current solution is to do the counts outside of the describe function.

Additional Context

No response

Metadata

Metadata

Assignees

Labels

EnhancementNeeds InfoClarification about behavior needed to assess issue

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions