ENH: Add convenience API to summarize null counts grouped by dtype (e.g. df.dtype_nulls.summary())

### Feature Type

- [x] Adding new functionality to pandas

- [ ] Changing existing functionality in pandas

- [ ] Removing existing functionality in pandas


### Problem Description

Add a small convenience API to provide a quick, per-dtype view of missing values in a DataFrame. The utility should list columns grouped by dtype with null counts and optional null percentages, and return both a one-row-per-dtype summary and a per-dtype detail table (columns + null counts).

This is a diagnostic convenience (similar in spirit to df.info(show_counts=True) but grouped by dtype and returning programmatic output).

### Feature Description

Add a DataFrame accessor that provides a compact, programmatic summary of missing values grouped by column dtype.

@pd.api.extensions.register_dataframe_accessor("dtype_nulls")
class DtypeNullsAccessor:
    def __init__(self, df):
        self._df = df

    def summary(self, include_pct: bool = True, sort_desc: bool = True):
        """
        Return (summary_df, detail_dict)

        Parameters
        ----------
        include_pct : bool, default True
            Include null_pct columns (percentage of nulls relative to len(df)).
        sort_desc : bool, default True
            Sort per-dtype detail tables by null_count descending when True.

        Returns
        -------
        summary_df : pd.DataFrame
            One row per dtype with columns:
              - dtype : str (dtype string, e.g., 'float64', 'object')
              - n_columns : int
              - cols_with_nulls : int
              - total_nulls : int
              - avg_null_pct : float (if include_pct)
        detail_dict : dict[str, pd.DataFrame]
            Mapping dtype string -> DataFrame listing columns of that dtype with
            columns ['column','null_count','null_pct'?] (null_pct present if include_pct).
        """

Implementation sketch / pseudocode:

nrows = len(df)
per_col = DataFrame({
    "column": df.columns,
    "dtype": df.dtypes.astype(str),
    "null_count": df.isna().sum().values
})
if include_pct:
    per_col["null_pct"] = per_col["null_count"] / (nrows if nrows else 1) * 100

detail = { dtype: g.sort_values("null_count", ascending=not sort_desc).reset_index(drop=True)
           for dtype, g in per_col.groupby("dtype") }

agg = per_col.groupby("dtype").agg(
    n_columns=("column","count"),
    cols_with_nulls=("null_count", lambda s: (s>0).sum()),
    total_nulls=("null_count","sum")
).reset_index()

if include_pct:
    agg["avg_null_pct"] = per_col.groupby("dtype")["null_pct"].mean().values

return agg, detail


Expected behaviour / examples:

df = pd.DataFrame({
    "a": [1, None, 3],
    "b": [None, None, 2.0],
    "c": ["x","y", None],
    "d": [True, False, True]
})
summary, detail = df.dtype_nulls.summary()
### summary: rows for 'float64', 'object', 'bool' with counts and percentages
### detail['float64'] lists columns 'b' and 'a' with null_count and null_pct


### Alternative Solutions

One-liner / ad-hoc: Users can already compute this with a short snippet:

(pd.DataFrame({'dtype': df.dtypes.astype(str), 'nulls': df.isna().sum()})
   .reset_index()
   .groupby('dtype')[['index','nulls']])


### Additional Context

Related design rationale:

This feature is a convenience diagnostic that complements df.info() and profiling packages; it returns programmatic data structures (DataFrames and dict) so downstream tooling and tests can consume results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

ENH: Add convenience API to summarize null counts grouped by dtype (e.g. df.dtype_nulls.summary()) #62833

Feature Type

Problem Description

Feature Description

summary: rows for 'float64', 'object', 'bool' with counts and percentages

detail['float64'] lists columns 'b' and 'a' with null_count and null_pct

Alternative Solutions

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Uh oh!

ENH: Add convenience API to summarize null counts grouped by dtype (e.g. df.dtype_nulls.summary()) #62833

Description

Feature Type

Problem Description

Feature Description

summary: rows for 'float64', 'object', 'bool' with counts and percentages

detail['float64'] lists columns 'b' and 'a' with null_count and null_pct

Alternative Solutions

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions