Skip to content

ENH: Supporting a mapper function as the 1st argument in DataFrame.set_axis #61493

Closed
@aallahyar

Description

@aallahyar

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

At the moment, DataFrame.set_axis() only accepts labels:

df = (
    pd.DataFrame({'A': range(3), 'B': range(10, 13)})
    .set_axis(['a', 'b'], axis=1)
)
print(df)

#    a   b
# 0  0  10
# 1  1  11
# 2  2  12

Which makes it difficult (or more precisely verbose, using workarounds) to use during method chaining (where available columns could be dynamic and unknown at the begining of the chain).

Feature Description

I suggest to allow .set_axis method to accept a "mapper" (either a function, dict or series) that could be used to convert an axis to another preferred axis.
The proposed enhancement could get inspiration from how .rename_axis works. For example, .set_axis could support receiving a function to apply on the current axis of the DataFrame (either its index or columns, depending on the axis argument) and set the axis to the labels that are returned by the function (see below for an example).

Example:

df = (
    pd.DataFrame({'A': range(3), 'B': range(10, 13)})
    .set_axis(lambda df: 'col' + df.columns, axis=1)

   # or an alternative signature to support
   .set_axis({'A': 'colA', 'B': 'colB'}, axis='columns')
)

#    colA  colB
# 0     0    10
# 1     1    11
# 2     2    12

Alternative Solutions

There is of course a workaround for this but, it is slightly verbose to use it during method chaining:

df = (
    pd.DataFrame({'A': range(3), 'B': range(10, 13)})
    .pipe(lambda df: df.set_axis('col' + df.columns, axis=1))
)
print(df)

#    colA  colB
# 0     0    10
# 1     1    11
# 2     2    12

Additional Context

I think .set_axis in general needs a bit of API consistency update.
For example, in DataFrame.rename_axis arguments can be provided in two ways:

df.rename_axis(index=index_mapper, columns=columns_mapper)
df.rename_axis(mapper, axis='index')

But, .set_axis does not support such calling signatures. I propose to additionally support index= and columns= calling arguments to clarify the intent and increase readability.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Closing CandidateMay be closeable, needs more eyeballsEnhancementNeeds TriageIssue that has not been reviewed by a pandas team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions