Skip to content

PERF: wide_to_long taking a long time, but I found why ! #49174

@Simon-Free

Description

@Simon-Free

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this issue exists on the latest version of pandas.

  • I have confirmed this issue exists on the main branch of pandas.

Reproducible Example

When wide_to_long has a large number of columns passed as i (my dataframe has 400000 lines, 50 cols of index of multiple types), the function can take a lot of time. I found that doing something like:

  1. Separate i_dataframe, and stubname_dataframe. Synchronize index between the two.
  2. Reset index on stubname_dataframe
  3. Perform wide_to_long on stubname_dataframe, with i=index, to obtain long_dataframe
  4. Merge i_dataframe on long_dataframe using index.

Installed Versions

Replace this line with the output of pd.show_versions() --> getting an AssertionError, but using pandas 1.3.5. Though, I checked the changelogs, no performances changes were made on pd.wide_to_long between 1.3.5 and 1.5.0

Prior Performance

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Needs InfoClarification about behavior needed to assess issuePerformanceMemory or execution speed performanceReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions