Skip to content

High memory usage pivot_table  #10554

@cangermueller

Description

@cangermueller

Hi all,

I observed now several times that pivot_table() requires a lot of memory when I wanted to reshape a Data.Frame in table format (id, name, value) into matrix format (index=id, columns=name, values=value):

print(d) 
# id  sample  value
    1           A        1
    2           A        2
    1           B       10
    2           B       20
..... (many rows)
e = pd.pivot_table(d, index='id', columns='sample', values='value')
print(e)
        A          B
1      1           2
2     10         20

Is this a known issue? Is there a workaround?
For data maintenance, it's often times better to store large data in table format, and convert them to matrix format on demand. The R reshape and gather functions (http://goo.gl/isx1bv) from the dplyr package seem to be faster and require less memory than pivot_table and melt.

Cheers,
Christof

Metadata

Metadata

Assignees

No one assigned

    Labels

    PerformanceMemory or execution speed performanceReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions