-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Closed
Labels
PerformanceMemory or execution speed performanceMemory or execution speed performanceReshapingConcat, Merge/Join, Stack/Unstack, ExplodeConcat, Merge/Join, Stack/Unstack, Explode
Description
Hi all,
I observed now several times that pivot_table() requires a lot of memory when I wanted to reshape a Data.Frame in table format (id, name, value) into matrix format (index=id, columns=name, values=value):
print(d)
# id sample value
1 A 1
2 A 2
1 B 10
2 B 20
..... (many rows)
e = pd.pivot_table(d, index='id', columns='sample', values='value')
print(e)
A B
1 1 2
2 10 20Is this a known issue? Is there a workaround?
For data maintenance, it's often times better to store large data in table format, and convert them to matrix format on demand. The R reshape and gather functions (http://goo.gl/isx1bv) from the dplyr package seem to be faster and require less memory than pivot_table and melt.
Cheers,
Christof
Metadata
Metadata
Assignees
Labels
PerformanceMemory or execution speed performanceMemory or execution speed performanceReshapingConcat, Merge/Join, Stack/Unstack, ExplodeConcat, Merge/Join, Stack/Unstack, Explode