spread causes system to run out of memory #13

jtlowell · 2014-07-18T13:49:38Z

How long/wide can a data frame be -- going from gathered to spread form?

I have a 200,000 row data frame I'm spreading to create 200,000 columns and I'm running out of memory.

Have you tested to see limits on the operations with various machines?

hadley · 2014-07-21T14:08:41Z

I have not. It might be possible to replace the vectorised R code with optimised C++ code that would need less memory.

JamesOwers · 2015-12-11T22:57:00Z

Just a quick note that I'm having memory issues with spread(..., drop=FALSE). If I use spread(..., drop=TRUE) then everything works out fine, the process takes just a few seconds, and the result is of size 0.2Mb.

My input dataset is 0.4MB, has 6000 rows, and 11 variables. This is the result of a filter on a dataset which is of size 200Mb. When running with spread(..., drop=FALSE), the rsession memory expands to over 20Gb.

Unfortunately I can't provide the exact dataset, but if there is anything I can provide to help, I'll be happy to do so.

hadley · 2015-12-11T23:56:34Z

How many unique values are there in the variables that you are spreading? It is easy to create very very large data frames with spread.

JamesOwers · 2015-12-12T00:08:58Z

There are some numeric variables with a few thousand unique values, but isn't spread just going to make a variable for each key? Also, by virtue of spread(..., drop=TRUE) working fine, the only variables remaining to spread only have one value: NA.

hadley closed this as completed Aug 22, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spread causes system to run out of memory #13

spread causes system to run out of memory #13

jtlowell commented Jul 18, 2014

hadley commented Jul 21, 2014

JamesOwers commented Dec 11, 2015

hadley commented Dec 11, 2015

JamesOwers commented Dec 12, 2015

spread causes system to run out of memory #13

spread causes system to run out of memory #13

Comments

jtlowell commented Jul 18, 2014

hadley commented Jul 21, 2014

JamesOwers commented Dec 11, 2015

hadley commented Dec 11, 2015

JamesOwers commented Dec 12, 2015