-
Notifications
You must be signed in to change notification settings - Fork 66
Description
When running R code from a JVM project with GraalVM, I ran into a java.lang.OutOfMemory error: GC overhead limit exceeded
.
Using the VisualVM program that comes with Graal, I noticed the Heap would quite quickly grow to its maximum size, after which the garbage collector would desperately try to cut down on heapsize while the running processes slowed down more and more. Eventually the program would simply crash with the OutOfMemory error message.
I've spent some time isolating the problem code and I found that the unnest
function in tidyr
appears to be the source. I have created a small sample Java+R project: https://github.com/NRBPerdijk/fastRBug/tree/memoryLeakUnnest
The problematic code can be found on the memoryLeakUnnest
branch and can be run by executing this command from the project folder:
mvn clean install && cd target && {PATH_TO_GRAALVM}/bin/java -Xmx1G -cp fastRBug-1.0-SNAPSHOT.jar:../lib/graal-sdk-19.1.0.jar Main
Alternatively, similar behaviour can be triggered using just FastR
, by running the following R
snippet:
runOutOfMemory <- function() {
while("a" == "a") {
df <- tibble(
x = 1:3,
y = c("a", "d,e,f", "g,h")
)
df %>% unnest(y = strsplit(y, ","))
print("Whoah!")
}
}
runOutOfMemory()
You'll need more patience with this method, the Java project explodes a bit quicker.
Edit:
Tried versions
- GraalVM RC 16 and GraalVM 19.1.0
tidyr
package 0.8.2 and 0.8.3dplyr
package 0.7.8
This memory leak may also contribute to the poor performance indicated in issue #71