Memory leak / infinite recursion #181

code-tree · 2015-09-14T07:38:13Z

Since upgrading to 2.11 I'm getting a memory leak when trying to push files. After disabling the below exception with -XX:-UseGCOverheadLimit the process went over 1GB before I killed it. The files I'm sending are tiny though.

Would appreciate any help, thanks

Picked up JAVA_TOOL_OPTIONS: -javaagent:/usr/share/java/jayatanaag.jar 
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
    at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:252)
    at scala.collection.mutable.ArrayOps$ofRef.flatMap(ArrayOps.scala:186)
    at s3.website.model.Files$.recursiveListFiles(push.scala:125)
    at s3.website.model.Files$$anonfun$recursiveListFiles$2.apply(push.scala:125)
    at s3.website.model.Files$$anonfun$recursiveListFiles$2.apply(push.scala:125)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:252)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:252)
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
    at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:252)
    at scala.collection.mutable.ArrayOps$ofRef.flatMap(ArrayOps.scala:186)
    at s3.website.model.Files$.recursiveListFiles(push.scala:125)
    at s3.website.model.Files$$anonfun$recursiveListFiles$2.apply(push.scala:125)
    at s3.website.model.Files$$anonfun$recursiveListFiles$2.apply(push.scala:125)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:252)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:252)
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
    at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:252)
    at scala.collection.mutable.ArrayOps$ofRef.flatMap(ArrayOps.scala:186)
    at s3.website.model.Files$.recursiveListFiles(push.scala:125)
    at s3.website.model.Files$$anonfun$recursiveListFiles$2.apply(push.scala:125)
    at s3.website.model.Files$$anonfun$recursiveListFiles$2.apply(push.scala:125)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:252)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:252)
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
    at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:252)
    at scala.collection.mutable.ArrayOps$ofRef.flatMap(ArrayOps.scala:186)
    at s3.website.model.Files$.recursiveListFiles(push.scala:125)
    at s3.website.model.Files$$anonfun$recursiveListFiles$2.apply(push.scala:125)
    at s3.website.model.Files$$anonfun$recursiveListFiles$2.apply(push.scala:125)

The text was updated successfully, but these errors were encountered:

code-tree · 2015-09-14T08:02:35Z

It seems like it isn't due to 2.11 actually, as just downgraded to 2.10 (which previously worked for me) and still getting a memory leak. Any ideas? thanks

laurilehmijoki · 2015-09-15T04:05:33Z

The recursive file listing routine seems to run away sometimes. Maybe we should replace it with [Apache Commons FileUtils](https://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/FileUtils.html#listFiles%28java.io.File, java.lang.String[], boolean%29) and see if the problem disappears. Would you like to try?

code-tree · 2015-09-15T05:15:53Z

Sorry, I'm not very familiar with Scala nor Java.

Any ideas what might be triggering it? / Something I can do to work around it? As I'm not able to push any projects at the moment.

Thanks for your work

laurilehmijoki · 2015-09-15T10:13:55Z

@code-tree try another version of Java? See the official download page at http://www.oracle.com/technetwork/java/javase/downloads/index.html.

laurilehmijoki · 2015-09-15T11:47:33Z

@code-tree will this comment help you?

code-tree · 2015-09-17T04:39:24Z

~~I was using openjdk7 but downgrading to openjdk6 solved this. So I assume it is a bug with openjdk then? Hope it'll be fixed sometime as v6 probably won't be available next time I upgrade OS.~~

Thanks for your help

code-tree · 2015-09-17T06:33:43Z

Sorry, that was actually a misdiagnosis. It doesn't matter which version, it works in 6, 7 and 8 openjdk. Instead, the problem occurs when there are too many files to list, and depends on what the working directory is when executing s3_website.

When executed from dir:

/ -- process over 1GB and fails
/home -- process over 1GB and fails
/home/user -- process over 1GB and fails
/home/user/folder -- process over 300MB but succeeds
etc.

So I believe s3_website is actually listing every file from the dir it is executed from (possibly in here), even though I have the --site option specified.

laurilehmijoki · 2015-09-18T04:58:33Z

Thanks for reporting your valuable discovery!

See #181 (comment)

laurilehmijoki · 2015-09-18T05:27:59Z

@code-tree please try out the new version 2.11.1. It contains a fix for this problem.

code-tree · 2015-09-18T09:30:18Z

Thanks for the fix, though unfortunately the problem still remains. Process went to 500MB before I stopped it (using 2.11.1), running from root (/) with --site set to my project.

In addition to limiting the recursion, it would be nice to stop it trying to list files from the working dir when --site is specified, as I think that might be the root cause of the issue? (I'm assuming this is the case, given it works when run from a file tree with little depth.

laurilehmijoki · 2015-09-18T12:31:52Z

@code-tree try 2.11.2, it contains a new fix.

code-tree · 2015-09-19T08:48:38Z

Thanks Lauri, almost there. It appears to work for site in the yaml config, but not --site from the command line?

That said, even when testing with site in the yaml config, the process still went up to 300MB, but succeeded, unlike when using --site. I wonder if there is another recursion happening somewhere else as well.

laurilehmijoki · 2015-09-21T04:22:00Z

@code-tree thanks for the feedback. According to the implementation the --site setting does not recursively search for files. I wonder what could explain the behaviour you are experiencing. There are only two places from which the recursiveListFiles function is called:

s3_website/src/main/scala/s3/website/model/ssg.scala

Line 15 in 57b9a63

recursiveListFiles(workingDirectory).find { file =>
s3_website/src/main/scala/s3/website/model/push.scala

Line 140 in 8398a92

recursiveListFiles(site.rootDirectory)

code-tree · 2015-09-22T01:25:24Z

Sorry, what I meant was: the fix works when site is specified in the yaml config (s3_website does not recurse through working dir) but does not seem to work when site is given as a CLI arg (s3_website still recurses through working dir).

It seems like config is only filled with yaml options and not CLI options? Then the second line containing the fix would evaluate to true (site is empty) even when it is specified on the CLI.

  def resolveSiteDir(implicit yamlConfig: S3_website_yml, config: Config, cliArgs: CliArgs, workingDirectory: File): Either[ErrorReport, File] = {
    val siteFromAutoDetect = if (config.site.isEmpty) { autodetectSiteDir(workingDirectory) } else { None }
    val errOrSiteFromCliArgs: Either[ErrorReport, Option[File]] = Option(cliArgs.site) match {

laurilehmijoki · 2015-09-22T03:30:10Z

You are right in your reasoning.

However, the code seems not to recurse in the case where one defines the site via the CLI arg. Hence I wonder what could possibly cause the out-of-memory error in that situation.

laurilehmijoki · 2015-09-22T03:47:23Z

@code-tree you can also try to work around the problem like this:

First, add into s3_website.yml the following line:

site: <%= ENV['SITE_DIR'] %>

Then invoke SITE_DIR=/path/to/your/site s3_website push.

If there is a bug in the way s3_website handles the --site CLI argument, then the above trick should circumvent that problem.

code-tree · 2015-09-24T01:29:47Z

Yes, using ENV in the yaml config has fixed it for the time being, thanks

stroupaloop · 2015-10-28T18:58:18Z

Hey there, experiencing the same issue, but setting the site: <%= ENV['SITE_DIR'] %> parameter within the s3_website.yml and invoking the SITE_DIR=_site s3_website push still returns the Java out of memory error as noted above.

Any additional insight to the problem? I'm currently using v2.12.2

laurilehmijoki · 2015-10-29T04:18:17Z

As a workaround, you can try the pure Ruby implementation at https://github.com/laurilehmijoki/s3_website/tree/1.x

fagiani · 2017-10-26T18:40:57Z

I'm experiencing the very same behavior even with the ENV parameter set. This all happened on a brand new setup as it used to work fine on the previous computer. I guess that some dependency gem may be actually causing that. Any others experiencing this currently? Any clues on how to fix this as it has been quite a while?

I am using 3.4.0 version.

Thanks!

Nihahs · 2018-02-28T07:29:29Z

Hi,
I had been facing the same issue. Not sure why but clearing the tmp folder fixed it. Hope this helps.

fagiani · 2018-03-01T22:56:43Z

@Nihahs do you mean the root /tmp/ of the filesystem or other tmp folder? I was unable to reproduce that and still get the same error :(

@laurilehmijoki are there any new clues on what may be causing that? In my case, the only reason I'd see is that one of my buckets has a big chunk of logs and although I have ignore_on_server: logs it will take a big while, raise CPU levels then throw an OutOfMemoryError exception.

Thanks all!

code-tree changed the title ~~Memory leak~~ Memory leak / infinite recursion Sep 14, 2015

laurilehmijoki added the bug label Sep 15, 2015

laurilehmijoki added a commit that referenced this issue Sep 18, 2015

Prevent runaway recursion in file listing

62082e1

See #181 (comment)

laurilehmijoki added a commit that referenced this issue Sep 18, 2015

A new fix for issue #181

8398a92

rossdakin mentioned this issue Oct 28, 2015

Unable to upload the _site folder to S3 using s3_website gem presidential-innovation-fellows/pif-website#83

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak / infinite recursion #181

Memory leak / infinite recursion #181

code-tree commented Sep 14, 2015

code-tree commented Sep 14, 2015

laurilehmijoki commented Sep 15, 2015

code-tree commented Sep 15, 2015

laurilehmijoki commented Sep 15, 2015

laurilehmijoki commented Sep 15, 2015

code-tree commented Sep 17, 2015

code-tree commented Sep 17, 2015

laurilehmijoki commented Sep 18, 2015

laurilehmijoki commented Sep 18, 2015

code-tree commented Sep 18, 2015

laurilehmijoki commented Sep 18, 2015

code-tree commented Sep 19, 2015

laurilehmijoki commented Sep 21, 2015

code-tree commented Sep 22, 2015

laurilehmijoki commented Sep 22, 2015

laurilehmijoki commented Sep 22, 2015

code-tree commented Sep 24, 2015

stroupaloop commented Oct 28, 2015

laurilehmijoki commented Oct 29, 2015

fagiani commented Oct 26, 2017

Nihahs commented Feb 28, 2018

fagiani commented Mar 1, 2018

Memory leak / infinite recursion #181

Memory leak / infinite recursion #181

Comments

code-tree commented Sep 14, 2015

code-tree commented Sep 14, 2015

laurilehmijoki commented Sep 15, 2015

code-tree commented Sep 15, 2015

laurilehmijoki commented Sep 15, 2015

laurilehmijoki commented Sep 15, 2015

code-tree commented Sep 17, 2015

code-tree commented Sep 17, 2015

laurilehmijoki commented Sep 18, 2015

laurilehmijoki commented Sep 18, 2015

code-tree commented Sep 18, 2015

laurilehmijoki commented Sep 18, 2015

code-tree commented Sep 19, 2015

laurilehmijoki commented Sep 21, 2015

code-tree commented Sep 22, 2015

laurilehmijoki commented Sep 22, 2015

laurilehmijoki commented Sep 22, 2015

code-tree commented Sep 24, 2015

stroupaloop commented Oct 28, 2015

laurilehmijoki commented Oct 29, 2015

fagiani commented Oct 26, 2017

Nihahs commented Feb 28, 2018

fagiani commented Mar 1, 2018