Bug - crawl job hangs #478

danizen · 2018-03-30T22:00:51Z

In #477, I diagnosed a serial problem where my crawling job experienced a Fatal OutOfMemoryError, and then later an attempt to stop the collector failed, because the JVM to be stopped would not exit.

It seems likely that the crawler job entered a terminal state, but the code was waiting for it to stop logically, except that it had failed or something instead.

The exception that produced this state was:

FATAL [JobSuite] Fatal error occured in job: monitor_lessdepth_crawler
INFO  [JobSuite] Running monitor_lessdepth_crawler: END (Tue Feb 06 17:13:41 EST 2018)
Exception in thread "monitor_lessdepth_crawler" java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:3236)
        at java.lang.StringCoding.safeTrim(StringCoding.java:79)
        at java.lang.StringCoding.encode(StringCoding.java:365)
        at java.lang.String.getBytes(String.java:941)
        at org.apache.http.entity.StringEntity.<init>(StringEntity.java:70)
        at com.norconex.committer.elasticsearch.ElasticsearchCommitter.commitBatch(ElasticsearchCommitter.java:589)
        at com.norconex.committer.core.AbstractBatchCommitter.commitAndCleanBatch(AbstractBatchCommitter.java:179)
        at com.norconex.committer.core.AbstractBatchCommitter.commitComplete(AbstractBatchCommitter.java:159)
        at com.norconex.committer.core.AbstractFileQueueCommitter.commit(AbstractFileQueueCommitter.java:233)
        at com.norconex.committer.elasticsearch.ElasticsearchCommitter.commit(ElasticsearchCommitter.java:537)
        at com.norconex.collector.core.crawler.AbstractCrawler.execute(AbstractCrawler.java:274)
        at com.norconex.collector.core.crawler.AbstractCrawler.doExecute(AbstractCrawler.java:228)
        at com.norconex.collector.core.crawler.AbstractCrawler.resumeExecution(AbstractCrawler.java:190)
        at com.norconex.jef4.job.AbstractResumableJob.execute(AbstractResumableJob.java:51)
        at com.norconex.jef4.suite.JobSuite.runJob(JobSuite.java:355)
        at com.norconex.jef4.job.group.AsyncJobGroup.runJob(AsyncJobGroup.java:119)
        at com.norconex.jef4.job.group.AsyncJobGroup.access$000(AsyncJobGroup.java:44)
        at com.norconex.jef4.job.group.AsyncJobGroup$1.run(AsyncJobGroup.java:86)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

For me, reducing the size of the elasticsearch commitSize resolved the problem, but still worth preventing job crawl hangs.

The text was updated successfully, but these errors were encountered:

essiembre · 2018-04-02T20:27:59Z

Often OOM exceptions can't be recovered from and as such cannot be handled reliably. The JVM application state is already compromised the moment you get this and killing/restarting with more memory is usually the best approach.

Still, if you want to prevent hangs, the best options likely is to use that JVM trick with a kill command or equivalent (from Oracle JVM documentation):

-XX:OnOutOfMemoryError="<cmd args>; <cmd args>"

As of Java 8u92, you can also use those JVM argument (described here):

-XX:ExitOnOutOfMemoryError
-XX:CrashOnOutOfMemoryError

The next major release will require Java 8 so the launch scripts shipped with the collector may be modified to include one of the Java 8 arguments.

stale · 2021-08-01T08:22:35Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale bot added the stale From automation, when inactive for too long. label Aug 1, 2021

stale bot closed this as completed Aug 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug - crawl job hangs #478

Bug - crawl job hangs #478

danizen commented Mar 30, 2018

essiembre commented Apr 2, 2018

stale bot commented Aug 1, 2021

Bug - crawl job hangs #478

Bug - crawl job hangs #478

Comments

danizen commented Mar 30, 2018

essiembre commented Apr 2, 2018

stale bot commented Aug 1, 2021