You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In #477, I diagnosed a serial problem where my crawling job experienced a Fatal OutOfMemoryError, and then later an attempt to stop the collector failed, because the JVM to be stopped would not exit.
It seems likely that the crawler job entered a terminal state, but the code was waiting for it to stop logically, except that it had failed or something instead.
The exception that produced this state was:
FATAL [JobSuite] Fatal error occured in job: monitor_lessdepth_crawler
INFO [JobSuite] Running monitor_lessdepth_crawler: END (Tue Feb 06 17:13:41 EST 2018)
Exception in thread "monitor_lessdepth_crawler" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3236)
at java.lang.StringCoding.safeTrim(StringCoding.java:79)
at java.lang.StringCoding.encode(StringCoding.java:365)
at java.lang.String.getBytes(String.java:941)
at org.apache.http.entity.StringEntity.<init>(StringEntity.java:70)
at com.norconex.committer.elasticsearch.ElasticsearchCommitter.commitBatch(ElasticsearchCommitter.java:589)
at com.norconex.committer.core.AbstractBatchCommitter.commitAndCleanBatch(AbstractBatchCommitter.java:179)
at com.norconex.committer.core.AbstractBatchCommitter.commitComplete(AbstractBatchCommitter.java:159)
at com.norconex.committer.core.AbstractFileQueueCommitter.commit(AbstractFileQueueCommitter.java:233)
at com.norconex.committer.elasticsearch.ElasticsearchCommitter.commit(ElasticsearchCommitter.java:537)
at com.norconex.collector.core.crawler.AbstractCrawler.execute(AbstractCrawler.java:274)
at com.norconex.collector.core.crawler.AbstractCrawler.doExecute(AbstractCrawler.java:228)
at com.norconex.collector.core.crawler.AbstractCrawler.resumeExecution(AbstractCrawler.java:190)
at com.norconex.jef4.job.AbstractResumableJob.execute(AbstractResumableJob.java:51)
at com.norconex.jef4.suite.JobSuite.runJob(JobSuite.java:355)
at com.norconex.jef4.job.group.AsyncJobGroup.runJob(AsyncJobGroup.java:119)
at com.norconex.jef4.job.group.AsyncJobGroup.access$000(AsyncJobGroup.java:44)
at com.norconex.jef4.job.group.AsyncJobGroup$1.run(AsyncJobGroup.java:86)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
For me, reducing the size of the elasticsearch commitSize resolved the problem, but still worth preventing job crawl hangs.
The text was updated successfully, but these errors were encountered:
Often OOM exceptions can't be recovered from and as such cannot be handled reliably. The JVM application state is already compromised the moment you get this and killing/restarting with more memory is usually the best approach.
Still, if you want to prevent hangs, the best options likely is to use that JVM trick with a kill command or equivalent (from Oracle JVM documentation):
-XX:OnOutOfMemoryError="<cmd args>; <cmd args>"
As of Java 8u92, you can also use those JVM argument (described here):
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
In #477, I diagnosed a serial problem where my crawling job experienced a Fatal
OutOfMemoryError
, and then later an attempt to stop the collector failed, because the JVM to be stopped would not exit.It seems likely that the crawler job entered a terminal state, but the code was waiting for it to stop logically, except that it had failed or something instead.
The exception that produced this state was:
For me, reducing the size of the elasticsearch
commitSize
resolved the problem, but still worth preventing job crawl hangs.The text was updated successfully, but these errors were encountered: