Investigate Heap Requirements for Smarti #147

westei · 2017-11-20T08:28:22Z

When running with -Xmx4g a java.lang.OutOfMemoryError: Java heap space was encountered.

As this is unexpected we need to further investigate memory consumption of Smarti. This include

implementation of a stress test utility
testing with high loads of conversation with different lengths and random messages
testing memory footprint of different analysis configurations (especially NLP processing components and models)
investigate for possible memory leaks

NOTE: marking this as enhancement with the intention to create additional issues based on investigation results

The text was updated successfully, but these errors were encountered:

* added a configuration that allows to configure the executor service pool size for processing. The default is set to 2 as requested by #145 solves #147 * changed configuration for the Stanford NLP processing to use the Shift Reduce Parser as this one has a lower memory footprint * added nlp.stanfordnlp.de.parseMaxlen=40 to prevent parse tree generation for long sentences that could end up in OOM situations NOTE: Those changes depend on bug-fixes in redlink-nlp and an update to Stanford NLP 3.8.0

…147

westei · 2017-11-22T14:15:55Z

The investigation concluded that their are no memory leaks present. Even long processing runs of >1000 conversations with >5000 messages showed no increase in base memory.

The OOM errors could be traced down to messages containing long sentences (or other Strings e.g. ASKII graphics) that cause the Stanford NLP Parser to require huge amounts of memory.

Several solutions where tested with the following results:

first and foremost ist is necessary to limit the maximum length of tokens a sentence can have for the parser to process it. This can now be done by using the nlp.stanfordnlp.de.parseMaxlen property. The default 30 is good for 4g java heap and the current analysis configuration.
to NOT use the Factored Parser. While this is the default of Stanford NLP it is by far the slowest and need the most Memory. For Smarti the default was set to the PCFG Parser. The Shift Reduce Parser is an alternative that is even faster and uses less memory (configure nlp.stanfordnlp.de.parseModel=edu/stanford/nlp/models/lexparser/germanPCFG.ser.gz to use this parser)
the number of processing threads is now configureable (as requested by Smarti should run with only 2 processing threads #145) and the default is now set to 2 (was 8). For every thread one should reserve about 500m additional heap (so on an 8 core machine it is recommended to run Smarti with 8g heap if 8 processing threads are configured)

With this configuration no OOM errors where encountered. Only when processing ASKII Graphics the PCFG Parser was driving the System to its limits while causes a lot of GC overhead. After processing the System recovered without problems and continued normal.

NOTE: Those changes require the newest SNAPSHOT version of redlink-nlp. As this also update to Stanford NLP 3.8.0 (was 3.6.0) Snarti users will need to update the Stanford NLP jars in the ext folder accordingly (see also according changes to dist/src/main/resources/plugin-info.txt)

…tigation Stress Test Tool and Memory Consumption optimisations (#147 and #145)

ruKurz · 2017-12-08T07:31:13Z

at least very cool that we have knowledge in that depth.

ruKurz · 2017-12-15T13:57:57Z

@westei Can you please write a short doc on how you run the stress test for the ressource consumption behavior of Smarti?

westei · 2017-12-28T09:01:13Z

created #179 for the documentation

westei added the enhancement label Nov 20, 2017

westei added this to the v0.6.1 milestone Nov 20, 2017

westei self-assigned this Nov 20, 2017

westei added the in progress label Nov 20, 2017

ja-fra added a commit that referenced this issue Nov 20, 2017

created simple loadTest script for testing and debugging (#147)

b715690

ghost assigned ja-fra Nov 20, 2017

westei added a commit that referenced this issue Nov 22, 2017

Updated the documentation to reflect additional features for #145 and #…

ccb7f7e

…147

westei added in review and removed in progress labels Nov 22, 2017

westei mentioned this issue Nov 23, 2017

Stress Test Tool and Memory Consumption optimisations (#147 and #145) #149

Merged

ja-fra added a commit that referenced this issue Nov 23, 2017

Added authentication to loadTest-script (#147)

0d464a6

ja-fra added a commit that referenced this issue Nov 28, 2017

Merge pull request #149 from redlink-gmbh/issue-#147_heap-space-inves…

5677ebd

…tigation Stress Test Tool and Memory Consumption optimisations (#147 and #145)

ja-fra added a commit that referenced this issue Nov 30, 2017

stanford-nlp has been upgraded to 3.8 in redlink-nlp (#147)

871a3fc

ja-fra added a commit that referenced this issue Nov 30, 2017

stanford-nlp has been upgraded to 3.8 in redlink-nlp (#147)

22b4567

ghost assigned wernerharing Dec 14, 2017

ja-fra added a commit that referenced this issue Dec 14, 2017

updated installation-doku for stanford 3.6 to 3.8 (#147)

d7d986b

ruKurz closed this as completed Dec 15, 2017

ruKurz removed the in review label Dec 15, 2017

ruKurz reopened this Dec 15, 2017

ruKurz added the ready label Dec 15, 2017

westei mentioned this issue Dec 28, 2017

Document Smarti Heap Requirements #179

Closed

westei closed this as completed Dec 28, 2017

ghost removed the ready label Dec 28, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate Heap Requirements for Smarti #147

Investigate Heap Requirements for Smarti #147

westei commented Nov 20, 2017

westei commented Nov 22, 2017

ruKurz commented Dec 8, 2017

ruKurz commented Dec 15, 2017

westei commented Dec 28, 2017

Investigate Heap Requirements for Smarti #147

Investigate Heap Requirements for Smarti #147

Comments

westei commented Nov 20, 2017

westei commented Nov 22, 2017

ruKurz commented Dec 8, 2017

ruKurz commented Dec 15, 2017

westei commented Dec 28, 2017