Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

json.facet memory probelem #42

Open
esameto opened this issue Apr 7, 2015 · 3 comments
Open

json.facet memory probelem #42

esameto opened this issue Apr 7, 2015 · 3 comments

Comments

@esameto
Copy link

esameto commented Apr 7, 2015

I use json.facet for making nested facet statistics, however i found during making a load test using such queries that the memory which is consumed by the SOLR increase by huge rate till about 24 Giga Bytes, here is a sample query:

http://10.68.20.139:5080/solr/reports_core2/select?q={!cache=false}_2110_EXACT_PARTS:[1 TO *]&rows=0&json.facet=
{MAN_NAMEANDSRATING:
{terms:{field:MAN_NAME,limit:-1,mincount:1,facet:
{SRATING:{terms:{field:SRATING,limit:-1,mincount:1,facet:{sum:'sum(field(_2110_EXACT_PARTS))'}}}}
}}}&
json.facet={MAN_NAMEANDRS:{terms:{field:MAN_NAME,limit:-1,mincount:1,
facet:{RS:{terms:{field:RS,limit:-1,mincount:1,
facet:{sum:'sum(field(_2110_EXACT_PARTS))'}}}}}}}&
json.facet={MAN_NAMEANDRGRADE:{terms:{field:MAN_NAME,limit:-1,mincount:1,
facet:{RGRADE:{terms:{field:RGRADE,limit:-1,mincount:1,facet:{sum:'sum(field(_2110_EXACT_PARTS))'}}}}}}}
&json.facet={MAN_NAMEANDLC_STATE:{terms:{field:MAN_NAME,limit:-1,mincount:1,facet:{LC_STATE:{terms:{field:LC_STATE,limit:-1,mincount:1,
facet:{sum:'sum(field(_2110_EXACT_PARTS))'}}}}}}}
&json.facet={MAN_NAMEANDP_RANGE:{terms:{field:MAN_NAME,limit:-1,mincount:1,
facet:{P_RANGE:{terms:{field:P_RANGE,limit:-1,mincount:1,facet:{sum:'sum(field(_2110_EXACT_PARTS))'}}}}}}}
&json.facet={MAN_NAMEANDcm_STATUS:{terms:{field:MAN_NAME,limit:-1,mincount:1,facet:{cm_STATUS:
{terms:{field:cm_STATUS,limit:-1,mincount:1,facet:{sum:'sum(field(_2110_EXACT_PARTS))'}}}}}}}&
json.facet={MAN_NAMEANDYEL_RANGE:{terms:{field:MAN_NAME,limit:-1,mincount:1,facet:{YEL_RANGE:
{terms:{field:YEL_RANGE,limit:-1,mincount:1,facet:{sum:'sum(field(_2110_EXACT_PARTS))'}}}}}}}&facet=true

@elyograg
Copy link
Contributor

elyograg commented Apr 7, 2015

Just to be clear ... what are you looking at to determine that this is a problem? I just want to be sure that you are taking java's garbage collection memory model into account. It is completely normal for a Java program to consume all allocated heap memory, at which point it will do a garbage collection to free up memory from objects that are no longer in use.

Extensive faceting on an index with a large number of documents will allocate very large amounts of heap memory, especially on Solr 4.x if DocValues are not used and the facet.method is left at default. You can reduce memory requirements by using facet.method=enum or turning on docValues for all fields you will use for faceting, then doing a complete reindex.

It is always possible that there is a true memory leak, but at the moment, we are not aware of any.

@esameto
Copy link
Author

esameto commented Apr 9, 2015

I noticed something may be helpful, I use Heliosearch on Linux on Tomcat web server, so i show the memory consumed by the Tomcat using top command, which says that tomcat consumes about 24 Gb, however when i opened the SOLR home page i found that JVM memory only 4GB, however the Physical Memory is about 30 GB, which means that the memory consumed by the SOLR is taken from the direct system memory not from the this of the JVM, so i think this consumed memory would not be avabilable for the garbage collection.
Also i red about Off-Heap cache feature for HelioSearch which make its caching moved off the JVM heap and explicitly managed. Off-heap memory is invisible to the garbage collector.

@argakon
Copy link

argakon commented Apr 10, 2015

At first check field cache size in solrconfig.xml.
Read about Xms, Xmx, XX:MaxPermSize and Xss Java options, for example in here http://www.mkyong.com/java/find-out-your-java-heap-memory-size/

You need to calculate values of this options for your server. For tomcat(in centos) you can set this options to JAVA_OPTS in /etc/sysconfig/tomcat.
I also use G1 garbage collector(Java 8u40).

My settings are:
-XX:+UseG1GC -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8m -XX:MaxGCPauseMillis=200 -XX:+UseLargePages -XX:+AggressiveOpts -Xms1024M -Xmx8192M -XX:MaxPermSize=256M -Xss512K

My configuration is:
2 Cores, in summary 6921061+3835948 docs
Memory on server: 64G
Solr RES in top stable at 20.2G

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants