RAM consumption much higher than expected #128

Open
jakubklimek opened this Issue Jan 31, 2014 · 17 comments

Projects

None yet

7 participants

@jakubklimek

VOS7 RAM consumption is much bigger than expected according to recommended NumberOfBuffers and MaxDirtyBuffers in virtuoso.ini according to amount free RAM
image

Virtuoso with settings for 8GB of RAM consumes 32GB of RAM (or even more... basically unlimited amount) => swaps a lot and crashes when swap is depleted. This is with 680000 buffers in status();

@indeyets

same here. pretty much the reason why we stopped using 7.x and went back to 6.1 for now

@HughWilliams
Collaborator

Hi Jakub,

What is the version being used (virtuoso-t -?), as I would suggest testing against the latest develop/7 archive which was last undated over the weekend as there is a number of memory related fixes in this branch over the stable/7 branch and earlier develop/7 branches.

At what point is the 32GB RAM being consumed, is this on startup or over time as the server is running ?

What error messages if any are reported in the "virtuoso.log" file and/or the /var/log/messages file related to Virtuoso ?

@jakubklimek

Hi Hugh,
This was 3b047ec, which is Version 7.0.1-dev.3207-pthreads, I will try again with the newest version.

The setup is as follows:

  • 600M triples in the DB, virtuoso.db size about 30GB Basically, it is lots of entities of approx. 10 types (RDF classes) in total. The maximum consumption is achieved when querying the dataset in a way that accesses all the entities, i.e. I try to get all entities of the first type, then the second, etc... and it seems that Virtuoso never frees the memory it gets.

The log contains messages like
11:17:20 * Monitor: Low query memory limit, try to increase MaxQueryMem
It is set to 1G

and when it crashes (due to out of memory I suppose), it looks like:

23:10:20 mmap failed with 12
23:10:20 mmap failed with 12
23:10:20 mmap failed with 12
23:10:20 mmap failed with 12
23:10:20 mmap failed with 12
23:10:20 /usr/local/bin/virtuoso-t() [0x9078aa]
23:10:20 /usr/local/bin/virtuoso-t() [0x907916]
23:10:20 /usr/local/bin/virtuoso-t() [0x905ce2]
23:10:20 /usr/local/bin/virtuoso-t() [0x905da0]
23:10:20 /usr/local/bin/virtuoso-t() [0x905ec8]
23:10:20 /usr/local/bin/virtuoso-t() [0x906091]
23:10:20 /usr/local/bin/virtuoso-t() [0x90781d]
23:10:20 /usr/local/bin/virtuoso-t() [0x8677c4]
23:10:20 /usr/local/bin/virtuoso-t() [0x664ea8]
23:10:20 /usr/local/bin/virtuoso-t() [0x66549a]
23:10:20 /usr/local/bin/virtuoso-t() [0x60d935]
23:10:20 GPF: Dkpool.c:1594 could not allocate memory with mmap

@jakubklimek

Retested with 58a8450 and still happens.
Now on another instance with 16 GB RAM
NumberOfBuffers=1360000 MaxDirtyBuffers=1000000

After issuing simple query:

Virtuoso 22023 Error SR...: The result vector is too large

SPARQL query:
prefix ruian: <http://ruian.linked.opendata.cz/ontology/>
construct {?s ?p ?o} where {?s a ruian:AdresniMisto; ?p ?o.}

q1

After issuing another simple query:


prefix ruian: <http://ruian.linked.opendata.cz/ontology/>

construct {?s ?p ?o} where {?s a ruian:Parcela; ?p ?o.}

It got even higher
q2

@openlink openlink assigned imitko and iv-an-ru and unassigned imitko and iv-an-ru Feb 12, 2014
@imitko
Collaborator

did you tried to change vector size in INI section, for example :
[Parameters]
MaxQueryMem = 2G
VectorSize = 1000
?

@jakubklimek

I tried:
MaxQueryMem = 1G
MaxVectorSize = 4000000
VectorSize = 1000

@jakubklimek

Still happening with 32bc49a
image
This happend by simply issuing a sparql clear graph query on the one main graph in the database that contains 624M triples...

@HughWilliams
Collaborator

Did you set log_enable(3) to turn of transactions which would consume large amounts of system memory, as detailed at:

http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtTipsAndTricksGuideDeleteLargeGraphs

@andyjenkinson

Just wanted to provide another data point here: we also see this issue. For example currently one of our instances is using 31G of resident memory but configured for 1,000,000 buffers. It has 174,156,700 triples, and the DB on disk is only 4.8G. It is a completely read-only server, there are never any write operations on it. Here's the status output:

OpenLink Virtuoso Server
Version 07.10.3207-pthreads for Linux as of Apr 8 2014
Started on: 2015-02-01 23:15 GMT+0

Database Status:
File size 0, 626432 pages, 178296 free.
1000000 buffers, 319350 used, 2 dirty 0 wired down, repl age 9265449 0 w. io 0 w/crsr.
Disk Usage: 334148 reads avg 0 msec, 0% r 0% w last 317612 s, 13394 writes flush 12.64 MB,
1338 read ahead, batch = 247. Autocompact 439 in 353 out, 19% saved.
Gate: 3776 2nd in reads, 0 gate write waits, 0 in while read 0 busy scrap.
Log = /indexed/virtuoso-opensource/biomodels/prod/28/virtuoso.trx, 2561 bytes
448085 pages have been changed since last backup (in checkpoint state)
Current backup timestamp: 0x0000-0x00-0x00
Last backup date: unknown
Clients: 41 connects, max 37 concurrent
RPC: 15261 calls, -5297 pending, 1 max until now, 0 queued, 6 burst reads (0%), 1 second 0M large, 344M max
Checkpoint Remap 0 pages, 0 mapped back. 21 s atomic time.
DB master 626432 total 178296 free 0 remap 0 mapped back
temp 491776 total 491770 free

Lock Status: 0 deadlocks of which 0 2r1w, 124 waits,
Currently 1 threads running 0 threads waiting 0 threads in vdb.

@HughWilliams
Collaborator

@andyjenkinson: Looking at your build it is a 07.10.3207 build from Apr 8 2014 and there have been a number of memory fragmentation fixes and a new memory allocator available in the latest git develop/7 branch which is 07.20.3212 and about to become the stable/7 branch, thus I would suggest you test with that ...

@jakubklimek

@HughWilliams Actually, the latest develop/7 is still:
Version 7.2.0-rc2.3211-pthreads as of Feb 3 2015

@HughWilliams
Collaborator

@jakubklimek: Yes, but it was 7.1.0-rc2.3211 last week ...

@andyjenkinson

An update to this: we have upgraded to 7.2.0.1 and the issue remains:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                            
24358 rdf_adm   20   0 54.1g  51g 3876 S  0.0 40.7 366:19.45 virtuoso-t
OpenLink Virtuoso  Server
Version 07.20.3212-pthreads for Linux as of Mar 11 2015 
Started on: 2015-04-16 09:01 GMT+1

Database Status:
  File size 0, 1018368 pages, 370428 free.
  1000000 buffers, 80957 used, 2 dirty 0 wired down, repl age 272918 0 w. io 0 w/crsr.
  Disk Usage: 43091634 reads avg 0 msec, 0% r 0% w last  438758 s, 3016620 writes flush      4.988 MB,
    18673 read ahead, batch = 39.  Autocompact 0 in 0 out, 0% saved.
Gate:  14046 2nd in reads, 0 gate write waits, 0 in while read 0 busy scrap. 
Log = /indexed/virtuoso-opensource/biosamples/prod/v20150403/virtuoso.trx, 6803 bytes
647880 pages have been changed since last backup (in checkpoint state)
Current backup timestamp: 0x0000-0x00-0x00
Last backup date: unknown
Clients: 3027 connects, max 3 concurrent
RPC: 21743 calls, -10330 pending, 1 max until now, 0 queued, 2 burst reads (0%), 8 second 0M large, 14M max
Checkpoint Remap 0 pages, 0 mapped back. 17 s atomic time.
    DB master 1018368 total 370428 free 0 remap 0 mapped back
   temp  1633280 total 1633274 free

Lock Status: 0 deadlocks of which 0 2r1w, 401 waits,
   Currently 1 threads running 0 threads waiting 0 threads in vdb.
@HughWilliams
Collaborator

@andyjenkinson: Can you please use the following script to monitor memory consumption of the server , so we can see the rate of memory growth:
http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtMonitorMemoryConsumption

I assume it is the ChEMBL datasets you have loaded, in which are these available for public download such that we could load local. Then assuming your queries are mainly against the SPARQL endpoint, you can create a HTTP recording of the activities being performed on the server and then provide that with we should be able to replay locally to recreate the issue:

http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtTipsAndTricksRecording#HTTP%20Recording

@andyjenkinson

Hi @HughWilliams, thanks I set up monitoring yesterday.

ChEMBL is one of the datasets, the one that caused problems most recently was the BioSamples dataset but from memory I don't recall there being a single dataset consistently to blame. They are all in separate instances, and yes they are all available to download.

We don't connect to Virtuoso via HTTP, we use the JDBC and Jena providers. However the vast majority of queries are routed originally from HTTP, and they are logged from the front end already. All the queries are read operations.

@HughWilliams
Collaborator

@andyjenkinson: OK, if you chose one of the instances exhibiting this behaviour and provide a copy of it ideally or the datasets to load we can setup a local instance. Then as you are querying via the Jena provider, the HTTP recording would not be of use, but you create a Virtuoso query log from the start of the Virtuoso instance to the point of failure as detailed at:

http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#querylogging

Then we should be able to create a simulation of this behaviour to force the condition to occur against our local copy of your database to see this in-house hopefully in which case it can be resolved by development ...

@varh1i

We are also experiencing memory leak with "Version 07.20.3212-pthreads for Linux as of Mar 25 2015".

I found 2 parameters: ThreadCleanupInterval, ResourcesCleanupInterval with description: "Set both to 1 in order to reduce memory leaking." but still happens. http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#ex_threadcleanupinterval

Have you found any solution for this? or is there any configuration parameter we can try?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment