Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ram usage insanity. #10

Closed
linas opened this issue Mar 6, 2021 · 4 comments
Closed

Ram usage insanity. #10

linas opened this issue Mar 6, 2021 · 4 comments

Comments

@linas
Copy link
Member

linas commented Mar 6, 2021

During learning on a tiny grammar, RAM usage by RocksDB exploded to 90GBytes. This is .. insane, it should not be more than a few GBytes for this workload. This is 40x greater RAM usage than expected ... the 40x number is just like the one in issue #9 and might be curable in the same way...

@linas
Copy link
Member Author

linas commented Apr 12, 2021

See this comment: facebook/rocksdb#3216 (comment) it appears that RAM usage goes as 3x disk usage and that almost all disk usage is in sst files, which get compacted down when the DB is closed and reopened. The large sst files appear to contain logs to changes to the incoming set, which cause vast numbers of log file writes, blowing up the disk. Redesigning the incoming set storage appears to avoid this issue (the redesign in eedfbc7)

Limiting the number of sst files that rocks can create by setting max_open_files to 300 seems to limit the number of sst files to 172, the number of file descriptors to 172, and RAM usage to 20GB.

@linas
Copy link
Member Author

linas commented Apr 12, 2021

With all-new, all-improved code base, after a run of shapes+disjuncts classsification: total of 1108125 atoms, guile has a 4.8GB heap. of which all but 20MB are free. Total RAM use is 42GB.

Rocks stats -- expt-17/gram-3.rdb
6.9GB disk, before compaction du -s gram-3.rdb
100 sst files ls -la gram-3.rdb/*sst | wc
106 open DB files lsof -p pid |grep sst |wc
=== Then close the database (but not guile)
37 open DB files descriptors!!
100 sst files as before
6.8GB disk use
Still 42GB RAM use
=== re-open the database
Still 42GB RAM use
519 MBytes DB disk usage (compaction runs on re-open, not close)
13 sst files total (compaction)
46 open file descriptors
=== close DB again
37 open DB file descs as before - these have been leaked.
13 sst files unchanged
515 MBytes Disk use unchanged

So it seems that 37 sst file descriptors have been leaked - all of these are to sst files that have been deleted: lsof -p pid |grep sst shows file names that aren't there any more.

So 42GB and 37 leaked file descs is a vast improvement over 180GB+corruption seen before. But still not acceptable. This is w/ ubuntu focal 20.04 and librocksdb-dev/focal,now 5.17.2-3 amd64 [installed]

Exiting and restarting shows 5GB RAM use, so that's maybe/mostly all atomspace RAM!? and not rocks RAM. So that implies of the 42GB above, 32GB was leaked by rocks. Yow!

@linas
Copy link
Member Author

linas commented Apr 13, 2021

Retest, using rocksdb version 6.19.0 compiled from github source bb75092574532c5629c27dcd99fe55f5514af48c

It appears that even the latest version is leaking file descs:

RAM usage after computing

(define cac (direct-sum psa wss))  ; psuedo-cset plus shapes.
(define btc (batch-transpose csc))
(btc 'mmt-marginals)

ls -la shape.rdb/*sst | wc # 10
lsof -p 18889 | grep sst |wc # 10
du -s shape.rdb # 801 MB
ps aux |grep guile # 7.36GB virt 4.85 GB rss

(cog-close storage-node)
no change to RAM
du -s shape.rdb # 719 MB so some decrease
ls -la shape.rdb/*sst | wc # 10 no change
lsof -p 18889 | grep sst |wc # 3 so its leaking filedescs, still

(cog-open storage-node)
No change to RAM
du -s shape.rdb # 514 MB
ls -la shape.rdb/*sst | wc # 8
ls -la shape.rdb/*sst | wc # 11 of which 3 are marked "deleted"

stop guile, start guile, stop guile:
no change to storage
ls -la shape.rdb/*sst | wc # 9

======= Now do classification

(gram-classify-greedy-discrim csc 0.25 4)
ps aux |grep guile # 37.8GB virt 35.2GB rss
lsof -p 362 | grep sst |wc # 81
ls -la gram-1-junk.rdb/*sst |wc # 81
du -s gram-1-junk.rdb # 5.8 GB

(cog-close storage-node)
du -s gram-1-junk.rdb # 5.7 GB
lsof -p 362 | grep sst |wc # 74 Yow! thats a big leak!

(cog-open storage-node)
du -s gram-1-junk.rdb # 606 MB - big shrink
lsof -p 362 | grep sst |wc # 91
ls -la gram-1-junk.rdb/*sst |wc # 17
lsof -p 362 | grep sst |grep deleted |wc 67

Yikes! this time it leaks 74 file descriptors! Ouch!

@linas
Copy link
Member Author

linas commented Apr 13, 2021

Fixed. Code was leaking iterators. Fixed in commit# 10fb460

Log:
df: 708513296 before starting gram-classify.
lsof -p 25951 | grep sst |wc # 9
ls -la gram-2-junk.rdb/*sst |wc # 9
du -s gram-2-junk.rdb # 514 MB

after finishing:
du -s gram-2-junk.rdb # 1.01 GB
ps aux |grep guile # 8.5 GB virt 5.6GB rss
ls -la gram-2-junk.rdb/*sst |wc # 16
lsof -p 25951 | grep sst |wc # 16

after cog-close:
lsof -p 25951 | grep sst |wc # zero!
ls -la gram-2-junk.rdb/*sst |wc # 16
du -s gram-2-junk.rdb # 927 MB

after cog-open:
du -s gram-2-junk.rdb # 1.24 GB
lsof -p 25951 | grep sst |wc # 17
ls -la gram-2-junk.rdb/*sst |wc # 17

after cog-close:
du -s gram-2-junk.rdb # 602 MB
df # 708601988

after exiting guile:
df # 708601988 -- no change

@linas linas closed this as completed Apr 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant