Reducing memory footprint in StreamIndexedIO #228

ldmoser · 2014-02-26T01:43:16Z

As scenes get heavier and deeper, the memory overhead of the data structures holding the hierarchy of the internal nodes in the StreamIndexedIO becomes significant. The structure is a tree of StreamIndexedIO::Node classes (104 bytes* each + contents of children map ) with DataNodes (56 bytes*) where data is saved.

With the following commits I've maintained the file format as is, and also binary compatibility (please review my third commit on that). And from my tests, performance was not affected. The memory consumption reduction varied in my tests from %20 to %70:

Heavy asset (airBeam) 30% memory reduction
Heavy asset with lots of tags 20% reduction
Animated asset (droid) 48% reduction
Linked scene (crowd) 70% reduction

The idea behind these changes were to have more Node classes that are specialized on particular cases and therefore don't need as many bytes. In order to do that, my first step was to change the StreamIndexedIO::Node class to behave as "private MemberData" class (kept the name for binary compatibility), that holds the pointer to the underlying Directory node and another pointer to the shared Index.

Then I've created this new class hierarchy:
NodeBase (16 bytes_)
|- DataNode ( 40 bytes_ )
|- SmallDataNode ( 24 bytes* )
|- DirectoryNode (56 bytes* )
|- SubIndexNode ( 24 bytes* )

sizes measures in 64bit machine.

…nter to the global Index and to the low level DirectoryNode. Adding a specific class for DirectoryNode that holds the information loaded from the index. Adding a member m_nodeType in NodeBase to distinguish derived classes without adding virtual methods.

…ield. Using char instead of the enums in the structs to compact members in one word. Original struct sizes were StreamIndexedIO::Node 104 bytes and DataNode was 56. Current struct sizes are DirectoryNode 56 and DataNode 40 bytes.

This does not affect the class size and should not break binary compatibility. As a consequence the Node class reduced from 32 bytes to 16 bytes (not derived from RefCounted anymore).

… data is small enough and avoid the big DataNode struct (40 bytes).

… SubIndexNode (24 bytes). This helps on the memory requirements for opening a file with a large amount of stored Objects. In particular, large LinkedScenes.

… for the purposes of filtering child entries by type.

johnhaddon · 2014-02-26T16:11:54Z

Hey Lucio - impressive reductions!

Unfortunately I'm seeing a performance regression when testing some multithreaded access in Gaffer. Here's the script I'm using to test, along with some measurements I took with it in the comments at the bottom. Could you see if you can reproduce that there?

import time

import IECore

import Gaffer
import GafferScene
import GafferSceneTest

r = GafferScene.SceneReader()
r["fileName"].setValue( "/Users/john/Downloads/bigBuckBunny_001.scc" )

t = time.time()
ti = IECore.Timer()

GafferSceneTest.traverseScene( r["out"], Gaffer.Context() )

print time.time() - t, ti.stop()

#########################
# master
#########################
#
#   0.92    1.64
#   0.83    1.54
#   0.84    1.55
#   0.82    1.53
#
#########################
# streamIndexedIOrawNodes
#########################
#
#   0.89    1.86
#   0.85    1.81
#   0.90    1.89
#   0.89    1.87
#   0.867   1.83
#
#########################

I've only taken a quick look at the actual code so far - perhaps you could look into this first and then I'll take a more detailed look?

johnhaddon · 2014-02-26T16:13:25Z

You can get the bunny here if you don't already have him :

http://imageengine.github.io/gaffer/resources/caches/bigBuckBunny_001.scc.gz

It'd be interesting to try the same with some production assets...

…he hierarchy of Nodes. Added a boolean in each Directory node that will be true only if there's a child that is SubIndex and only uses mutex on these locations. Also, avoids keeping the mutex locked while reading the subindex or during deallocation of duplicated Directories.

ldmoser · 2014-02-28T00:02:24Z

Good one John!
My numbers were even worse here:
streamIndexedIOrawNodes
0.865702152252 3.58

With the latest change they are in pair with the current trunk
0.79429101944 1.85

Could you double check on your end?

ldmoser · 2014-02-28T02:58:08Z

And here some production examples:

hugeLayout: LSCC 7Mb (plus sub-layouts)
[ new ]
18.2563018799 128.21
17.7411010265 130.04
17.9823520184 129.19
[ old ]
18.4420411587 133.44
19.8850400448 136.13
17.6744019985 129.06

animated asset: SCC 8Gb
[ new ]
1.1167318821 2.06
0.952558040619 2.04
0.965826034546 2.02
[ old ]
1.0604159832 2.19
1.05946683884 2.21
1.05524492264 2.19

huge geo: SCC 104Gb
[new]
0.0319299697876 0.05
0.0220229625702 0.04
0.0236361026764 0.05
[old]
0.02747797966 0.04
0.025918006897 0.04
0.0246629714966 0.04

johnhaddon · 2014-02-28T15:41:33Z

That's an improvement for me too - still not quite as quick as master but within a % or two. Are your [old] measurements from master or from this branch before your latest fix? Any idea how the huge geo is so quick in comparison to the others?

johnhaddon · 2014-02-28T15:58:48Z

include/IECore/StreamIndexedIO.h

I think we should definitely do this before merging - the whole Node vs NodeBase discrepancy is quite confusing.

johnhaddon · 2014-02-28T16:34:04Z

I might be wrong as I haven't looked at everywhere they're used, but generally I feel that all the Node classes could do with encapsulating their data a bit more. I don't really understand the codebase fully, but it seems that a lot of the members are hanging out in public as non-const, whereas my guess in many cases is that they aren't intended to change after construction, or only in specific circumstances. Do you think it'd be worth tightening this up a little to reassure the naive onlooker that they aren't going to change and to have the compiler do a bit more checking for us?

johnhaddon · 2014-02-28T16:37:23Z

This is of no use to us now, but it's good to know that C++11 allows the size of an enum to be controlled - that'd be handy for those spots where we've been forced to lose type safety and store them in a char :

http://www.cprogramming.com/c++11/c++11-nullptr-strongly-typed-enum-class.html

I wonder when our 3rd party dependencies will allow us to move to a modern enough compiler to use this stuff?

Adding some comments to explain better the rationale behind them.

ldmoser · 2014-03-06T01:21:50Z

I believe I've addressed most of your notes John. I think it's way clearer for the naive onlooker.
Decided to not rename the Node class because it affects the virtual function duplicate and the constructor.
About your questions, my [old] times are relative to the master. And the "huge geo" file has a very small index structure that points to massive animated geometry. So it makes sense being quick to open and traverse.

johnhaddon · 2014-03-06T17:20:20Z

Thanks Lucio! Leaving the Node class rename till later seems like the right thing to do for now - sorry I hadn't noticed that it was exposed via methods in the API.

Reducing memory footprint in StreamIndexedIO

Lucio Moser added 8 commits February 25, 2014 15:35

Changing StreamIndexedIO member from intrusive pointer to raw pointer.

f96ed26

This does not affect the class size and should not break binary compatibility. As a consequence the Node class reduced from 32 bytes to 16 bytes (not derived from RefCounted anymore).

Fixing bug by checking if pointer is valid. Now all tests passing.

0ceb10f

Adding SmallDataNode class (24 bytes long on 64bit machines) when the…

52a57b3

… data is small enough and avoid the big DataNode struct (40 bytes).

Updating comment.

e955831

Adding a compact Node class for representing non-expanded subIndexes:…

9303ae5

… SubIndexNode (24 bytes). This helps on the memory requirements for opening a file with a large amount of stored Objects. In particular, large LinkedScenes.

Bug fix: considering the new SubIndexNode a IndexedIO::Directory node…

d6edca6

… for the purposes of filtering child entries by type.

Removing conditional on delete.

96dae62

johnhaddon reviewed Feb 28, 2014
View reviewed changes

Protecing member variables from the Node classes.

46a19c6

Adding some comments to explain better the rationale behind them.

johnhaddon added a commit that referenced this pull request Mar 6, 2014

Merge pull request #228 from ldmoser/streamIndexedIOrawNodes

0e109f4

Reducing memory footprint in StreamIndexedIO

johnhaddon merged commit 0e109f4 into ImageEngine:master Mar 6, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reducing memory footprint in StreamIndexedIO #228

Reducing memory footprint in StreamIndexedIO #228

Uh oh!

ldmoser commented Feb 26, 2014

Uh oh!

johnhaddon commented Feb 26, 2014

Uh oh!

johnhaddon commented Feb 26, 2014

Uh oh!

ldmoser commented Feb 28, 2014

Uh oh!

ldmoser commented Feb 28, 2014

Uh oh!

johnhaddon commented Feb 28, 2014

Uh oh!

johnhaddon Feb 28, 2014

Uh oh!

johnhaddon commented Feb 28, 2014

Uh oh!

johnhaddon commented Feb 28, 2014

Uh oh!

ldmoser commented Mar 6, 2014

Uh oh!

johnhaddon commented Mar 6, 2014

Uh oh!

Uh oh!

Reducing memory footprint in StreamIndexedIO #228

Reducing memory footprint in StreamIndexedIO #228

Uh oh!

Conversation

ldmoser commented Feb 26, 2014

Uh oh!

johnhaddon commented Feb 26, 2014

Uh oh!

johnhaddon commented Feb 26, 2014

Uh oh!

ldmoser commented Feb 28, 2014

Uh oh!

ldmoser commented Feb 28, 2014

Uh oh!

johnhaddon commented Feb 28, 2014

Uh oh!

johnhaddon Feb 28, 2014

Choose a reason for hiding this comment

Uh oh!

johnhaddon commented Feb 28, 2014

Uh oh!

johnhaddon commented Feb 28, 2014

Uh oh!

ldmoser commented Mar 6, 2014

Uh oh!

johnhaddon commented Mar 6, 2014

Uh oh!

Uh oh!