Skip to content

Conversation

ldmoser
Copy link

@ldmoser ldmoser commented Feb 26, 2014

As scenes get heavier and deeper, the memory overhead of the data structures holding the hierarchy of the internal nodes in the StreamIndexedIO becomes significant. The structure is a tree of StreamIndexedIO::Node classes (104 bytes* each + contents of children map ) with DataNodes (56 bytes*) where data is saved.

With the following commits I've maintained the file format as is, and also binary compatibility (please review my third commit on that). And from my tests, performance was not affected. The memory consumption reduction varied in my tests from %20 to %70:

  • Heavy asset (airBeam) 30% memory reduction
  • Heavy asset with lots of tags 20% reduction
  • Animated asset (droid) 48% reduction
  • Linked scene (crowd) 70% reduction

The idea behind these changes were to have more Node classes that are specialized on particular cases and therefore don't need as many bytes. In order to do that, my first step was to change the StreamIndexedIO::Node class to behave as "private MemberData" class (kept the name for binary compatibility), that holds the pointer to the underlying Directory node and another pointer to the shared Index.

Then I've created this new class hierarchy:
NodeBase (16 bytes_)
|- DataNode ( 40 bytes_ )
|- SmallDataNode ( 24 bytes* )
|- DirectoryNode (56 bytes* )
|- SubIndexNode ( 24 bytes* )

  • sizes measures in 64bit machine.

Lucio Moser added 8 commits February 25, 2014 15:35
…nter to the global Index and to the low level DirectoryNode.

Adding a specific class for DirectoryNode that holds the information loaded from the index.
Adding a member m_nodeType in NodeBase to distinguish derived classes without adding virtual methods.
…ield.

Using char instead of the enums in the structs to compact members in one word.
Original struct sizes were StreamIndexedIO::Node 104 bytes and DataNode was 56.
Current struct sizes are DirectoryNode 56 and DataNode 40 bytes.
This does not affect the class size and should not break binary compatibility.
As a consequence the Node class reduced from 32 bytes to 16 bytes (not derived from RefCounted anymore).
… data is small enough and avoid the big DataNode struct (40 bytes).
… SubIndexNode (24 bytes).

This helps on the memory requirements for opening a file with a large amount of stored Objects. In particular, large LinkedScenes.
… for the purposes of filtering child entries by type.
@johnhaddon
Copy link
Member

Hey Lucio - impressive reductions!

Unfortunately I'm seeing a performance regression when testing some multithreaded access in Gaffer. Here's the script I'm using to test, along with some measurements I took with it in the comments at the bottom. Could you see if you can reproduce that there?

import time

import IECore

import Gaffer
import GafferScene
import GafferSceneTest

r = GafferScene.SceneReader()
r["fileName"].setValue( "/Users/john/Downloads/bigBuckBunny_001.scc" )

t = time.time()
ti = IECore.Timer()

GafferSceneTest.traverseScene( r["out"], Gaffer.Context() )

print time.time() - t, ti.stop()

#########################
# master
#########################
#
#   0.92    1.64
#   0.83    1.54
#   0.84    1.55
#   0.82    1.53
#
#########################
# streamIndexedIOrawNodes
#########################
#
#   0.89    1.86
#   0.85    1.81
#   0.90    1.89
#   0.89    1.87
#   0.867   1.83
#
#########################

I've only taken a quick look at the actual code so far - perhaps you could look into this first and then I'll take a more detailed look?

@johnhaddon
Copy link
Member

You can get the bunny here if you don't already have him :

http://imageengine.github.io/gaffer/resources/caches/bigBuckBunny_001.scc.gz

It'd be interesting to try the same with some production assets...

…he hierarchy of Nodes.

Added a boolean in each Directory node that will be true only if there's a child that is SubIndex and only uses mutex on these locations.
Also, avoids keeping the mutex locked while reading the subindex or during deallocation of duplicated Directories.
@ldmoser
Copy link
Author

ldmoser commented Feb 28, 2014

Good one John!
My numbers were even worse here:
streamIndexedIOrawNodes
0.865702152252 3.58

With the latest change they are in pair with the current trunk
0.79429101944 1.85

Could you double check on your end?

@ldmoser
Copy link
Author

ldmoser commented Feb 28, 2014

And here some production examples:

hugeLayout: LSCC 7Mb (plus sub-layouts)
[ new ]
18.2563018799 128.21
17.7411010265 130.04
17.9823520184 129.19
[ old ]
18.4420411587 133.44
19.8850400448 136.13
17.6744019985 129.06

animated asset: SCC 8Gb
[ new ]
1.1167318821 2.06
0.952558040619 2.04
0.965826034546 2.02
[ old ]
1.0604159832 2.19
1.05946683884 2.21
1.05524492264 2.19

huge geo: SCC 104Gb
[new]
0.0319299697876 0.05
0.0220229625702 0.04
0.0236361026764 0.05
[old]
0.02747797966 0.04
0.025918006897 0.04
0.0246629714966 0.04

@johnhaddon
Copy link
Member

That's an improvement for me too - still not quite as quick as master but within a % or two. Are your [old] measurements from master or from this branch before your latest fix? Any idea how the huge geo is so quick in comparison to the others?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should definitely do this before merging - the whole Node vs NodeBase discrepancy is quite confusing.

@johnhaddon
Copy link
Member

I might be wrong as I haven't looked at everywhere they're used, but generally I feel that all the Node classes could do with encapsulating their data a bit more. I don't really understand the codebase fully, but it seems that a lot of the members are hanging out in public as non-const, whereas my guess in many cases is that they aren't intended to change after construction, or only in specific circumstances. Do you think it'd be worth tightening this up a little to reassure the naive onlooker that they aren't going to change and to have the compiler do a bit more checking for us?

@johnhaddon
Copy link
Member

This is of no use to us now, but it's good to know that C++11 allows the size of an enum to be controlled - that'd be handy for those spots where we've been forced to lose type safety and store them in a char :

http://www.cprogramming.com/c++11/c++11-nullptr-strongly-typed-enum-class.html

I wonder when our 3rd party dependencies will allow us to move to a modern enough compiler to use this stuff?

Adding some comments to explain better the rationale behind them.
@ldmoser
Copy link
Author

ldmoser commented Mar 6, 2014

I believe I've addressed most of your notes John. I think it's way clearer for the naive onlooker.
Decided to not rename the Node class because it affects the virtual function duplicate and the constructor.
About your questions, my [old] times are relative to the master. And the "huge geo" file has a very small index structure that points to massive animated geometry. So it makes sense being quick to open and traverse.

@johnhaddon
Copy link
Member

Thanks Lucio! Leaving the Node class rename till later seems like the right thing to do for now - sorry I hadn't noticed that it was exposed via methods in the API.

johnhaddon added a commit that referenced this pull request Mar 6, 2014
Reducing memory footprint in StreamIndexedIO
@johnhaddon johnhaddon merged commit 0e109f4 into ImageEngine:master Mar 6, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants