Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multithreaded, selective attribute updates #1156

Merged
merged 13 commits into from
Mar 11, 2015

Conversation

davidsminor
Copy link
Contributor

Initial scene output also goes directly to the renderer, rather than via procedurals, so we can block the main thread

@johnhaddon
Copy link
Member

Have you run the GafferRenderManTest.InteractiveRenderManTest unit tests? I get the following crash when I do :

testAddLight (GafferRenderManTest.InteractiveRenderManRenderTest.InteractiveRenderManRenderTest) ... 3DL ERROR S2074: cannot assign variable of type 'uniform vector' to parameter 'uniform point from' (shader: 'pointlight')
terminate called after throwing an instance of 'tbb::captured_exception'
  what():  Unable to find mapping for output path

I also notice that you've started using type& name and type* name here and there rather than following our established type &name and type *name convention. Not a biggy, but I would like to keep the codebase consistent so if you could stick with the convention that would be great.

I'll dive into a proper review once we know what's going on with the test failure...

@andrewkaufman andrewkaufman added the pr-revision PRs that should not be merged because a Reviewer has requested changes. label Jan 15, 2015
@davidsminor
Copy link
Contributor Author

Gah - yeah, I was just running GafferSceneTest wasn't I - what was I thinking. I'll take a look

@davidsminor
Copy link
Contributor Author

Right - so it looks like that was an uncaught exception problem. I've added basic exception handling to the tbb::pipeline functions too stop the crash - the problem now being that it'll spam you with errors if there's something wrong with the graph, rather than just terminating. I'll see if I can find a better way of doing that... Also, I'm getting a test failure in testMoveCoordinateSystem, which miiiiiight just be numerical precision in the shader or something, although I haven't fully investigated it yet.

Anyway, there you go

@davidsminor
Copy link
Contributor Author

eh? I wrote out all the "myLovelyPlane" images that were getting tested, and they all seemed fine. The 'eck?

@davidsminor
Copy link
Contributor Author

So it looks like the back plane wasn't covering the entire render in the test, leaving a margin of pixels with unexpected values. If I scale the plane by a factor of 10, that fixes the failures, although it's a bit of a mystery to me why that margin was being read in the first place, as it's reading at uv coordinates 0.1,0.1, 0.5,0.5 and 0.9,0.9, which don't sound like they'd be anywhere near the edge.

@andrewkaufman andrewkaufman removed the pr-revision PRs that should not be merged because a Reviewer has requested changes. label Jan 16, 2015
@davidsminor
Copy link
Contributor Author

Just got some new performance stats for the robot from this morning, for doing a big shader edit hitting 2600 objects. This is from 3delight 11.0.133, where they've sped up the RiEdit stuff quite a bit:

setup:                                          startup      edit
--------------------------------------------------------------------
3delight 11.0.133:                              23.6 sec     13 sec
no RiShader/Attribute calls:                    13 sec       12 sec
no RiEditBegin/End/Shader/Attribute calls:      13 sec       8 sec
all Ri calls commented out:                     13 sec       8 sec
gaffer traversal only:                          5 sec        4 sec

3delight 11.0.126:                              23.6 sec     30 sec

So 3delight 11.0.133 is considerably better than 3delight 11.0.126. It's interesting how commenting out the RiShader calls cuts the startup time in half.... it looks like most of the 3delight startup time is going into instantiating shaders? Anyway, it kind of looks like the IECoreRi code's costing us 8 seconds doesn't it? Traversal takes 4 seconds, but that happens in the background, so I reckon blasting through an equivalent precompiled list of IECoreRi calls in a single thread would probably take 8 seconds. On top of that, it looks like the RiEdit blocks cost us an extra 4 seconds

@davidsminor
Copy link
Contributor Author

So, after playing around with IECoreRI a bit, I managed to get that 8 seconds down to 5.5. Here's what I've done so far:

  • replaced boost::format and string concatenation in ParameterList::appendParameter with something more manual
  • reordered string comparisons in RendererImplementation::shader(), as "surface" and "shader" are the most common
  • replaced type == "shader" with !strcmp( type.c_str(), "shader" ) in RendererImplementation::shader() to avoid constructing strings over and over again
  • Avoided making that parametersCopy variable in RendererImplementation::shader(), and rejecting "__handle" inside ParameterList::ParameterList() instead

It also looks like you can knock another half second off by commenting out the bit in RendererImplementation::shader() that builds the type hints - apparently it doesn't like making all those map insertions.

@johnhaddon
Copy link
Member

Nice!

Did you actually measure the type == "shader" change to strcmp() on its own? It sounds a bit fishy, because direct string <-> const char * comparisons are defined (and shouldn't make temporaries). If we really want to speed up the comparisons though, we should think about taking InternedString arguments in Renderer methods rather than std::string.

Now we have GeometricTypedData, we should be able to ditch the typeHints hack entirely. It would take a little bit of work in both IECoreRI and Gaffer to check that we were tracking the geometric type appropriately, but I think it'd be well worth doing - cleaner and faster.

m_childIndices.push_back( 0 );
}

void next()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm right in saying that this is an implementation detail of SceneGraphIteratorFilter? If that's the case, can you make it private to make it clear that it's not something that tbb will use directly.

@johnhaddon
Copy link
Member

I've made a bunch of inline comments - a few style things that should be easy to fix, a couple of suggestions which should help with performance, and a couple of questions to aid my understanding. Thanks for the clear comments on the filter tasks - they made the code very easy to dive into.

There's one broader issue here as well though. It's not allowed to use "scene:path" values that aren't predetermined as being valid - where validity means "every name in the path is present in the childNames at that level" (SceneAlgo provides a check for this with exists()). Really, when we're doing the update traversal, we also need to be computing child names, and culling the traversal when we hit locations which are no longer valid. Do you think that's easy to fit in with the logic as it stands?

@johnhaddon
Copy link
Member

Do you know what'll happen with the "suspend rendering during edits" change with 3delight prior to 11.0.138? I suspect it might be a problem, and I really want to keep Gaffer compatible with the free download of 3delight as far as possible.

@davidsminor
Copy link
Contributor Author

Just rebased from master and removed "fixed hang when switching from 'paused' to 'stopped'" commit, as that was in the master already

@davidsminor
Copy link
Contributor Author

Hmm - I tried it on 11.0.134 actually, and surprisingly the code in commit 87b819a wasn't a problem. Also, I just checked on the 3delight website, and the free download is version 11.0.142

@davidsminor
Copy link
Contributor Author

Just made a little addition so inactive InteractiveRenders don't do the childNames check when the childNames plug gets dirtied

@davidsminor
Copy link
Contributor Author

Hey, are we ready to get this rolling? People are complaining about the IPR crashing when they fiddle with the graph while it's starting up and stuff like that. You reckon I should add a simple stderr startup progress indicator first actually?

@johnhaddon
Copy link
Member

Yeah, I think we're pretty much there aren't we? I'd just like to take it for a bit of a spin before merging (I've only looked at the code so far). Do you know when we're planning to make the next release? And do you think it'd be good to do some user testing before merging or are you happy to get it straight into the wild?

@davidsminor
Copy link
Contributor Author

I think the old one's a bit temperamental anyway, probably because of the asynchronous startup, so I'd be fairly happy just slapping it down and seeing what happens. What do you reckon about the progress indicator? It'd make people less annoyed at the new UI freezes

@johnhaddon
Copy link
Member

Hi David,
Testing with 3delight 11.0.143 in the IE environment and gaffer test -repeat 10 GafferRenderManTest.InteractiveRenderManRenderTest.testDeleteWhilePaused I periodically get this error :

terminate called after throwing an instance of 'tbb::captured_exception'
  what():  Unable to find mapping for output path
Abort (core dumped)

I've been unable to get the same to happen on master. Since testDeleteWhilePaused() simulates a common process in Caribou, I reckon we need to get this fixed - can you see if you can reproduce it at your end?

@davidsminor
Copy link
Contributor Author

Actually it's possible that's because it hasn't been rebased onto master yet? I'll give that a go

@johnhaddon
Copy link
Member

I was testing with a merge of your branch onto master, if that helps at all - using the command line that GitHub gives at the bottom of this pull request.

Initial scene output also goes directly to the renderer, rather than via procedurals, so we can block the main thread
Uncaught exceptions were previously causing hangs
In this commit I'm checking for changing child names when the childNames plug gets dirtied (I'm reluctant to do it during a regular update because it means extra hash/plug evaluations, sorting lists etc during something I want to be fast). If a location's child names change, the lists get compared, and locations in the scene graph that are no longer present get an m_locationPresent flag that I've added set to false. This stops the SceneGraphIteratorFilter traversing into them so they don't get updated.
@davidsminor
Copy link
Contributor Author

ah there we go - it's because I wasn't catching exceptions in plugDirtied(). Updated

@davidsminor
Copy link
Contributor Author

Hello. Is this ready to go?

@johnhaddon
Copy link
Member

Yep, thanks. I do have a couple of notes on it still, but I think it's more important that we get this into people's hands first...

johnhaddon added a commit that referenced this pull request Mar 11, 2015
Multithreaded, selective attribute updates
@johnhaddon johnhaddon merged commit a3e4e84 into GafferHQ:master Mar 11, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-revision PRs that should not be merged because a Reviewer has requested changes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants