Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SceneWriter needs to execute all in one go #870

Closed
andrewkaufman opened this issue Jun 18, 2014 · 10 comments
Closed

SceneWriter needs to execute all in one go #870

andrewkaufman opened this issue Jun 18, 2014 · 10 comments
Assignees
Labels
core Issues with core Gaffer functionality
Milestone

Comments

@andrewkaufman
Copy link
Contributor

The SceneWriter is an executable which needs to process a group of Tasks at once, regardless of any one context requested, since its writing a single file for the entire frame range.

@andrewkaufman andrewkaufman added this to the Despatching milestone Jun 18, 2014
@andrewkaufman
Copy link
Contributor Author

Actually, re-reading our various offline discussions, this is titled poorly and I mixed two issues... I'll change this one to be about the SceneWriter execution, and add a new issue for task grouping.

@andrewkaufman andrewkaufman changed the title Mechanism for grouped Task execution SceneWriter needs to execute all in one go Jun 18, 2014
@andrewkaufman
Copy link
Contributor Author

SceneWriter is actually a single task that depends on its inputs at multiple frames.

  • executionHash() is constant across time - only one file is ever produced. The Despatcher will query it once per frame, and realise that it's always the same, so only make a single task for it.
  • executionHash() should actually be implemented by munging together it's inputs' executionHash() for every frame in the range.
  • executionRequirements() will request all input frames at once, telling the Despatcher that it needs all the input frames to be generated before that single task can be called at all.

@johnhaddon
Copy link
Member

I can see how bullet point 2 arose from looking back at our offline discussion, but it's not quite accurate...

I don't think there's any requirement for an ExecutableNode to include the executionHash() of its input requirements if they don't actually affect the results of the execute() - in fact if they don't affect the result then they shouldn't be included. And since the requirementsPlug() provides arbitrary "just do this first" requirements that don't actually affect anything about how SceneWriter (or anything else for that matter) operates, they shouldn't be included in the executionHash().

What we do need to put in SceneWriter::executionHash() is the filename that we're writing to, and the combined hashes of the entire input scene (which is a ComputeNode rather than a requirement) across the relevant frame range. As we discussed, that might be a bit expensive without a hierarchy hash, so a quick alternative for now might be to hash the address of the input ScenePlug and the entire Context - that should hold up until we introduce more complex despatchers with a memory of what they've done in the past.

@johnhaddon
Copy link
Member

Also, we need to make sure that as a matter of course, all executionHash() implementations start by hashing in the TypeId of the ExecutableNode derived class, to avoid clashes between different types of Executable with otherwise identical inputs. For ComputeNodes this is done automatically by the base class (and all derived class implementations are required to call the base class implementation first). I suggest we adopt that methodology here.

@johnhaddon
Copy link
Member

And just one more thing - I think the "execute all in one go" concept is slightly more complex than what we've discussed. So far we've said that the SceneWriter needs to execute for all frames at once, which is only kind of true. It needs to execute for all frames at once, where the Contexts are otherwise identical.

We don't have them yet, but just as we have a SceneContextVariables node to propagate new variables upstream during scene assembly, I intend to have ExecutableContextVariables nodes that propagate variables upstream during execution (they would ask for their requirements in a modified context). So you could branch two such nodes off the bottom of an Executable chain to perform the same sequence of operations in slightly different contexts (so maybe a ribgen, render, quicktime, publish chain would be executed for $pass=="foreground" and $pass=="background"). In such situations we need to batch together all the frames for the foreground pass separately from all the frames for the background pass.

Let's make sure we discuss this further to make sure we come up with an API which allows this to happen fairly naturally.

@andrewkaufman
Copy link
Contributor Author

So, time to discuss this further. We currently have SceneWriter::executionHash() hashing the address of the input ScenePlug and all non-ui values in the Context. I guess we need to modify that to ignore the frame so its constant across time. And we need to add the fileName (which was left out accidentally I imagine).

Are you thinking SceneWriter will have it's own plugs for frame range, regardless of the requested Context? Or should it have some way to collect all the relevant Contexts together? Would that be via a plug the Dispatchers recognize?

Is it going to need to reimplement ExecutableNode::executionRequirements() to get the requirements for the full frame range? Or will the Dispatcher take care of that, when it de-duplicates the Tasks by executionHash?

Will this require modifying ExecutableNode::Task to store a list of Contexts? Or alternatively Dispatcher::TaskDescription to store a list of Tasks?

@johnhaddon
Copy link
Member

I'm not 100% clear on any of this, and you have plenty of experience with the farm dispatching stuff that I don't, so please call me on any of this if it sounds suspect.

I'd like to defer the question of whether or not SceneWriter has its own frame range - I believe that is orthogonal to the question of how we implement executing-all-in-one-go, and the answer will come naturally from user requests (or the lack of them). I think ExecutableNodes would force their own frame range onto the Dispatcher by implementing ExecutableNode::executionRequirements() to request their own output at the previous frame, until they hit the start frame. That's what we discussed for simulation nodes right? I don't think that's relevant to the how those tasks then get grouped for executing-all-at-once-ness.

I think ExecutableNode::Task should stay as it is - it's a nice simple concept, and encapsulates everything needed by the public API. Dispatcher::TaskDescription is the internal data structure with the additional information that Dispatchers need to do their private work, so that's the one that should be changed to include a vector of tasks, or whatever it needs to be to support this work.

As for how ExecutableNodes should communicate their all-in-one-go requirements, I guess we could go with either a simple API or a more flexible one :

  1. ExecutableNode::mustExecuteAllFramesAtOnce() (or a better name saying the same thing).
  2. ExecutableNode::batchContexts( const vector<Context *> &contexts, vector<vector<Context>> &batches )

Either way, for our current purposes we need to sort the contexts into groups where each context is identical to the others except for the frame. If we do 1) then SceneWriter is very simple and the dispatcher must do the batching. If we do 2) then the dispatcher is simpler, and we gain some flexibility in the constraints an ExecutableNode can place on the dispatcher - we could batch across things other than frame. I'm struggling to think of a good use case for that right now though so maybe pushing the complexity out of the way and into Dispatcher is best? Also, the dispatcher is going to have to do its own batching to implement clumping anyway, so 1) would keep all the batching code in one place. I think I vote for 1), unless anyone's got a use case for 2).

One other thing that might be nice to consider - I find it annoying that most ExecutableNodes only ever do one frame at a time, but all are forced to loop over a list of contexts in execute(). It's also quite annoying to have to build a vector and pass it when as an ExecutableNode client you only want to execute one frame. I'd welcome any API changes that would simplify that, so there was a form of execute() that took a single Context *. No need to worry about that immediately though.

@andrewkaufman
Copy link
Contributor Author

So in trying (and failing) to give mustExecuteAllFramesAtOnce() a better name, Dan mentioned the fact that if it's this difficult to name it, maybe its not the best interface. Since you're also talking about providing an execute() which takes a single Context *, what would you think of this change:

virtual void execute( const Context *context ) const = 0;
virtual void execute( const Context *context, std::vector<float> &frames ) const;

The default implementation of the second form would just call the first form several times, modifying the context each time. SceneWriter (and other mustExecuteAllFramesAtOnce nodes) would re-implement the second form as needed.

I don't think this actually removes the need for mustExecuteAllFramesAtOnce (the Dispatcher still needs to know when to batch Tasks together), but maybe it makes it easier to name, since batching by frame becomes an explicit part of the ExecutableNode API? After writing all this, I still haven't thought of a good name though...

@johnhaddon
Copy link
Member

I like it. What do you think to this as far as naming goes?

virtual void execute( const Context *context ) const = 0;
virtual void executeSequence( const Context *context, const std::vector<float> &frames ) const;
virtual bool requiresSequenceExecution() const;

One other thing - elsewhere in the API, the public methods don't require a Context to be passed in at all, they just use the one that's current. I think I'd like that to be the case for the ExecutableNode as well - does that sound reasonable?

Oh and while we're discussing naming, I think I'd like executionRequirements() to be simply requirements() and executionHash() to be simply hash() - the prefix was in there because at one point Executable was multiply inherited, and we wanted to distinguish its methods from those in ComputeNode etc. That's no longer an issue now…

@andrewkaufman
Copy link
Contributor Author

That all sounds fine to me. I'll see how it goes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Issues with core Gaffer functionality
Projects
None yet
Development

No branches or pull requests

2 participants