Skip to content

SceneTree

djewsbury edited this page Jan 26, 2015 · 3 revisions

#Scene "parser" and scene tree rationale

Many engines are based around a scene tree. The scene tree provides a structure for organising all of the elements of a scene, as well as performing some naturally hierarchical operations (such as quad-tree culling).

But there is no scene tree in XLE. What gives?

##Scene tree typical usage

How does a scene tree work in most engines? Here's a chart showing some typical scene tree.

TypicalSceneTree

There are a few things that I find a bit awkward about this scheme.

###Overlapping purposes for the hierarchy

Commonly, nodes are organised in the hierarchy according to the following rules:

  • to take advantage of the local-to-parent inheritance
  • (eg, one object has been placed inside the local space of another object)
  • because of locality
  • for hierarchical culling, we group objects together because they are close to each other (for example, in a quad-tree or oct-tree pattern)
  • this will often produce nodes that don't need a local-to-parent transform
  • sometimes objects that want to inherit a local space also want to inherit the parent's locality information (but not always)
  • because they came from the same export
  • eg, a "model root node" will contain nodes that were exported from Max/Maya together in the same export operation

Sometimes these purposes overlap, but otherwise they don't. Consider the case of attaching a particle effect inside a hierarchy. Maybe smoke coming from a gun barrel. We could attach an effect node in the hierarchy of the weapon model. This would allow us to inherit the local space of the weapon. But we don't really want to inherit the bounding volume of the weapon model (because the smoke will probably extend outside of that bounding volume, in some procedural unpredictable way). So this is a case were the hierarchy purposes don't perfectly overlap.

Ok, so we could get around this by creating two architecturally separate objects: a particle emitter, and a particle system. In this design, the particle emitter can exist inside of the hierarchy, but doesn't itself have any bounding volume or rendering component. It would work, but it means we have to start adapting our designs to meet the nature of the scene tree.

It feels like a scene tree doesn't fit perfectly with the ideal of a single clear purpose for everything. There are multiple, slighty muddy, purposes for a scene tree.

###Other issues There are some other thing I find a bit awkward.

Thing Description
Generally prefer linear structures Tree based structures are more likely to be unevenly distributed in memory, and more likely to require more separate heap blocks. By comparison, linear structures are more likely to be contiguous in memory. For today's hardware, it's often advantageous to prefer the second case.
We lose type information Because the scene tree is sorted by locality, we loose information about "type". For example, we might want to have a special "tree manager" that applies some special tree-related technology when we render trees. But the tree objects are scattered about the scene tree, in no particularly order. How do we know what is a tree and what isn't? How do we get a list of all of the tree objects that will be inside a given frame?
Excessive generality Sometimes structures that are too general can end up hurting rather than helping. Good designs come from finding the conceptual area that overlaps between two separate things. If we're a little outside of that area, or if our design doesn't quite fit in place exactly right, it can make future things more difficult.
Sometimes we benefit from erring on the more "specific" side, and avoid being overly general.
Not well suited to character skinning It doesn't make sense for character bones to sit in the same hierarchy that we use for other things. Particularly as characters have hundreds of bones, we want to make sure they are lightweight and efficient. So maybe bones should be not be the same thing as a scene tree node.
But adding things to the local space of a bone was one of the goals of a scene tree. If we can no longer do that, why do we need it?
No good interface for INode If we want nodes to be polymorphic, we need some virtual interface that all nodes should implement. But there just isn't a single good interface for this. We often need to fall back to some downcasting mechanism. It's just no fun.

###Implementing a scene tree in XLE

While there are some problems with scene trees, they can still be useful. It's quite possible to implement a scene tree within the XLE system. There's nothing that prevents using a scene tree in XLE. But the important thing is it's possible to use XLE without have a scene tree.

XLE splits the scene rendering process into 2 basic concepts:

  • Lighting parsing
  • Scene parsing

The lighting parsing step is responsible for executing all of the lighting steps required, including things like point light sources, shadows and reflections. It handles the "physics of light" part.

The scene parser is responsible for defining what the world contains. It will fill our scene up with objects and things. The key interface for the scene parser is called SceneEngine::ISceneParser. It is called a "parser" because (conceptually) it should take some data structure defining the world and produce a linear set of rendering commands.

For more information about the lighting parser, see lighting parser diagram.

The scene parser interface has only a few methods. There are roughly 2 types of methods:

Type Description
"Get" type methods GetLightCount() / GetLightDesc() or GetCameraDesc() or GetTimeValue()
These query the scene for certain properties.
"Execute" type methods ExecuteScene() or ExecuteShadowScene()
These are instruct the parser to run over the scene and execute the requested the objects. Note that the lighting parser must have some control over what things are rendered at what time. So the scene parser must follow the SceneParseSettings passed through as a parameter.
The scene parser must also do any locality based culling during this execute phase.

There are no assumptions made about how the scene parser organises it's data. A scene parser that uses a scene tree data structure would traverse the tree in the "Execute" steps.

###Using a non-tree based scene tree in XLE

XLE provides helper objects for implementing efficient scene parsers without using a scene tree. Lets consider the SceneEngine::PlacementsManager.

In XLE "placements" refers to a database of objects and positions. These are intended to be mostly static (but with room for certain types of animation/changes). Normally a world will be full of objects that have been places by artists and designers -- these objects might be natural things like trees and rocks, or they could be buildings and furniture.

The placements manager separates the world into many cells. Each cells contains a number of placements that are grouped together by some shared properties.

For example, a basic system might split the world into a grid of cells, where every cell is 512x512 meters in the XY plane.

However, placement cells can actually overlap, or be enabled and disabled by events. There could be separate cells for indoors and outdoors. Or possibility different cells for different levels of detail at extreme ranges.

Each cell contains a list of objects and model-to-world transforms. The data associated with each object is intentionally kept as small as possible. But each object has a unique guid for identification (and this could be used for attaching extra information).

Each cell has some static locality culling data structure (like a balanced quad tree). Because placements don't change, we can precompile an efficient structure. We can also use different culling structures for different cells (indoor cells might benefit from different culling methods to outdoor objects, for example).

By design, both the cell object list and the culling data are stored in linear arrays in memory. In other words, they are stored in memory in a "serialised" form. One of the great advantages of this is it's easier for streaming. If our data is in a single heap block, we can allocate the space and load it from disk more conveniently. This make error handling easier and generally makes the system more efficient and less error prone. (But the disadvantage is it's best suited to static data).

###A Scene parser of many parts

The ideal scene parser is one that pulls together many different technologies to get the behaviour best suited to some particular goals. This means using more than just one solution, and using specialised solutions when needed.

Placements are designed to be mostly static. This is because the majority of objects in the world are simple and we want to make sure we get the best performance for those objects. But the scene can be made up by than just placements. In some cases, we might objects that are more dynamic and interactive. In these cases, we can add further objects to the scene parse, after the static placements have been executed.

Sometimes we might need very special behaviour for some special result. Let's consider a tree rendering example. Let's imagine we had a very large landscape and we want it to look like a forest, all the way to the horizon.

Ideally, our solution would have these properties:

  • near the gameplay area, trees are manually placed by artists
  • but outside of the gameplay area, in the distance we want to just fill the world with procedurally generated trees
  • When the camera is close to trees, they should appear as models
  • but, distant trees should appear as imposters
  • The density of tree placements should fall off with distance from the camera
  • There are only a few different types of trees
  • most tree models will tend to be repeated hundreds of times in the same scene

This might seem like a common case. But there are many special properties to it. We could imagine that we could sit down and wrote some code that did this style of tree rendering, and only this style of tree rendering. And actually, it might not take too long to do that. And if we specialised our solution (from top to bottom) to do exactly this, we'd probably get the most efficient result.

So why not write a specialised solution? In this case, we don't want to fiddle with placements or quad trees, or any of that. We just want a _treeManager->ExecuteTreeScene() call in the scene parser.

That's the goal of the scene parser -- to provide a framework that can be adapted to many different style and goals.