Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic Geometry instancing #10093

Closed
wants to merge 6 commits into from

Conversation

Benjamin-Dobell
Copy link
Contributor

@Benjamin-Dobell Benjamin-Dobell commented Nov 10, 2016

When I first started using three.js I read that it supported instancing. It wasn't until I'd been using it for a while that I realised that you were supposed to use InstancedBufferGeometry to enable instancing. However, the issue with that is that your instances aren't actually part of the normal scene work-flow, it's pretty complicated for most users.

This on the other hand is turned on by default. Geometry will be batched and submitted to the GPU in one go as long as a browser supports the necessary extensions. An "instancing texture" is used (world and normal matrices encoded) when floating point vertex-shader accessible textures are available. On systems that don't support this, then the number of available uniforms will be detected and the necessary matrices will be pushed as arrays.

I've tried to be careful not to break backwards compatibility. The only slight issue will be for those whom are using their own shaders. They'll want to add:

#include <common_vertex>

to the top their vertex shader main function. This has been done already for all built-in shaders.

EDIT: ... unless they're using dynamic uniforms (object render callbacks), in which case the implementation will detect that and won't turn on instancing for that object, so the user will be unaffected.

Instancing is turned on by default for opaque objects, but off for transparent ones. This is because in-order to effectively batch geometry you don't want to worry about z-sorting. Batching can be enabled for transparent objects if desired.

Performance can be tweaked by increasing the instancing texture size (setInstancingTextureSize( size ). By default it's 64x64, which gives you batches of size 585. Bumping it to 128x128 will use ever so slightly more VRAM, but will allow for 2340 instances per batch.

The increased VRAM is pretty negligible, but pushing a larger texture can itself be less performant than a smaller one if most of your geometry only has a few instances. For this reason, there's minInstancingBatchSize which will cause geometry to be rendered without batching (no texture pushing) if there's less than minInstancingBatchSize instances viewable in a given frame.

These properties can be changed safely between rendering frames if desired.

Sensible defaults have been chosen.

Of course, instancing can be disabled altogether by setting autoInstancing to false.

@Benjamin-Dobell
Copy link
Contributor Author

As I mentioned in #9738, if an object has callbacks associated with it then we don't try instance those objects. This is because we have to assume the worst i.e. that those callbacks alter uniforms, in which case the objects can't be instanced.

@Benjamin-Dobell
Copy link
Contributor Author

Benjamin-Dobell commented Nov 10, 2016

Oh, drawMode has been moved from Object to Geometry for the same reason... it didn't really make too much sense per object anyway.

EDIT: Sorry, not "moved" entirely. Just marked as deprecated in Object, it'll still work as expected, those objects just won't be instanced.

@pailhead
Copy link
Contributor

What are the benefits of using a texture and doing a texture fetch for each instance over using an attribute?

@pailhead
Copy link
Contributor

pailhead commented Nov 10, 2016

I think here the uniform should be computed as an inverse float, so to avoid the division?

uniform int instancingTextureSize;

mat4 getModelMatrix() {

    //float dx = 1.0 / float( instancingTextureSize );
    //float dy = 1.0 / float( instancingTextureSize );
    float dx =  instancingTexture_INVERSE_Size ;
    float dy =  instancingTexture_INVERSE_Size ;

But it is still 7 texture fetches, from a floating point texture, and a few operations to unpack it, index into it etc. I don't know much about the mechanics, but it feels like it would all be cheaper with a couple of attributes instead of a texture?

This may also be limiting, i'm not sure if this is a valid way to do it (could be faster to update a uniform than an attribute?) but with attributes you could put a large number of instances, like 2^16 i think, while you can't have that many uniforms?

#else

    //uniform mat4 modelMatrices[ MAX_INSTANCES ];
    //uniform mat3 normalMatrices[ MAX_INSTANCES ];
    attribute vec4 a_MatVec1; //i think three cant figure out a mat4 out of this on its own
    ....
    attribute vec4 a_MatVec4;

    mat4 getModelMatrix() {

           //return modelMatrices[ int( instanceIndex ) ];
           return mat4( a_MatVec1, a_MatVec2, a_MatVec3, a_MatVec4 ); //pack a matrix with 4 vec4 attributes

@Benjamin-Dobell
Copy link
Contributor Author

Benjamin-Dobell commented Nov 10, 2016

Note: Force push just modified comments, nothing functional.

@pailhead Yeah, I didn't try optimise it too much. I originally planned to mimic the skinning shader (bone texture), but I couldn't help myself optimising it at least a little bit ;)

Attributes are typically designed to be one-per-vertex. Of course the ANGLE instance rendering functionality lets you set a divisor. I've kind of assumed that texture lookups will be more performant, but that may well be complete nonsense. There's both textures and uniform options, I guess there's nothing stopping someone adding an attributes option as well.

I actually implemented something very similar, albeit more advanced, to this years ago for Xbox 360, attributes wasn't an option then. But I guess I started in a "texture mindset" for that reason. However, the plan for the texture was actually that in the future it could easily be extended for fully instanced, unlimited bone, skinned models (which is what I'd implemented in the past).

@pailhead
Copy link
Contributor

rather than textures->uniforms->attributes i think it should go in reverse :)

@Benjamin-Dobell
Copy link
Contributor Author

Benjamin-Dobell commented Nov 10, 2016

Well, textures makes sense over uniforms due to constraints on how many uniforms you're going to have available. On my system you'd get about 140 instances per batch vs. 585 (well technically >500,000).

However, you're welcome to mess around with this and give attributes a shot. If you don't beat me to it, then I'll take a crack at it when I've got a bit more time.

@Benjamin-Dobell
Copy link
Contributor Author

Oh yeah, the other limitation with attributes is how many you can push per-vertex (typically 8-16). We'd need 7 (6 with clever packing), however that's possibly going to interfere with custom shaders/materials etc. (if it works at all on systems with lower limits). But yeah, that certainly doesn't totally preclude the idea of using attributes.

@pailhead
Copy link
Contributor

I didn't have a chance to go through the entire code, i don't know how it figures out what to batch, but it would definitely be worth doing some benchmarks.
I tried to approach this slightly differently, but im on an older version of three. I basically made a class that creates an instanced buffer geometry out of a regular buffer geometry and a "positioning" callback. The regular geometry populates the mesh attributes, while the positioning callback runs some logic and fills the instancing attributes.

This is a different approach though, i have to know ahead of time what im instancing and how im doing it.

I'm a bit confused with where three.js stands now. It's not really a game engine so i'm not sure if it should be handling optimizations like this. IE. first we'd have to make sure that saving a few draw calls, is worth having each mesh now do 7 texture fetches.

On the other hand if you know that you want to draw something in a specific way, you'd set this up manually, ie. by making something with instancedBufferGeometry in the first place, and populating it's attributes based on some logic, or even choosing a texture or a uniform over that.

For my use case i didn't actually hack much, the convenience class to obtain the instancing data is something you can build with existing classes. The only thing i did have to hack in was the TRS matrix. The assumption made there was that at a bare minimum, something needs to have position,rotation,scale, and this should be exposed to all the shaders. I basically added a ".instanceTransform" property to the base material class which triggers a define in the shader.

@pailhead
Copy link
Contributor

pailhead commented Nov 10, 2016

We'd need 7 (6 with clever packing),

Are you 100% on this? I think you only need TRS for this, not the normal matrix. I need to look at my code and if something has changed, but i'm pretty sure that you can transform the normals just once, in object space.

What i tried in 78 with vert chunks

#ifdef INSTANCE_TRANSFORM
    #define INSTANCE_TRANSFORM_DEFINED //not the cleanest way but i had trouble injecting this mat because of branching i think
        mat4 aTRS = mat4(
            aTRS0,aTRS1,aTRS2,aTRS3
        );
    #endif

    vec3 objectNormal = ( aTRS * vec4( normal , 0. ) ).xyz;

About running out of attributes, this may be a solution, these can be addressed as a single mat4 attribute it seems, but i haven't tried it yet:

#9916

http://stackoverflow.com/questions/38853096/webgl-how-to-bind-values-to-a-mat4-attribute

edit

i just realized, "oh right, scale", this works while they're uniformly scaled, it will break the normals.

@Benjamin-Dobell
Copy link
Contributor Author

Benjamin-Dobell commented Nov 10, 2016

The biggest issue for me is that InstancedGeometryBuffer circumvents the scene graph, for me that's a deal breaker.

Also I don't really see why anyone should be expected to handle this manually when it can trivially be performed for them automatically.

I've benchmarked this for my own needs. My geometry was already heavily optimised for mobile, but on average I'm seeing frame rendering time reduced by 25%. Gains will be proportionally much larger if you're not auto updating transforms (which is presently responsible for around 50% of my render loop). Also if you're using larger pieces of geometry you'll have bigger gains. My scenes are unfortunately heavily broken up as it aided collision detection.

@Benjamin-Dobell
Copy link
Contributor Author

Normal matrix requires an invert operation. Not included in the current WebGL GLSL spec and extremely expensive to implement manually.

@Benjamin-Dobell
Copy link
Contributor Author

Oh, yeah you don't need a normal matrix, I didn't use one when I implemented this last time. But having one makes for simpler calculations in the shader. Also I wanted these changes to be as transparent as possible I.e. Not rewriting all the lighting shaders.

@pailhead
Copy link
Contributor

what do you mean by circumnavigating the scene graph?

@pailhead
Copy link
Contributor

Oh, yeah you don't need a normal matrix,

I didn't mean it like that, i forgot that it's just the rotation extracted, i mostly render real world objects so they're only uniformly scaled :)

Was thinking about the question one might ask before implementing something like this. Like "do my objects move, or are they static" "will i add more of them" "does the geometry need to be queried in any way" etc etc.

InstancedBufferGeometry shouldn't have anything to do with the scenegraph in the first place, pretty much the same way BufferGeometry shouldn't, but i may be missing something?

@Benjamin-Dobell
Copy link
Contributor Author

Well, this implementation is seamless, it shouldn't impact a single example, if it does it's a bug.

With InstancedBufferGeometry you only add one object to the scene, the instances are defined by some other means (custom shader code, attributes etc.)

This implementation just goes "Well, several objects already share one Geometry. Might as well share a draw call too." It's not really any more advanced than that. For particle effects etc. you'd still use something like InstancedBufferGeometry.

@pailhead
Copy link
Contributor

I think the question/conclusion doesn't stop there - "For every instance of this geometry drawn, i will do 7 texture fetches for each of it's vertices". I think this shouldn't really be hardcoded in the library.

With InstancedBufferGeometry you only add one object to the scene,

What's preventing you from adding other objects to the scene? You can make an arbitrary amount of meshes point to the instanceBufferGeometry, or even keep an instance of BufferGeometry referring to the same geometry attributes. If you compute something like bounding box or bounding sphere, you can copy those to other objects too. There really shouldn't be a conceptual difference between these two, it just branches the WebGLRenderer and does slightly different calls. There is no reason why you would only use one object when using either of the geometries.

@Benjamin-Dobell
Copy link
Contributor Author

Benjamin-Dobell commented Nov 11, 2016

I think the question/conclusion doesn't stop there - "For every instance of this geometry drawn, i will do 7 texture fetches for each of it's vertices". I think this shouldn't really be hardcoded in the library.

I think you're getting caught up on the implementation details.

Why are texture fetches any different than anything else three.js automatically does for the user? It automatically:

  • computes matrices from quaternions, vectors and scales.
  • does seamless optimisations e.g. converting to BufferGeometry behind the scenes in the WebGL renderer.
  • per object z-sorting (when enabled).
  • per object culling (when enabled).
  • provides pre-written shaders for common use-cases (of which reusing geometry surely is one).
  • _seamlessly does texture fetches, in the vertex shader when it makes sense_.

Yes, that's right, three.js already does texture fetches automatically for you, and I don't mean in the pixel shader. Pretty much precisely the same thing is done in the skinning shader.

Furthermore, as mentioned above, it doesn't do 7 texture queries for every piece of geometry blindly (technically it's 7-per instance). If for example, the number of objects to be instanced is below the user-configurable minInstancingBatchSize then the existing, non-instanced, rendering path is used. The behaviour can also be completely disabled by setting autoInstancing to false.

Nonetheless, yes, we should all profile this and chime in on how it affects our software, that's the beauty of open-source software.

However, realistically there's absolutely no reason whatsoever your average user should care. If this implementation is bug-free, then a user needs to know nothing about it, it's an implementation detail. If they pay attention to release notes, then all they need to know is that rendering performance has been improved for scenes that take advantage of proper geometry re-use between objects. Everything else is irrelevant. This pull request doesn't by any means prevent anyone doing their own more specific instancing, it just provides a performance gain for the normal use-case, that's it.

@Benjamin-Dobell
Copy link
Contributor Author

Force push removed some morph target/attribute changes that shouldn't have been included.

@Benjamin-Dobell
Copy link
Contributor Author

Benjamin-Dobell commented Nov 11, 2016

@mrdoob
Copy link
Owner

mrdoob commented Nov 13, 2016

This is great! But I don't think I would do this "automagically".

How about turning this into a similar API to #5816? Something like...

var mesh = new THREE.Mesh( geometry, material );
var instances = new THREE.Instances( mesh, 1000 ); // This produces an internal array of 1000 identity matrices

var vector = new THREE.Vector3();

for ( var i = 0; i < 1000; i ++  ) {

    vector.x = i;
    instances.setPositionOf( i, vector ); // This sets the position in the internal matrices array

}

scene.add( instances );

This way the user has full control on what gets instanced and there is less logic/magic in WebGLRenderer.

@Benjamin-Dobell
Copy link
Contributor Author

Benjamin-Dobell commented Nov 13, 2016

@mrdoob I guess that's similar to the current instancing functionality. I'm not really a fan of that. We've got this nice scene graph handling all this functionality for us, but we have to reimplement its behaviour (poorly) just to save on draw calls. Draw calls just seem like a much lower level thing, something we shouldn't have to think about when building a scene.

Admittedly it was silly of me not to check, but when I first started using three.js there'd been the recent refactor that allowed Geometry to properly be shared (across contexts etc.) I'd actually assumed that by using the same Geometry I was going to save on draw calls by the having the geometry only pushed to graphics card just the once (or at least not once every object).

"Instancing" is definitely a cool thing that people should be able to control themselves (particle effects etc.) but I guess despite me calling this instancing, I don't really think of this patch set as instancing, just an across the board speed boost - totally hidden for the user. I can't really think of a situation in which a user would not want this behaviour.

Certainly when you've got skinning/dynamic uniforms then you wouldn't want this (although if you're clever you can still instance), but this implementation detects those situations and disables instancing accordingly. Nonetheless, if you really don't want instancing, then it seems to me that cloning the Geometry would actually make sense - you're specifically saying don't reuse the same geometry, treat each object as something different.

The real beauty of this implementation is the simplicity for the end user. With those two links I referred to above; I've loaded the exact same scene, the only difference between those two is that one has this patch set applied and the other doesn't. It's great that I can just load the exact same scene, not think about it and get a 25% performance boost.

@mrdoob
Copy link
Owner

mrdoob commented Nov 13, 2016

@mrdoob I guess that's similar to the current instancing functionality.

Not really. At the moment, there is no way to use instancing unless you know glsl.

The real beauty of this implementation is the simplicity for the end user.

I would prefer to implement this in a predictable way. A way where the user can predict the resulting drawcalls. It's very tempting to try to implement some magic in the WebGLRenderer which suddenly makes everything render much faster, but in the last 6 years I've learnt that those clever speed ups always brings side effects and end up getting in the way of the user and the maintainer 😞

@pailhead
Copy link
Contributor

pailhead commented Nov 13, 2016

Hi,

i hope i didnt come off as too adversarial. This is all a learning experience for me, and i'm trying to consolidate bits and pieces of the knowledge acquired so far.

I think you may give too much credit / have a different understanding of the scene graph. Not entirely sure if draw call batching has much to do with it. The simplest example coming to mind is that there can be empty nodes in that graph, not really rendering but just handling transformation. I think it's primary purpose is that, linking nodes together, managing transformations.

I actually dont know how much optimization happens under the hood of threejs, especially since i missed a few versions.
Draw calls can be optimized without actually reducing them right? For example, if the same geometry is being rendered with different materials it's good to have those calls happen consecutively? Also if it's the same material binding different textures, the less state changes between the calls the better?

Another thing that might be consider is overdraw, although im a bit fuzzy on this topic. Say you want to cull nodes in some other way without actually drawing them, you then don't want to have them tied up in this batch?

Does the size of the mesh matter? Do the nodes transform, if so, is there a trade off - does the shader need to do the texture lookups and unpacking if the meshes are static?

I'm concerned mostly because you mention dynamic uniforms which i thought got nuked, and you only mention particle effects when it comes to instancing, while the box example up there is a perfectly legitimate use case.

If i understand correctly, you want to treat the scene as if it contains these 9144 boxes as nodes, but have the renderer automagically batch them. But why? If you are procedurally generating these it's trivial to use proper instancing (with attributes), if you load it baked as one mesh it's also going to be relatively optimized, at the cost of consuming more memory, and a longer load time. In this case i'd also say it's an optimization that you can take care yourself, ie. reduce memory and load time. If you load it from something describing the graph (9144 nodes, 1 geometry) then yes, it would be nice to not have this tank the renderer, and for some optimization to happen under the hood, but still would be very easy to optimize this yourself with the existing tools. I'd still favor extending the tools, than hard coding this.

So as opposed to @mrdoob solution above, where you "build" this instanced struct:

var mySceneGraph = Loader.load( 'collada.dae' ); //obtain something somehow with a bunch of nodes, sharing geometry
var myOptimizedStruct = new THREE.MeshGroup( mySceneGraph ); //analyze it and come up with something thats easier to render

Certainly when you've got skinning/dynamic uniforms then you wouldn't want this (although if you're clever you can still instance), but this implementation detects those situations and disables instancing accordingly. Nonetheless, if you really don't want instancing, then it seems to me that cloning the Geometry would actually make sense - you're specifically saying don't reuse the same geometry, treat each object as something different.

This is something that should never happen i think. Cloning geometry makes sense if you want to alter that geometry. Geometry is just geometry, and it sits there, and gets called with different states for different draw calls,

@pailhead
Copy link
Contributor

pailhead commented Nov 13, 2016

This article is worth reading through for this topic:

http://blog.tojicode.com/2013/07/webgl-instancing-with.html

I also give a lot of weight to everything i read from pyalot :) (first comment)

The important bit being that this is done through attributes (thus saving all those divisions and instructions it takes to decode your texture) and it doesn't do a texture lookup.

@Benjamin-Dobell
Copy link
Contributor Author

Benjamin-Dobell commented Nov 13, 2016

@mrdoob @pailhead

Draw calls can be optimized without actually reducing them right? For example, if the same geometry is being rendered with different materials it's good to have those calls happen consecutively? Also if it's the same material binding different textures, the less state changes between the calls the better?

This is correct, and is my understanding of what a renderer should be responsible for.

However, the existing three.js renderers do not implement any optimisations like that.

This patch-set however literally implements the very optimisations you mentioned i.e. ensuring we're not swapping materials and causing unnecessary uniform pushing. In my testing prior to this patchset, pushing uniforms was the biggest bottleneck in the engine. By grouping the geometry together I've eliminated the program/uniform swapping. If you were to set minInstancingBatchSize to a large value (say Number.MAX_SAFE_INTEGER) then this is the behaviour you'd get - no instancing, but sorted rendering.

Another thing that might be consider is overdraw, although im a bit fuzzy on this topic. Say you want to cull nodes in some other way without actually drawing them, you then don't want to have them tied up in this batch?

Admittedly I'm not quite sure what you mean with regards to this. This patch-set does not circumvent any of three.js's in-built culling behaviour.

Does the size of the mesh matter?

Yes, larger meshes will benefit more. There is no down-side when the meshes are small.

Do the nodes transform

No, they're already being pushed to the GPU at the moment, they're just pushed one at a time, rendered, then the next object does the same thing. This is inefficient.

does the shader need to do the texture lookups and unpacking if the meshes are static?

Yes, the GPU still needs to know where to render objects regardless of whether they're static. This information needs to be provided to the GPU somehow.

If i understand correctly, you want to treat the scene as if it contains these 9144 boxes as nodes, but have the renderer automagically batch them. But why?

Why not?

Why should you be expected to implement instancing yourself?

The renderer is going to have a much better understanding than you about what can and cannot be efficiently optimised. Particularly when you consider that instancing only makes sense when you have detailed knowledge of the rendering pipeline.

Whose to say that instancing is even going to be a thing for every renderer three.js introduces in the future?

Sure it's a concept in WebGL right now, but in the future there might be something else more efficient a renderer should be doing for you, and not correctly structuring your scene as a real scene (in the scene graph) is counter to this.

Locking yourself into using the an Instances class is tightly coupling the usage of three.js with the way WebGL currently works. Why a user should be expected to have a detailed understanding of how current GPUs most efficiently work when constructing a scene is beyond me.

If you are procedurally generating these it's trivial to use proper instancing (with attributes)

It's even more trivial to just insert the object into the scene graph and not care.

if you load it baked as one mesh it's also going to be relatively optimized, at the cost of consuming more memory, and a longer load time.

I'm not sure what you mean here.

This is something that should never happen i think. Cloning geometry makes sense if you want to alter that geometry. Geometry is just geometry, and it sits there, and gets called with different states for different draw calls,

Agreed. However, I think we disagree on when geometry should be instanced. I see no reason why all geometry shouldn't be considered for instancing. Why should a user need to decide? What if you're building scenes dynamically and sometimes you have a small number of objects and sometimes large? What about if it's a game and objects are constantly being created?

@Benjamin-Dobell
Copy link
Contributor Author

Benjamin-Dobell commented Nov 13, 2016

@mrdoob Have you given any thought to introducing "renderer middleware"?

For the reasons I've listed above I really dislike the idea of having to manually assemble instances of geometry within a scene (or rather not within a scene). A quick summary of why:

  • Instancing is inherently an OpenGL/WebGL feature. I don't like the idea that the use of an Instances class might not work (or be meaningless) for another renderer.
  • In the future WebGL (or other renderers) may have a better solution than "instancing", by keeping this behind the scenes users don't lock themselves into using instances when it may no longer make sense. The renderer is in general going to be better equipped than a programmer to know what optimisations are best given its internal state.
  • I strongly dislike that instances are not part of the scene graph. By not including them as real objects in a scene:
    • The instances won't be frustum culled.
    • The instances won't be subject to any other scene graph rendering optimisations.
    • We can't take advantage of automatic world matrix updating from decomposed components.
    • We can't utilise three.js' collision detection (ray casting) functionality - this is critical for me as my scenes are interactive.
    • We can't easily add and remove instances from the scene - we'd need to keep track of an Instances object instead for every piece of Geometry in the scene.
    • We can't easily swap between instancing and not-instancing a piece of geometry if the number of instances fluctuates - we'd be inserting the objects in scene, then suddenly when they reached a critical point (where instancing is more efficient) we'd have to remove them from the scene and create an Instances object. Then perform the reverse if the number of instances drops.
    • We can't load a scene with instances - unless I or someone else wants to essentially reimplement object loading in all the different loaders specifically for instances.
    • Based on the proposed Instances interface we can't easily animate (or adjust) the transform of the instances, the example API just sets up the instances position (and presumably orientation/scale) and they can't be changed. Certainly we could add an object-like abstraction... but we already have Object.
    • If we want to turn transparency on for an instance (i.e. make an instance see-through whilst selected by the user - which is something that'd be hard to keep track of with the proposed API), it needs to be removed from Instances, injected into the scene, then when we're done we need to remove it from the scene and inject it into Instances again.
    • We can't change the material of an instanced object. Again we'd need to explicitly remove it from our Instances and either add it to the scene or some other Instances we're manually keeping track of.
  • I can't imagine any situation in which users would explicitly want more draw calls. Turning it on by default makes sense - although, this patch-set does include a means to disable it and tweak as desired.

Please correct me if I'm wrong, but I think your only concern is cluttering the WebGLRenderer. I genuinely don't believe it is clutter at all, by taking this approach I've inherently reused all the behaviour already available to objects (which is saving a huge amount of feature duplication). Instancing (or at least the way it's implemented with GLSL) is inherently a WebGL feature, it shouldn't really be part of the "core".

Nonetheless, if my assumptions are correct, again please let me know if there's something else you're not fond of; then perhaps a reasonable solution would be to make the renderers capable of supporting middleware. Then I can try pull the instancing functionality out into a middleware. I don't know if it'd actually work, given this is inherently a WebGL feature, the low-level coupling may be too much to come up with a reasonable middleware API.

But then people could enable this with something like:

var renderer = new WebGLRenderer();
var instancingMiddleware = new WebGLInstancingMiddleware();

// Configure as desired e.g.
instancingMiddleware.setTextureSize(128);
instancingMiddleware.setMinBatchSize(64);

renderer.addMiddleware(instancingMiddleware);

To me that seems like the best of both worlds. Drastically simplified usage, we get to take advantage of all of three.js' awesome scene graph functionality and WebGLRenderer doesn't get (much?) larger.

Although, honestly I have no idea if this will work. A middleware API might expose too much of WebGLRenderer's internal logic, or just in general require a heap of code to be added to support the middleware. I'm not sure.

Sorry, I just realised I never addressed your comment:

I would prefer to implement this in a predictable way. A way where the user can predict the resulting drawcalls.

I'm a little unclear on what you mean with regards to that. The implementation is predictable. When an object can safely be instanced (by default non-transparent, but configurable) and doesn't have dynamic uniform-like behaviour, then when the number of instances (Object/Mesh) of a piece of Geometry in a scene exceed minInstancingBatchSize, instancing will take place, otherwise it won't.

Though, out of curiosity, why do you expect users to care about the draw calls? I would assume that by using a rendering library they specifically don't care, or at least would rather not deal with it unless absolutely necessary.

@Benjamin-Dobell
Copy link
Contributor Author

Benjamin-Dobell commented Nov 13, 2016

Oh, another thing... MultiMaterial. This patch-set handles this situation perfectly. An Instances object presumably would be constrained to entirely opaque objects.

My instancing occurs on the "render item" level, meaning that if a piece of Geometry has several materials, some transparent, some not. The opaque materials (geometry group) will be instanced appropriately, but the transparent parts not (as they need to be z-sorted). I imagine that would not be the case for Instances, at least not easily.

NOTE: This is my intention at least, I haven't thoroughly tested this behaviour just yet.

One other thing worth noting. In my code-base, I have furniture pieces. These are themselves entire scenes, loaded from other files. However, these are effectively my "instances". It's important to note these are made up of a node/object hierarchy, not just one Object with one piece of Geometry. Instead, my furniture is regularly made up of multiple pieces of Geometry, I'd have to add a lot of complicated code to try keep track of Instances for all the Geometry within each of these furniture instances.

I'm not sure how different people use three.js, however, I imagine games would have a very similar concept with regards to loading "models". The proposed Instances API would be incredibly difficult to work with for this (common?) use case.

@pailhead
Copy link
Contributor

pailhead commented Jan 4, 2018

In your SceneInstancer example, it seems like anything that isn't instanced is not drawn, I only see a single render call with the instanced scene but the _instanceScene only contains instances, not anything else that might've been in the original scene that didn't need to be instanced.

Totally right:

instance.addMesh( object ) 
+ object.visible = false //dont render in main scene
+ //object.material.visible = false //maybe this to not hide children?
myRenderer.render( this.sceneBatcher.getBatchedScene(), myCamera)
+ myRenderer.render( myScene , myCamera )

@trusktr
Copy link
Contributor

trusktr commented Jan 4, 2018

  • object.visible = false //dont render in main scene

Now you're modifying the scene graph. You're changing the semantic meaning of the visibility of the objects. Maybe the end user wants some objects to be visible for periods of time and not other time, and now the SceneInstancer is corrupting the user's visibility state.

@pailhead
Copy link
Contributor

pailhead commented Jan 4, 2018

Okay, I lied, PR is bigger, but the functionality of this PR is perfect (as opt-in), the SceneInstancer idea currently has a bunch more caveats and is more difficult for the end user.

Don't be so harsh on yourself, maybe you missed this option:

Large diffs are not rendered by default.

@pailhead
Copy link
Contributor

pailhead commented Jan 4, 2018

Now you're modifying the scene graph.

I may have a different use case scenario than John Doe, while Jane Smith may have one of her own. All of our use cases may differ from yours, that's why I wrote this code as an example of one use case.

My use case specifically was not to make the library larger by adding a feature i would never use and that is so deeply embedded in the WebGLRenderer (which has been getting much more modular lately). @mrdoob is the devil here 🤣 i'm just trying to help you out, maybe there is something you can do with the library as is, that suits your needs :)

@pailhead
Copy link
Contributor

pailhead commented Jan 4, 2018

So in the case of visibility, what is your use case, do you want the buffers to be recalculated?

@trusktr
Copy link
Contributor

trusktr commented Jan 4, 2018

I don't have a specific use case, I'm just thinking generically. If I'm making a scene today, a new scene for the next app, and a new scene for the app after that, I want to rely on the scene graph state rather than thinking "okay, so .visibility is going to be automatically modified by the render loop so I need to make my own visibility state to avoid having a conflict".

More specifically, in infamous, I'd rather not have to create extra state outside of the Three scene graph due to the fact that the underlying Three scene graph is not reliable as a source of state if it is being modified by something like the SceneInstancer.

@pailhead
Copy link
Contributor

pailhead commented Jan 4, 2018

You're out of luck it seems, looks like it may be a while until something like this makes it in.

@titansoftime
Copy link
Contributor

You're out of luck it seems, looks like it may be a while until something like this makes it in.

Which is a bummer, something like this would be incredibly useful.

@pailhead
Copy link
Contributor

pailhead commented Jan 4, 2018

It'd be nice if one of you could put these requests into a wiki or something that's easy to digest.

In the meantime you can take a look at this wip. Not the most optimal, but i used most of the code above:

http://dusanbosnjak.com/test/webGL/three-auto-instancing/

The crazy scale is because i've got buggy animation logic.

update (benchmark)

(generic function to update the scene graph, any one can use it)

let d = 1

function updateGraph(delta){
	for ( let i = 0 ; i < dynamicMeshes.length ; i ++ ){

		dynamicMeshes[i].rotation.x += delta 
		let scale = dynamicMeshes[i].scale.x
		
		if( scale < 1 ) 
			d = 1
		else if ( scale > 3 )
			d = -1 

		scale = ( scale * (1 + d * delta) )
		dynamicMeshes[i].scale.set(scale,scale,scale)

		dynamicMeshes[i].updateMatrix()
		dynamicMeshes[i].updateMatrixWorld(true)
	}

        //magic
	if(mode.batched){
		sceneInstancer.updateDynamic()
	}
}

scene graph setup (benchmark)

(generic three.js scene graph, anyone should be able to create it)

const geometryPool = []
geometryPool.push( new THREE.SphereBufferGeometry(1, 16, 16) )
geometryPool.push( new THREE.BoxBufferGeometry(1, 1, 1) )
geometryPool.push( new THREE.DodecahedronBufferGeometry(1,0) )
geometryPool.push( new THREE.CylinderBufferGeometry(0.5,0.5,2,12,1) )
geometryPool.push( new THREE.CylinderBufferGeometry(1,0,2,8,1) )
geometryPool.push( new THREE.IcosahedronBufferGeometry(1,0) )
geometryPool.push( new THREE.TorusKnotBufferGeometry(1,0.1,32,8,2,3) )

const materialTypes = []
materialTypes.push( new THREE.MeshPhongMaterial({color:'red'}))
materialTypes.push( new THREE.MeshLambertMaterial({color:'green'}))
materialTypes.push( new THREE.MeshStandardMaterial({color:'blue'}))

const staticMeshes = []
const dynamicMeshes = []
const num = 20 

const v3 = new THREE.Vector3()
const numInv = 1 / num
const numInvSq = numInv * numInv
const cubeStepSize = 5

function cube(index, size){
	const sizeHalf = size * 0.5 
	const x = index % 20 - sizeHalf
	const y = Math.floor(index * numInv) % 20 - sizeHalf
	const z = Math.floor(index * numInvSq ) % 20 - sizeHalf

	return v3.set(x,y,z).multiplyScalar(cubeStepSize)
}

const half = num * num * num / 2


for ( let i = 0 ; i < num * num * num ; i ++ ){
	const parent = new THREE.Group()
	parent.rotation.set(
		Math.random() * Math.PI * 2,
		Math.random() * Math.PI * 2,
		Math.random() * Math.PI * 2,
	)
	parent.scale.set(
		Math.random() * 2 + 1,
		Math.random() * 2 + 1,
		Math.random() * 2 + 1,
	)
	const m = new THREE.Mesh( 
		geometryPool[Math.floor(Math.random() * geometryPool.length)],
		materialTypes[Math.floor(Math.random() * materialTypes.length)] 
	)
	parent.position.copy(cube(i,num))
	staticMeshes.push(m)
	scene.add(parent)
	parent.add(m)
	
	m.matrixAutoUpdate = false
	parent.updateMatrixWorld(true)

	if(i >= half){
		m.userData.dynamic = true
		dynamicMeshes.push(m)
	}
}

const l = new THREE.DirectionalLight()
l.position.set(1,1,1)
scene.add(l)
scene.matrixAutoUpdate = false

renderer.sortObjects = false

magic?

const sceneInstancer = new SceneInstancer()

sceneInstancer.setScene( controller.scene )

results?

renderer.info //in console

If you zoom all the way out you should see 8000 draw calls in the non batched version. With the batching it should be around 40 because the stuff is randomly generated. If you zoom in however, you will render less vertices and three will do some culling. Where's the sweet spot here? To cull or not to cull, what's missing etc etc.

I get 15 frames non batched and 60 batched on a 1080ti.

@trusktr
Copy link
Contributor

trusktr commented Jan 4, 2018

Hmmm, if we had some hooks in renderer like mentioned above, it might be possible to try making this happen with the InstancedMesh approach. I'm not sure how much performance impact creating another scene would have though. Maybe only on the first time the scene is created, and any time the user's scene is modified, there'd need to be an update of the actually-rendered scene.

It's possible to implement something like Object.observe, but lighter weight, only observing certain properties (like with Custom Elements' observedAttributes for performance), that checks properties every frame. With this, we might be able to make a tree that is mapped from the user's tree. The only thing the user would need to do is

import AutoInstancer from 'three-plugin-auto-instancer'
renderer.hook('scene', AutoInstancer)

The plugin hook "scene" gives plugins an opportunity to do something to the scene or return a new one, right at the beginning of render(). Then three-plugin-auto-instancer can try something like your SceneInstancer idea @pailhead. It would intelligently observe the user's tree using some observation mechanism, and return a modified tree.

It would cache the generated tree, only returning an updated tree when changes were detected in the user's tree, otherwise it just no-ops. In many cases, only work will be done on the first render. For example for scenes where materials and geometries aren't changing, only things like rotation and position are changing, there's nothing to update, then the upcoming InstancedMesh that @mrdoob mentioned would continue to handle the position/rotation/etc updates.

@trusktr
Copy link
Contributor

trusktr commented Jan 4, 2018

I think I'll give the observation mechanism a shot at some point. It doesn't necessarily need a hook, since the scene is passed into render() anyways:

import AutoInstancer from 'three-auto-instancer'
renderer.render( AutoInstancer( scene ), camera ) // runs the tool every frame too

though hooks in general might be nice or other cases that otherwise involve forking WebGLRenderer to tweak internals.

@pailhead
Copy link
Contributor

pailhead commented Jan 4, 2018

Aside from the syntax, what do you want it to do? Can you describe what's in your scene?

@mrdoob
Copy link
Owner

mrdoob commented Jan 4, 2018

is the upcoming InstancedMesh

The one from last year's GDC? 🤣

Yes! 🤣

This stuff takes time...

There are a lot of things that one has to take into account. Say... What if the extension is not available? Are there any related driver bugs? You don't want to hide all that logic in the renderer on the get go. It's better to have it outside and, when things feel robust, then you can move it to the renderer if it makes sense. As an example, took a while to implement scene.background in a robust way. And it's still not done, it doesn't support tonemapping for instance and there seem to be some bugs on Android/WebVR.

Also, anything that lands in the renderer is stuff you need to support for a very long time, otherwise users get mad that things break. So yeah, if I'm not confident it's something I can see myself maintaining for a long time and responsible for (because contributors rarely maintain the feature after it lands) then it stays open until I feel like it's the right time or it never gets merged.

@mrdoob
Copy link
Owner

mrdoob commented Jan 4, 2018

@Benjamin-Dobell

To be fair, some of these changes are actually fixing bugs in three.js, that are totally unrelated to batching:

36d2992
87607d8

Admittedly, they should probably be independently submitted as PRs.

Happy to cherry-pick these commits. Do you mind detailing a little bit on what they are fixing?

@pailhead
Copy link
Contributor

pailhead commented Jan 4, 2018

I wouldn't mind maintaining it, but it's also working relatively fine outside of the renderer itself (the code looks a bit clunkier though)

@pailhead
Copy link
Contributor

I want to start some kind of auto instancing project that wouldn't touch three.js code, collaborators welcome.

It would be nice if this whole thread resulted in some real world use cases, rather than these pretty simple benchmarks. It would have been nice if all this discussion resulted in some sort of a spec.

@Benjamin-Dobell
I'd love to see a version of the warehouse you posted. For something like that i think the InstanceMesh should suffice and not be too complicated to setup. For that specific example, it feels like you should have the flexibility to maybe apply additional optimizations (how you group instances, how you cull them, how they get created etc.)

@trusktr @lojjic
You describe a pretty complex dynamic scene graph. If something could be mocked, it would be great for testing.

@pailhead
Copy link
Contributor

Reading this for example:
https://docs.unity3d.com/Manual/DrawCallBatching.html

I'd say:

AutoInstancing

  • should manage batching / instancing automatically (the whole point of the feature)
  • should have criteria (seems like it can't be efficient or beneficial without constraints)
  • should batch meshes (guess)
  • should batch according to material (no other way? plus it's complex, they should have the same texture for example)
  • should autobatch only static objects (not moving). ( batching dynamic objects seems complicated )

etc.

@pailhead
Copy link
Contributor

pailhead commented Jan 11, 2018

Presence of this also seems relevant:
https://docs.unity3d.com/ScriptReference/Mesh.CombineMeshes.html

The equivalent of this in three.js would be if there was some kind of mesh container that does the geometry merge on first render and discards the sub meshes.

But then, i probably wouldn't want to defer something like this for the first render, when i could maybe download them in order, and process as they arrive or something.

So maybe rather than doing something that tackles the entire graph it could this for starters:

const material = new Material({map:wood})
const table = new MeshCombined(
  [ 
    Mesh('leg',material ), 
    Mesh('leg',material ), 
    Mesh('leg',material ), 
    Mesh('leg',material ), 
    Mesh('board', material)
  ], 
  mergeOrInstance //either merges everything together, or returns an autoinstanced mesh of legs, and regular mesh board
)

@titansoftime
Copy link
Contributor

I really hope something gets figured out, because this is a feature that is greatly needed.

@pailhead
Copy link
Contributor

pailhead commented Jan 11, 2018

@titansoftime Any chance you could help define what that “something” is? Any thoughts on the super crude list a couple of posts above?

@titansoftime
Copy link
Contributor

titansoftime commented Jan 11, 2018

A relatively easy way to batch meshes in order to eliminate excessive draw calls.

Automatic would be cool, but I understand that Mr. Doob prefers a more transparent API which is totally understandable.

I mean, if it were up to me I would prefer something like:

var group = new THREE.MeshBatch([mesh1,mesh2,mesh3,etc...]);
scene.add(group);

@pailhead
Copy link
Contributor

In other words "Make my 3d on the web faster"? While I totally agree with this in spirit, it doesn't really help set any kind of direction or anything to make this actually happen... :(

You can take a look at this stub and pitch in:

https://github.com/pailhead/three-instanced-mesh/blob/master/future/MeshCombine.js

If InstancedMesh is a relatively easy way to batch meshes, this should be an even easier way, but still very ambiguous.

@titansoftime
Copy link
Contributor

titansoftime commented Jan 11, 2018

const instance = new THREE.InstancedMesh(geometry,material,numInstances)

This is interesting, though iterating through a materialmap to set positions/rotations seems kinda strange (to me).

Not sure if this makes sense but:

var instance = new THREE.InstancedMesh(geometry,material,numInstances);

for( var i=0, len=instance.instances.length; i<len; i++ ){

  instance.instances[i].setPosition(new THREE.Vector3(x,y,z)).
  // rotation as well

}

Meh ?

Edit* And yes please make my 3d webz faster =]

@titansoftime
Copy link
Contributor

Also, some thoughts on this:

should manage batching / instancing automatically (the whole point of the feature)
should have criteria (seems like it can't be efficient or beneficial without constraints)
should batch meshes (guess)
should batch according to material (no other way? plus it's complex, they should have the same texture for example)
should autobatch only static objects (not moving). ( batching dynamic objects seems complicated )

Auto Instancing, honestly, not SUPER important. I think I'd rather control that myself. THough it does sound kinda cool.

Criteria. Not really sure what to say about this.

Batching Meshes. This would be a cool feature, although the proposed method in the previous post (basically the same idea as yours) is more than fine as well.

Batching per material. I assumed an instanced geo group would have to have the same material. If it's possible to have multiple materials, well, that's great but not super important imo.

Batching static objects. Yes, I assumed this would be the case.

@pailhead
Copy link
Contributor

pailhead commented Jan 11, 2018

Off the top of my head, my guess would be that you can actually share a lot of materials under special conditions. Lets say you have 10 different shades of "plastic" (MeshStandardMaterial), as long as they're all opaque, have the same maps etc, they will compile into the same program. Because they're plastic, their metalness and shininess may be the same, and only color differs.

Some autobatcher could figure this out, and make another attribute holding the per material color (this is different than per instance faux vertex color that's in my repo).

If both metalness, shininess and color are different those can also be stored in an attribute. Maps however would interfere with this. One material having a shininessMap and the other not, wouldnt work together. Two materials having two different maps also wouldn't work together. However, Two materials, using the same map with different offsets (atlas) could work.

@pailhead
Copy link
Contributor

Would it be helpful if you could run something like MeshCombined on arrays of meshes and compare the results? [leg,leg,leg,leg,board] is 5 drawcalls to begin with, but could be instanced to two, or merged into one. [foo,bar,baz,qux] is 4 drawcalls, but maybe both instancing and merging return 4 so you gain nothing. Through trial and error you could find some sweet spot for what kind of meshes to even attempt this on?

@titansoftime
Copy link
Contributor

I'd be more than happy to help benchmark the MeshCombined method.

Without a doubt that would add great utility. I'm thinking however the instance = new THREE.InstancedMesh(geometry,material,numInstances); method is more clear about what is happening. The MeshCombined method has, as you noted, a bit of somewhat unpredictable behavior. Along with it's complexity, I fear that the PR may never be merged.

Either way, I am more than happy to assist to the best of my ability (I am not a 3d programming genius unfortunately).

@pailhead
Copy link
Contributor

Doesn’t have to be merged, i’d Make it it’s own package depending either on the InstancedMesh package or only on three if that class makes it in.
Worst case scenario you get no gain, but everything is the same, best case it helps. I’ll fiddle with it a bit. Handling materials is going to be tricky.

@wmcmurray
Copy link
Contributor

I just stumbled upon this HUGE thread 😱 it focused a lot on adding the functionality to the core, but nothing prevents anyone to create a wrapper around the Scene that will batch everything automatically BEFORE adding stuff to the actual ThreeJs Scene. This way everyone can implement their own "magic strategy" inside their app, while ThreeJs core remain clean and predictable 👌🏼

I wrote more details about this here (it might help you get started) :
https://woodenraft.games/blog/separation-of-view-and-state-in-threejs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants