Working version using swaps #44

mqp · 2018-12-13T01:10:33Z

I worked toward making an implementation of reordering the list of tris, as suggested in #42 and as we discussed earlier.

The code as written does this by leaving the original geometry object alone, and reordering and keeping a reference to the origTris (now BVHConstructionContext.tris) array. This seems slightly faster during construction, but actually results in using more memory in the resulting tree in this implementation -- it's referring to this one big array in each node, plus an offset and count, as opposed to the old one, in which each node referred to a small slice of the big array. So really this is just a start.

The obvious next step is to reorder the actual geometry object per the reordered tris array, so that we don't have to keep this "array of pointers" representation around next to the final tree. That will really be a nice improvement over the old version.

Other changes in this implementation:

I went back to recursing directly in createNode instead of using a queue of nodes. I like doing this because it means we're doing the tree construction left-to-right, depth-first, making it more comprehensible to e.g. reorder geometry incrementally when we have an edge node to construct, because we'll never have to look at the triangles in that edge node again later in tree construction. I'm not concerned about a stack overflow because we control the maximum tree depth anyway. It's also nice not to construct garbage closures in each createNode invocation.
I split some MeshBVH construction functionality into a BVHConstructionContext object, and moved some of the GeometryUtilities stuff which required precomputed pieces of construction data (e.g. centroids, bounds) into class methods on that. I'm not sure whether I can really say this is better, it just made it easier for me to keep everything organized when I was making this change. I'm open to changing it back or further reorganizing.

Open question: How serious are we about supporting non-BufferGeometry geometries? It will make this reordering idea rather more complicated if we have to care about multiple geometry representations. I personally would not bother supporting them, at least as an output. (In the case of outputting an indexed BufferGeometry, we have it especially easy, because we can just change the index and leave all the attributes alone; otherwise, we need to reorder all the attribute data as well.)

gkjohnson · 2018-12-13T01:16:12Z

This pull request introduces 2 alerts when merging b47fcfd into 96a159b - view on LGTM.com

new alerts:

2 for Unused variable, import, function or class

Comment posted by LGTM.com

gkjohnson · 2018-12-13T01:22:50Z

This pull request introduces 2 alerts when merging e8a7622 into 96a159b - view on LGTM.com

new alerts:

2 for Unused variable, import, function or class

Comment posted by LGTM.com

gkjohnson · 2018-12-13T18:46:41Z

Hey @mquander!

This is great. I know it's still a work in progress but I'll go through the code in a bit and add a few comments.

To answer a few of your questions:

... but actually results in using more memory in the resulting tree in this implementation...

I think this is okay for now -- enabling the re-ordering of the geometry indices directly in the long run should is a good plan.

I split some MeshBVH construction functionality into a BVHConstructionContext object... I'm open to changing it back or further reorganizing.

Separating the build logic out into a separate file is a good change, in my opionion. The tree constructor was getting pretty unwieldly. I do think I find keeping the construction operations as functional as possible to be a bit more traceable, especially if there's no internal caching going on. getCentroids(tris, geo, getValFunc) vs BVHConstructionContext.computeCentroids(), for example. It's definitely a matter of personal preference, though.

How serious are we about supporting non-BufferGeometry geometries?

I'm comfortable removing support for regular THREE Geometry. There's been talk of deprecating THREE.Geometry and I think it's generally a fine assumption that BufferGeometry should be used for anything moderately complicated.

... we need to reorder all the attribute data as well.

I would think generating a triangle index if one doesn't exist rather than re-ordering the attributes is an okay plan, for now. That sounds like a lot of complexity that would be great to avoid.

gkjohnson · 2018-12-13T18:54:45Z

lib/MeshBVH.js


 			// create the two new child nodes
-			if ( ! left.length || ! right.length ) {
+			if ( index === offset || index === offset + count ) {


This is basically the case where all of the triangles wind up on one side of the partition, right? Shouldn't it be possible to know this ahead of time and avoid partitioning? Granted I'm not sure if there's a lot to gain from checking that.

I agree, it feels like there is some duplication of concerns between this and the splitting code (which also can choose to return no-split-required.) Perhaps I will try to clean it up.

On reflection, I think this is actually not just awkward, but wrong. Consider the AVERAGE and CENTER strategies. If it happens to be that all of the triangle centroids wind up on one side of the average of the longest edge -- which seems strange but possible given that the longest edge computation is using the full bounding box, but partitioning is using the centroid -- this code will terminate tree construction, even though maybe if you did a split again you would continue to make useful splits. So something should be different here.

Would this be a problem in the AVERAGE split? In that case the average of all centroids are taken which would have to mean that some of the centroids are left on either side of the split. In the case of CENTER it would seem that this is just an artifact of choosing a less smart split strategy.

Ultimately I think this is okay.

mqp · 2018-12-14T08:53:45Z

This is getting closer to something I endorse. I made some other small fixes that were easy and hopefully uncontroversial.

The failing test looks like it's because our test data has an "ambiguous" result where there are two equally close triangles that intersect the ray? I will look at it further.

This version now soundly cleans up on all the benchmarks while also using less memory for the tree structure (especially if the geometry was already indexed.) For whatever reason, it speeds up the in-browser example bounds tree creation a lot more than it speeds up the benchmark bounds tree creation -- in the example in Firefox it seems to go from 110 ms to 60 ms, whereas on the benchmark it goes from about 30 ms to 25 ms.

gkjohnson · 2018-12-14T08:55:28Z

This pull request introduces 3 alerts when merging ff1abdf into 96a159b - view on LGTM.com

new alerts:

3 for Unused variable, import, function or class

Comment posted by LGTM.com

gkjohnson · 2018-12-14T09:03:33Z

This pull request introduces 1 alert when merging 305013c into 96a159b - view on LGTM.com

new alerts:

1 for Unused variable, import, function or class

Comment posted by LGTM.com

gkjohnson · 2018-12-18T23:17:40Z

I'm gonna go through and see how everything looks but here are the benchmarks I ran for master against this branch so far:

Benchmark Before

Compute Bounds Tree      : 28.12 ms
Default Raycast          : 1.452785 ms
BVH Raycast              : 0.16852 ms
First Hit Raycast        : 0.08895 ms

Total Memory Usage       : 2633.052 kb
Geometry Memory Usage    : 1134.216 kb
Bounds Tree Memory Usage : 1498.836 kb

Benchmark After

Compute Bounds Tree      : 24.84 ms
Default Raycast          : 1.525407 ms
BVH Raycast              : 0.099526 ms
First Hit Raycast        : 0.10233 ms

Index Memory Usage       : 300.184 kb
BVH Memory Usage         : 758.626 kb

The memory improvements look great but I'm a little confused by the discrepancy between BVH Raycast and First Hit Raycast -- "BVH Raycast" seems to consistently be faster than "First Hit", which makes me feel like something is wrong. "First Hit" also seems to have gotten slower and "BVH" significantly faster, which I wouldn't have expected. Are you seeing seeing similar results?

gkjohnson · 2018-12-18T23:27:37Z

lib/MeshBVH.js


-				node.tris = tris;
+				ctx.writeReorderedIndices( offset, count, indices );
+				node.offset = offset;


This is probably minimal savings but is it necessary to store the offset and count fields for anything other than leaf nodes?

No, but this branch is making a leaf node, so it seems to me it should have them. (Although as I mentioned above, I'm not a fan of this branch.)

Got it 👍

I see now that the branch below is the internal node case and does not have these fields added.

gkjohnson · 2018-12-18T23:46:16Z

lib/BVHConstructionContext.js

+		this.bounds = computeBounds( geo );
+
+		// a list of every available triangle index
+		this.tris = new Array( getTriangleCount ( geo ) );


What's the value of the tris array? I don't see it being changed anywhere so it seems to always hold the current index for that position in the array.

Nevermind I'm seeing that partition() updates the tris array

Next related question: Why not just operate directly on the index array rather than creating a new tris array and then writing the new order to the index buffer?

We need each invocation of createNode to not only refer to the correct vertices, but also the correct precomputed bounding boxes and precomputed centroids. tris is an index into all three of those things at once. If we didn't have tris, each swap operation would have to not only swap three elements of the index (one per vertex), but six elements in the list of triangle bounding boxes and three elements in the list of centroids. I didn't measure it, but I suspected this was likely to be slower and was sure to be uglier than keeping this extra level of indirection.

Yeah I see, now. This is more of an index into the index array.

Yes, it's tricky to think about.

gkjohnson · 2018-12-19T00:00:52Z

lib/MeshBVH.js

-		const n = createNode( origTris, ( new THREE.Box3() ).copy( geo.boundingBox ), this );
-		while ( queue.length ) queue.pop()();
+		const n = createNode( boundsToArray( geo.boundingBox ), this, 0, ctx.tris.length );
+		n.index = new THREE.BufferAttribute( indices, 1 );


Why not just take over the existing index array or generate one if it doesn't exist and use that?

Well, the way we construct the new index array is by incrementally filling it up from left to right as we bottom out at leaf nodes. If we modified the old array instead, we might be overwriting indices that we still need to read from for future tree construction (because the old array isn't in sorted-triangle-order -- see the above explanation of how we're using tris to keep track of the sort order instead of modifiying the index directly.)

It's a good point that we wouldn't have to allocate a new index if we swapped the index data instead of using tris.

gkjohnson · 2018-12-19T00:20:47Z

lib/MeshBVHNode.js

-		this.tris = null;
-		this.children = null;
-		this.splitAxis = - 1;
+		// internal nodes have boundingData, children, and splitAxis


I'm wondering if there's a performance trade-off here by not defining all the fields on every node. Here's a benchmark that looks into the performance implications of using differently shaped objects:
https://github.com/davidmarkclements/v8-perf#polymorphic-vs-monomorphic-code

Yeah, having an extra type in the inline caches must cost something, but changing this is basically cutting the tree size in half, which is so much that I suspect it must be faster (i.e. due to more nodes fitting in a page.) I didn't carefully measure it, though, because even if it were a little slower I would still want to do it to cut the memory usage.

I did a little bit of testing and didn't see any significant changes so this seems okay.

If you really wanted to save more memory you could store count and offset in a more optimally sized (UInt16Array, UInt32Array, etc) array based on the length of the index array.

True. But the long-term approach for making things more efficient from here is to serialize the whole output tree into one big typed array, so this intermediate step isn't super appealing to me.

lib/GeometryUtilities.js

gkjohnson · 2018-12-19T19:10:21Z

I was digging in a bit more into the performance discrepancy between raycastFirst and raycast and it seems that this line may be incorrect:

const leftToRight = ray.direction[ xyzFields[ this.splitAxis ] ] <= 0;

If the "left" side is assumed to be negative and the "right" positive then a positive value should be considered moving left the right. Changing the line to use >= makes raycastFirst faster.

It's a little concerning that the tests pass when in either case here but it makes sense. If the further child is checked first then the closer bounds will always checked next because the surface is closer and that intersection will be returned.

For whatever reason, it speeds up the in-browser example bounds tree creation a lot more than it speeds up the benchmark bounds tree creation

The knot geometry used in the benchmark is different so it's possible that the structure of the geometry causes it to reach max depth or max triangles per leaf nodes faster in a way that doesn't show the speed improvements as much. I wouldn't be against changing the geometry in the benchmark.

mqp · 2018-12-19T20:42:21Z

Wow, thanks for finding that! Talk about defeating the whole purpose of raycastFirst.

mqp · 2018-12-19T21:34:44Z

Now I get benchmark results that make more sense, with regard to the proportion between full raycast and first-hit-only:

Compute Bounds Tree      : 22.94 ms
Default Raycast          : 1.635422 ms
BVH Raycast              : 0.126518 ms
First Hit Raycast        : 0.063944 ms

Index Memory Usage       : 300.184 kb
BVH Memory Usage         : 758.626 kb

(Remember when this benchmark looked like this? 😮)

mqp · 2018-12-19T21:49:13Z

So, now that this is approaching its final form, do you still like the distinction between MeshBVH.createNode and the separate BVHConstructionContext class?

In my judgment, I'm tempted to make BVHConstructionContext a value type that just hangs onto tris, centroids, bounds, and sahplanes (the big things that we will throw out after tree construction), move functions like partition, split, and shrinkBoundsTo back onto MeshBVH, and have them take the context as an argument. I like that structure because I care about clarifying the distinction between runtime-tree-data and construction-time-tree-data, but it seems weird to have createNode be in a different module than all the subroutines it relies on.

mqp · 2018-12-20T00:54:07Z

OK, this branch seems like a Pareto improvement over the existing code now, so I'm happy to merge whenever you are satisfied with it. Once it's merged I might actually put this version in Hubs because I think it's probably fast and memory-efficient enough to be better than nothing for typical medium-sized models. (Of course, it will still be a very appealing improvement to web-workerize it, and to serialize the whole tree. I think the benchmark drastically underestimates the actual size of the tree in memory because it's not taking into account the kind of overhead that the JS interpreter actually has per object, like prototype chains and stuff -- for example, you can check out the Chrome memory profiler's opinion on the example page.)

gkjohnson · 2018-12-20T18:28:40Z

This looks great! The benchmarks have come a long way.

In my judgment, I'm tempted to make BVHConstructionContext a value type that just hangs onto tris

This sounds like a fine change to me. I think it might make things easier to follow, too. It's a little unclear atm when a function belongs on the context and when it belongs on the MeshBVH class. We can re-org in a separate effort, though.

I might actually put this version in Hubs because I think it's probably fast and memory-efficient enough to be better than nothing for typical medium-sized models

I'd be excited to see this. I'd like to get this up on NPM, as well -- there are few things that have to happen before it's ready for that I think, though.

Of course, it will still be a very appealing improvement to web-workerize it, and to serialize the whole tree.

I was thinking about this a bit more and it occurred to me that there might be some complexity in sending the position and index buffers over to the worker. We'd have to either have both indices stored as SharedArrayBuffers or copy an ArrayBuffer to send it over, right? It's workable but the copy is a bit of an annoyance.

I think the benchmark drastically underestimates the actual size of the tree in memory because it's not taking into account the kind of overhead that the JS interpreter actually has per object...

I'm wondering if we can do anything to improve the estimate.

mqp · 2018-12-20T21:23:23Z

I was thinking about this a bit more and it occurred to me that there might be some complexity in sending the position and index buffers over to the worker. We'd have to either have both indices stored as SharedArrayBuffers or copy an ArrayBuffer to send it over, right? It's workable but the copy is a bit of an annoyance.

That may be accurate. The dream would be to put the model loader also into a worker together with the BVH generation, in which case we can just transfer it all over when that worker is done. I see no reason why that shouldn't be possible. Right now, though, our model loading isn't done in a worker because the three.js mechanism to workerize that is super confusing to me.

Working version using swaps

e8a7622

mqp force-pushed the reorder-tris branch from b47fcfd to e8a7622 Compare December 13, 2018 01:16

gkjohnson reviewed Dec 13, 2018

View reviewed changes

mqp added 3 commits December 13, 2018 17:14

Rewrite geometry index with new ordering

21e57a1

Use mesh geometry reference when raycasting

b9cf5c6

Eliminate extra data on tree nodes

1e27e98

mqp force-pushed the reorder-tris branch 2 times, most recently from 008d3de to 1d2e3dd Compare December 14, 2018 09:44

mqp added 10 commits December 14, 2018 14:23

Don't support non-BufferGeometry inputs

fcdfc4d

Eliminate seenFaces from raycasting

7044e92

Use Float32Arrays for large precomputations

9bb1e8f

Small fixes to eliminate garbage during construction

97778cb

Style cleanup

e18283d

Kill unused code

2679808

Tidy up precomputation code

c2c144e

Fix test to deal with reordered index

14f82c6

Make bounds shrinking do a little less work

0e5a3a6

More useful benchmark memory output

69196b1

mqp force-pushed the reorder-tris branch from 2fffde7 to 69196b1 Compare December 14, 2018 22:23

gkjohnson reviewed Dec 18, 2018

View reviewed changes

gkjohnson reviewed Dec 19, 2018

View reviewed changes

lib/GeometryUtilities.js Outdated Show resolved Hide resolved

mqp added 2 commits December 19, 2018 13:31

Fix bug in firstHitOnly raycast

06196fd

Cache length in triangle loops

4867378

mqp force-pushed the reorder-tris branch from a3e128e to 4867378 Compare December 19, 2018 21:31

mqp added 2 commits December 19, 2018 15:30

Misc. cleanup

9f41e04

Consolidate triangle precomputations for better perf

8abbdde

mqp force-pushed the reorder-tris branch from 7d70186 to 8abbdde Compare December 19, 2018 23:30

Make createNode less strange

b1a882c

mqp force-pushed the reorder-tris branch from 819e47b to b1a882c Compare December 20, 2018 00:45

gkjohnson merged commit 6908aea into gkjohnson:master Dec 20, 2018

gkjohnson mentioned this pull request Dec 20, 2018

Remove unnecessary fields from the MeshBVHNode to save memory #18

Closed

mqp mentioned this pull request Dec 20, 2018

BVH construction : Improve memory consumption #42

Closed

mqp mentioned this pull request Dec 29, 2018

Swap all triangle data instead of using tris pointers #62

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Working version using swaps #44

Working version using swaps #44

mqp commented Dec 13, 2018 •

edited

gkjohnson commented Dec 13, 2018

gkjohnson commented Dec 13, 2018

gkjohnson commented Dec 13, 2018

gkjohnson Dec 13, 2018

mqp Dec 13, 2018

mqp Dec 14, 2018 •

edited

gkjohnson Dec 19, 2018

mqp commented Dec 14, 2018

gkjohnson commented Dec 14, 2018

gkjohnson commented Dec 14, 2018

gkjohnson commented Dec 18, 2018

gkjohnson Dec 18, 2018

mqp Dec 19, 2018 •

edited

gkjohnson Dec 19, 2018

gkjohnson Dec 18, 2018

gkjohnson Dec 18, 2018

gkjohnson Dec 18, 2018

mqp Dec 19, 2018

gkjohnson Dec 19, 2018

mqp Dec 19, 2018

gkjohnson Dec 19, 2018

mqp Dec 19, 2018

gkjohnson Dec 19, 2018

mqp Dec 19, 2018 •

edited

gkjohnson Dec 19, 2018

mqp Dec 19, 2018

gkjohnson commented Dec 19, 2018

mqp commented Dec 19, 2018

mqp commented Dec 19, 2018 •

edited

mqp commented Dec 19, 2018 •

edited

mqp commented Dec 20, 2018 •

edited

gkjohnson commented Dec 20, 2018

mqp commented Dec 20, 2018

Working version using swaps #44

Working version using swaps #44

Conversation

mqp commented Dec 13, 2018 • edited

gkjohnson commented Dec 13, 2018

gkjohnson commented Dec 13, 2018

gkjohnson commented Dec 13, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mqp Dec 14, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mqp commented Dec 14, 2018

gkjohnson commented Dec 14, 2018

gkjohnson commented Dec 14, 2018

gkjohnson commented Dec 18, 2018

Benchmark Before

Benchmark After

Choose a reason for hiding this comment

mqp Dec 19, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mqp Dec 19, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gkjohnson commented Dec 19, 2018

mqp commented Dec 19, 2018

mqp commented Dec 19, 2018 • edited

mqp commented Dec 19, 2018 • edited

mqp commented Dec 20, 2018 • edited

gkjohnson commented Dec 20, 2018

mqp commented Dec 20, 2018

mqp commented Dec 13, 2018 •

edited

mqp Dec 14, 2018 •

edited

mqp Dec 19, 2018 •

edited

mqp Dec 19, 2018 •

edited

mqp commented Dec 19, 2018 •

edited

mqp commented Dec 19, 2018 •

edited

mqp commented Dec 20, 2018 •

edited