Features in geometry tree tailored for clash detection. See #4278. #4282

Moult · 2024-02-01T07:33:53Z

Definitely do not merge :)

First attempt at implementation of intersection clash checks. See #4278.

This considers only protruding clashes. It does not consider touching nor encroaching clashes and does not consider protrusion direction in the protrusion distance.

Still a lot to move around and tweak so it's also not yet ready for a detailed review and you can see lots of random prints and probably horrifically incorrect usage of pointers and structs and what not but at least something is committed to a branch for now ...

This considers only protruding clashes. It does not consider touching nor encroaching clashes and does not consider protrusion direction in the protrusion distance.

…ults. Add cache for points that have already been checked. Separate out new add_triangulated function with new kwarg add_element(should_triangulate=True) so that you can still have the old geom tree using aabb only if you want, also because add() is used in boolean_utils. Tolerance is now a kwarg. I misunderstood the OBB dimension and should *2 not /2 which now makes it much slower since there's more triangles to check.

… optimal OBBs, check OBB first for points_in_b prior to doing raycast (much faster now), and return the surface point of the protrusion too for convenience.

…rotrusion) This should be equivalent to select(element) but faster and with an allow_touching toggle.

Moult · 2024-02-04T12:14:33Z

…nor cleanup to use vert/normal maps.

…eed, verts now use gp_Pnt instead of BVH_Vec3d, and optimise add_triangulated.

aothms · 2024-02-07T09:17:48Z

Super awesome!

I see a couple of follow up tasks:

- move physx code to separate file (.cpp) maybe just use the whole lib as a proper dependency
- separate product geometries into solids/shells so that we can rely on bvh-based point containment test instead of non-deterministic'ish ray hit counting
- most unordered maps should probably be vectors because the key is simply a contiguous range of ints
- factor out the common bits of the clash functions
- I think the distrinction between add/add_triangulated shouldn't be made at insertion time but rather at tree initialization.
- filter out invalid tris before tree insertion instead of maintaining a map of bools
- as you said replace triangleintersects.hpp with physx equivalent
- tag open shells so that we know we can't rely on/do containment tests
- I would probably store the points in a vector and index into them for every triangle, as opposed to storing all three points du

plicated, to save some mem

I don't think I'm going to touch any of the geometric predicates. It sounds rather thought through. In some cases I'd have different preferences (like I conceptually don't really like how the distinction between piercing and protruding happens based on loose edges/verts as opposed to topology, but don't see a quick way out of resolving that without a more proper triangle mesh datastructure)

I can work on these follow up tasks if you want, but then we need to agree on some "handover time" so that we're loosing time on conflicts.

Moult · 2024-02-08T00:51:48Z

filter out invalid tris before tree insertion instead of maintaining a map of bools

I'm not sure how to do this. The BVH tree is created from a triangle_set, and a triangle_set is created from a shape list, and a shape list is a list of TopoDS_Face, not triangles. I'm guessing somehow under the hood it reuses the triangulation data from incremental mesh, but I don't know how to get at the triangles until after the BVH is actually created.

…rse than when we used unordered maps.

Moult · 2024-02-08T03:41:26Z

A small benchmark regarding the vertex indices indirection on a 40MB data set clashing all 6400 elements against each other using a intersection check (without early returns). I'm measuring this portion of the code:

    for element in sorted(all_els, key=lambda e: e.Name):
        clashes = tree.clash_intersection(element, tolerance=0.002, check_all=True)

Test	Mem (excl 682MB for file.open())	Time
Before indirection	1,473MB	10.2s
After indirection	1,112MB	12.7s

Note: negligible time difference in the tree adding portion of the code.

… test instead of intersection which is much faster.

Moult · 2024-02-08T06:16:56Z

OK I think I've done what I can. The remaining unchecked boxes I either don't know how to do properly or probably need a bigger architectural decision (also probably worth thinking what signatures to use for many-many clashes). I'll be pens down for the rest of today if you want to write code :)

Moult · 2024-02-08T09:28:56Z

Current benchmarks :) Notice the growing majority of time in opening, building the tree, and the crazy high memory use.

All times in seconds. Memory in MB. All elements are clashed against all other elements.

Open = time to ifcopenshell.open(...)
Tree = time to create tree, either (B) baseline or (T) with triangulation / elem BVH
Col = collision, allowing touching
Int = intersection, 2mm tolerance with (R) return early or (A) check all
Clr = clearance check of 100mm
Sel = tree.select(e), which is functionally equivalent to Col or Int (if you need protrusion distances - except that Sel distances aren't correct atm)
Mem = memory, where (B) represents a "baseline" of only ifcopenshell.open() and creating the current box-based UBTree, and (A) represents the peak memory usage after everything including triangulated tree building and clashing. Measured using print(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1000)

Note that tree.select(e, extend=x) is not measured (equivalent to Clr) because it's simply too slow (gave up measuring after 5 minutes).

Dataset	# Objs	Open	Tree(B)	Tree(T)	Col	Int(R)	Int(A)	Clr(R)	Sel	Mem(B)	Mem(A)
40MB of Elec / Fire	6,412	1.65	1.85	3.3	0.12	0.35	0.76	0.65	9.5	995	1,760
200MB or Elec / Fire / Hyd / Mech	18,513	8.8	34.21	39.6	0.46	1.8	3.9	20.5	130	8,410	10,820

Moult · 2024-02-11T11:20:40Z

The latest commit introduces many-many variants of the clash functions: clash_intersection_many, clash_collision_many, and clash_clearance_many. I don't think there's any reason to keep the old 1:N clash functions (e.g. clash_intersection) because the N:N versions are always faster... like significantly so. See the edit I've made to the benchmarks post to see the huge improvement in numbers :) (especially when the same objects are in both set A and B).

Do you see any reason to keep the 1:N functions or can / should I delete them?

…is divided by num_threads equally.

Moult · 2024-02-12T00:54:31Z

I had a shot at implementing multithreading. I attempted to try to do something similar to

IfcOpenShell/src/ifcgeom/IfcGeomIteratorImplementation.h

Lines 326 to 362 in c66cf0a

    
           for (auto& rep : tasks_) { 
        
           	MAKE_TYPE_NAME(Kernel)* K = nullptr; 
        
           	if (threadpool.size() < kernel_pool.size()) { 
        
           		K = kernel_pool[threadpool.size()]; 
        
           	} 
        
           	while (threadpool.size() == conc_threads) { 
        
           		for (int i = 0; i < (int)threadpool.size(); i++) { 
        
           			auto& fu = threadpool[i]; 
        
           			std::future_status status; 
        
           			status = fu.wait_for(std::chrono::seconds(0)); 
        
           			if (status == std::future_status::ready) { 
        
           				process_finished_rep(fu.get());							 
        
           				std::swap(threadpool[i], threadpool.back()); 
        
           				threadpool.pop_back(); 
        
           				std::swap(kernel_pool[i], kernel_pool.back()); 
        
           				K = kernel_pool.back(); 
        
           				break; 
        
           			} // if 
        
           		}   // for 
        
           	}     // while 
        
           	std::future<geometry_conversion_task*> fu = std::async( 
        
           		std::launch::async, [this]( 
        
           			IfcGeom::MAKE_TYPE_NAME(Kernel)* kernel, 
        
           			const IfcGeom::IteratorSettings& settings, 
        
           			geometry_conversion_task* rep) { 
        
           				this->create_element_(kernel, settings, rep);  
        
           				return rep; 
        
           			}, 
        
           		K, 
        
           		std::ref(settings), 
        
           		&rep); 
        
           	threadpool.emplace_back(std::move(fu)); 
        
           }

but failed horribly. I ended up using a mutex with a very naive approach:

After the box BVH clash, create a task queue of clashes
Divide that queue by num_threads
Use threads to do the work, use a mutex to lock, and merge the results into a final results vector to return

I've got no idea if there is a better way to do it, but I've just updated the results table again and I'm very impressed with the results. I think I've run out of tricks I can think of to further optimise the clash portion of the code. I reckon it's now time to move on to optimising opening / tree creation.

aothms · 2024-02-20T13:16:54Z

Experiment with multithreading

Multi-threading on IO bound tasks like this likely has very little effect. It may even make things worse because of all the locking that has to be put in place.

Moult · 2024-02-20T21:43:07Z

Hmm, I'm looking to crush the 140 seconds of save/load time mentioned here. I did a couple of measurements:

108 seconds to convert to H5 via Python (89 seconds for geom iteration + ~19 seconds for H5 processing).

29 seconds to load a chunked H5 into Blender. (26 seconds of H5 loading / processing, and ~3 seconds of creating Blender objects)

I was hoping that merely using C++ / multithreading would be enough to crush either the 19 or 26 seconds.

Moult · 2024-02-21T06:34:58Z

The good news is that with my attempt at porting the loading code to C++, the 29 seconds it took to load a H5 into Blender has now dropped to 6.6 seconds. Woohoo! (half the time loading the H5, and the other half creating Blender objects). No multithreading was used.

I found a crazy behaviour in H5 where getting a subgroup name was very, very slow. Maybe that explains why the Python code for shape_id, shape in model["shapes"].items() is so slow.

I also found that the casting from SWIG wrapped vector to Python list was very slow too. I got around this by implementing numpy.i.

The bad news is that I've gone past the point of knowing what I'm doing and I have absolutely no idea how this now compiles (I manually copied over numpy.i and the numpy include .h directory, and manually included something in CMakeLists that obviously only works on my machine. So there's a huge amount of cleanup to do ... but hey I'm still really excited about the numbers! :)

Moult · 2024-02-21T21:45:21Z

Given there's a week left before the release, here's the coordination usecase wishlist I'd ideally like covered by then.

I want to just open a model to inspect it visually as fast as possible
- chunked loading (direct from geom iterator)
- save sqlite
- query sqlite
- save to blend for future loading (no h5 necessary)
I want to open many large models to inspect it visually with basic properties as fast as possible
- Blender ui / operators to trigger preprocessing steps
- save h5
- save sqlite
- load h5
- query sqlite
I want to run clash detection on a model I'm authoring
- updated ifcclash
- updated blender clash ui
- build tree
- run n:n clash functions
I want to run clash detection on many large models
- updated ifcclash
- updated blender clash ui
- save h5
- load h5
- (ideally not build) load tree
- run n:n clash functions

Moult · 2024-02-22T02:38:10Z

I tested with saving to H5 via C++ and it seems as though the 108 seconds have also dropped down to 89 seconds for saving a H5. I guess all the overhead in the past was in passing big lists to Python and handling those lists there.

(Note I originally measured 108 seconds but when remeasuring with my own compiled version of IfcOpenShell it went down to 101 seconds, I wonder is there is some -March native optimisation compared to IfcOpenBot, or if the latest OCCT 7.7 is faster somehow)

I did a few measurements:

Given %template(FloatVector) vector<float>;, a h5_shape.verts is returned as a <ifcopenshell.ifcopenshell_wrapper.FloatVector; proxy of <Swig Object of type 'std::vector< float > *' at 0x7f0f871a8ff0> >. If I use this directly in Blender code (e.g. to create meshes with) for a random file it takes 14.7 seconds. Using the SWIG objects directly is slow it seems.
If I cast to list list(e.verts) and then create meshes using the Python list for all subsequent code it drops to 8.9 seconds.
If I use numpy.i and provide a numpy.ndarray directly from C++ then it drops to 6.5 seconds.

I wonder if this means that there could be a benefit in serving TriangulationElement geometry verts/edges/faces as numpy arrays (such as for general geometry iteration that everybody uses). I didn't look in detail as to how that's managed in SWIG but type(shape.geometry.verts) says tuple so maybe it's a different story.

…results

Moult · 2024-02-22T08:38:25Z

It's now possible to chunk directly from an IFC (instead of first having to save out a H5). This means that users can press a button in Blender and immediately load and see an IFC with a fast FPS. Something that would previously take almost 5 minutes and have them browse around with 3 FPS would now take 83 seconds and at 30 FPS.

And this should work for multiple models too! (Once I build the operator for it) Users would also be able to headlessly run it in the background and save out a Blend file and auto-link that blend file to the scene (and then memory used for ifcopenshell.open() should be freed I think). So in a single session they can load in many models and federate them conveniently.

(BTW you've probably noticed the code getting worse and worse, I'd love to clean it up but I need your guidance, and there's a lot of magic around SWIG which escapes me. I've definitely gone overscope in this and starting working on #4279 which is related but perhaps should be in a separate PR)

Moult · 2024-02-23T12:08:12Z

OK, now when loading in a chunked model, you can activate a tool where you can "click" on objects. It'll "highlight" the object you clicked on (it's not a true editable object after all) and it'll query the SQLite db for object properties (see the right hand side of the screenshot).

…lend generation) is now possible

… balance RAM (instancing) vs speed (chunking)

- Only affects non-chunked instances (where verts are higher so the impact will be greater) - Things far away or not in view of the camera will be rendered as bounds

Moult · 2024-02-27T09:11:27Z

Next steps:

Clash code to be reviewed by @aothms to be ready to merge. @Moult to rewrite ifcclash python frontend to work with it, and unbundle hppfcl deps. (excluding any H5 functionality related to IfcClash)
@aothms to investigate alternative to numpy.i so iterator can return efficiently to Python. @Moult to write rewrite chunking in Python reading from this "buffer" and using numpy as efficiently as possible to do chunking and measure how this compares to the C++ process_chunk() / get_chunk(). Goal is to 1) reduce complexity of numpy returns 2) prevent extra code branching in iterator to maintain.
@aothms to start separating H5 into its own distinct serialiser with resolving hacks like the 2 chunk definitions, the vector vs double
Adding support for AABB/OBB in H5
Read/write BVH tree tri swap indices
Future: memory research on pure viewer

Moult · 2024-02-28T11:32:04Z

Thanks to your awesome work @aothms I think we can close #4369, and tomorrow I'll do a quick commit to delete all the chunking C++ code, and do a bit of clean up to finish integrating it into Blender, and then I'd say action 2 is done!

First attempt at implementation of intersection clash checks.

e95e16c

This considers only protruding clashes. It does not consider touching nor encroaching clashes and does not consider protrusion direction in the protrusion distance.

Moult requested a review from aothms February 1, 2024 07:33

Moult marked this pull request as draft February 1, 2024 07:34

Moult added 5 commits February 2, 2024 23:35

Optimise by removing unnecessary (very slow) triangle shrinking, more…

080b628

… optimal OBBs, check OBB first for points_in_b prior to doing raycast (much faster now), and return the surface point of the protrusion too for convenience.

Implement collision clash type (pure tri intersection regardless of p…

988c8f1

…rotrusion) This should be equivalent to select(element) but faster and with an allow_touching toggle.

Simple implementation of clearance clash.

1fe69e5

Use NVIDIA's ray-tri intersection implementation under BSD-3 license.

930020b

Moult added 4 commits February 5, 2024 18:47

Implement early return bool for intersection tests.

9ba1596

Use map for verts and normals, basic implementation of piercing checks.

47dd8ec

Fix missing clearance AABB check and add check_all clearance bool. Mi…

b658b74

…nor cleanup to use vert/normal maps.

Reuse geom iterator triangulation data, don't use optimal OBBs for sp…

0b3e9e7

…eed, verts now use gp_Pnt instead of BVH_Vec3d, and optimise add_triangulated.

Moult added 4 commits February 7, 2024 23:01

Move vendor functions into clash_utils

09efd68

Replace threeyd tri-tri intersect library with physx implementation.

77d9688

Refactor out BVH-BVH clash into its own function.

155f0ab

Use vectors instead of unordered_map for 15% speed boost

e089d4c

Indirection using vertex indices saves 20% memory but speed is now wo…

2cb2809

…rse than when we used unordered maps.

Moult added 2 commits February 8, 2024 15:54

Attempt a manifold check. If it isn't closed, then we use a collision…

6caacd6

… test instead of intersection which is much faster.

Forgot to commit some critical bugfixes.

c90f9c3

Implement BVH based many-many clash with return struct.

6cec578

Implement naive multithreading for N:N clashes. The clash task queue …

911b715

…is divided by num_threads equally.

Remove 1:N clash functions

a5cf4cd

Naive implementation of loading chunked models in C++

6065c0e

Moult added 3 commits February 22, 2024 19:19

Minor fixes to bugs in H5 saving

48bf67f

The iterator can now return chunked triangles instead of per element …

a3006eb

…results

Test operators to try out the various loading combinations

6b82cb4

Moult added 3 commits February 23, 2024 23:03

GUID chunking now implemented when loading either from iterator or H5

a97678d

SQLite property extraction IfcPatch recipe for querying chunked models

d7a7701

New workspace tool to query chunked models

d022399

Moult added 4 commits February 24, 2024 20:31

Type properties are now shown and direct linking of IFC (background b…

67753a0

…lend generation) is now possible

Hybrid chunking and instancing strategy when loading linked models to…

0c4bcb8

… balance RAM (instancing) vs speed (chunking)

Optional culling during navigation to increase FPS

6f56d78

- Only affects non-chunked instances (where verts are higher so the impact will be greater) - Things far away or not in view of the camera will be rendered as bounds

Basic implementation of clipping planes

e0eea85

aothms mentioned this pull request Feb 27, 2024

numpy based chunking #4369

Closed

Moult added 2 commits February 28, 2024 12:45

Test operators to compare C++ chunking vs numpy chunking

8d44b0a

More test operators

ce328a0

Moult added 5 commits February 28, 2024 22:33

Implement numpy based material chunking thanks to aothms

2558518

Forgot to commit test numpy chunking code and workspace tools

dbd3cb6

Chunk all code related to chunking which can now be done in Python

0f1b007

Purge all Python test code

0842df0

Reconcile changes between master and branch

feba8af

Moult closed this Feb 29, 2024

Moult mentioned this pull request Feb 29, 2024

Clash detection features #4374

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Features in geometry tree tailored for clash detection. See #4278. #4282

Features in geometry tree tailored for clash detection. See #4278. #4282

Moult commented Feb 1, 2024 •

edited

Moult commented Feb 4, 2024 •

edited

aothms commented Feb 7, 2024 •

edited by Moult

Moult commented Feb 8, 2024

Moult commented Feb 8, 2024 •

edited

Moult commented Feb 8, 2024

Moult commented Feb 8, 2024 •

edited

Moult commented Feb 11, 2024 •

edited

Moult commented Feb 12, 2024

aothms commented Feb 20, 2024

Moult commented Feb 20, 2024

Moult commented Feb 21, 2024

Moult commented Feb 21, 2024

Moult commented Feb 22, 2024 •

edited

Moult commented Feb 22, 2024

Moult commented Feb 23, 2024

Moult commented Feb 27, 2024

Moult commented Feb 28, 2024

Features in geometry tree tailored for clash detection. See #4278. #4282

Features in geometry tree tailored for clash detection. See #4278. #4282

Conversation

Moult commented Feb 1, 2024 • edited

Moult commented Feb 4, 2024 • edited

aothms commented Feb 7, 2024 • edited by Moult

Moult commented Feb 8, 2024

Moult commented Feb 8, 2024 • edited

Moult commented Feb 8, 2024

Moult commented Feb 8, 2024 • edited

Moult commented Feb 11, 2024 • edited

Moult commented Feb 12, 2024

aothms commented Feb 20, 2024

Moult commented Feb 20, 2024

Moult commented Feb 21, 2024

Moult commented Feb 21, 2024

Moult commented Feb 22, 2024 • edited

Moult commented Feb 22, 2024

Moult commented Feb 23, 2024

Moult commented Feb 27, 2024

Moult commented Feb 28, 2024

Moult commented Feb 1, 2024 •

edited

Moult commented Feb 4, 2024 •

edited

aothms commented Feb 7, 2024 •

edited by Moult

Moult commented Feb 8, 2024 •

edited

Moult commented Feb 8, 2024 •

edited

Moult commented Feb 11, 2024 •

edited

Moult commented Feb 22, 2024 •

edited