Faster multi ray trace retry #1323

Keou0007 · 2021-05-18T01:15:04Z

Overview

The retry kwarg in the multi_ray_trace function triggers a retry of ray traces that returned no intersection points. Currently this is done by looping over all the rays and determining if they returned a result, which can be intolerably slow if the number of rays is very large and defeats the point of using embree for speed up in the first place. The new method uses a setdiff to determine which rays need to be retried and then only loops across those, significantly improving the performance of the function.

codecov · 2021-05-18T01:22:38Z

Codecov Report

Merging #1323 (32f9c32) into master (f9bbebc) will decrease coverage by 0.03%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master    #1323      +/-   ##
==========================================
- Coverage   90.05%   90.01%   -0.04%     
==========================================
  Files          56       56              
  Lines       12183    12188       +5     
==========================================
  Hits        10971    10971              
- Misses       1212     1217       +5

adeak · 2021-05-18T18:02:59Z

At a first glance this looks good, thanks. Unfortunately I'm not using conda so installing pyembree is a bit of a pain. I'll try to get around to do that, because I have a strong suspicion that we could make the function even more efficient.

In the meantime do you have a small complete example that we can use as a benchmark to compare runtimes?

akaszynski · 2021-05-18T19:45:37Z

In the meantime do you have a small complete example that we can use as a benchmark to compare runtimes?

That would be great as well. My feeling is that this fix looks good, but I don't usually use conda and can't test this locally without some setup.

adeak · 2021-05-18T19:48:08Z

I don't usually use conda and can't test this locally without some setup.

Same here. I tried installing pyembree without it, but only embree 3 is available in the debian testing repos and pyembree only supports embree 2 (its last commit was in 2019 which makes me a bit nervous about it as a soft dependency).

I've just finished setting up conda for this so I think I'll be able to play around with it.

adeak · 2021-05-18T20:09:21Z

I'm all set. We'll want to fix https://github.com/pyvista/pyvista/blob/master/tests/test_polydata.py#L39-L42:

try:
    CONDA_ENV = os.environ['CONDA_ALWAYS_YES'] == "1"
except KeyError:
    CONDA_ENV = False

Despite the conda I had to run the test with

CONDA_ALWAYS_YES=1 pytest -v -k 'test_polydata'

The only place where this is used is the multi_ray_trace test which checks if rtree, pyembree and trimesh are installed. The conda thing seems to be needed for pyembree.

So how about removing the whole conda filter and just letting pytest.importorskip do its thing?

akaszynski · 2021-05-18T20:11:31Z

So how about removing the whole conda filter and just letting pytest.importorskip do its thing?

That's a better approach. Want to make another PR?

adeak · 2021-05-18T20:14:41Z

I think we can put it in this one along with other potential improvements.

@Keou0007 let me know if you feel like removing this yourself, otherwise I believe we can push to your branch to include this ourselves :)

Keou0007 · 2021-05-20T13:58:35Z

I think I've done what you want with removing the CONDA_ENV stuff. Please check that.

Here's a simple test for the performance improvement based on the distance between two surfaces example, which I modified to use the multi_ray_trace function in the PR. I've increased the resolution of the hill so there are a million rays being traced. In this particular example, embree will fail to find an intersection for about 50k of them. The idea is that the retry option will redo all of those misses using the OBBTree ray trace function, which is more reliable but much slower.

Code

import time
import pyvista as pv
import numpy as np

# A helper to make a random surface
def hill(seed):
    mesh = pv.ParametricRandomHills(randomseed=seed, u_res=1000, v_res=1000,
                                    hillamplitude=0.5)
    mesh.rotate_y(-10) # give the surfaces some tilt

    return mesh

h0 = hill(1).elevation()
h1 = hill(10)
# Shift one surface
h1.points[:,-1] += 5
h1 = h1.elevation()

h0["distances"] = np.empty(h0.n_points)
h0["distances"][:] = np.nan
vec = [[0, 0, 1]] * h0.n_points

start = time.time()
points, id_r, id_t = h1.multi_ray_trace(h0.points, vec, first_point=True)
print(f"multi_ray_trace method takes {time.time()-start:.3f} s")

h0["distances"][id_r] = np.sqrt(np.sum((points - h0.points[id_r])**2, axis=-1))

p = pv.Plotter()
p.add_mesh(h0, scalars="distances", smooth_shading=True)
p.add_mesh(h1, color=True, opacity=0.75, smooth_shading=True)
p.show()


start = time.time()
points, id_r, id_t = h1.multi_ray_trace(h0.points, vec, first_point=True, retry=True)
print(f"multi_ray_trace method with retry takes {time.time()-start:.3f} s")

h0["distances"][id_r] = np.sqrt(np.sum((points - h0.points[id_r])**2, axis=-1))

p = pv.Plotter()
p.add_mesh(h0, scalars="distances", smooth_shading=True)
p.add_mesh(h1, color=True, opacity=0.75, smooth_shading=True)
p.show()

Running this code using the existing method, it has to loop over all 1 million rays and check if there is an intersection returned before retrying. This is slooooooow.

multi_ray_trace method takes 9.883 s
multi_ray_trace method with retry takes 992.013 s

Using the new method from this PR, retrying the 50k misses one at a time in a loop still takes a few minutes, but we're no longer wasting time checking on every ray.

multi_ray_trace method takes 9.616 s
multi_ray_trace method with retry takes 282.953 s

This disparity in performance only gets worse as the number of rays increases.

adeak · 2021-05-20T14:40:57Z

Yes, your change is what I had in mind, thank you.

Also thanks for the example, that's perfect for me to start playing with this.

akaszynski

Based on the speedup, I'm more than happy with the changes. Tests pass:

tests/test_polydata.py::test_multi_ray_trace PASSED                      [ 73%]

adeak · 2021-05-21T15:00:20Z

I'm still working on this, I'm just a bit under the weather. Unfortunately due to memory and runtime constraints I can only time a 500x500 problem. @Keou0007 can you time your test with the following version, and check that the results are the same?

    def multi_ray_trace(poly_data, origins, directions, first_point=False, retry=False):
        # docstring snipped for github comment
        if not poly_data.is_all_triangles():
            raise NotAllTrianglesError

        try:
            import trimesh, rtree, pyembree
        except (ModuleNotFoundError, ImportError):
            raise ImportError(
                "To use multi_ray_trace please install trimesh, rtree and pyembree with:\n"
                "\tconda install trimesh rtree pyembree"
            )

        origins = np.asarray(origins)
        directions = np.asarray(directions)

        faces_as_array = poly_data.faces.reshape((poly_data.n_faces, 4))[:, 1:]
        tmesh = trimesh.Trimesh(poly_data.points, faces_as_array)
        locations, index_ray, index_tri = tmesh.ray.intersects_location(
            origins, directions, multiple_hits=not first_point
        )
        if retry:
            ray_tuples = [(id_r, l, id_t) for id_r, l, id_t in zip(index_ray, locations, index_tri)]
            all_ray_indices = np.arange(len(origins))
            retry_ray_indices = np.setdiff1d(all_ray_indices, index_ray, assume_unique=True)
            for id_r in retry_ray_indices:
                origin = origins[id_r]
                vector = directions[id_r]
                unit_vector = vector / np.sqrt(np.sum(np.power(vector, 2)))
                second_point = origin + (unit_vector * poly_data.length)
                locs, indexes = poly_data.ray_trace(origin, second_point, first_point=first_point)
                if locs.any():
                    if first_point:
                        locs = locs.reshape([1, 3])
                    for loc, id_t in zip(locs, indexes):
                        ray_tuples.append((id_r, loc, id_t))
            sorted_results = sorted(ray_tuples, key=lambda x: x[0])
            locations = np.array([loc for id_r, loc, id_t in sorted_results])
            index_ray = np.array([id_r for id_r, loc, id_t in sorted_results])
            index_tri = np.array([id_t for id_r, loc, id_t in sorted_results])
        return locations, index_ray, index_tri

The two changes so far are

assume_unique in setdiff1d,
coercing the inputs to arrays once at the start.

Especially the latter leads to really significant speedup.

I still want to refactor the native python tuple sorting and redundant list comprehensions. And check the inside of the loop. These might also decrease the runtime a bit.

adeak · 2021-05-21T15:51:28Z

Here's the best I've got:

    def multi_ray_trace(poly_data, origins, directions, first_point=False, retry=False):
        # docstring snipped for github comment
        if not poly_data.is_all_triangles():
            raise NotAllTrianglesError

        try:
            import trimesh, rtree, pyembree
        except (ModuleNotFoundError, ImportError):
            raise ImportError(
                "To use multi_ray_trace please install trimesh, rtree and pyembree with:\n"
                "\tconda install trimesh rtree pyembree"
            )

        origins = np.asarray(origins)
        directions = np.asarray(directions)

        faces_as_array = poly_data.faces.reshape((poly_data.n_faces, 4))[:, 1:]
        tmesh = trimesh.Trimesh(poly_data.points, faces_as_array)
        locations, index_ray, index_tri = tmesh.ray.intersects_location(
            origins, directions, multiple_hits=not first_point
        )
        if retry:
            # gather intersecting rays in lists
            loc_lst, ray_lst, tri_lst = [arr.tolist() for arr in [locations, index_ray, index_tri]]

            # find indices that trimesh failed on
            all_ray_indices = np.arange(len(origins))
            retry_ray_indices = np.setdiff1d(all_ray_indices, index_ray, assume_unique=True)

            # compute ray points for all failed rays at once
            origins_retry = origins[retry_ray_indices, :]  # shape (n_retry, 3)
            directions_retry = directions[retry_ray_indices, :]
            unit_directions = directions_retry / np.linalg.norm(directions_retry,
                                                                axis=1, keepdims=True)
            second_points = origins_retry + unit_directions * poly_data.length  # shape (n_retry, 3)

            for id_r, origin, second_point in zip(retry_ray_indices, origins_retry, second_points):
                locs, indices = poly_data.ray_trace(origin, second_point, first_point=first_point)
                if locs.any():
                    if first_point:
                        locs = locs.reshape([1, 3])
                    ray_lst.extend([id_r] * indices.size)
                    tri_lst.extend(indices)
                    loc_lst.extend(locs)
            # sort result arrays by ray index
            index_ray = np.array(ray_lst)
            sorting_inds = index_ray.argsort()
            index_ray = index_ray[sorting_inds]
            index_tri = np.array(tri_lst)[sorting_inds]
            locations = np.array(loc_lst)[sorting_inds]

        return locations, index_ray, index_tri

With this the 500x500 case goes down from 19 seconds to 16.3 seconds on my laptop. I'm curious how it works on the large benchmark.

Edit: I realized memory is not that much of an issue after all, so I could run the 1000 x 1000 case. The improvement is pretty meh: I get 205 seconds (PR version) -> 196 seconds (above version). But converting the inputs to arrays helps the non-retry case as well.

So yeah, the main speedup comes from not looping over rays we've already found.

Keou0007 · 2021-05-22T07:59:13Z

Yes there are definitely diminishing returns there and most of the time is taken up by the OBBTree trace now. I did try and make that faster a while back but it seemed the OBBTree is not thread safe so I got some weird results.

I've pushed your changes, thanks for doing that. It looks pretty nice now, to my eyes at least.

adeak

Thanks. I've pushed two small whitespace fixes, but this should be good to go if CI is green. Last night I checked on your example that we get the same result as before, for both values of first_point.

Keou0007 added 6 commits January 19, 2021 13:23

add tuple as type hint for vectors

d6b9fb5

add tuple to description

86d66c4

Merge branch 'master' of https://github.com/pyvista/pyvista

5adf0f4

Merge branch 'master' of https://github.com/pyvista/pyvista

3d730dc

Merge branch 'master' of https://github.com/pyvista/pyvista

7227dca

use setdiff1d to speed up multi_ray_trace retry

5ec1954

adeak self-requested a review May 18, 2021 09:18

remove redundant CONDA_ENV check from multi_ray_trace test

49cf778

Keou0007 added 3 commits May 21, 2021 00:04

Merge branch 'master' of https://github.com/pyvista/pyvista

d83c318

resolve conflict?

2c10bea

Merge branch 'master' into faster_multi_ray_trace_retry

0ad7e9c

akaszynski approved these changes May 21, 2021

View reviewed changes

optimisations by adeak

d143a32

Whitespace fixes

32f9c32

adeak approved these changes May 22, 2021

View reviewed changes

adeak merged commit 977f3ef into pyvista:master May 22, 2021

Keou0007 deleted the faster_multi_ray_trace_retry branch May 24, 2021 08:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster multi ray trace retry #1323

Faster multi ray trace retry #1323

Keou0007 commented May 18, 2021

codecov bot commented May 18, 2021 •

edited

adeak commented May 18, 2021

akaszynski commented May 18, 2021

adeak commented May 18, 2021

adeak commented May 18, 2021 •

edited

akaszynski commented May 18, 2021

adeak commented May 18, 2021

Keou0007 commented May 20, 2021

adeak commented May 20, 2021

akaszynski left a comment

adeak commented May 21, 2021 •

edited

adeak commented May 21, 2021 •

edited

Keou0007 commented May 22, 2021

adeak left a comment

Faster multi ray trace retry #1323

Faster multi ray trace retry #1323

Conversation

Keou0007 commented May 18, 2021

Overview

codecov bot commented May 18, 2021 • edited

Codecov Report

adeak commented May 18, 2021

akaszynski commented May 18, 2021

adeak commented May 18, 2021

adeak commented May 18, 2021 • edited

akaszynski commented May 18, 2021

adeak commented May 18, 2021

Keou0007 commented May 20, 2021

adeak commented May 20, 2021

akaszynski left a comment

Choose a reason for hiding this comment

adeak commented May 21, 2021 • edited

adeak commented May 21, 2021 • edited

Keou0007 commented May 22, 2021

adeak left a comment

Choose a reason for hiding this comment

codecov bot commented May 18, 2021 •

edited

adeak commented May 18, 2021 •

edited

adeak commented May 21, 2021 •

edited

adeak commented May 21, 2021 •

edited