Eliminate legacy splats #1893

teh-cmc · 2023-04-18T06:57:41Z

Splats used to require a dedicated row with a special InstanceKey and associated machinery.

With the new store, anything that doesn't match the explicit number of instances given by the row should be treated as a splat:

within a row with num_instances == 0, a component of unit length is a splat
within a row with num_instances == 1, a component of unit length is not a splat
within a row with num_instances == N (N > 1), a component of unit length is a splat

Note: the renderer uses the old InstanceKey::SPLAT constant to drive some related (but different) logic in the selection/hovering subsystems.

Fixing this will fix #1014 in the process.

The text was updated successfully, but these errors were encountered:

Wumpf · 2023-04-18T08:21:21Z

Expanding on the InstanceKey::SPLAT part:
Right now this is the instance id used to indicate that there is no particular instance id and the entire object is selected. Ideally, we would also use zero there since this is the default instance key for all picking layer renderings whenever no instance key was specified directly.
Afaik the only reason we can't do that yet is that we resolve "context dependent selection" when setting up the picking layer not before: We should always render the same picking layer out no matter what is selected at the moment, but right now we may or may not write out SPLAT for instance keys depending on whether an object or an instance is selectable. This ofc means that the picking layer needs to transfer this semantic so it can be picked up again when resolving a retrieved picking layer rectangle - 0 would mean instance 0, SPLAT means all instances. Instead, we should put this logic in the code that reads the picking layer back.
Fixing it this way should give a small performance & simplicity boost to SceneParts!

Wumpf · 2023-04-18T08:26:08Z

Created separate ticket for that ⬆️

jleibs · 2023-04-18T10:31:57Z

within a row with num_instances == 1, a component of unit length is not a splat

I've mentioned it elsewhere, but I want to note that the legacy explicit-splat design did allow us to differentiate between these two calls:

rr.log_points("points", positions = [p1], colors = csplat)
rr.log_points("points", positions = [p1], colors = [c1])

which have different behavior if later followed by the call:

# Just log new positions; rely on previous splat for colors.
rr.log_points("points", positions = [p1, p2])

We can work around this by, again, logging a separate row, but then we're back to having multiple rows again. Alternatively we can only log the extra row if the primary component is length-1 but that special-casing feels worse to me because it's going to create a host of new special edge-case handling for the 1-element path, which starts to feel a lot like going back to having mono-records.

teh-cmc · 2023-10-10T11:12:50Z

Related: Optimize gathering of point cloud colors #3730 (comment)

That's where it becomes funky...

If you do this:
rr.log("random", rr.Points3D(positions, colors=colors, radii=radii))
rr.log_components("random", [rr.components.ColorBatch([255, 0, 0])])
Then you're going to end up with the original colors being discarded, a single red point and the rest of the points using the default color for this entity path (because that ColorBatch is not a splat).

Now, there is a trick at your disposal... you could do this:
rr.log("random", rr.Points3D(positions, colors=colors, radii=radii))
rr.log_components("random", [rr.components.ColorBatch([255, 0, 0])], num_instances=2)
And now you'll end up with only red points, because you explicitly said that the data was 2 instances wide, and so the log function considers the ColorBatch to be a splat...

Of course we could change things so that logging 1 thing is always considered a splat, but then you have the opposite problem, which might or might not be better depending on the situation 🤷.

And this is why I don't like that splats are a logging-time rather than a query-time concern: the view should get to decide what to do with the data that it has as its disposal, and that behavior should be configurable through blueprints and through the UI.
This instance key business is pretty similar to e.g. configurable texture clamping modes in gfx APIs after all.

teh-cmc · 2023-10-12T09:28:44Z

Here's another interesting case:

import rerun as rr
import numpy as np

rr.init("rerun_example_arrows_splat", spawn=True)

vertex_positions = np.array([[-1.0, 0.0, 0.0], [1.0, 0.0, 0.0], [0.0, 1.0, 0.0]], dtype=np.float32)

rr.log(
    "triangle",
    rr.Mesh3D(
        vertex_positions=vertex_positions,
        vertex_normals=[0.0, 0.0, 1.0],
        vertex_colors=[[255, 0, 0], [0, 255, 0], [0, 0, 255]],
    ),
    [rr.Arrows3D.indicator()],
)

Look at how the Arrow is rendered.

The arrow was logged as a splat, so it has a single entry whose instance-key is u64::MAX.

Since we also logged an arrow indicator, the viewer will query an Arrow3D archetype using that same mesh data, except this time the arrow is our primary/required component, not an optional one.
The archetype query will try to join our one single arrow with the available positions and colors, but since position and color are not splatted, there isn't a single instance key in common and so the join doesn't yield anything.
The result is that we draw the arrow on its own, with the default origin and default color.

The "fix" is to log the normal multiple times so that it doesn't get marked as a splat at log-time:

-        vertex_normals=[0.0, 0.0, 1.0],
+        vertex_normals=[[0.0, 0.0, 1.0], [0.0, 0.0, 1.0], [0.0, 0.0, 1.0]],

This would not be an issue if splatting was a query-time concern.

Another somewhat related issue here is how our query model works in and of itself: since we join optional components onto primary components, even if the above issue was fixed, we would still only show a single arrow (properly joined with the first position and first color, but still, something to think about...).

This implements the new uncached latest-at APIs, and introduces some basic types for promises. Tests and benchmarks have been backported from `re_query`. We already get a pretty decent improvement because the join process (clamped-zip) is cheaper and we don't need to query for instance keys at all: ``` group re_query re_query2 ----- -------- --------- arrow_batch_points2/query 1.39 2.5±0.03µs 379.7 MElem/sec 1.00 1810.6±23.62ns 526.7 MElem/sec arrow_mono_points2/query 1.44 1082.7±8.66µs 902.0 KElem/sec 1.00 753.6±9.28µs 1295.9 KElem/sec ``` - Fixes #3379 - Part of #1893 Here's an example/guide of using the new API: ```rust // First, get the raw results for this query. // // Raw here means that these results are neither deserialized, nor resolved/converted. // I.e. this corresponds to the raw `DataCell`s, straight from our datastore. let results: LatestAtResults = re_query2::latest_at( &store, &query, &entity_path.into(), MyPoints::all_components().iter().cloned(), // no generics! ); // Then, grab the raw results for each individual components. // // This is still raw data, but now a choice has been made regarding the nullability of the // _component batch_ itself (that says nothing about its _instances_!). // // * `get_required` returns an error if the component batch is missing // * `get_optional` returns an empty set of results if the component if missing // * `get` returns an option let points: &LatestAtComponentResults = results.get_required::<MyPoint>()?; let colors: &LatestAtComponentResults = results.get_optional::<MyColor>(); let labels: &LatestAtComponentResults = results.get_optional::<MyLabel>(); // Then comes the time to resolve/convert and deserialize the data. // These steps have to be done together for efficiency reasons. // // Both the resolution and deserialization steps might fail, which is why this returns a `Result<Result<T>>`. // Use `PromiseResult::flatten` to simplify it down to a single result. // // A choice now has to be made regarding the nullability of the _component batch's instances_. // Our IDL doesn't support nullable instances at the moment -- so for the foreseeable future you probably // shouldn't be using anything but `iter_dense`. let points = match points.iter_dense::<MyPoint>(&mut resolver).flatten() { PromiseResult::Pending => { // Handle the fact that the data isn't ready appropriately. return Ok(()); } PromiseResult::Ready(data) => data, PromiseResult::Error(err) => return Err(err.into()), }; let colors = match colors.iter_dense::<MyColor>(&mut resolver).flatten() { PromiseResult::Pending => { // Handle the fact that the data isn't ready appropriately. return Ok(()); } PromiseResult::Ready(data) => data, PromiseResult::Error(err) => return Err(err.into()), }; let labels = match labels.iter_sparse::<MyLabel>(&mut resolver).flatten() { PromiseResult::Pending => { // Handle the fact that the data isn't ready appropriately. return Ok(()); } PromiseResult::Ready(data) => data, PromiseResult::Error(err) => return Err(err.into()), }; // With the data now fully resolved/converted and deserialized, the joining logic can be // applied. // // In most cases this will be either a clamped zip, or no joining at all. let color_default_fn = || MyColor::from(0xFF00FFFF); let label_default_fn = || None; let results = clamped_zip_1x2(points, colors, color_default_fn, labels, label_default_fn).collect_vec(); ``` --- Part of a PR series to completely revamp the data APIs in preparation for the removal of instance keys and the introduction of promises: - #5573 - #5574 - #5581 - #5605 - #5606 - #5633 - #5673 - #5679 - #5687 - #5755 - TODO - TODO Builds on top of the static data PR series: - #5534

Static-aware, key-less, component-based, cached latest-at APIs. The overall structure of this new cache is very similar to what we had before. Effectively it is just an extremely simplified version of `re_query_cache`. This introduces a new temporary `re_query_cache2` crate, which won't ever be published. It will replace the existing `re_query_cache` crate once all the necessary features have been backported. - Fixes #3232 - Fixes #4733 - Fixes #4734 - Part of #3379 - Part of #1893 Example: ```rust let caches = re_query_cache2::Caches::new(&store); // First, get the results for this query. // // They might or might not already be cached. We won't know for sure until we try to access // each individual component's data below. let results: CachedLatestAtResults = caches.latest_at( &store, &query, &entity_path.into(), MyPoints::all_components().iter().cloned(), // no generics! ); // Then, grab the results for each individual components. // * `get_required` returns an error if the component batch is missing // * `get_optional` returns an empty set of results if the component if missing // * `get` returns an option // // At this point we still don't know whether they are cached or not. That's the next step. let points: &CachedLatestAtComponentResults = results.get_required::<MyPoint>()?; let colors: &CachedLatestAtComponentResults = results.get_optional::<MyColor>(); let labels: &CachedLatestAtComponentResults = results.get_optional::<MyLabel>(); // Then comes the time to resolve/convert and deserialize the data. // These steps have to be done together for efficiency reasons. // // Both the resolution and deserialization steps might fail, which is why this returns a `Result<Result<T>>`. // Use `PromiseResult::flatten` to simplify it down to a single result. // // A choice now has to be made regarding the nullability of the _component batch's instances_. // Our IDL doesn't support nullable instances at the moment -- so for the foreseeable future you probably // shouldn't be using anything but `iter_dense`. // // This is the step at which caching comes into play. // // If the data has already been accessed with the same nullability characteristics in the // past, then this will just grab the pre-deserialized, pre-resolved/pre-converted result from // the cache. // // Otherwise, this will trigger a deserialization and cache the result for next time. let points = match points.iter_dense::<MyPoint>(&mut resolver).flatten() { PromiseResult::Pending => { // Handle the fact that the data isn't ready appropriately. return Ok(()); } PromiseResult::Ready(data) => data, PromiseResult::Error(err) => return Err(err.into()), }; let colors = match colors.iter_dense::<MyColor>(&mut resolver).flatten() { PromiseResult::Pending => { // Handle the fact that the data isn't ready appropriately. return Ok(()); } PromiseResult::Ready(data) => data, PromiseResult::Error(err) => return Err(err.into()), }; let labels = match labels.iter_sparse::<MyLabel>(&mut resolver).flatten() { PromiseResult::Pending => { // Handle the fact that the data isn't ready appropriately. return Ok(()); } PromiseResult::Ready(data) => data, PromiseResult::Error(err) => return Err(err.into()), }; // With the data now fully resolved/converted and deserialized, the joining logic can be // applied. // // In most cases this will be either a clamped zip, or no joining at all. let color_default_fn = || { static DEFAULT: MyColor = MyColor(0xFF00FFFF); &DEFAULT }; let label_default_fn = || None; let results = clamped_zip_1x2(points, colors, color_default_fn, labels, label_default_fn).collect_vec(); ``` --- Part of a PR series to completely revamp the data APIs in preparation for the removal of instance keys and the introduction of promises: - #5573 - #5574 - #5581 - #5605 - #5606 - #5633 - #5673 - #5679 - #5687 - #5755 - TODO - TODO Builds on top of the static data PR series: - #5534

This implements the new uncached range APIs. Latest-at & range queries are now much more similar than before and share a lot of nice traits. Tests have been backported from `re_query`. Here's an example/guide of using the new API: ```rust // First, get the raw results for this query. // // Raw here means that these results are neither deserialized, nor resolved/converted. // I.e. this corresponds to the raw `DataCell`s, straight from our datastore. let results: RangeResults = re_query2::range( &store, &query, &entity_path.into(), MyPoints::all_components().iter().cloned(), // no generics! ); // Then, grab the raw results for each individual components. // // This is still raw data, but now a choice has been made regarding the nullability of the // _component batch_ itself (that says nothing about its _instances_!). // // * `get_required` returns an error if the component batch is missing // * `get_optional` returns an empty set of results if the component if missing // * `get` returns an option let all_points: &RangeComponentResults = results.get_required(MyPoint::name())?; let all_colors: &RangeComponentResults = results.get_optional(MyColor::name()); let all_labels: &RangeComponentResults = results.get_optional(MyLabel::name()); let all_indexed_points = izip!( all_points.iter_indices(), all_points.iter_dense::<MyPoint>(&resolver) ); let all_indexed_colors = izip!( all_colors.iter_indices(), all_colors.iter_sparse::<MyColor>(&resolver) ); let all_indexed_labels = izip!( all_labels.iter_indices(), all_labels.iter_sparse::<MyLabel>(&resolver) ); let all_frames = range_zip_1x2(all_indexed_points, all_indexed_colors, all_indexed_labels); // Then comes the time to resolve/convert and deserialize the data, _for each timestamp_. // These steps have to be done together for efficiency reasons. // // Both the resolution and deserialization steps might fail, which is why this returns a `Result<Result<T>>`. // Use `PromiseResult::flatten` to simplify it down to a single result. // // A choice now has to be made regarding the nullability of the _component batch's instances_. // Our IDL doesn't support nullable instances at the moment -- so for the foreseeable future you probably // shouldn't be using anything but `iter_dense`. eprintln!("results:"); for ((data_time, row_id), points, colors, labels) in all_frames { let points = match points.flatten() { PromiseResult::Pending => { // Handle the fact that the data isn't ready appropriately. continue; } PromiseResult::Ready(data) => data, PromiseResult::Error(err) => return Err(err.into()), }; let colors = if let Some(colors) = colors { match colors.flatten() { PromiseResult::Pending => { // Handle the fact that the data isn't ready appropriately. continue; } PromiseResult::Ready(data) => data, PromiseResult::Error(err) => return Err(err.into()), } } else { vec![] }; let color_default_fn = || Some(MyColor::from(0xFF00FFFF)); let labels = if let Some(labels) = labels { match labels.flatten() { PromiseResult::Pending => { // Handle the fact that the data isn't ready appropriately. continue; } PromiseResult::Ready(data) => data, PromiseResult::Error(err) => return Err(err.into()), } } else { vec![] }; let label_default_fn = || None; // With the data now fully resolved/converted and deserialized, the joining logic can be // applied. // // In most cases this will be either a clamped zip, or no joining at all. let results = clamped_zip_1x2(points, colors, color_default_fn, labels, label_default_fn) .collect_vec(); eprintln!("{data_time:?} @ {row_id}:\n {results:?}"); } ``` - Fixes #3379 - Part of #1893 --- Part of a PR series to completely revamp the data APIs in preparation for the removal of instance keys and the introduction of promises: - #5573 - #5574 - #5581 - #5605 - #5606 - #5633 - #5673 - #5679 - #5687 - #5755 - TODO - TODO Builds on top of the static data PR series: - #5534

teh-cmc mentioned this issue Apr 18, 2023

Tracking issue: end-to-end batching #1619

Closed

18 tasks

teh-cmc changed the title ~~Fix splats all around (rs sdk, py sdk, re_query...)~~ Eliminate legacy splats Apr 18, 2023

teh-cmc added ⛃ re_datastore affects the datastore itself 🔍 re_query affects re_query itself labels Apr 18, 2023

Wumpf mentioned this issue Apr 18, 2023

Resolve context depending picking on picking readback, not during scene build up #1895

Closed

teh-cmc mentioned this issue Apr 18, 2023

Batch selections should not preview single-element non-splats #1014

Closed

teh-cmc mentioned this issue Apr 27, 2023

A single call to rerun.log should never produce more than 1 "event" on the timeline #1995

Open

jleibs mentioned this issue Oct 9, 2023

Fix bug when joining cleared optional components #3726

Merged

3 tasks

teh-cmc mentioned this issue Oct 10, 2023

Optimize gathering of point cloud colors #3730

Merged

4 tasks

teh-cmc self-assigned this Mar 14, 2024

This was referenced Mar 19, 2024

New data APIs 1: uncached latest-at queries #5574

Merged

New data APIs 2: cached latest-at queries #5581

Merged

teh-cmc mentioned this issue Mar 26, 2024

New data APIs 8: uncached range queries #5687

Merged

5 tasks

teh-cmc mentioned this issue Apr 24, 2024

Remove instance keys and explicit splatting everywhere #6104

Merged

16 tasks

teh-cmc closed this as completed in #6104 Apr 26, 2024

teh-cmc closed this as completed in 893de2f Apr 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eliminate legacy splats #1893

Eliminate legacy splats #1893

teh-cmc commented Apr 18, 2023 •

edited

Loading

Wumpf commented Apr 18, 2023

Wumpf commented Apr 18, 2023

jleibs commented Apr 18, 2023

teh-cmc commented Oct 10, 2023

teh-cmc commented Oct 12, 2023 •

edited

Loading

Eliminate legacy splats #1893

Eliminate legacy splats #1893

Comments

teh-cmc commented Apr 18, 2023 • edited Loading

Wumpf commented Apr 18, 2023

Wumpf commented Apr 18, 2023

jleibs commented Apr 18, 2023

teh-cmc commented Oct 10, 2023

teh-cmc commented Oct 12, 2023 • edited Loading

teh-cmc commented Apr 18, 2023 •

edited

Loading

teh-cmc commented Oct 12, 2023 •

edited

Loading