Global symbol query #6497

ChrisLoer · 2018-04-11T01:04:32Z

This is an alternative to PR #6461 that goes further and also fixes issue #5475 and #6298 (switches to global instead of per-tile symbol querying, which fixes edge cases around tile boundaries). It also addresses the original hover flicker from issues #5887 and #5506.

I split it out into a separate PR because it's even more invasive and we might want to choose the smaller fix. However, at this point I'm in favor of going with this version because it has some nice simplifications:

No round-tripping of viewport query coordinates through tile space
No more need to merge duplicate results from the same symbol showing up in multiple tiles
All querying-related data can now be indexed with a bucket instance id and a feature index.
CrossTileSymbolIndex manages lifetime of data needed for querying symbols, Tile is only responsible for "latest" data
CollisionBoxArray no longer involved in querying at all

I've just realized a subtle problem with this PR -- the order of symbol results can now depend on the (arbitrary) instance ID of the bucket they belong to. I'll need to figure out a fix for that. Do we have to go as far as fully reproducing the previous order based on tileID? (We should be able to do that, since we can work back from bucket to tile).

briefly describe the changes in this PR
write tests for all new functionality
document any changes to public APIs
post benchmark scores
manually test the debug page

/cc @ansis @malwoodsantoro @ryanbaumann @lilykaiser @jfirebaugh

ChrisLoer · 2018-04-11T22:29:00Z

I just pushed a change to make the symbol results sort first by tileID, the same way they did before. There is still a small difference:

Previously, within a tile/layer, we sorted symbol results by their index into the CollisionBoxArray (basically the order in which they got created during performSymbolLayout)
Now, since we're not using the CollisionBoxArray, we sort them by their feature index in the original vector tile.

As far as I know, these should be pretty much equivalent since the layout order is derived relatively directly from the order of features in the tile, but I'm not sure I could guarantee it.

ChrisLoer · 2018-04-11T23:49:34Z

Benchmarks look fine except for QueryPoint and QueryBox... which look like they need work!

ansis

Overall, I like this approach a lot better than #6461 and I think it makes sense to do it now. It's a bit more involved change but I agree that the result is simpler code.

I left a couple comments related to small things. The only bigger question I have is the one about whether we even need to retain buckets.

ansis · 2018-04-12T16:32:00Z

src/symbol/cross_tile_symbol_index.js

+                symbolBucket.sourceLayerIndex,
+                symbolBucket.index,
+                tile.tileID
+            );


I think this creates a new RetainedBucket object every frame. If doing that is fine, could we get rid of lastUsedBuckets and just reset retainedBuckets on every frame?

ansis · 2018-04-12T16:45:39Z

src/symbol/cross_tile_symbol_index.js

+                symbolBucket.index,
+                tile.tileID
+            );
+


This retains the buckets used in the previous frame, not the buckets used in the last committed placement, right? The comment is making me doubt if I'm understanding this correctly

Half-baked thought: My understanding is that this retaining is necessary so that any tile loads or updates that arrived between frames don't eliminate data rendered during the previous frame. Do we need this exact data, or can we load "the same symbol" by loading features based on their crossTileID instead of their vectortile layer and feature index?

Or, if we need to have the previously rendered view of the data, would it be possible to (eventually, not now) refactor SourceCache and Tile so that it's possible to hold onto a immutable(ish) reference to all of the previous frame's render data instead of mutating everything in place?

This logic is more about holding onto data during incremental placement than it is about holding onto data between frames. lastUsedBuckets is only cleared during the prune operation that happens on committing a placement. That allows us to hold onto RetainedBuckets that weren't used in the previous/current frame, but will still be part of the next placement that gets committed. Does that make sense?

RetainedBuckets is probably a misleading name, because it makes it sound like we're holding on to the buckets themselves (and thus makes it sound like something related to rendering). What we're actually holding onto is all the bucket-specific data necessary for symbol querying (basically, the raw vector tile data plus some metadata). Maybe RetainedBucketQueryData?

A further refactoring that might make sense is to stop holding on to the FeatureIndex entirely here. We're using it for shared code that loads features from the underlying vector data, but we don't need to hold onto all the data it has in its non-symbol GridIndex, since we always query the latest version for non-symbol features.

🤔 Not sure if it would have a noticeable performance impact, but the current implementation of lastUsedBuckets holds on to more than it needs to, since it will hold onto any bucket that gets loaded during any frame of an incremental placement, even if that bucket didn't actually end up used in the placement (because of where it was in the layer order). I'll look at changing that behavior so the intent is clearer.

Thanks for explaining, I think I had confused prune with removeStaleBuckets. This makes sense.

Renaming to RetainedBucketQueryData sounds good to me.

What is the advantage of loading the exact feature used for placement instead of loading features with matching crossTileIDs?

Hrm, that's a good question, I haven't thought through what it would be like to implement that. So instead of storing bucket/feature indices in the CollisionIndex, we'd just store a crossTileID and maintain a reverse-lookup index to go from a crossTileID to the latest matching bucket/symbolInstance?

Initial reactions:

There can be multiple symbol instances per feature, and our querying logic is feature-based, so we'd have to be careful to handle duplicates/merging correctly there.

We'd have to figure out something reasonable to do with crossTileIDs that were in the latest placement, but point to a tile removed from the pyramid since the placement happened.

Using the latest data instead of the data that was included in the latest placement opens up some room for some subtle mismatches that could in some cases be hard to debug (e.g. your query shows a symbol as being within your query geometry, but the returned GeoJSONFeature appears to be outside your query geometry).

I think the honest answer is "I didn't think of doing it that way". What would be the major advantage of switching to that approach? Do you think it might be a simplicity win, or are you more thinking about being able to discard data earlier?

In most cases, I don't think the current implementation leads to much extra data being retained, but the worst case is probably when there's something like a setData-style GeoJSON animation causing tiles with symbols to update really frequently (say, every 20ms). In that case, we'd be holding on to ~15 copies of the tile data over the course of a 300ms placement.... which sounds less-than-optimal, although I still don't think it'd be that bad (not much data, the data would have to wait for GC either way, etc.).

I was thinking about it from a simplicity point of view (not having to deal with retaining logic), but I think those are some good points. Especially the third one. I think your current approach is 👍

I got rid of the pruning entirely by moving RetainedQueryData into Placement -- it's a much better fit since the lifetime of the data needs to be the same as the lifetime of the Placement. This also avoids retaining data for buckets that never get included in a placement, which means the setData-animation worst case should go from ~15x tile data storage to ~2x storage.

ansis · 2018-04-12T16:51:11Z

src/symbol/grid_index.js

+                    x1: this.bboxes[boxUid * 4],
+                    y1: this.bboxes[boxUid * 4 + 1],
+                    x2: this.bboxes[boxUid * 4 + 2],
+                    y2: this.bboxes[boxUid * 4 + 3]


I'm not sure about this, but would it be cleaner to move polygon queries into GridIndex rather than exposing box geometries outside of the index?

🤔 Like give GridIndex#query an optional (bbox) => boolean argument that it applies to filter query results? That sounds like it could make sense. It might further hide the kind-of-weird internal circle->bbox conversion, but I guess it's already the case that GridIndex is not really that general-purpose.

I was thinking query(geometry: Array<Point>) but a filter function could also work. How hot is this part of querying? Does creating extra objects add noticeable overhead? The current approach seems fine to me and I don't think it necessarily needs to change.

For rendering, the 'hitTest' part of the logic is hot. For querying... I need to do more profiling anyway to figure out why the "Query" benchmarks have slowed down so much. I like this idea, but I think it seems fine to defer to later for now.

ansis · 2018-04-12T16:57:50Z

src/symbol/placement.js

@@ -81,7 +81,10 @@ class Placement {

    placeLayerTile(styleLayer: StyleLayer, tile: Tile, showCollisionBoxes: boolean, seenCrossTileIDs: { [string | number]: boolean }) {
        const symbolBucket = ((tile.getBucket(styleLayer): any): SymbolBucket);
-        if (!symbolBucket) return;
+        if (!symbolBucket || !tile.latestFeatureIndex)


In what cases is !tile.latestFeatureIndex true?

None that I know of, that was for flow.

Ah, but it's also vestigial -- in an earlier implementation, I had the collision box data in the feature index. I'll get rid of that.

ansis · 2018-04-12T17:02:50Z

src/symbol/collision_index.js

+                new Point(feature.x1, feature.y1),
+                new Point(feature.x2, feature.y1),
+                new Point(feature.x2, feature.y2),
+                new Point(feature.x1, feature.y2)


Does the bbox polygon need five points (with the last point being the same as the first) in order for the intersection test to work properly?

I don't think so? I actually haven't looked closely at how intersection_tests.js implements this, but it's the same behavior as before, right?

Yep, it's the same behaviour as before.

mapbox-gl-js/src/util/intersection_tests.js

Lines 112 to 124 in 13ea16d

function lineIntersectsLine(lineA: Line, lineB: Line) {

if (lineA.length === 0 || lineB.length === 0) return false;

for (let i = 0; i < lineA.length - 1; i++) {

const a0 = lineA[i];

const a1 = lineA[i + 1];

for (let j = 0; j < lineB.length - 1; j++) {

const b0 = lineB[j];

const b1 = lineB[j + 1];

if (lineSegmentIntersectsLineSegment(a0, a1, b0, b1)) return true;

}

}

return false;

}

lineIntersectsLine, which is used by polygonIntersectsPolygon, seems to expect a repeated vertex when used this way (the last segment is never tested). BUT I think it's still (accidentally?) correct, because it's impossible for this missing edge to be crossed without either crossing a different edge or having a point within the polygon. So, I'm not sure if this a bug or not

🙈 Do you have any ideas for tests or assertions we could put in to make it clear what types of geometry are supported? I haven't wrapped my head around it and for now my assumption is it's OK.

ansis · 2018-04-12T17:04:36Z

src/source/tile.js

+            } else if (this.latestRawTileData) {
+                // If rawTileData hasn't updated, hold onto a pointer to the last
+                // one we received
+                this.latestFeatureIndex.rawTileData = this.latestRawTileData;


In what cases does a the new data have a featureIndex but not a rawTileData?

loadTile in VectorTileWorkerSource returns a rawTileData, but reloadTile doesn't (we use a new FeatureIndex but with the previous data since it hasn't changed).

ansis · 2018-04-12T17:10:25Z

src/data/bucket/symbol_bucket.js

@@ -593,7 +598,8 @@ class SymbolBucket implements Bucket {
            // An icon can only have one box now, so this indexing is a bit vestigial...
            const box: CollisionBox = (collisionBoxArray.get(k): any);
            if (box.radius === 0) {
-                collisionArrays.iconBox = { x1: box.x1, y1: box.y1, x2: box.x2, y2: box.y2, anchorPointX: box.anchorPointX, anchorPointY: box.anchorPointY };
+                collisionArrays.iconBox = { x1: box.x1, y1: box.y1, x2: box.x2, y2: box.y2, anchorPointX: box.anchorPointX, anchorPointY: box.anchorPointY, featureIndex: box.featureIndex };


Is the featureIndex stored on iconBox used anywhere?

Oops! That's vestigial from an earlier implementation.

ansis · 2018-04-12T18:24:59Z

I have one high-level question about this approach: The results are wrong (and already were wrong before this pr) if the query happens after the map moves but before placement runs. This is generally ok because this window is very small. Do you think it will be important to fix this at some point? And if we do fix it, would a global approach still be useful or would it make sense to switch back to tiled indexing at that point?

(I don't think we have to think too much about this right now)

ChrisLoer · 2018-04-12T19:00:55Z

The results are wrong (and already were wrong before this pr) if the query happens after the map moves but before placement runs. This is generally ok because this window is very small. Do you think it will be important to fix this at some point?

Yeah, good point. I think we should ticket it and fix it, although it should be mostly separate from this PR. I can imagine it cropping up in something like an interactive article that uses animations to give a tour of the map and also implements hover functionality in the map -- it would be easy to get kind of bizarre hover behavior.

And if we do fix it, would a global approach still be useful or would it make sense to switch back to tiled indexing at that point?

My first thought on how to fix it would be to convert the query geometry based on the current transform to match the transform used in the placement. Wouldn't that solve most of the problem? The current screen coordinates might map to something that was entirely outside the scope of the last placement, but in that case most of the symbols wouldn't be placed on screen either (there are exceptions like text-allow-overlap: true, but it's already the case that those can show up on screen before they show up in results).

ansis · 2018-04-12T19:15:09Z

My first thought on how to fix it would be to convert the query geometry based on the current transform to match the transform used in the placement. Wouldn't that solve most of the problem?

This would be closer but would also not handle labels that change size or position (relative to the map plane) when zooming, rotating or when panning in perspective view. I think that completely fixing it would involve not using Placement's grid index for querying and instead using an index we can create per-tile in the work or something. We'd need to be able to apply the zoom and perspective transforms to the labels at query time. But making that fast enough would probably be hard and maybe not worth the effort right now.

ChrisLoer · 2018-04-12T19:29:52Z

Yeah, if we wanted symbol querying to always perfectly match what was on screen, I think we'd need a totally different approach... re-calculating everything at query time could work but also be expensive. I guess the "dream" is still that we somehow get the shaders to generate query information for us on every frame.

But I'm less concerned about general problems of the form "what we're rendering at this lat/lon has changed since the last placement" than I am concerned about the specific problem where the effective lat/lon of the query changes.

mb12 · 2018-04-12T19:30:14Z

@ansis If the caller re-runs the query upon receiving event REGION_DID_CHANGE/REGION_DID_CHANGE_ANIMATED, would that fix the problem? If the map is moving/animating when the query is made, it is okay for the results to be inconsistent, but they should be consistent when the client receives the change done events.

ansis · 2018-04-12T21:09:31Z

@ChrisLoer Moving the bucket retaining to placement (dd20158) looks great. Thanks for clearing up all these questions for me. This looks ready to me if the performance regression can be fixed

@mb12 thanks for your input. If I'm understanding you correctly, that would involve introducing new events? that would probably be nice to avoid.

mb12 · 2018-04-12T21:14:16Z

@ansis These events are already available on both iOS and Android. We should add them to js as well. I copy pasted those names from the Android version (that's why they were all caps).

ChrisLoer · 2018-04-12T22:26:27Z

src/data/feature_index.js


-            // don't check the same feature more than once


🤦‍♂️ This turned out to be the cause of the performance regression -- we definitely still need to filter out duplicate entries. I misread FeatureIndex#insert, which can actually make very many entries in the grid for a single feature. I spent half a day messing around trying to reliably profile QueryBox before I finally realized this...

ChrisLoer · 2018-04-12T22:32:53Z

Fixed the perf regression (take these screenshots with a grain of salt -- I've had a lot of trouble getting QueryBox especially to reproduce results reliably... but the >10x slowdown is definitely fixed):

ChrisLoer · 2018-04-13T00:19:47Z

As far as I know, these should be pretty much equivalent since the layout order is derived relatively directly from the order of features in the tile, but I'm not sure I could guarantee it.

I figured out why this PR introduced small changes in sort order even though feature indices and collision box indices should be in the same order. We were sorting with the default Array.prototype.sort, which uses string comparison:

https://github.com/mapbox/mapbox-gl-js/blob/master/src/data/feature_index.js#L133

I switched to using the "topDownFeatureComparator" that we already use for non-symbols (I'm not sure why we do it that way, but it seemed reasonable to be consistent). One nice thing about this change is that the "expected.json" for tests that use 121points.geojson is somewhat more intelligible now -- you can actually tell that within a tile the features maintain the (reverse of the) order they had in the geojson.

ChrisLoer · 2018-04-13T02:12:32Z

I couldn't resist pushing a fix for issue #6184, since it's closely related and I think we should take the opportunity to fix things in queryRenderedSymbols when we've got all of its complexity loaded up in our heads. 🙂

@mollymerp, is there any chance you have the reproduction case hanging around from #6184? I bootstrapped the test for getting the rotation sort order right, but I keep getting tripped up when I stare at all those coordinates and try to think what order they're supposed to be in.

anandthakker

😍

anandthakker · 2018-04-13T12:00:19Z

src/data/feature_index.js


        return result;
    }

-    filterMatching(
+    generateGeoJSONFeature(


Naming nit: feels to me like the most important thing this method is doing is still deciding whether or not to include the feature, moreso than generating the geojson result if the answer is 'yes'. Bad suggested rename: maybeAddFeatureToResult()

🤔You're right, when I first named this I thought the two pathways would end up sharing a smaller chunk of shared code. Maybe I should just go back to filterMatching? Or maybe loadMatchingFeatures?

Went with loadMatchingFeature

ansis · 2018-04-13T14:52:41Z

src/data/bucket/symbol_bucket.js

        for (const i of symbolInstanceIndexes) {
            const symbolInstance = this.symbolInstances[i];
+            this.featureSortOrder.push(symbolInstance.featureIndex);


Would line symbol instances that share the same feature have the same featureIndex? Would all features be sorted based on the y of the first instance?

Yup -- and this could be a little weird in "hover effect" type interactions, where you're hovering over a particular symbol instance that looks topmost, but you're getting the feature behind it first. But since we return unique features, not symbol instances, it would also be kind of weird to sort based on symbol instance: the same query results would get sorted differently just based on how you generated them.

ansis · 2018-04-13T14:57:15Z

src/source/query_features.js

+                    // Actually there can be multiple symbol instances per feature, so
+                    // we sort each feature based on the first matching symbol instance.
+                    const sortedA = featureSortOrder.indexOf(a.featureIndex);
+                    const sortedB = featureSortOrder.indexOf(b.featureIndex);


Would it be better to use an object for the lookup instead of an array with a linear indexOf here? The feature numbers could be low enough that this doesn't matter at all though

I thought a little about this and then decided it would be premature optimization to do that without this showing up as a hot spot. Worst case here is when we sort results from a really dense layer of potentially overlapping results, but even with a thousand features I don't think it gets that bad.

ansis · 2018-04-13T15:04:32Z

src/symbol/placement.js

@@ -393,6 +394,9 @@ export class Placement {
        }

        bucket.sortFeatures(this.transform.angle);
+        if (bucket.featureSortOrder) {
+            this.retainedQueryData[bucket.bucketInstanceId].featureSortOrder = bucket.featureSortOrder.slice();
+        }


Is the copy of the array necessary?
Would passing RetainedQueryData to bucket.sortFeatures(...) avoid the need to store the array on the bucket at all?

Yeah, having the sort array always live with RetainedQueryData is a good idea, and we can avoid the copy.

There is a subtle oddity here -- the sort order will be based on the last time we update opacities, instead of being based on the last time placement happened. It's a little weird that it's not all synced to the same time point, but at least right now, I can't think of why it would be a problem to use a somewhat newer sort order for the same data.

Actually, I realized this change was wrong in some last minute manual testing! Moving to RetainedQueryData is actually awkward because sortFeatures doesn't necessarily run on every placement (it will early-exit if the bearing hasn't changed). We'd have to make sure to copy featureSortOrder from previous placements, and it's actually much simpler to just use the last featureSortOrder from the bucket itself. We still get rid of the array copy.

ansis

Looks good!

Addresses hover flicker from issues #5887 and #5506. Also fixes issue #5475/#6298, so that symbols that bleed over tile boundaries don't get missed. Under the hood, there are some good simplifications: - No round-tripping of viewport query coordinates through tile space - No more need to merge duplicate results from the same symbol showing up in multiple tiles - All querying-related data can now be indexed with a bucket instance id and a feature index - `Placement` now manages lifetime of any data needed to query against its CollisionIndex - CollisionBoxArray no longer involved in querying at all

Tests issue #5475.

Matches topDownComparator of FeatureIndex. Previous behavior was string sorting, which could provide unexpected results with multi-digit feature indices. Update query tests to match new order.

…g symbols. Fixes issue #6184.

ChrisLoer · 2018-04-13T19:41:33Z

@mollymerp I ended up using the new hover-bus behavior in debug/debug.html for manually sanity-checking the results, so no need to worry about finding the original repro case from #6184. My test was just to make two of the icons overlap, check that queryRenderedSymbols put the bottom-y (top-most rendered) first in the list, rotate the map 180 degrees, and make sure the sort order is flipped.

Add marker-change-on-hover functionality to debug.html.

d0635d8

ChrisLoer requested a review from ansis April 11, 2018 01:05

ChrisLoer force-pushed the global-symbol-query branch from b9f6b47 to 7ee20e7 Compare April 11, 2018 22:05

ChrisLoer force-pushed the global-symbol-query branch from 7ee20e7 to 4efc663 Compare April 11, 2018 23:20

ansis reviewed Apr 12, 2018

View reviewed changes

ChrisLoer commented Apr 12, 2018

View reviewed changes

ChrisLoer force-pushed the global-symbol-query branch from dd20158 to 468c78d Compare April 12, 2018 22:30

ChrisLoer force-pushed the global-symbol-query branch 2 times, most recently from 54d464c to ab3ffe9 Compare April 13, 2018 00:05

This was referenced Apr 13, 2018

Fix race conditions in symbol querying #6461

Closed

queryRenderedSymbols does not sort features #6184

Closed

Map move invalidates queryRenderedSymbols coordinates until next placement #6510

Open

anandthakker reviewed Apr 13, 2018

View reviewed changes

ansis reviewed Apr 13, 2018

View reviewed changes

ansis approved these changes Apr 13, 2018

View reviewed changes

ChrisLoer added 3 commits April 13, 2018 12:07

Test that symbols in adjacent tiles are included in query results.

f4b1943

Tests issue #5475.

Use numeric sort comparator for sorting symbol query results.

2d8d7e7

Matches topDownComparator of FeatureIndex. Previous behavior was string sorting, which could provide unexpected results with multi-digit feature indices. Update query tests to match new order.

Make symbol query result sort order match render order for overlappin…

fc1a8a4

…g symbols. Fixes issue #6184.

ChrisLoer force-pushed the global-symbol-query branch from 6566b85 to fc1a8a4 Compare April 13, 2018 19:35

ChrisLoer merged commit 6997943 into master Apr 13, 2018

ChrisLoer deleted the global-symbol-query branch April 13, 2018 19:48

ChrisLoer mentioned this pull request Apr 30, 2018

When hovering a symbol, the transition is too long (v0.44.2) #6572

Closed

	function lineIntersectsLine(lineA: Line, lineB: Line) {
	if (lineA.length === 0 \|\| lineB.length === 0) return false;
	for (let i = 0; i < lineA.length - 1; i++) {
	const a0 = lineA[i];
	const a1 = lineA[i + 1];
	for (let j = 0; j < lineB.length - 1; j++) {
	const b0 = lineB[j];
	const b1 = lineB[j + 1];
	if (lineSegmentIntersectsLineSegment(a0, a1, b0, b1)) return true;
	}
	}
	return false;
	}

Global symbol query #6497

Global symbol query #6497

Conversation

ChrisLoer commented Apr 11, 2018 • edited

ChrisLoer commented Apr 11, 2018

ChrisLoer commented Apr 11, 2018

ansis left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ansis commented Apr 12, 2018

ChrisLoer commented Apr 12, 2018

ansis commented Apr 12, 2018

ChrisLoer commented Apr 12, 2018

mb12 commented Apr 12, 2018

ansis commented Apr 12, 2018

mb12 commented Apr 12, 2018

Choose a reason for hiding this comment

ChrisLoer commented Apr 12, 2018

ChrisLoer commented Apr 13, 2018

ChrisLoer commented Apr 13, 2018

anandthakker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ansis left a comment

Choose a reason for hiding this comment

ChrisLoer commented Apr 13, 2018

ChrisLoer commented Apr 11, 2018 •

edited