Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Maps] blended ES document source that shows clusters and documents #48459

Closed
wants to merge 3 commits into from

Conversation

nreese
Copy link
Contributor

@nreese nreese commented Oct 16, 2019

POC for blended ES documents source that shows cluster when the grid contains more then 10% of ES_SIZE_LIMIT

Screen Shot 2019-10-16 at 2 00 47 PM

The sample data sets included with Kibana do not that many documents so testing this PR with those is a little difficult. If you want to use sample data sets, set ES_SIZE_LIMIT to 1000

@nreese nreese added release_note:enhancement WIP Work in progress [Deprecated-Use Team:Presentation]Team:Geo Former Team Label for Geo Team. Now use Team:Presentation v8.0.0 v7.6.0 labels Oct 16, 2019
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-gis (Team:Geo)

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Copy link
Contributor

@thomasneirynck thomasneirynck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, apologies that this took so long to return a proper review.

I think what this PR is looking to solve, mapping data beyond the 10k doc limit is critical. The techniques used here to solve this works really well in the context of Maps imho. It is a purely client-side introduction, that takes advantage of some metadata of the request-responses (ie. Total-counts) to determine whether a given view can be refined.

  • it removes the need for a manual scale-dependent configuration across multiple layers
  • it makes fair assumptions about how geo-data can be displayed and is usable. Basically, when the count exceeds the 10k limit, the individual data becomes illegible anyway due to the cluttered data. A tiled-rollup is a fair approach that imho will work for many real world datasets.
  • it reuses existing concepts of Maps, which just can be “cut&paste” together and leveraged in the right place.
    • scale dependency of layers
    • geo_tile grid views

Because of this, imho this could be a good “client-side-only” approach to push maps beyond the 10k limit

There are two caveats I think we should consider:

  • I am not 100% sure if the “blending” should occur within a single level-of-detail/zoom-level. It makes sometimes for an odd-visual effect. I think this "blended" approach would work equally well on a per zoom-level basis. Just flip between individual-doc or clustered view when zooming in/out based on count-of-docs in that view.
  • Essentially, this PR is introducing a tiled-method to retrieve larger collections of data. Individual docs are retrieved on a per-tile basis. This can create many request for data within the single view. (In the order of dozens per viewport). When using a tiled approach to retrieve geo-data, this is mostly used in combination with client-side caching of tiles, so previously requested tiles are not re-requested when panning/zooming. It is this client-side caching that creates the performance-advantage over the “request all within view” approach currently in Maps. This client-side caching is already implemented in mapbox-gl for vector-tile sources. But in order to leverage that, we need to create a REST-api point that can retrieve tiles for any given tile defined by XYZ params.

Apart from those caveats, this blended approach would also introduce complexities in the styling-UX and tooltip-UX. Presumably, users would want to modify these individually, for the “clustered”-view and for the “individual-doc” view. Right now, these have a one-to-one relation to the layer-type. A “blended” approach like this would necessarily complicate the layer-editor.

Another (minor) complication is how we would represent this in Inspector.

imho, this would be a good approach to develop further. It fits in the context of having "better defaults". At its core, this PR automates the selection of zoom-dependent layers, which is in-line with the current approach we recommend to users on how to deal with large data. It simplifies it by collapsing these two views of data (grid-agg vs individual-agg) in two layers) into a single layer.

return this._getGeoJsonFromHits(layerName, searchFilters, registerCancelCallback, ES_SIZE_LIMIT);
}

const clusterLimit = ES_SIZE_LIMIT / 10;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

somewhat arbitrary (?)

@bcamper
Copy link

bcamper commented Nov 15, 2019

Great summary @thomasneirynck. I agree this is great progress towards addressing the 10k limit and hoping there's some iteration of it we can move ahead with.

Also agree that the "mixed" aggregation and documents in a single is kind of awkward, as currently visualized. But if we keep the agg/doc "mode" consistent within a single view (e.g. "flip between individual-doc or clustered view when zooming in/out based on count-of-docs in that view" as you suggested), won't we then have a different but potentially still odd transition where just panning the view could cause the whole view to switch between aggs/docs? This might still be better though, it could be worth trying for comparison.

Another option would be to do this transition at a global zoom level, by querying all potential documents in the world at each zoom. Might have its own issues though :) One thing that is nice about it being more dynamic is that it allows for local areas of high detail (common for some kinds of data, e.g. urban areas with high density and rural with low), so maybe we wouldn't want to sacrifice that either.

Related, we could consider more traditional "cluster"-styling, which could communicate the doc/agg distinction in a more familiar manner. Prior art in:

https://docs.mapbox.com/mapbox-gl-js/example/cluster/
https://github.com/Leaflet/Leaflet.markercluster

On the other hand, we just switched to using the grid view as the default because the way we do point-symbolized aggs can be confusing #49976, so 🤷‍♂️

@nreese nreese added v7.7.0 and removed v7.6.0 labels Jan 15, 2020
@nreese
Copy link
Contributor Author

nreese commented Feb 18, 2020

closing, replacing with #57879

@nreese nreese closed this Feb 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Deprecated-Use Team:Presentation]Team:Geo Former Team Label for Geo Team. Now use Team:Presentation discuss release_note:enhancement v7.7.0 v8.0.0 WIP Work in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants