[Lens] Field stats endpoint does not need to use sampler aggregation #74595

wylieconlon · 2020-08-06T20:25:30Z

Originally, we thought that the sampler aggregation would behave like a random sampling query, with improved performance across large datasets. This is not what the sampler aggregation actually does, which means that we are doing more work instead of less. This aggregation can be removed entirely.

elasticmachine · 2020-08-06T20:25:32Z

Pinging @elastic/kibana-app (Team:KibanaApp)

flash1293 · 2020-08-10T08:43:08Z

@wylieconlon Are you sure not using sampler is better than using it in cases where it really matters (super large data sets)?

From the documentation you linked:

Example use cases

Reducing the running cost of aggregations that can produce useful results using only samples e.g. significant_terms

Probably missing a nuance here.

wylieconlon · 2020-08-10T15:01:15Z

@flash1293 Because we don't use any of those aggregations when calculating the samples, that part is not relevant. The other example use case is potentially relevant, with caveats:

Tightening the focus of analytics to high-relevance matches rather than the potentially very long tail of low-quality matches

This part actually might be relevant if the user has added a rank-affecting query to the Lens editor before clicking on the preview. Exact match queries wouldn't have any effect, but a query with OR would affect the rank, as would a wildcard query.

Because it's such a narrow subset of queries that affect the results, I would say that the sampler is not useful.

flash1293 · 2020-08-11T07:11:04Z

Ah, I forgot we aren't using aggregations for gathering the stats - makes total sense in that case, thanks.

stratoula · 2023-09-06T11:39:06Z

We have changed the implementation so this is not valid anymore

wylieconlon added technical debt Improvement of the software architecture and operational architecture Team:Visualizations Visualization editors, elastic-charts and infrastructure Feature:Lens labels Aug 6, 2020

wylieconlon added this to Long-term goals in Lens via automation Aug 6, 2020

wylieconlon moved this from Long-term goals to Tech Debt in Lens Aug 6, 2020

flash1293 mentioned this issue Aug 14, 2020

[Meta][Lens] Technical debt #75030

Closed

43 tasks

wylieconlon mentioned this issue Oct 26, 2020

[Lens] Misleading percentages shown in field stats when the field is only on some docs #81677

Closed

jughosta mentioned this issue Sep 22, 2022

[Discover][Lens] Meta - Unified field list #137779

Closed

31 tasks

jughosta added the Feature:UnifiedFieldList The unified field list component used by Lens & Discover label Jan 24, 2023

stratoula closed this as completed Sep 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Lens] Field stats endpoint does not need to use sampler aggregation #74595

[Lens] Field stats endpoint does not need to use sampler aggregation #74595

wylieconlon commented Aug 6, 2020

elasticmachine commented Aug 6, 2020

flash1293 commented Aug 10, 2020

wylieconlon commented Aug 10, 2020

flash1293 commented Aug 11, 2020

stratoula commented Sep 6, 2023

[Lens] Field stats endpoint does not need to use sampler aggregation #74595

[Lens] Field stats endpoint does not need to use sampler aggregation #74595

Comments

wylieconlon commented Aug 6, 2020

elasticmachine commented Aug 6, 2020

flash1293 commented Aug 10, 2020

wylieconlon commented Aug 10, 2020

flash1293 commented Aug 11, 2020

stratoula commented Sep 6, 2023