Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better "unmapped" handling for aggregator factories #58139

Open
not-napoleon opened this issue Jun 15, 2020 · 4 comments · May be fixed by #65573
Open

Better "unmapped" handling for aggregator factories #58139

not-napoleon opened this issue Jun 15, 2020 · 4 comments · May be fixed by #65573
Labels
:Analytics/Aggregations Aggregations >refactoring Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) >tech debt

Comments

@not-napoleon
Copy link
Member

When constructing aggregators, ValuesSourceAggregationBuilder has a separate code path for cases where we know there are no values to aggregate (called "unmapped" for historical reasons). Some aggregations, like Geo Tile Grid, take advantage of that to return simpler aggregators for this case, but not all do. Aggregations that don't do this typically create a normal aggregator with a null values source, and have special null handling in the values source.

Going forward, the correct thing to do is to write aggregators only for the case where the values source has values, and to have createUnmapped handle the empty case. Ideally, I'd like to have a default implementation of createUnmapped in ValuesSourceAggregatorFactory and only override it if we need to do something strange.

Also as part of this, we should get rid of any null handling in the aggregators themselves. ValuesSourceConfig no longer produces nulls, and if we handle the unmapped cases correctly, we should never need to worry about nulls in aggregators.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jun 15, 2020
@Thlamz
Copy link
Contributor

Thlamz commented Nov 26, 2020

I'm going to try to solve this one, I will open a PR soon.

@Thlamz Thlamz linked a pull request Nov 27, 2020 that will close this issue
@not-napoleon
Copy link
Member Author

@Thlamz Cool, let me know if you need any help. Also, this doesn't have to happen as one big push; we can clean up factories one by one as needed.

@Thlamz
Copy link
Contributor

Thlamz commented Nov 30, 2020

@not-napoleon I opened a PR with what I could figure out. Most factories and aggregators did the exact same thing on createUnmapped and had the exact same null checks. I was able to clean them out (I think). Any factory that did anything special on createUnmapped I left alone. It would be helpful if you could take a look at my PR and tell me if I had the right idea of if I got everything completely wrong.

Also I couldn't figure out exactly what the default implementation of createUnmapped should do, I used a class named NonCollectingAggregator which I found lying around that seemed be the right fit for this and I created a class named InternalNonMapped which basically doesn't do anything so that I could use it as the InternalAggregator. I am not very confident with this solution and would greatly appreciate your help. I went a bit more into my issues in the description of the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >refactoring Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) >tech debt
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants