-
Notifications
You must be signed in to change notification settings - Fork 24.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added syntactic sugar over agg definition #5721
Conversation
- some aggregations typically tend to be requested together. But using the normal (formal) dsl can make the request too verbose. The agg factory/parsing infrastructure was changed to enable factories to automatically add more factories, and by that enables "syntatctic sugar" to save verbosity - one common use case for this is the `missing` aggregation. Often, when running values source aggs (e.g. terms, avg, etc...) one also wants to get the count of those document that are not associated with values (therefore were not aggregated). Now, using the new infrastructure, one can define a `track_missing` setting on any values source based aggregation, and a `missing` aggregation will automatically be injected into the request - based on the same value source. Closes #5324
@uboness I could not play with it and docs do not make iy clear if it replicate sub aggs defined within an aggs that are marked to track missing. In other words if I agg by state and then city and mark state agg to track missing there will be city buckets within missing bucket along with any accumulator aggs such as sum avg etc defined within city? Also I feel it would be much more convenient to have missing as part of normal bucket array with key specified when enabling missing tracking for the agg rather than as sibling node. After all missing is just another value (granted somewhat special) of that field. It is just like the databases let you handle null values almost like the normal ones and with great ease and convenience |
@roytmana adding
so, for the missing, we decided this is the best we'll go with that regard, as adding the missing count to the aggs themselves also adds complexity we'd like to avoid. That said, we do plan very soon (this week perhaps) to let you define default values for missing docs, which will enable you to treat those like you'd treat |
Thanks for the info @uboness. Then as you said it is truly is a syntactic sugar adding little value if it does not replicate sub aggs defined within the agg with tracked null values. I would like to re-open my request since it solves none of the problems I outlined in it if you do not mind? |
@roytmana well, it's there to solve the verbosity of the request, which seemed to be your main concern outlined above. As for the response, the difference in how you'd fetch the data is negligible I'd say:
vs
In any case, you can also wait for the default values feature I mentioned above and use that if it better fits your needs |
Thank you @uboness I will wait for default values. It would have been very nice to have an optional "other" bucket in aggs as well :-) |
public static ValueCountBuilder count(String name) { | ||
return valueCount(name); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you do it in a separate change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I'm all for it, but would just like to make sure we have a changes entry for it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
It doesn't feel right to me to make Although I think it is nice to users to have documentation of Regarding the syntax, maybe we should make it future-proof in case we would also like to be able to replicate the sub-tree of aggregations under the "track_missing": {
"term_missing": "counts"
} where |
I'm not sure I agree with that, the whole idea of syntactic sugar is that it's just applied on the syntax, which means this syntax represents a more elaborated syntax. So this behaviour should not (almost by definition) propagate to the underlying implementation of the factories or the aggregators... it should be done on the highest level possible.
I tend to agree, there's just no place for common settings in the docs now. I did try to put it on the "main" aggs and share it between related ones (e.g. range, date_range, ip_range). Maybe we should consider adding a section for common configurations (also put the values source configuration there?), but I'd do that on a different PR
I don't think it's not future proof right now... if needed, in the future we could always introduce support for objects as well (nothing prevents us from doing that) |
I'm good with that, but can we keep concerns separated and not do both validation and return the siblings in the same method?
OK, can you open an issue already and mark it as a blocker for 1.2 so that we don't forget?
OK. |
this is stale for alomst 1 year. I am closing it for now - we can still revisit |
missing
aggregation. Often, when running values source aggs (e.g. terms, avg, etc...) one also wants to get the count of those document that are not associated with values (therefore were not aggregated). Now, using the new infrastructure, one can define atrack_missing
setting on any values source based aggregation, and amissing
aggregation will automatically be injected into the request - based on the same value source.Closes #5324