Skip to content
This repository has been archived by the owner on Jun 20, 2023. It is now read-only.

Support for aggregation framework #146

Closed
hlassiege opened this issue Sep 18, 2014 · 12 comments
Closed

Support for aggregation framework #146

hlassiege opened this issue Sep 18, 2014 · 12 comments

Comments

@hlassiege
Copy link

Hi,

I'm wondering if the support of the aggregation framework is under work ?

Thanks

@kramer
Copy link
Member

kramer commented Sep 18, 2014

Nobody is working on it currently as far as I know; feel free to take a shot :)

@hlassiege
Copy link
Author

Ok. I'll do it.

@jtjeferreira
Copy link

Hi

whats the status of this issue? I checked #148 was not accepted, but i would like to contribute some aggregations support

@kramer
Copy link
Member

kramer commented Oct 30, 2014

This issue is blocked by #160 - which is also in the todo list.

@kramer
Copy link
Member

kramer commented Nov 4, 2014

Blockage cleared, issue is up for grabs!

@jtjeferreira Feel free to work on this and please mind our Contribution Guidelines.

@jtjeferreira
Copy link

Hi

I solved my issue by parsing the json object to get my aggregations. I will try to improve it by making it more generic and contribute back

@cfstout
Copy link
Contributor

cfstout commented Feb 11, 2015

Is this still an open issue? Looks like there was some work done previously, but I don't see aggregation support anywhere in the code or in open PRs. I'd be happy to take a crack at it if it's still an open issue.

@kramer
Copy link
Member

kramer commented Feb 11, 2015

Hi @cfstout,

Yes this is still an open issue, feel free to give it a go and we'll be more than happy to accept it.

Pinging @ferhatsb who mentioned he might have an idea on the matter some time ago 😄 !

@cfstout
Copy link
Contributor

cfstout commented Feb 11, 2015

Aggregations introduce the notion of nesting. As this is difficult to handle from the Jest side, I'm going to add documentation to the aggregation class saying that parsing for nested aggregations must happen on the user's side. If not we would have to know the specific name of each aggregation while parsing out into the various Aggregation models used greatly complicating the code. Let me know if that sounds ok.

That being said I've gone ahead and created aggregation models for all the other aggregations except the ones elasticsearch lists as "experimental", but that is my next step. Progress is in my fork: https://github.com/cfstout/Jest/tree/Aggregation_support

cfstout pushed a commit to cfstout/Jest that referenced this issue Feb 11, 2015
I have added new models in the io.searchbox.core.search.aggregation
package which represent all the base aggregations. I have not added
any classes for 'nested' aggregations as these are simply ways of crafting
subaggregations on the request side, but should be handled as one of
the base models when dealing with search results.

TODO:
* implement getAggs() in the SearchResults class
* Add tests around these aggregations models
cfstout pushed a commit to cfstout/Jest that referenced this issue Feb 12, 2015
I have added new models in the io.searchbox.core.search.aggregation
package which represent all the base aggregations. I have not added
any classes for 'nested' aggregations as these are simply ways of crafting
subaggregations on the request side, but should be handled as one of
the base models when dealing with search results.

TODO:
* implement getAggs() in the SearchResults class
* Add tests around these aggregations models
@cfstout
Copy link
Contributor

cfstout commented Feb 12, 2015

Seems elasticsearch also removed the _type field in aggregations, so we can't follow the same pattern used by facets. I am going to try a different approach by having a single aggregation model, with all of the possible aggregation fields, and a list of sub aggregations. Basically, my plan is to create a java pojo representation of the json object adding some back-end logic to make the aggregations easy to deal with.

@cfstout
Copy link
Contributor

cfstout commented Feb 12, 2015

Just pushed a 'hacky' version to my branch that returns aggregations with fields without knowing about type. My plan is to now override the getAggregations method in the SearchResult class to also take a name to type map, which would allow the user to provide the specific type of aggregation requested and the associated name. That way when we encounter the name in the json document, we simply perform a lookup in the table as opposed to looking at the _type metadata field present before. Then we can recursively pass down this map to the aggregations themselves to handle the nested use case also.

cfstout pushed a commit to cfstout/Jest that referenced this issue Feb 13, 2015
I have create two ways to get aggregation data from a search response.

1) End user can provide a map of names to the aggregation types
expected to be returned. This takes the place of the _type metadata
returned in faceting. This map does not need to be in any particular
order, but each aggregation will search through and see if it has
any matching sub-aggregations matching the names present and populate
them as needed. It is important to note that using this method we
will ONLY return aggregations specified by the user in this map.

2) Type invariant aggregations can be requested. This method will simply return a
list of TypeInvariantAggregations. This is recommended if the user is unsure of
the Aggregations supplied, or simply wants a representation of all data returned
in the aggregation. This module introduces some danger and requires responsibility
on the user's part. The intended use is to call the getAggregationFields() method
to get a list of all AggregationFields contained. The user is only assured
that those particular methods will return non-null results.

**TODO**

* Polish off these methods-- this is meant to be a rough draft of the Aggregations module
in order to get rapid feedback

* Add testing over the Aggregation module
@cfstout
Copy link
Contributor

cfstout commented Feb 13, 2015

I just pushed a draft of this aggregation module. See commit ^. @kramer I'd love some feedback just whenever you have some free time before I continue with polishing and writing tests for this code.

cfstout pushed a commit to cfstout/Jest that referenced this issue Feb 18, 2015
Changes to reflect suggestions by kramer
cfstout pushed a commit to cfstout/Jest that referenced this issue Feb 18, 2015
Changes to reflect suggestions by kramer
cfstout pushed a commit to cfstout/Jest that referenced this issue Feb 18, 2015
cfstout pushed a commit to cfstout/Jest that referenced this issue Feb 24, 2015
I have create three ways to get aggregation data from a search response.

1) End user can call `get<X>Aggregation(String aggName)` where X is the type of
aggregation expected and agg name is the name of the aggregation.

2) User can call `getAggregation(String aggName, Class<T> aggType)` where
aggName is the name of the aggregation and aggType is the type expected for
that particular aggregation.

3) User can call `getAggregations(Map<String, Class<T>> nameToTypeMap)` which
provides a map of names to expected types for this level of aggregation.
A list of aggregations of the given types will be returned as a result.

In order to handle nested aggregations, the root `Aggregation` class contains all three methods.
The `getAggregations()` call in `SearchResult` returns the root level aggregation.
Thus to get an average aggregation named 'avg1' the end user would make a call like:
`searchResult.getAggregations().getAvgAggregation('avg1')` which would then have the
`getAvg` method available. Nested aggregations could then be achieved by nesting these calls to
the aggregations returned at each call:
`searchResult.getAggregations.getTermsAggregation('term1').getAvgAggregation('avg1')`

Tests were added for each of these Aggregation objects with the objective of testing
all methods of a particular aggregation object in the case where a user provides
a proper name, and in the case where they provide a bad name by which to get the
aggregation.

Finally, many of these methods have the potential to create null values, as a bad
name will lead to no data being returned. I have commented in JavaDoc style these
methods with the potential to return null so that end users will be aware of the
possibility.
cfstout pushed a commit to cfstout/Jest that referenced this issue Feb 24, 2015
I have create three ways to get aggregation data from a search response.

1) End user can call `get<X>Aggregation(String aggName)` where X is the type of
aggregation expected and agg name is the name of the aggregation.

2) User can call `getAggregation(String aggName, Class<T> aggType)` where
aggName is the name of the aggregation and aggType is the type expected for
that particular aggregation.

3) User can call `getAggregations(Map<String, Class<T>> nameToTypeMap)` which
provides a map of names to expected types for this level of aggregation.
A list of aggregations of the given types will be returned as a result.

In order to handle nested aggregations, the root `Aggregation` class contains all three methods.
The `getAggregations()` call in `SearchResult` returns the root level aggregation.
Thus to get an average aggregation named 'avg1' the end user would make a call like:
`searchResult.getAggregations().getAvgAggregation('avg1')` which would then have the
`getAvg` method available. Nested aggregations could then be achieved by nesting these calls to
the aggregations returned at each call:
`searchResult.getAggregations.getTermsAggregation('term1').getAvgAggregation('avg1')`

Tests were added for each of these Aggregation objects with the objective of testing
all methods of a particular aggregation object in the case where a user provides
a proper name, and in the case where they provide a bad name by which to get the
aggregation.

Finally, many of these methods have the potential to create null values, as a bad
name will lead to no data being returned. I have commented in JavaDoc style these
methods with the potential to return null so that end users will be aware of the
possibility.
@kramer kramer closed this as completed Mar 1, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants