Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement stats for geo_point and geo_shape field #22391

Merged
merged 1 commit into from Jan 4, 2017

Conversation

Projects
None yet
2 participants
@jimczi
Copy link
Member

commented Dec 30, 2016

Currently geo_point and geo_shape field are treated as text field by the field stats API and we
try to extract the min/max values with MultiFields.getTerms.
This is ok in master because a geo_point field is always a Point field but it can cause problem in 5.x (and 2.x) because the legacy
geo_point are indexed as terms.
As a result the min and max are extracted and then printed in the FieldStats output using BytesRef.utf8ToString
which can throw an IndexOutOfBoundException since it's not valid UTF8 strings.
This change ensure that we never try to extract min/max information from a geo_point field.
It does not add a new type for geo points in the fieldstats API so we'll continue to use text for this kind of field.
This PR is targeted to master even though we could only commit this change to 5.x. I think it's cleaner to have it in master too before we make any decision on
#21947.

Fixes #22384

Implement stats for geo_point and geo_shape field
Currently `geo_point` and `geo_shape` field are treated as `text` field by the field stats API and we
try to extract the min/max values with MultiFields.getTerms.
This is ok in master because a `geo_point` field is always a Point field but it can cause problem in 5.x (and 2.x) because the legacy
 `geo_point` are indexed as terms.
 As a result the min and max are extracted and then printed in the FieldStats output using BytesRef.utf8ToString
 which can throw an IndexOutOfBoundException since it's not valid UTF8 strings.
 This change ensure that we never try to extract min/max information from a `geo_point` field.
 It does not add a new type for geo points in the fieldstats API so we'll continue to use `text` for this kind of field.
 This PR is targeted to master even though we could only commit this change to 5.x. I think it's cleaner to have it in master too before we make any decision on
  #21947.

Fixes #22384
@jpountz

jpountz approved these changes Jan 4, 2017

Copy link
Contributor

left a comment

LGTM

@jimczi jimczi merged commit 360ce53 into elastic:master Jan 4, 2017

1 of 2 checks passed

elasticsearch-ci Build started sha1 is merged.
Details
CLA Commit author is a member of Elasticsearch
Details

@jimczi jimczi deleted the jimczi:geo_field_stats branch Jan 4, 2017

jimczi added a commit that referenced this pull request Jan 4, 2017

Implement stats for geo_point and geo_shape field (#22391)
Currently `geo_point` and `geo_shape` field are treated as `text` field by the field stats API and we
try to extract the min/max values with MultiFields.getTerms.
This is ok in master because a `geo_point` field is always a Point field but it can cause problem in 5.x (and 2.x) because the legacy
 `geo_point` are indexed as terms.
 As a result the min and max are extracted and then printed in the FieldStats output using BytesRef.utf8ToString
 which can throw an IndexOutOfBoundException since it's not valid UTF8 strings.
 This change ensure that we never try to extract min/max information from a `geo_point` field.
 It does not add a new type for geo points in the fieldstats API so we'll continue to use `text` for this kind of field.
 This PR is targeted to master even though we could only commit this change to 5.x. I think it's cleaner to have it in master too before we make any decision on
  #21947.

Fixes #22384
@jimczi

This comment has been minimized.

Copy link
Member Author

commented Jan 4, 2017

Thanks @jpountz
I pushed to master and 5.x

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Jan 4, 2017

Should we merge to the 5.1 branch too?

@jimczi

This comment has been minimized.

Copy link
Member Author

commented Jan 4, 2017

Sure I'll merge to 5.1 too.

@jimczi jimczi added the v5.1.2 label Jan 4, 2017

jimczi added a commit that referenced this pull request Jan 4, 2017

Implement stats for geo_point and geo_shape field (#22391)
Currently `geo_point` and `geo_shape` field are treated as `text` field by the field stats API and we
try to extract the min/max values with MultiFields.getTerms.
This is ok in master because a `geo_point` field is always a Point field but it can cause problem in 5.x (and 2.x) because the legacy
 `geo_point` are indexed as terms.
 As a result the min and max are extracted and then printed in the FieldStats output using BytesRef.utf8ToString
 which can throw an IndexOutOfBoundException since it's not valid UTF8 strings.
 This change ensure that we never try to extract min/max information from a `geo_point` field.
 It does not add a new type for geo points in the fieldstats API so we'll continue to use `text` for this kind of field.
 This PR is targeted to master even though we could only commit this change to 5.x. I think it's cleaner to have it in master too before we make any decision on
  #21947.

Fixes #22384
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.