Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch doesn't like Antarctica #3

Closed
chrisflatley opened this issue Oct 20, 2015 · 8 comments
Closed

Elasticsearch doesn't like Antarctica #3

chrisflatley opened this issue Oct 20, 2015 · 8 comments
Labels

Comments

@chrisflatley
Copy link
Contributor

If you put use a document with 'Antarctica' in it causes Elasticsearch to exception:

Caused by: org.elasticsearch.index.mapper.MapperParsingException: failed to parse [entities.geoJson
Caused by: com.spatial4j.core.exception.InvalidShapeException: Self-intersection at or near point (-7.409738314942461, -71.63108011089658, NaN)

@jbaker-dstl
Copy link
Contributor

The GeoJSON we're using for Antarctica appears valid (validated with http://geojsonlint.com/), so this sounds like it's an Elasticsearch error as opposed to a Baleen one?

@chrisflatley
Copy link
Contributor Author

I think it is valid, but must be more complex that ES likes to deal with. Similar problem with nice pictures shown here pelias-deprecated/quattroshapes#16

I wonder if there's some self intersection/complexity or whether its the fact it's 'special case' of ES/Geojson around the pole (s).

@jbaker-dstl
Copy link
Contributor

This may have been fixed in Elasticsearch 2, as there have been a whole load of Geo related bugs fixed. I'll try to have a look at it in the next week or two.

@jbaker-dstl jbaker-dstl added the bug label Dec 3, 2015
@jbaker-dstl
Copy link
Contributor

Finally got round to trying Elasticsearch 2, and it's giving the same error.

@jbaker-dstl
Copy link
Contributor

This is still an issue - perhaps look at updating the GeoJSON data in Baleen to the latest version.

https://github.com/datasets/geo-countries/blob/master/data/countries.geojson - NB: data comes from a different repository, so will need to recheck license and update READMEs.

@jbaker-dstl
Copy link
Contributor

Updated data doesn't resolve the issue, looks like the issue will need resolving by Elasticsearch: elastic/elasticsearch#17407

@jamesfry
Copy link
Contributor

jamesfry commented Feb 3, 2017

I've just checked and this is still an issue in Elasticsearch 5.2.0 (I don't have a submittable PR for this yet as the API has changed, NodeBuilder has been removed and the TransportClient is the preferred client with embedded Nodes unsupported).

Digging in a bit further, it sounds like precision/rounding errors may be to blame in ES (see elastic/elasticsearch#7372) or at either the date line or (possibly more likely) the poles cause problems for the validity checks, or the polygon mapping code (dateline was handled a while ago, poles were to come later but I couldn't find a patch / commit) . Given the given the precision in the antarctica data.

There is also some concern at the JTS validation code used elastic/elasticsearch#13397 and it seems there is a plan to move away from JTS which may therefore fix the issue.

Anyway, as a test I replaced the Antarctica entry in countries.geojson with updated geometry exported from QGIS with a COORDINATE_PRECISION of 10 (rather than the default of 15) and the problem has seemingly gone away, at least with ES 5.2 and 2.0.

jamesfry added a commit to jamesfry/baleen that referenced this issue Feb 3, 2017
jbaker-dstl added a commit that referenced this issue Feb 9, 2017
Simplify Antarctica geometry to work around Baleen/ES issue #3
@jbaker-dstl
Copy link
Contributor

jbaker-dstl commented Feb 9, 2017

For info, in 2.4.0-SNAPSHOT I have reduced the precision of all coordinates to better match the original dataset and introduced a test to test the storage of all country GeoJSONs in Elasticsearch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants