Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRS Axis Order issues with 1.8.9 #816

Closed
snowman2 opened this issue Oct 29, 2019 · 27 comments
Closed

CRS Axis Order issues with 1.8.9 #816

snowman2 opened this issue Oct 29, 2019 · 27 comments

Comments

@snowman2
Copy link
Contributor

Problem Description

When reading in: https://raw.githubusercontent.com/geopandas/geopandas/master/geopandas/tests/data/overlay/overlap/df1_df2_overlap-union.geojson

With previous versions of Fiona, the WKT string was read in as:

GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]

However, currently (1.8.9) it is read in as:

GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Longitude",EAST],AXIS["Latitude",NORTH]]

Everything is mostly the same except for the axis order at the end.

I believe this is due to this line:

osr_set_traditional_axis_mapping_strategy(cogr_srs)

This is problematic for a couple of reasons:

  1. It does not preserve the original WKT string/projection and mangles it
  2. What if the user actually modified their data to be in the axis order mentioned in the WKT string? How would they tell Fiona that they did so?

Recommended Solution

Add an always_xy flag to crs_to_wkt() and only use it when creating a CRS to be used in the transformation. It would also be a good idea to add an always_xy flag to the transformation operations for the case where the user modified the underlying data to match the axis order specified in the CRS. For Fiona 1.8.x, the always_xy flag could be omitted from the transformation operations and just default to always_xy=True in crs_to_wkt()

Thoughts?

@jorisvandenbossche
Copy link
Member

@snowman2 I don't seem to be able to replicate this with just a fiona example:

In [1]: import fiona

In [2]: import geopandas

In [3]: import pprint 

In [4]: with fiona.open("https://raw.githubusercontent.com/geopandas/geopandas/master/geopandas/tests/data/overlay/overlap/df1_df2_overlap-union.geojson") as c: 
   ...:     pprint.pprint(c.meta) 
   ...:   
{'crs': {'init': 'epsg:4326'},
 'crs_wkt': 'GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS '
            '84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Longitude",EAST],AXIS["Latitude",NORTH],AUTHORITY["EPSG","4326"]]',
 'driver': 'GeoJSON',
 'schema': {'geometry': 'Polygon',
            'properties': OrderedDict([('col1', 'int'), ('col2', 'int')])}}

with the latest fiona:

In [11]: geopandas.show_versions()   

SYSTEM INFO
-----------
python     : 3.7.3 | packaged by conda-forge | (default, Jul  1 2019, 21:52:21)  [GCC 7.3.0]
executable : /home/joris/miniconda3/envs/test-fiona-189/bin/python
machine    : Linux-4.15.0-1059-oem-x86_64-with-debian-buster-sid

GEOS, GDAL, PROJ INFO
---------------------
GEOS       : 3.7.2
GEOS lib   : /home/joris/miniconda3/envs/test-fiona-189/lib/libgeos_c.so
GDAL       : 3.0.1
GDAL data dir: None
PROJ       : 6.2.0
PROJ data dir: /home/joris/miniconda3/envs/test-fiona-189/share/proj

PYTHON DEPENDENCIES
-------------------
geopandas  : 0.6.1
pandas     : 0.25.3
fiona      : 1.8.9.post2
numpy      : 1.17.3
shapely    : 1.6.4.post2
rtree      : 0.8.3
pyproj     : 2.4.0
matplotlib : None
mapclassify: None
pysal      : None
geopy      : None
psycopg2   : None

@snowman2
Copy link
Contributor Author

snowman2 commented Nov 4, 2019

@jorisvandenbossche the Longitude and Latitude are swapped like the example I have above for the latest fiona in your WKT string. So, that seems consistent with this issue.

@jorisvandenbossche
Copy link
Member

Ah, indeed, I was only looking at the presence/absence of the AUTHORITY["EPSG","4326"].
But so this is also strange: it returns an invalid WKT then, as it has the authority code while the axis order is not according to that authority.
(in your example, for the WKT where the long/lat order is swapped, also the authority code 4326 is not present)

@snowman2
Copy link
Contributor Author

snowman2 commented Nov 4, 2019

Ah, indeed, I was only looking at the presence/absence of the AUTHORITY["EPSG","4326"].
But so this is also strange: it returns an invalid WKT then, as it has the authority code while the axis order is not according to that authority.
(in your example, for the WKT where the long/lat order is swapped, also the authority code 4326 is not present)

Yes, that is indeed troubling. The WKT is mangled and misleading.

@jorisvandenbossche
Copy link
Member

So but do you have a reproducible example for the output you showed above? (where the authority was removed)

@snowman2
Copy link
Contributor Author

snowman2 commented Nov 4, 2019

So but do you have a reproducible example for the output you showed above? (where the authority was removed)

Not sure. I may have deleted the environment.

@jorisvandenbossche
Copy link
Member

Ah, and now I see why I thought I couldn't replicate it. Because for another file, I actually get the "correct" axis order (the one that matches the EPSG code):

In [31]: path = geopandas.datasets.get_path("naturalearth_lowres")  

In [32]: with fiona.open(path) as c: 
    ...:     meta = c.meta 

In [33]: meta['crs']  
Out[33]: {'init': 'epsg:4326'}

In [34]: meta['crs_wkt']  
Out[34]: 'GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]'

In [35]: fiona.__version__
Out[35]: '1.8.9.post2'

This still gives latitude/longitude order in the CRS.

So it seems to depend on the actual source representation of the CRS. In case of the above example, it is a WKT in a .prj file of the shapefile. While in the other example it was a ogc:urn string in geojson.

@snowman2
Copy link
Contributor Author

snowman2 commented Nov 4, 2019

@jorisvandenbossche, you are correct. Inside the geojson file is the ogc URN which has a different axis order than EPSG 4326 (geopandas/geopandas#1027 (comment)). It sounds like older versions of GDAL read it in as EPSG:4326 when that was not the case. So, it sounds like this issue is not what I originally thought and can probably be closed after verification.

@sgillies
Copy link
Member

sgillies commented Nov 4, 2019

@snowman2 as far as I know we must, for EPSG:4326 and some other CRS, expect the WKT reported by Fiona 1.8.9 & GDAL 3.0 to report lat/long order at the same time the Fiona API returns long/lat coordinates. GDALSetAxisMappingStrategy is what Fiona uses to preserve the existing coordinate order but I don't think there was ever any intent for that function to modify the WKT representation of the CRS.

@snowman2
Copy link
Contributor Author

snowman2 commented Nov 4, 2019

I don't think there was ever any intent for that function to modify the WKT representation of the CRS

That is great news! I guess I assumed it did since it was in the crs_to_wkt() function. But, I am glad it does not.

I think this is resolved then. Thanks for the clarification!

@snowman2 snowman2 closed this as completed Nov 4, 2019
@jorisvandenbossche
Copy link
Member

I still have a bunch of questions, though, related to this.

  • So it seems roundtrip to shapely and geopackage indeed preserve the CRS, so the original reported issue is specific to the GeoJSON driver and because of its use of "urn:ogc:def:crs:OGC:1.3:CRS84" as the CRS name in the file.
    However, because it seems you always get this "urn:ogc:def:crs:OGC:1.3:CRS84" crs name (if the input crs is EPSG:4326), this means you cannot do a proper roundtrip including the CRS:

    In [1]: import fiona
    
    In [2]: import geopandas
    
    In [3]: schema = geopandas.io.file.infer_schema(geopandas.read_file(geopandas.datasets.get_path('nybb'))) 
    
    In [4]: import pyproj
    
    In [5]: crs = pyproj.CRS("EPSG:4326") 
    
    In [6]: with fiona.open("test.geojson", driver="GeoJSON", mode='w', schema=schema, crs_wkt=crs.to_wkt()): 
       ...:     pass 
    
    In [7]: with fiona.open("test.geojson") as c: 
       ...:     meta = c.meta 
    
    In [8]: meta['crs']
    Out[8]: {'init': 'epsg:4326'}
    
    In [9]: meta['crs_wkt'] 
    Out[9]: 'GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Longitude",EAST],AXIS["Latitude",NORTH],AUTHORITY["EPSG","4326"]]'
    
    In [10]: crs_new = pyproj.CRS(meta['crs_wkt']) 
    
    In [11]: crs_new  
    Out[11]: 
    <Geographic 2D CRS: GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84" ...>
    Name: WGS 84
    Axis Info [ellipsoidal]:
    - lon[east]: Longitude (degree)
    - lat[north]: Latitude (degree)
    Area of Use:
    - undefined
    Datum: World Geodetic System 1984
    - Ellipsoid: WGS 84
    - Prime Meridian: Greenwich
    
    In [12]: crs == crs_new 
    Out[12]: False
    

    This is actually what triggered @snowman2 to open this issue, I think. So but then if fiona considers the above the correct behaviour, I am wondering if we should override this in GeoPandas: if the source is geojson and if the resulting CRS is equal to "CRS84", replace the crs with EPSG:4326. As this is what most people will expect .. ?

  • Is there a way to retrieve the "original source" of the CRS information? Currently you have the "crs" (which is a proj string) and "crs_wkt" keys in the meta. But for the geojson, the original source would be "urn:ogc:def:crs:OGC:1.3:CRS84". For a shapefile, the original source would be the text in the .prj file (which, even if it is WKT, can be different than the return value of "crs_wkt" meta key).

  • What is the WKT format version of the returned string of meta['crs_wkt'] ?
    I suppose it is some form of the WKT1_GDAL version (if you ask for that version in pyproj, you get something not exactly the same but rather close). Is there a way to ask for WKT2 ?

@snowman2
Copy link
Contributor Author

snowman2 commented Nov 6, 2019

And the plot thickens:

>>> import fiona
>>> import geopandas
>>> schema = geopandas.io.file.infer_schema(geopandas.read_file(geopandas.datasets.get_path('nybb'))) 
>>> _CRS = (
...     'GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,'
...     'AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],'
...     'PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],'
...     'UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],'
...     'AXIS["Longitude",EAST],AXIS["Latitude",NORTH]]'
... )
>>> fds = fiona.open("test.geojson", driver="GeoJSON", mode='w', schema=schema, crs_wkt=_CRS)
>>> fds.close()
>>> with fiona.open("test.geojson") as c: 
...     print(c.crs_wkt)
... 
GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]
>>> fds = fiona.open("test.geojson", driver="GeoJSON", mode='w', schema=schema, crs="EPSG:4326")
>>> fds.close()
>>> with fiona.open("test.geojson") as c: 
...     print(c.crs_wkt)
... 
GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Longitude",EAST],AXIS["Latitude",NORTH]]
>>> fds = fiona.open("test.geojson", driver="GeoJSON", mode='w', schema=schema, crs_wkt="EPSG:4326")
>>> fds.close()
>>> with fiona.open("test.geojson") as c: 
...     print(c.crs_wkt)
... 
GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Longitude",EAST],AXIS["Latitude",NORTH]]

So, it seems the WKT representation with the axis order swapped is not honored and always produces the lat/lon order while the EPSG code always produces the lon/lat order when reading back in.

@snowman2
Copy link
Contributor Author

snowman2 commented Nov 6, 2019

This is really funny, with the GeoJSON driver:

  1. If you pass in "urn:ogc:def:crs:OGC:1.3:CRS84" to write the CRS, you get the WKT equivalent to "EPSG:4326"
  2. If you pass in "EPSG:4326" to write the CRS you get the WKT equivalent to "urn:ogc:def:crs:OGC:1.3:CRS84"

snowman2 added a commit to snowman2/geopandas that referenced this issue Nov 6, 2019
@jorisvandenbossche
Copy link
Member

@snowman2 if you look at the geojson file for the first case, you will see that there was no crs information written (so in the case you passed a WKT to crs_wkt, and I suppose this is because it didn't recognize that specific WKT string? Although I would have expected an error). And if there is no crs information in the file, it seems you always get back EPSG:4326.
While in the other cases there is crs information written ("urn:ogc:def:crs:OGC:1.3:CRS84"), and so then you get CRS84 as the resulting crs.

For passing "urn:ogc:def:crs:OGC:1.3:CRS84", it's the same. The resulting geojson file does not include crs information. So it seems that such urn string is also not recognized as input crs description.

So as an additional issue, it seems that the input to crs_wkt is not validated (or at least, an error is not raised to the user, but the argument is just silently ignored):

In [42]: fds = fiona.open("test.geojson", driver="GeoJSON", mode='w',
    ...:                  schema=schema, wkt_crs="blabla") 
    ...: fds.close()  

also generates a geojson file with no crs information without any error or warning.

@jorisvandenbossche
Copy link
Member

I opened #823 as a specific issue for the invalid crs_wkt being silently ignored.

The strange thing is though, that "urn:ogc:def:crs:OGC:1.3:CRS84" doesn't seem to work for GeoJSON, but when using this as the wkt_crs argument while writing to a Shapefile, it does correctly produce a .prj file with:

GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137.0,298.257223563]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]]

@rbuffat
Copy link
Contributor

rbuffat commented Nov 6, 2019

@snowman2 if you look at the geojson file for the first case, you will see that there was no crs information written (so in the case you passed a WKT to crs_wkt, and I suppose this is because it didn't recognize that specific WKT string? Although I would have expected an error). And if there is no crs information in the file, it seems you always get back EPSG:4326.
While in the other cases there is crs information written ("urn:ogc:def:crs:OGC:1.3:CRS84"), and so then you get CRS84 as the resulting crs.

For passing "urn:ogc:def:crs:OGC:1.3:CRS84", it's the same. The resulting geojson file does not include crs information. So it seems that such urn string is also not recognized as input crs description.

According to the GeoJSON standard (RFC7946 (August 2016)) all coordinates are in the urn:ogc:def:crs:OGC::CRS84 coordinate system.

As previous draft versions of the GeoJSON standard supported coordinate systems, it could be that previous versions of the gdal GeoJSON driver supported custom coordinate systems.

@jorisvandenbossche
Copy link
Member

@rbuffat yes, I know that the latest spec only allows CRS84, while the older spec (http://geojson.org/geojson-spec.html) includes a specification for a CRS description.
But, GDAL still defaults to the older spec: https://gdal.org/drivers/vector/geojson.html#rfc-7946-write-support

As an example, with current fiona, I can perfectly write a different CRS (using the variables from above)

In [17]: fds = fiona.open("test.geojson", driver="GeoJSON", mode='w', 
    ...:                  schema=schema, crs="EPSG:31370")  
    ...:    fds.close()   

and then the resulting geojson file includes this:

...
"crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:EPSG::31370" } },
...

Now, regardless of which is spec is being used:

@jorisvandenbossche
Copy link
Member

Now, regardless of which is spec is being used ... there still is the problem of roundtripping to GeoJSON changing EPSG:4326 into CRS84

Correcting myself here: in fact, when forcing the new spec (passing RFC7946="YES" to fiona.open), this roundtrip is fine. Because with RFC7946, no crs information is written in the geojson file. And then, when reading in (as was also noticed above), you actually get back EPSG:4326.

So it is only when "urn:ogc:def:crs:OGC:1.3:CRS84" is written into the file, you get back a WKT that is not EPSG:4326 (but the proj4 string still says it is!), and you have this roundtrip problem.
But, with the default of GDAL to use the old spec, you always get into that situation if your original data had EPSG:4326 properly set.

@snowman2
Copy link
Contributor Author

snowman2 commented Nov 6, 2019

I think this is an issue with GDAL then. I would think with the new emphasis on axis order, the GeoJSON should differentiate between CRS84 and EPSG:4326 when reading and writing the CRS without the new spec activated.

@jorisvandenbossche
Copy link
Member

I think this is an issue with GDAL then

I am not sure it is up to GDAL to change CRS84 into EPSG:4326. While it is fiona that knows that it is always using x/y (lon/lat) order and thus already ignoring the axis order (using the traditional GIS axis order).

What do you think about Fiona returning EPSG:4326 instead of CRS84 for geojson format?

@snowman2
Copy link
Contributor Author

What do you think about Fiona returning EPSG:4326 instead of CRS84 for geojson format?

Overall, this is my take on it: geopandas/geopandas#1101 (comment)

But, I also think bringing this question to the GDAL devs would be a good idea as well.

I think it would be good for Fiona 2+ to take axis order into consideration. Possibly add the option for users to toggle on/off the axis order stuff?

@snowman2
Copy link
Contributor Author

Opened issue: OSGeo/gdal#2035

@sgillies
Copy link
Member

@snowman2 the GeoJSON spec, RFC 7946, says the CRS is the OGC's CRS84 (which is explicitly long/lat order). FIona shouldn't go against that.

@snowman2
Copy link
Contributor Author

@snowman2 the GeoJSON spec, RFC 7946, says the CRS is the OGC's CRS84 (which is explicitly long/lat order). FIona shouldn't go against that.

I agree 💯

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Nov 20, 2019

the GeoJSON spec, RFC 7946, says the CRS is the OGC's CRS84 (which is explicitly long/lat order). FIona shouldn't go against that.

Well, Fiona (or GDAL) does go against it.
When you use RFC7946="YES", you actually get back EPSG:4326 ..
It's only when not using the new spec that you get back CRS84 (well, you get back what is written into the "crs" field of the geojson file, but it is impossible to write EPSG:4326 in that field, as it always get converted to CRS84)

@jorisvandenbossche
Copy link
Member

@snowman2 thanks for opening that issue!

@snowman2
Copy link
Contributor Author

@snowman2 thanks for opening that issue!

Not a problem! Looks like EPSG:4326 is the winner in this scenario and the fix is in GDAL, so no change needed in Fiona.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants