Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: include crs in to_json (#1774) #2151

Merged
merged 17 commits into from
May 1, 2023

Conversation

simberaj
Copy link
Contributor

@simberaj simberaj commented Oct 1, 2021

Add an OGC CRS definition to the GeoJSON dictionary interface in _to_geo(), which propagates to __geo_interface__ and to_dict().
Close #1774.

@simberaj simberaj changed the title ENH: include crs in to_json, __geo_interface__ etc. (#1744) ENH: include crs in to_json, __geo_interface__ etc. (#1774) Oct 1, 2021
@martinfleis
Copy link
Member

Thanks!

I am only worried about one thing. __geo_interface__ spec does not include crs. If we now include it, following the GeoJSON model, I am not sure that it doesn't cause some trouble in other libraries. They may not expect it and hence not know how to handle it.

I would be more comfortable if we included it only in to_json where it is expected and not in __geo_interface__ where it is not.

@simberaj
Copy link
Contributor Author

@martinfleis Updated according to your idea.

@simberaj
Copy link
Contributor Author

simberaj commented Feb 2, 2022

@martinfleis docstring amended, thanks for the suggestion

@martinfleis
Copy link
Member

I just realised that this fails when a CRS cannot be represented by an (authority, code) tuple. Consider this example, using Gall-Peters Orthographic Projection defined via proj string. The CRS is perfectly valid, just pyproj cannot determine authority and code.

import geopandas

world = geopandas.read_file(geopandas.datasets.get_path('nybb'))
world = world.set_geometry(world.centroid)
world = world.set_crs('+proj=cea +lon_0=0 +lat_ts=45 +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs', allow_override=True)
world.to_json()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/var/folders/2f/fhks6w_d0k556plcv3rfmshw0000gn/T/ipykernel_24983/3825038059.py in <module>
----> 1 world.to_json()

~/Git/geopandas/geopandas/geodataframe.py in to_json(self, na, show_bbox, drop_id, **kwargs)
    749 
    750         if self.crs is not None:
--> 751             authority, code = self.crs.to_authority()
    752             ogc_crs = f"urn:ogc:def:crs:{authority}::{code}"
    753             geo["crs"] = {"type": "name", "properties": {"name": ogc_crs}}

TypeError: cannot unpack non-iterable NoneType object

We should catch this, raise a warning that we were unable to create a compliant CRS string and return a JSON without CRS.

@martinfleis martinfleis changed the title ENH: include crs in to_json, __geo_interface__ etc. (#1774) ENH: include crs in to_json (#1774) Feb 21, 2022
@martinfleis martinfleis added this to the 0.11 milestone Feb 21, 2022
@martinfleis
Copy link
Member

Thanks for the change! I have pushed two commits in, as I initially just wanted to lint the code but then realised that we should remove the print of the CRS from the error message as it gets quite verbose, so I went ahead. We don't print the whole thing anywhere and I think we can easily leave it out here.

Copy link
Member

@m-richards m-richards left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @simberaj, I'm pretty happy with this.

I just have one minor question for @martinfleis to weigh in on.

  • Should inclusion of CRS in to_json be opt in? - e.g. as keyword argument include_crs, since it is not part of the current geojson spec? I don't use this method, but if you were say using to_json to write geojson to a file for a system not supporting crs, this would be breaking, and there's no good way to opt out (loads the string, drop crs as a key and dumps again) unless I'm missing something.

geopandas/geodataframe.py Outdated Show resolved Hide resolved
@martinfleis
Copy link
Member

martinfleis commented Feb 23, 2022

Should inclusion of CRS in to_json be opt in?

That a good point I didn't think about. I don't think it is a big issue though. If you use to_file, you always have crs included with no way of turning it off. But it surely wouldn't hurt to have it optional in case someone needs it, defaulting to True.

@simberaj Could you add the keyword?

@jorisvandenbossche
Copy link
Member

I don't know how important this is, but one potential additional concern: according to https://www.ogc.org/ogcUrnPolicy, there is a specific set of authorities that are currently recognized by the OGC spec (EDCS, EPSG, OGC, SI, UCUM, so eg not IGNF or ESRI).
(might be worth checking what GDAL does in those cases)

@jorisvandenbossche
Copy link
Member

Quickly testing:

df = geopandas.read_file(geopandas.datasets.get_path("naturalearth_lowres"))
df2 = df.to_crs("IGNF:WGS84")
df2.to_file("test_ignf.geojson")

this then doesn't seem to include the CRS information in the resulting file (while it does for df.to_file("test.geojson")).

@martinfleis martinfleis requested a review from ljwolf June 20, 2022 12:39
@martinfleis martinfleis modified the milestones: 0.11, 0.12 Jun 21, 2022
@martinfleis martinfleis modified the milestones: 0.12, 0.13 Oct 24, 2022
martinfleis and others added 2 commits April 6, 2023 13:15
Co-authored-by: Matt Richards <45483497+m-richards@users.noreply.github.com>
@martinfleis martinfleis modified the milestones: 0.14, 0.13 Apr 6, 2023
@martinfleis
Copy link
Member

I have updated this PR to closely match the 2016 GeoJSON spec. That means that when we save a gdf with a CRS equal to WGS84 or use to_wgs84=True, no CRS is saved and the JSON matches the spec.

When we use different CRS, a crs field is included matching the older spec and GDAL behaviour.

We also check for the set of allowed authorities and include CRS only if the authority is listed in the OGC URN Policy. This deviates from GDAL which saves CRS using the authority::code even for authorities like IGNF or ESRI. But it follows OGC. The question here is whether we want to do this - the resulting GeoJSON is not valid per the specification anyway (different CRS used and no CRS stored), so we may want to follow GDAL in saving the CRS anytime we are able to obtain authority:code tuple from pyproj.

Removing the check for allowed_authorities, the behaviour should match the one of GDAL, based on my testing.

@martinfleis martinfleis linked an issue Apr 6, 2023 that may be closed by this pull request
@jorisvandenbossche jorisvandenbossche merged commit 35f7004 into geopandas:main May 1, 2023
15 of 17 checks passed
@jorisvandenbossche
Copy link
Member

Thanks @simberaj for the original PR, and everyone for getting this over the finish line.

@jparta
Copy link

jparta commented Aug 28, 2023

Why does this change only apply to GeoDataFrames and not GeoSeries?

@martinfleis
Copy link
Member

@jparta because no one reailsed that we need to port that over to GeoSeries.to_json as well. Would you be willing to open a PR implementing the same on the GeoSeries?

JohnMoutafis pushed a commit to JohnMoutafis/geopandas that referenced this pull request Nov 16, 2023
Co-authored-by: Jan Šimbera <jan.simbera@nanoenergies.cz>
Co-authored-by: Martin Fleischmann <martin@martinfleischmann.net>
Co-authored-by: Matt Richards <45483497+m-richards@users.noreply.github.com>
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: include CRS in "to_json" to_json should either include CRS to convert to wgs84
5 participants