Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CRS JSON export (refs #1545) #1547

Merged
merged 16 commits into from Aug 20, 2019

Conversation

@rouault
Copy link
Member

rouault commented Jul 6, 2019

(output updated with latest developments)
$ src/projinfo EPSG:32631 -o JSON -q

{
  "type": "ProjectedCRS",
  "name": "WGS 84 / UTM zone 31N",
  "base_crs": {
    "name": "WGS 84",
    "datum": {
      "type": "GeodeticReferenceFrame",
      "name": "World Geodetic System 1984",
      "ellipsoid": {
        "name": "WGS 84",
        "semi_major_axis": 6378137,
        "inverse_flattening": 298.257223563
      }
    },
    "coordinate_system": {
      "subtype": "ellipsoidal",
      "axis": [
        {
          "name": "Geodetic latitude",
          "abbreviation": "Lat",
          "direction": "north",
          "unit": "degree"
        },
        {
          "name": "Geodetic longitude",
          "abbreviation": "Lon",
          "direction": "east",
          "unit": "degree"
        }
      ]
    },
    "id": {
      "authority": "EPSG",
      "code": 4326
    }
  },
  "conversion": {
    "name": "UTM zone 31N",
    "method": {
      "name": "Transverse Mercator",
      "id": {
        "authority": "EPSG",
        "code": 9807
      }
    },
    "parameters": [
      {
        "name": "Latitude of natural origin",
        "value": 0,
        "unit": "degree",
        "id": {
          "authority": "EPSG",
          "code": 8801
        }
      },
      {
        "name": "Longitude of natural origin",
        "value": 3,
        "unit": "degree",
        "id": {
          "authority": "EPSG",
          "code": 8802
        }
      },
      {
        "name": "Scale factor at natural origin",
        "value": 0.9996,
        "unit": "unity",
        "id": {
          "authority": "EPSG",
          "code": 8805
        }
      },
      {
        "name": "False easting",
        "value": 500000,
        "unit": "metre",
        "id": {
          "authority": "EPSG",
          "code": 8806
        }
      },
      {
        "name": "False northing",
        "value": 0,
        "unit": "metre",
        "id": {
          "authority": "EPSG",
          "code": 8807
        }
      }
    ]
  },
  "coordinate_system": {
    "subtype": "Cartesian",
    "axis": [
      {
        "name": "Easting",
        "abbreviation": "E",
        "direction": "east",
        "unit": "metre"
      },
      {
        "name": "Northing",
        "abbreviation": "N",
        "direction": "north",
        "unit": "metre"
      }
    ]
  },
  "area": "World - N hemisphere - 0°E to 6°E - by country",
  "bbox": {
    "south_latitude": 0,
    "west_longitude": 0,
    "north_latitude": 84,
    "east_longitude": 6
  },
  "id": {
    "authority": "EPSG",
    "code": 32631
  }
}
@rouault rouault mentioned this pull request Jul 6, 2019
@rouault rouault force-pushed the rouault:json_export branch 6 times, most recently from 7123662 to d824832 Jul 6, 2019
@rouault

This comment has been minimized.

Copy link
Member Author

rouault commented Jul 6, 2019

I've more or less implemented the same ellision rules as WKT2, that is if an object has an ID, then omit the IDs of its children, except for the base CRS of the projected CRS, the method and parameter name IDs. Similarly the "usage" (area & bbox) are only reported on the top object.

A few potential ideas to make it more compact:

  • for interior objects, most of the time the "type" attribute is useless and directly inferred by the parent context. For example "ellipsoid": { "type": "Ellipsoid", ... }. The cases where I'd keep the type would be for units (to know if it is linear, angular, etc..).
  • For { "type": "GeographicCRS", ... } as the value of "base_crs", perhaps keep it. Or replace the generic "base_crs" by "base_geographic_crs" ? Similarly for "datum": { "type": "GeodeticReferenceFrame" } --> "geodetic_reference_frame": { ... } ?
  • for units, we could potentially recognize a few common ones under a short form: "unit": "metre", "unit": "degree"
  • for Ellipsoid, we could potentially decide to conventionnaly omit unit completely, if the semi major axis is metre. Similarly for the longitude of PrimeMeridian if it is degree ? (WKT2 didn't go that far though)
  • omit PrimeMeridian if it is Greenwich ?
@rouault rouault force-pushed the rouault:json_export branch 5 times, most recently from 0e30264 to 0fecb4a Jul 6, 2019
rouault added 3 commits Jul 6, 2019
…ing to Python3
@rouault rouault force-pushed the rouault:json_export branch from 0fecb4a to d44e1e2 Jul 7, 2019
@kbevers

This comment has been minimized.

Copy link
Member

kbevers commented Jul 8, 2019

I think this looks quite good already. Impressive work, Even!

for interior objects, most of the time the "type" attribute is useless and directly inferred by the parent context. For example "ellipsoid": { "type": "Ellipsoid", ... }. The cases where I'd keep the type would be for units (to know if it is linear, angular, etc..).

Sounds good to me. Especially if combined with the short form unit descriptor described below.

For { "type": "GeographicCRS", ... } as the value of "base_crs", perhaps keep it. Or replace the generic "base_crs" by "base_geographic_crs" ? Similarly for "datum": { "type": "GeodeticReferenceFrame" } --> "geodetic_reference_frame": { ... } ?

I am indifferent to this one. Do you want to to this because snake_case is more "JSONic" than CamelCase?

for units, we could potentially recognize a few common ones under a short form: "unit": "metre", "unit": "degree"

Sounds good.

for Ellipsoid, we could potentially decide to conventionnaly omit unit completely, if the semi major axis is metre. Similarly for the longitude of PrimeMeridian if it is degree ? (WKT2 didn't go that far though)

Sounds very reasonable to me.

omit PrimeMeridian if it is Greenwich ?

Yes.

In general, I think that a short form approach to the JSON output is preferred when properties of CRS or transformation is "standard". That is, SI units, Greenwich prime meridian and what else is intuitively understood by people familiar with the topic.

Is the JSON output from projinfo enabled by default or do you have to toggle it with -o JSON? I prefer the latter.

Before this is merged I would like to have the docs for projinfo updated and preferably also a description or specification of the JSON format (especially if the ideas above is implemented).

@rouault

This comment has been minimized.

Copy link
Member Author

rouault commented Jul 8, 2019

I am indifferent to this one. Do you want to to this because snake_case is more "JSONic" than CamelCase?

yes, for key values, I use snake_case . For "type" values, I've imitated GeoJSON where you have the CamelCase convention ( "type", "Feature", "type": "GeometryCollection", etc). But my comment here wasn't about the naming convention, but more about how to specify the type of an objet. Currently with "datum": { "type": "GeodeticRefererenceFrame" }, this is generic, and if we implement "type": "DynamicGeodeticReferenceFrame" this is easily possible. I we omit the "type", we need to move its indication one level up in some way.

Is the JSON output from projinfo enabled by default or do you have to toggle it with -o JSON

Not enabled by default. Have to toggle it with -o JSON

I would like to have the docs for projinfo updated

Sure, but we aren't there yet. And I guess the reading side should be there too. Which will bring the issue of having a JSON parser. I've in GDAL a hand-made streaming parser (made to be able to parse gigabytes lengthy GeoJSON files), mixed with libjson-c, but a pure steaming parser is not convenient to use here, and a standard "DOM" oriented if I might say parser would be better. I guess I could possibly have this DOM layer on top of the streaming parser, or use an external library.

preferably also a description or specification of the JSON format

Like https://github.com/rouault/proj.4/blob/json_export/data/crsjson.schema.json possibly augmented with a few comments ?

@kbevers

This comment has been minimized.

Copy link
Member

kbevers commented Jul 8, 2019

my comment here wasn't about the naming convention, but more about how to specify the type of an objet. Currently with "datum": { "type": "GeodeticRefererenceFrame" }, this is generic, and if we implement "type": "DynamicGeodeticReferenceFrame" this is easily possible. I we omit the "type", we need to move its indication one level up in some way.

Sorry, I misinterpreted that. I understand now. I think the current approach is fine - it is in line with the WKT2 standard and is more flexible in case new types are introduced.

Like rouault/proj.4:data/crsjson.schema.json@json_export possibly augmented with a few comments ?

Yes, I think that will be enough. Something stating that this is based on WKT2:2018 and how it deviates from the standard. I have a feeling that once we provide this type of output (and eventually input) people is going to rely on it quickly so some form of governance of the format is required. Good documentation helps with that.

@hobu

This comment has been minimized.

Copy link
Contributor

hobu commented Jul 9, 2019

I think the canonical name for this should be projson or something more descriptive than simply json.

@snowman2

This comment has been minimized.

Copy link
Contributor

snowman2 commented Jul 9, 2019

Or confuse people and do WKTJSON_2019 😛

@mwtoews

This comment has been minimized.

Copy link
Member

mwtoews commented Jul 9, 2019

I'd prefer keeping the output format as JSON, since this is a valid format for the output, and not some special format that is inspired by JSON.

Copy link
Member

mwtoews left a comment

Amend docs/source/apps/projinfo.rst lines 57, 72, 93 and 226. (Also, I don't mind doing this, if you prefer)

@m-mohr m-mohr mentioned this pull request Jul 23, 2019
3 of 3 tasks complete
rouault added 2 commits Aug 8, 2019
@rouault rouault force-pushed the rouault:json_export branch 2 times, most recently from e11ca45 to c5dd688 Aug 9, 2019
@rouault rouault force-pushed the rouault:json_export branch from c5cddfe to 124eb22 Aug 9, 2019
rouault added 2 commits Aug 9, 2019
@rouault rouault force-pushed the rouault:json_export branch from ec8fe5f to 5e8a44e Aug 9, 2019
@rouault

This comment has been minimized.

Copy link
Member Author

rouault commented Aug 9, 2019

Folks, I believe this work is feature-complete now and ready for review. I've opted for PROJJSON as suggested by @hobu . I've done changes on export so that it is a bit less verbose . All CRS, Datum and CoordinateOperation objects can be exported and imported. C function proj_as_projjson() for export added (import is done through existing proj_create())

For parsing JSON, I use the single file header https://github.com/nlohmann/json/blob/develop/single_include/nlohmann/json.hpp . I've integrated it in the source tree. I can imagine that some distributions might not like that (it is sometimes available as a package). It should be easy to patch PROJ source to include the system version if present (actually it should probably be just a matter of removing the include/proj/internal/nlohmann directory). I didn't go as far as tweaking the build systemS to make that a configurable option.

@hobu

This comment has been minimized.

Copy link
Contributor

hobu commented Aug 9, 2019

We are using nlohmann in PDAL as well to replace jsoncpp which has been terrible with interface changes and deployment issues. It has worked well for us, but our upcoming 2.0 release is the first public release with it.

@hobu

This comment has been minimized.

Copy link
Contributor

hobu commented Aug 9, 2019

We are using nlohmann in PDAL

I forgot to add we buried it in a namespace so it wouldn't clash, and we are careful about not exposing it in public APIs.

@rouault

This comment has been minimized.

Copy link
Member Author

rouault commented Aug 9, 2019

I forgot to add we buried it in a namespace so it wouldn't clash, and we are careful about not exposing it in public APIs.

The same here with this little #define trick : https://github.com/rouault/proj.4/blob/json_export/include/proj/internal/include_nlohmann_json.hpp#L5

rouault added 2 commits Aug 9, 2019
…]ReferenceFrame
@rouault rouault force-pushed the rouault:json_export branch from 00ca55b to 63981af Aug 10, 2019
…ParametricCRS and TemporalCRS
@snowman2

This comment has been minimized.

Copy link
Contributor

snowman2 commented Aug 12, 2019

First of all, this is fantastic!

One thing I randomly bumped into is the difference in the area of use in the datum representations of WKT versus PROJ JSON. Also, noticed this difference between the direct Datum PROJ JSON versus the datum information in the CRS PROJ JSON.

>>> from pyproj.crs import CRS, Datum
>>> crs_utm = CRS.from_epsg(26915)
>>> crs_utm.to_json_dict()["base_crs"]["datum"]
{'type': 'GeodeticReferenceFrame', 'name': 'North American Datum 1983', 'ellipsoid': {'name': 'GRS 1980', 'semi_major_axis': 6378137, 'inverse_flattening': 298.257222101}}
>>> dd = Datum.from_json_dict(crs_utm.to_json_dict()["base_crs"]["datum"])
>>> dd
DATUM["North American Datum 1983",
    ELLIPSOID["GRS 1980",6378137,298.257222101,
        LENGTHUNIT["metre",1,
            ID["EPSG",9001]]]]
>>> dd.to_json_dict()
{'type': 'GeodeticReferenceFrame', 'name': 'North American Datum 1983', 'ellipsoid': {'name': 'GRS 1980', 'semi_major_axis': 6378137, 'inverse_flattening': 298.257222101}}
>>> crs_utm.datum.to_json_dict()
{'type': 'GeodeticReferenceFrame', 'name': 'North American Datum 1983', 'ellipsoid': {'name': 'GRS 1980', 'semi_major_axis': 6378137, 'inverse_flattening': 298.257222101}, 'area': 'North America - NAD83', 'bbox': {'south_latitude': 14.92, 'west_longitude': 167.65, 'north_latitude': 86.46, 'east_longitude': -47.74}, 'id': {'authority': 'EPSG', 'code': 6269}}
>>> dd2 = Datum.from_json_dict(crs_utm.datum.to_json_dict())
>>> dd2
DATUM["North American Datum 1983",
    ELLIPSOID["GRS 1980",6378137,298.257222101,
        LENGTHUNIT["metre",1]],
    ID["EPSG",6269]]
>>> dd2.to_json_dict()
{'type': 'GeodeticReferenceFrame', 'name': 'North American Datum 1983', 'ellipsoid': {'name': 'GRS 1980', 'semi_major_axis': 6378137, 'inverse_flattening': 298.257222101}, 'area': 'North America - NAD83', 'bbox': {'south_latitude': 14.92, 'west_longitude': 167.65, 'north_latitude': 86.46, 'east_longitude': -47.74}, 'id': {'authority': 'EPSG', 'code': 6269}}

Not sure if this matters or not, but figured it was noteworthy.

@rouault

This comment has been minimized.

Copy link
Member Author

rouault commented Aug 12, 2019

Not sure if this matters or not, but figured it was noteworthy.

This is expected. The export follows the same rules as WKT (normally you should get similar behaviour, but I didn't try to check to that level of detail...), which includes omitting "details" of embedded objects when you export a top-level object. When you export a CRS, its datum id and datum area of uses are omitted for compactness. So when you instanciate a CRS back from that, those information are no longer there (they should be queried from the database if needed). Whereas when you instanciate a CRS from its code and export its datum right away, the datum object of this CRS has id, area of use etc, which can be exported, if it is exported as a standalone object.

Copy link
Member

kbevers left a comment

Sorry for the late review. All in all I think this is a great piece of work. Good job, Even.

I have a few remarks regarding the JSON schema. Or rather the filename and URL as you can see in the inlined comments. In addition to those I am wondering how updates to the schema is dealt with? This version of the schema is obviously tied to PROJ 6.2.0, but what happens in the case we need to fix a bug in the schema for version 6.3.0? As far as I can see no versioning info is present in the schema.

@@ -0,0 +1,935 @@
{
"$id": "https://proj.org/crsjson.schema.json",

This comment has been minimized.

Copy link
@kbevers

kbevers Aug 15, 2019

Member

Now that you have decided to go with the name PROJJSON, shouldn't it then be projjson.schema.json?

@@ -44,6 +44,7 @@ Synopsis
(*added in 6.2*)
- a OGC URN combining references for concatenated operations
(e.g. "urn:ogc:def:coordinateOperation,coordinateOperation:EPSG::3895,coordinateOperation:EPSG::1618")
- a PROJJSON string. The jsonschema is at https://github.com/OSGeo/proj/blob/master/data/crsjson.schema.json (*added in 6.2*)

This comment has been minimized.

Copy link
@kbevers

kbevers Aug 15, 2019

Member

In the schema file this is said to be located at https://proj.org/crsjson.schema.json. I would prefer to use the same location here too. Having a direct link to master on github is a potential risk if the schema is updated before a new release is published.

{
"$id": "https://proj.org/crsjson.schema.json",
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "Schema for CRS JSON",

This comment has been minimized.

Copy link
@kbevers

kbevers Aug 15, 2019

Member

Should also be PROJJSON instead of CRS JSON.

@rouault

This comment has been minimized.

Copy link
Member Author

rouault commented Aug 15, 2019

In addition to those I am wondering how updates to the schema is dealt with? This version of the schema is obviously tied to PROJ 6.2.0, but what happens in the case we need to fix a bug in the schema for version 6.3.0? As far as I can see no versioning info is present in the schema.

Hum I thought this was perfect work, peer reviewed by hundreds of people, so there is no bug, right ;-) ? More seriously, AFAICS there is no dedicated field in https://json-schema.org/specification.html to put a version number for a schema, so I think this should be captured in the $id property if we wanted to, like https://proj.org/schemas/v0.1/projjson.schema.json ? (not sure we want to advertize a 1.0 version number at this point)
Versioning the schema would be one thing, but I'm not sure how useful this would be given that PROJJSON content itself doesn't contain a link to it. Should we include a "$schema": "https://proj.org/schemas/v0.1/projjson.schema.json" link in the top element of a PROJJSON export ?
Note: GeoJSON doesn't point to a schema (but it had no json schema developed officially for it, so that's propably not a good reference).

@kbevers

This comment has been minimized.

Copy link
Member

kbevers commented Aug 16, 2019

so I think this should be captured in the $id property if we wanted to, like proj.org/schemas/v0.1/projjson.schema.json ?

Yes, it was something like that I was thinking about.

Should we include a "$schema": "proj.org/schemas/v0.1/projjson.schema.json" link in the top element of a PROJJSON export ?
Note: GeoJSON doesn't point to a schema (but it had no json schema developed officially for it, so that's propably not a good reference).

I think that would be a smart thing to do. As soon as this is released it is going to be (ab)used by lots of people and changes in the schema will affect them in some way or other. If there's no standardised way of pointing towards the schema in use (similar to XML) this is as good as it gets, I guess.

Keeping the schema as version 0.1 is fine with me. We can use the 6.2.x releases up until 7.0 as a preview phase and change it to 1.0 with PROJ 7. Hopefully any potential kinks will be ironed out by then.

… it and to exported PROJJSON strings
@kbevers

This comment has been minimized.

Copy link
Member

kbevers commented Aug 20, 2019

@rouault As far as I can tell you have addressed my comments so if you are happy with the PR as it is now I think it is time to merge the code.

@rouault rouault merged commit 2c9c015 into OSGeo:master Aug 20, 2019
4 checks passed
4 checks passed
FreeBSD Task Summary
Details
Travis CI - Pull Request Build Passed
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
coverage/coveralls Coverage decreased (-0.03%) to 85.893%
Details
@rouault

This comment has been minimized.

Copy link
Member Author

rouault commented Aug 20, 2019

Merged!

@kbevers

This comment has been minimized.

Copy link
Member

kbevers commented Aug 20, 2019

Awesome. Impressive work on this!

@snowman2

This comment has been minimized.

Copy link
Contributor

snowman2 commented Aug 21, 2019

Thanks for adding this. Is this planned to be added to the 6.2.0 release?

@snowman2 snowman2 mentioned this pull request Aug 21, 2019
@rouault rouault added this to the 6.2.0 milestone Aug 21, 2019
@rouault

This comment has been minimized.

Copy link
Member Author

rouault commented Aug 21, 2019

Is this planned to be added to the 6.2.0 release?

Yes, this is in master, and master will be release as 6.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.