Convert layers to YAML #947

Merged
merged 10 commits into from Sep 16, 2014

Conversation

Projects
None yet
4 participants
@pnorman
Collaborator

pnorman commented Sep 13, 2014

Closes #711

Although requiring a script is not ideal, the JSON MML file couldn't be sanely edited anyways. The advantages are

  • Sane merge conflict resolution
  • Not always a merge conflict for edits to the same SQL
  • Reasonable code reviews of SQL changes. Currently I don't review most SQL changes since they're very hard to get a handle on when escaped into a single line of JSON
  • Easier changing of database settings. With YAML aliases, they only need to be specified once

I think it'd be easiest to merge this then rebase other merges on top of it - the one-time conversion from JSON to YAML is not trivial, as a great deal of work had to be done by hand to clean up the YAML.

pnorman added some commits Sep 9, 2014

Add a script that takes a yaml file and converts it to a JSON file
Requires PyYAML. If this dependency is an issue on a particular
platform, pretty much any language allows you to do this.

A ruby example is

    ruby -ryaml -rjson -e 'puts JSON.pretty_generate(YAML.load(ARGF))' < project.yml > project.json
Convert the MML file from JSON to YAML
This one-time conversion was done with the json2yaml node module,
and then substantially reformatted by hand, using mappings and alias
to avoid repeating projection and extents information for each layer,
and to get consistent ordering of the properties of each layer. The
latter does not matter semantically, but makes editing with a text
editor cleaner.

By using the scripts/yaml2mml.py script, the YAML can be converted
back into JSON, which is still what TileMill uses.

A JSON-aware diff of the original project.mml and the new generated
one shows that the only differences are the new "_parts" property,
used for aliases and ignored by TileMill, and layer extents.

The script can be invoked with

    scripts/yaml2mml.py < project.yaml > project.mml

The project.mml file remains checked in to git, as TileMill requires
it and this allows someone to clone the repo, and immediately have
it work. Merge conflicts in project.mml are trivial to solve, as
the entire file is just regenerated by running the script again.

This does mean that the .MML file should not be edited directly,
but this is fine, as editing the JSON by hand is extremely difficult
anyways! The HOT OSM style has a similar workflow, making use of cartocc.
Use alias to avoid duplicating SQL connection information
By using aliases, we can avoid having to specify the same information
repeatedly, saving lines, avoiding mistakes, and making it easier to
use a differently named database.

The last is evident in that the landcover extents changed slightly
in this, because they were different than all the other layers.
Conver SQL from in-line escaped scalars to block scalars
Block scalars allow you to use newlines, and are far more readable

This commit also catches some extents that were missed previously.
Clean up SQL in the YAML file
A lot of the SQL was badly formatted, with random tabs, caps, and
lines of 5000 characters. This commit cleans up about half of them
but is not complete.
+* Add indentation if necessary for complex function calls, WHERE parenthesis, and CASE statements
+* One space before and after = etc
+* Name SQL subqueries after the layer name (but use underscores)
+

This comment has been minimized.

@matthijsmelissen

matthijsmelissen Sep 13, 2014

Collaborator

What about spaces after commas? We are currently inconsistent in that.

@matthijsmelissen

matthijsmelissen Sep 13, 2014

Collaborator

What about spaces after commas? We are currently inconsistent in that.

This comment has been minimized.

@pnorman

pnorman Sep 14, 2014

Collaborator

I've been making that consistent

@pnorman

pnorman Sep 14, 2014

Collaborator

I've been making that consistent

+* Two indents for columns, to bring them to the same indent level as later clause contents
+* Add indentation after ``SELECT``s until the end of the sub-select.
+* Add indentation for contents of ``FROM``, ``WHERE``, ``ORDER BY`` and other clauses
+* Put content with WHERE, etc if it's short

This comment has been minimized.

@matthijsmelissen

matthijsmelissen Sep 13, 2014

Collaborator

I'm not sure what you mean with this line?

@matthijsmelissen

matthijsmelissen Sep 13, 2014

Collaborator

I'm not sure what you mean with this line?

@matthijsmelissen

This comment has been minimized.

Show comment
Hide comment
@matthijsmelissen

matthijsmelissen Sep 13, 2014

Collaborator

Great work, this will certainly make all our work much easier.

Did you make any changes to the SQL apart from whitespace? Does this need a review?

On coverting the file I get:
Traceback (most recent call last):
File "./scripts/yaml2mml.py", line 2, in
import sys, yaml, json
ImportError: No module named yaml

I think we need to add the package python-yaml to the dependencies.

Collaborator

matthijsmelissen commented Sep 13, 2014

Great work, this will certainly make all our work much easier.

Did you make any changes to the SQL apart from whitespace? Does this need a review?

On coverting the file I get:
Traceback (most recent call last):
File "./scripts/yaml2mml.py", line 2, in
import sys, yaml, json
ImportError: No module named yaml

I think we need to add the package python-yaml to the dependencies.

project.yaml
+ (SELECT
+ way
+ FROM planet_osm_line
+ WHERE man_made='cutline'

This comment has been minimized.

@matthijsmelissen

matthijsmelissen Sep 14, 2014

Collaborator

Here (and at some other places) are spaces missing around the equals sign. Not really important, but I guess we'd better fix it now.

@matthijsmelissen

matthijsmelissen Sep 14, 2014

Collaborator

Here (and at some other places) are spaces missing around the equals sign. Not really important, but I guess we'd better fix it now.

project.yaml
+ (SELECT
+ way
+ FROM planet_osm_polygon
+ WHERE "historic"='castle_walls'

This comment has been minimized.

@matthijsmelissen

matthijsmelissen Sep 14, 2014

Collaborator

As you dropped the quotes in the citywalls layer, you can remove them elsewhere as well.

@matthijsmelissen

matthijsmelissen Sep 14, 2014

Collaborator

As you dropped the quotes in the citywalls layer, you can remove them elsewhere as well.

project.yaml
+ END AS access,
+ construction,
+ CASE WHEN service in ('parking_aisle','drive-through','driveway') THEN 'INT-minor'::text else 'INT-normal'::text end as service,
+ CASE WHEN oneway in ('yes', '-1') and highway in ('motorway', 'motorway_link', 'trunk', 'trunk_link', 'primary', 'primary_link', 'secondary', 'secondary_link', 'tertiary', 'tertiary_link', 'residential', 'unclassified', 'road', 'service', 'pedestrian', 'raceway', 'living_street',' construction') THEN oneway ELSE NULL END as oneway,

This comment has been minimized.

@matthijsmelissen

matthijsmelissen Sep 14, 2014

Collaborator

Typo: space before 'construction'!

@matthijsmelissen

matthijsmelissen Sep 14, 2014

Collaborator

Typo: space before 'construction'!

project.yaml
+ (SELECT way, highway, name
+ FROM planet_osm_line
+ WHERE highway in ('bridleway', 'footway', 'cycleway', 'path', 'track', 'steps')
+ and name is not null

This comment has been minimized.

@pnorman

pnorman Sep 14, 2014

Collaborator

I haven't gotten this far, just some guided search and replaces

@pnorman

pnorman Sep 14, 2014

Collaborator

I haven't gotten this far, just some guided search and replaces

@pnorman

This comment has been minimized.

Show comment
Hide comment
@pnorman

pnorman Sep 14, 2014

Collaborator

I think the fact that we can now add comments and do a proper code review is being well demonstrated with the fact that @math1985 has been able to do one :)

Collaborator

pnorman commented Sep 14, 2014

I think the fact that we can now add comments and do a proper code review is being well demonstrated with the fact that @math1985 has been able to do one :)

@matthijsmelissen

This comment has been minimized.

Show comment
Hide comment
@matthijsmelissen

matthijsmelissen Sep 14, 2014

Collaborator

I checked the code, and I think the yaml code is equivalent with the old mml code - with the exception of one typo (space in front of 'construction') and a code block called '_parts'. What does this do?

Collaborator

matthijsmelissen commented Sep 14, 2014

I checked the code, and I think the yaml code is equivalent with the old mml code - with the exception of one typo (space in front of 'construction') and a code block called '_parts'. What does this do?

@pnorman

This comment has been minimized.

Show comment
Hide comment
@pnorman

pnorman Sep 14, 2014

Collaborator

The parts within it are used for YAML aliasing.
Pulling an example from the file, we get this YAML

# Various parts to be included later on
_parts:
  # Extents are used for tilemill, and don't actually make it to the generated XML
  extents: &extents
    extent: *world
    srs-name: "900913"
    srs: "+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0.0 +k=1.0 +units=m +nadgrids=@null +wktext +no_defs +over"
  osm2pgsql: &osm2pgsql
    type: "postgis"
    dbname: "gis"
    key_field: ""
    geometry_field: "way"
    extent: "-20037508,-20037508,20037508,20037508"
Layer:
  - id: "citywalls"
    name: "citywalls"
    class: ""
    geometry: "linestring"
    <<: *extents
    Datasource: 
      <<: *osm2pgsql
      table: |-
        (SELECT 
            way 
          FROM planet_osm_line 
          WHERE historic = 'citywalls')
        AS citywalls
    advanced: {}

The <<: *extents key merges the keys from the &extents mapping into that location, avoiding having to specify them again, and same for <<: *osm2pgsql.
The idea is taken from the MapProxy documentation.

Fortunately, an understanding of YAML isn't needed to use them when adding a layer - you just copy from existing layers. It's probably possible to use the first occurrence of an osm2pgsql layer as an anchor and copy mappings from that, but this way is more robust and clearer what's being included.

The YAML Reference Card has the YAML syntax (and is itself YAML), not that I see any need to use more YAML features.

The _parts in JSON is ignored by carto and doesn't impact the output XML, and TileMill also appears to ignore it.

Collaborator

pnorman commented Sep 14, 2014

The parts within it are used for YAML aliasing.
Pulling an example from the file, we get this YAML

# Various parts to be included later on
_parts:
  # Extents are used for tilemill, and don't actually make it to the generated XML
  extents: &extents
    extent: *world
    srs-name: "900913"
    srs: "+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0.0 +k=1.0 +units=m +nadgrids=@null +wktext +no_defs +over"
  osm2pgsql: &osm2pgsql
    type: "postgis"
    dbname: "gis"
    key_field: ""
    geometry_field: "way"
    extent: "-20037508,-20037508,20037508,20037508"
Layer:
  - id: "citywalls"
    name: "citywalls"
    class: ""
    geometry: "linestring"
    <<: *extents
    Datasource: 
      <<: *osm2pgsql
      table: |-
        (SELECT 
            way 
          FROM planet_osm_line 
          WHERE historic = 'citywalls')
        AS citywalls
    advanced: {}

The <<: *extents key merges the keys from the &extents mapping into that location, avoiding having to specify them again, and same for <<: *osm2pgsql.
The idea is taken from the MapProxy documentation.

Fortunately, an understanding of YAML isn't needed to use them when adding a layer - you just copy from existing layers. It's probably possible to use the first occurrence of an osm2pgsql layer as an anchor and copy mappings from that, but this way is more robust and clearer what's being included.

The YAML Reference Card has the YAML syntax (and is itself YAML), not that I see any need to use more YAML features.

The _parts in JSON is ignored by carto and doesn't impact the output XML, and TileMill also appears to ignore it.

@matthijsmelissen

This comment has been minimized.

Show comment
Hide comment
@matthijsmelissen

matthijsmelissen Sep 14, 2014

Collaborator

Thanks for the corrections.

We're still not consistent in the use of spaces after commas: personally I would prefer to add spaces after all commas. What do you think?

There are also still a couple of keywords that need capitzalizing, by the way.

Collaborator

matthijsmelissen commented Sep 14, 2014

Thanks for the corrections.

We're still not consistent in the use of spaces after commas: personally I would prefer to add spaces after all commas. What do you think?

There are also still a couple of keywords that need capitzalizing, by the way.

@pnorman

This comment has been minimized.

Show comment
Hide comment
@pnorman

pnorman Sep 14, 2014

Collaborator

We're still not consistent in the use of spaces after commas

Well, that's largely because I wanted to get this PR in quicker than I could clean up all of the SQL.

Collaborator

pnorman commented Sep 14, 2014

We're still not consistent in the use of spaces after commas

Well, that's largely because I wanted to get this PR in quicker than I could clean up all of the SQL.

@gravitystorm gravitystorm merged commit 56cd8e0 into gravitystorm:master Sep 16, 2014

1 check passed

continuous-integration/travis-ci The Travis CI build passed
Details

@pnorman pnorman deleted the pnorman:yaml branch Sep 23, 2014

@pnorman pnorman referenced this pull request in tangrams/tangram Aug 11, 2016

Open

Use 2sp indents in sample YAML files #379

@Ircama

This comment has been minimized.

Show comment
Hide comment
@Ircama

Ircama Nov 6, 2016

Contributor

project.yaml includes this comment:

# Extents are used for tilemill, and don't actually make it to the generated XML

Is the _parts section only needed by Tilemill or also to the OSM rendering through Mapnik?

Contributor

Ircama commented Nov 6, 2016

project.yaml includes this comment:

# Extents are used for tilemill, and don't actually make it to the generated XML

Is the _parts section only needed by Tilemill or also to the OSM rendering through Mapnik?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment