Skip to content

Import implementation from OSMesa project#60

Merged
jpolchlo merged 41 commits into
geotrellis:masterfrom
jpolchlo:refactor/process-osm
Mar 22, 2019
Merged

Import implementation from OSMesa project#60
jpolchlo merged 41 commits into
geotrellis:masterfrom
jpolchlo:refactor/process-osm

Conversation

@jpolchlo
Copy link
Copy Markdown
Contributor

@jpolchlo jpolchlo commented Mar 1, 2019

The existing vectorpipe implementation is a bit of a Frankenstein creation of two projects (OSMesa and the original, RDD-based VP implementation). The last importation of OSMesa code was somewhat shoehorned into the structure of the old VP, and it doesn't necessarily suit the philosophy of OSMesa. We'd like to more forcefully adopt the DataFrame-centric approach of OSMesa, and bring in the extra facilities provided by the current OSMesa implementation (streaming data sources and the like).

This PR essentially wipes out VP and imports code from osmesa.common into the vectorpipe package.

This code is in a usable state now, but the intention is also to improve the documentation of this package as much as is practicable to make this codebase more accessible than it has been.

  • Document core usage patterns (ProcessOSM)
  • Document DataSources

At a later date, we will improve the vectortile export functionality of this library.

Comment thread src/main/scala/ProcessOSM.scala Outdated
Comment thread src/main/scala/raster/SparseIntTile.scala Outdated
Comment thread README.md
```
which will produce a frame consisting of "top-level" entities, which is to say
nodes that don't participate in a way, ways that don't participate in
relations, and a subset of the relations from the OSM data. The resulting
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to get an explicit list of the entities that will be produced by constructGeometries. @mojodna ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • points (from tagged nodes, including tags that really ought to be dropped, e.g. source=*)
  • polygons (from ways with tags that are cause them to be considered as areas)
  • lines (from ways without area tags)
  • multipolygons (from multipolygon or boundary relations)
  • multilinestrings (from route relations)

This currently does include ways that participate in relations (as long as they have tags? would need to check that. ways with tagging that's distinct from the relation suggests that it fulfills a separate, distinct role) and tagged nodes that participate in ways and relations (same deal as above; a bollard may contribute a vertex to a path).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I want to revisit the relation assembly process (for non-multipolygons) to prevent new geometries from being created and just propagating relation tags onto the constituent lines / polygons so that they can fulfill the same role without duplication. It does complicate things a bit, resulting in more minor versions of ways, as changes to relations that they're part of will change their metadata (similar to how changes to nodes change their geometry).)

Comment thread src/main/scala/vectorpipe/ProcessOSM.scala Outdated
Comment thread src/main/scala/vectorpipe/ProcessOSM.scala Outdated
Comment thread src/main/scala/vectorpipe/ProcessOSM.scala Outdated
Comment thread src/main/scala/vectorpipe/ProcessOSM.scala
Comment thread src/main/scala/vectorpipe/ProcessOSM.scala Outdated
Comment thread src/main/scala/vectorpipe/ProcessOSM.scala Outdated
Comment thread src/main/scala/vectorpipe/ProcessOSM.scala Outdated
Comment thread src/main/scala/vectorpipe/functions/osm/package.scala Outdated
Comment thread src/main/scala/vectorpipe/util/Caching.scala Outdated
@sethfitz
Copy link
Copy Markdown
Collaborator

Tutorial bits (in src/main/tut/) should be updated or removed.

@sethfitz
Copy link
Copy Markdown
Collaborator

data/ should also be cleaned up. (Though maybe these are for subsequent PRs.)

@jpolchlo
Copy link
Copy Markdown
Contributor Author

jpolchlo commented Mar 14, 2019

I wanted to leave the tut armature in place so that I can go and fill it in a bit later. Thanks for the reminder that it's there. Going to create an issue on it. (See #61)

Comment thread src/main/scala/vectorpipe/ProcessOSM.scala Outdated
jpolchlo and others added 26 commits March 22, 2019 10:40
Provides a path to developing within IntelliJ while specifying spark-sql
as a provided dependency (to avoid including Spark in assemblies where
they will conflict with the Spark runtime).

Also switches to the native ORC data source to facilitate dropping
spark-hive as a dependency.
The main change here is the removal of a pathological piece of geometry
from the tests.  Previous iterations of this test may have "gotten
lucky" in passing at any point in the past; the logic in the relation
reconstruction code was not up to the challenge.
@jpolchlo jpolchlo force-pushed the refactor/process-osm branch from 8115f08 to 12e2da2 Compare March 22, 2019 14:41
@jpolchlo jpolchlo merged commit cd4b6e7 into geotrellis:master Mar 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants