Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cost rasters and LTS from shapefiles #782

Merged
merged 86 commits into from
Apr 1, 2022
Merged

Cost rasters and LTS from shapefiles #782

merged 86 commits into from
Apr 1, 2022

Conversation

abyrd
Copy link
Member

@abyrd abyrd commented Jan 24, 2022

This combines preliminary work from Q3 2021 and rebases it onto the current development branch. This branch will add the following features:

  • Map matching a shapefile to OSM, using specified attribute of the shapefile as biking LTS
  • Draping the road network over a raster and assigning costs from the resulting profiles along edges

If we store these to files, then this will require a change to the Network Version of the serialized TransportNetworks.

To facilitate testing and experimentation, these new types of data may be added with the following custom modification types. The source files may be uploaded via the Data Sources tab, and their IDs (copied from the URL) substituted in to the dataSourceId field.

{
  "name": "LTS from Shapefile",
  "r5type": "shapefile-lts",
  "dataSourceId": "61f277140e46af79fd6d5367",
  "ltsAttribute": "lts_ov"
}
{
  "name": "Elevation from GeoTIFF Raster",
  "r5type": "raster-cost",
  "costFunction": "TOBLER",
  "dataSourceId": "61f277140e46af79fd6d5367"
}
{
  "name": "Sun cost from GeoTIFF Raster",
  "r5type": "raster-cost",
  "costFunction": "SUN",
  "dataSourceId": "61f277140e46af79fd6d5367"
}

ansoncfit and others added 30 commits September 8, 2021 19:26
showing bike LTS, filtering to edges that allow cars
list all attributes assignable to a certain class;
find attribute by name, validating type
shapefile will be sought in resources file storage
This was used for point-to-point routing.
All network configuration will now be merged into the Bundle Manifest.
The files were called [networkId].json with no reference to "manifest",
so this is just a Java class rename.
Without this step, worker will fetch only .shp and will not have the
.dbf or .prj, so will report unknown projection and have no database.
this will be generalized for elevation, tree shade etc.
Move elevation loading code into new rastercost package.
Add RasterCost custom modification type to load and apply rasters.
Create CostField abstraction encompassing both elevation and tree shade.
Wire CostField into StreetRouter to actually affect costs.
Move elevation data into CostField from EdgeStore.
Enable inversion of elevation changes on reverse edges.
also added some comments and stub for sun cost
also changed default ltsAttribute to lts
small changes preparing to load sun data
@abyrd
Copy link
Member Author

abyrd commented Mar 25, 2022

A few observations on tile generation: when you zoom in, you can readily see the smoothness of the shapes improve as it fetches more detailed map tiles. This is intended behavior - lower zoom tiles are more simplified since they contain more features. The backend memory use increases while it's building tiles, but this seems ephemeral. After triggering a GC, I see very little ongoing memory impact even with spatial indexes built for 4 feeds. We'll want to check this out with some really big feeds like NL or CH.

Also handles lines that pass through a tile with no points in the tile.
@abyrd
Copy link
Member Author

abyrd commented Mar 25, 2022

Now using JTS intersection operation to clip, and handling cases that produce MultiLineStrings. Cases like the DC blue line exiting and re-entering a tile, and the long straight segments of the LA metrolink lines converging on Union Station now display properly. This should be ready for final review and merge.

final String workerVersion = request.params("workerVersion");

WorkerCategory workerCategory = new WorkerCategory(bundleId, workerVersion);
private String getOrStartWorker (String bundleId, String workerVersion, UserPermissions userPermissions) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is mostly a duplicate of the code in the BrokerController (in fact it was originally copied from there). Should we abstract this out as a helper method in the broker itself?

   /**
     * Get or start worker.
     */
    public synchronized String getOrStartWorker(WorkerCategory workerCategory, WorkerTags workerTags) {
        String address = getWorkerAddress(workerCategory);
        if (address == null) {
            // There are no workers that can handle this request. Request some.
            createOnDemandWorkerInCategory(workerCategory, workerTags);
        } else {
            // Workers exist in this category, clear out any record that we're waiting for one to start up.
            // FIXME the tracking of which workers are starting up should really be encapsulated using a "start up if needed" method.
            recentlyRequestedWorkers.remove(workerCategory);
        }
        return address;
    }

At the very least, we shouldn't leave the replicated FIXME in two different places.

Copy link
Member Author

@abyrd abyrd Mar 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely agree, this should be factored out into a shared method. In fact, as a best practice we should avoid duplicating this much code even when making a prototype because it then requires us to spot and re-merge the duplicates months down the line - we should factor out early.

@@ -0,0 +1,293 @@
package com.conveyal.analysis.controllers;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the GtfsVectorTileMaker be moved into the com.conveyal.gtfs package since it's not a controller?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summarizing discussion on the call today, we will be splitting all functionality of GtfsVectorTileMaker into GtfsCache and MapTile (renamed to VectorTile until we actually merge it with any similar raster tile classes). MapTile is not a utility class strictly speaking, and its use is currently limited to vector tile generation. We tentatively planned to create a com.conveyal.analysis.vectortile package to situate related code under the application layer (analysis here refers to the application built on top of r5 and gtfs-lib, not just the action of analyzing things).

.build(this::buildShapesIndex);

/** A cache of spatial indexes of the transit stops, keyed on the BundleScopedFeedId. */
private final LoadingCache<String, STRtree> stopIndexCache = Caffeine.newBuilder()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the shape and stop indexes be moved into the GTFSCache? If they shouldn't, should they at least be extracted into their own file in case they ever want to be used outside of the GtfsVectorTileMaker? Additionally, is there a better way to distinguish between the type of cache data they contain vs what is currently in the GTFSCache? Possibly renaming GTFSCache -> GtfsMapDbCache and extracting the shape and stop indexes into a GtfsGeoSpatialInMemoryCache? I don't think these names distinguish between the two well enough, but I wanted to give an example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summarizing discussion on the call today: we will move these indexes into GTFSCache, but leave these spatial index caches as two top-level LoadingCaches within GTFSCache rather than making the spatial indexes fields on the GTFSFeed values managed by the GTFSCache. This is not because we necessarily consider this an ideal design long term, but because 1. it retains the current level of flexibility with respect to eviction policies, allowing the geometry caches to hold less objects and evict them sooner than the feed cache. This is important because we haven't used this in production yet and unlike the GTFSFeeds the spatial indexes have potentially unlimited memory usage. We want to observe memory usage in practice and be able to tune these parameters. 2. it's easier to reason about reference chains and loops when they're not nested, 3. even though lazy loading synchronization would be relatively simple as it would be per-GTFSFeed instance, it's nice to uniformly use the well-audited per-key locking of LoadingCache.

STRtree stopsIndex = stopIndexCache.get(bundleScopedFeedId);

final long startTimeMs = System.currentTimeMillis();
final int tileExtent = 4096; // Standard is 4096, smaller can in theory make tiles more compact
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be moved into a private static final int TILE_EXTENT = 4096;?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we'll add symbolic constants for all magic numbers. This is just left over from prototyping.


// Combine these two layers in a tile
JtsMvt mvt = new JtsMvt(patternLayer, stopsLayer);
MvtLayerParams mvtLayerParams = new MvtLayerParams(256, tileExtent);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be moved into a private static final int TILE_SIZE = 256;?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we'll add symbolic constants for all magic numbers. This is just left over from prototyping. These also need some comments explaining that they're the usual standard values, and what they do.

properties.put("lat", stop.stop_lat);
properties.put("lon", stop.stop_lon);

// Factor out this duplicate code from clipScaleAndSimplify
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we factor this out?

    private static Coordinate projectCoordinate (Coordinate c, Envelope envelope) {
        double px = ((c.x - envelope.getMinX()) * TILE_EXTENT) / envelope.getWidth();
        double py = ((envelope.getMaxY() - c.y) * TILE_EXTENT) / envelope.getHeight();
        return new Coordinate(px, py);
    }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this can be factored out. I think the original code that clones a coordinate and sets its fields was the result of a series of refactors, starting from a version that tried to avoid new object creation. In the end we create a lot of small objects in the vector tile generation, which is not a big deal for contemporary garbage collectors.

// final String modificationNonceDigest = request.params("modificationNonceDigest");
checkNotNull(bundleId);
TransportNetwork network = transportNetworkCache.getNetwork(bundleId);
// "61fe817974f6230b0363aae1-8c07ddd4f8bd29ac10a4a109dd27d7b58dabd56c" // elevation only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These appear to be hard coded testing IDs. Should they be removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, just left over from experiments and prototyping.

* Complex street geometries will be simplified, but the geometry will not deviate from the original by more
* than this many tile units. These are minuscule at 1/4096 of the tile width or height.
*/
private static final int LINE_SIMPLIFY_TOLERANCE = 5;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should clipScaleAndSimplify and common static values like LINE_SIMPLIFY_TOLERANCE, TILE_EXTENT, and TILE_SIZE be factored out into a TileUtils file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summarizing discussion from today's call: they're not quite static utility methods, as the goal is to reuse the envelope and buffered envelope of a particular tile as fields on an object. So we want to factor them out, but to a concrete class in a vector tile-specific package.

trevorgerhardt and others added 9 commits March 28, 2022 13:33
This change factors out common code across the `GtfsVectorTileMaker` and the `NetworkTileController`. It moves manny common functions into the `MapTile` util. This is because many operations are dependent on the `envelope` that is produced by that util. Additionally, the remaining functionality of the `GtfsVectorTileMaker` was eseentially boiled down to just creating and querying the caches.
Since the cache can be used for anything, not just vector tiles. Also, factors out the "GeometryCache" part.
Creates a buffered envelope that can be re-used while clipping.
Also, add comments.
merge GtfsGeometryCache into GtfsCache
restore type information and documentation
make MapTile clearly a VectorMapTile
@abyrd
Copy link
Member Author

abyrd commented Mar 31, 2022

I merged in @trevorgerhardt's cleanup and added some more of my own based on our conversation and comments added to #796. Some minor changes that were not discussed on the call:

  1. I reintroduced type information where it had been replaced with var declarations. This is a question of style, but var is not used at all in the rest of the code base and arguably makes type information more obscure in most cases. This is usually compensated for by IDEs which then sort of defeats the shortening effect of var: an IDE will infer the type and display it inline, so changing long x to var x ends up displaying var x: long to make sure you're getting the type you expect. If we change to using var this should probably be a consistent decision for the whole project.
  2. I reintroduced some Javadoc that was deleted when code was moved from one file to another. This was not the most well-composed documentation but did take some effort to pull together and could be very useful to someone approaching vector tile generation in the future. I cleaned it up a bit when I added it back.
  3. I pulled out some method calls that had been inlined as parameters in another method call. In optimized JVM operation this should have no effect on performance, but it does help when debugging to see the values before they're passed into the method, and it can be easier to read to have symbolic names in the parameter positions.

* As a custom modification, a model class only exists in R5, there is no separate UI/backend modification class.
*
* TODO This modification is not effective together with transit - it doesn't cause the egress tables to be rebuilt.
* Hmm, actually we don't ever recompute bike egress. Changing the LTS value does not invalidate bike egress tables.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this comment need to be addressed? At the very least, is it clear in the documentation provided to the clients that use this feature?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be properly handled because affectsStreetLayer() returns true, but I’d like to retrace how that works to make sure it handles this case. Of course additional documentation will be in order if egress is not always affected by modifications.

@abyrd abyrd merged commit fc009dc into dev Apr 1, 2022
@abyrd abyrd deleted the cost-rasters-and-lts branch April 1, 2022 02:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
nv-update Implies changes to the serialized network (either format or logical content changes)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants