Remove redundant LineStrings in order to save memory#2795
Conversation
|
@gmellemstrand the PR diff shows over 100 files changed, can you confirm that this is built on top of #2794 (Use realtime data in TransitLayerMapper), and I should look at a diff relative to that PR to understand your changes? |
|
In that case let's aim to clean up and merge #2794, since Thomas has already provided a lot of comments on that one, then I'll start working on this one. |
ea7500a to
8585fd1
Compare
|
This is now rebased on top of dev-2x. Some things we need to discuss:
|
154474e to
cb575da
Compare
|
After discussing with @abyrd made the following changes:
|
|
I have made a new commit that should save about half of space used for elevation profiles. Elevation is sampled along each StreetEdge at regular intervals (10m), and then the resulting coordinates are efficiently encoded using DlugoszVarLenIntPacker. A typical elevation profile can look like this: 0 = {Coordinate@4453} "(0.0, 239.16)" The last x-coordinate is the length of the StreetEdge. I have changed CompactElevationProfile to only save the y-values and then reconstruct the x-values when uncompacting. I have also moved the distanceBetweenSamplesM field from ElevationModule to CompactElevationProfile. |
…structed when uncompacting.
236db2a to
26da2d5
Compare
|
Summarizing our discussion today: this PR looks good to me. One remaining issue is how to store and set the distance between elevation profile sample points. It was effectively almost a constant in the past - but it turns out Entur has made modifications to allow setting this higher to 25m (since the horizontal resolution of their elevation data is only 50m). So the value needs to be saved at graph build and restored when the graph is loaded. We looked at making the (un)compacter a concrete class, serializing it with the Graph, or just passing the horizontal spacing into the unpack method. But unfortunately in all places where this method is called, we don't have access to a context object that lets us read configuration from the graph. So the horizontal spacing would have to be passed through multiple frames, adding parameters to multiple methods. We decided to just retain the current static configuration approach, which is stylistically kind of ugly but will require only one simple and easy to understand new line upon graph load. This does mean that we cannot in the general case have more than one graph loaded at once because configuration in the global (static) scope is affected by the contents of a particular graph. Eliminating multiple routers was already planned for OTP2, but we are now making moves that really cement this decision. @gmellemstrand 's changes add new methods Finally there is an issue of methods named |
|
@gmellemstrand once you make the changes we discussed, I can review and potentially merge today. |
53c3671 to
7463dad
Compare
abyrd
left a comment
There was a problem hiding this comment.
Thanks for working through all the details. Looks good.
After doing some memory profiling of OTP, we found that a lot of space is taken up by LineStrings. This affects both graph size and the memory requirement to run OTP. It is especially important for very large graphs, since you may run into a limit to the file size supported by the Kryo serializer.
Note that this is based on #2794 , so only the last commit is relevant for this particular pull request.
The following changes have been made:
The geometries in SimpleTransfer and Raptor Transfer objects have been removed. In the first case they were saved as LineStrings, and in the second as Coordinate arrays. Since we already have a list of edges, it is possible to just combine these whenever we need the complete geometry. This is typically when building the itineraries, so performance is not so critical. The geometries of StreetEdges are delta compressed using CompactLineString, so they do not take up much space.
The geometries in TripPatterns have been removed. This information is duplicated in the hop geometries, and are not needed before creating the itineraries. In addition, the geometries in hop edges now use the CompactLineString, like with StreetEdges.
In total this cuts the size of the serialized graph in half when testing with the Oslo GTFS and Oslo OSM files. I have tested with maxTransferDistance of 2000, which means there were relatively many SimpleTransfers, which took up a lot of space.
This does break the BusRouteStreetMatcher, as it tries to replace the whole TripPattern geometry, so that needs to be looked into before merging.