Skip to content

Experiment: Elevation data optimization

Andrew Byrd edited this page Dec 17, 2014 · 1 revision

This page is about experiment results on optimizing street edge memory (tickets #1527, #1533).

Caveat

For the record, the idea is to optimize the memory usage of street edges and elevation profile data, with the following changes:

  • Use bit fields for storing street edges flags (6 flags for now), instead of plain boolean which consume each 1 byte (according to latest JVM specs).
  • Transform the various bike safety, slope work cost and slope speed factors from the current "effective length" (where the values are stored either as length multiplied by some factor or total energy cost along the segment) to more bounded and somehow simpler unit-length factors (or for energy, average joules per meter on the segment). The effective length is then simply the factor multiplied by the segment length. This allow us to reduce the bounds of the values and help keep a good precision even with float factors.
  • Extract the bike safety factor from the elevation profile data to the street edge.
  • Those two latter changes allow us to use a "flat elevation profile" singleton instance for the case where no elevation is needed, greatly reducing memory usage.
  • Compact the elevation profile exact data (which is not needed during path computation) using delta coding (which fits well with the type of data: arc-length curve starting from 0 and elevation), fixed float values and variable-length int coding (following the same method as for geometry).
  • Invert the EdgeWithElevation <- StreetEdge relationship to StreetEdge <- StreetWithElevationEdge, to remove the elevation profile pointer when it's not needed.

Experiment results

With elevation data

Some results on a graph covering a 1x1 degree square (111km x ~80km), street data only:

     Graph size            Class size
         (mem)     (disk)  EPS   PSE   Gain (mem)
----------------------------------------------------
A    439161000  247433834  48    96    --
B    310603000  124029254  48    88    29%
C    304946000  119786150  32    88    1.8%
----------------------------------------------------
(Number of PlainStreetEdge objects: 353894)
  • A - original version
  • B - compact elevation profile, var-len int packing, float bike safety
  • C - B + float elevation factors (slope work, slope speed, max slope).

Note: we take as a base value for the various factors the "flattened" length (the projected length on the XY plane, not the effective one taking into account slope). But the effective result is the same in both case anyway.

Without elevation data

Gain using a "flat profile" shared singleton instance:

              Graph (mem)  Gain
Original       156768000
Singleton      145567000     7%

Note: the gain is 11% relative to the geometry-optimized version.

Removal of elevationProfile pointer

The changes in this modifications:

  • Create a StreetWithElevationEdge, extending StreetEdge, storing elevationProfileSegment pointer
  • We use the StreetEdgeFactory for OSM builder and ShapeFileBuilder, removing any OSM specific-stuff (label removal helped in this)
  • Old classes deriving from StreetEdge are deriving now from StreetWithElevationEdge (PartialStreetEdge, AreaEdge)
  • Rework a bit the various interfaces, removing carSpeed as constructor argument (we have a single constructor now, if one want to specify carSpeed there is a setter for this)

The only drawback with this optimization is that due to 8 bytes rounding of object size by the JVM, the StreetEdge object size goes down from 64 bytes to... 64 bytes. But in the "long term" we will shave off 4 bytes from every object, so it's an estimated 3% size improvement when no elevation data is needed.

Note: Now when building a graph we need to know beforehand if we want to have elevation data or not, as OSM/Shapefile builders need to have the properly configured edge factory.

Boolean bit field

Taking into account memory alignment rounding (8 bytes alignment), we did not reduce the plain street edge object size by replacing booleans by bit fields (88 bytes for the two versions). But we "potentially save in the long term" 5 bytes for every 353894 objects, that is 1769470 bytes, ie 0.4% of total memory (or 0.98% of a geometry-optimized graph size w/o elevation data). The change is rather simple and harmless so even if the gain is not that important we can keep it.

Potential future improvements

Half-precision float factors

We now store various unitary factors using floats, but we could save some more memory by switching to either:

  • fixed-sized short (2 bytes), the precision may be enough as the factor ranges are limited (for bike safety between 1 and around 100; 2 bytes allow a precision of around 0.003).
  • half-precision floats (2 bytes). The advantage of floats are that the precision is scaled to the value, which is a better fit for storing multipliers factors. However storing sign bit and exponent in half-precision leave not that much room for the mantissa (1 bit sign, 5 bit exponent, 10 bits mantissa). We could design our own format by removing the sign bit, use less bits for the exponent (the needed scale is not large), that would leave us with around 12 bits for the mantissa. Also, as the factor are unitary multipliers, we could store the value as the factor minus 1 instead (f - 1) to reduce even more the data entropy.

Removing JTS CoordinateSequence

We still use a "CoordinateSequence" and a OTP-customized "PackedCoordinateSequence" for the elevation profile data interface. They both rely on JTS and could be removed (especially the customized OTP version which is a copy-paste from JTS adding only a Serializable interface, which is not needed anymore as we serialize the packed form instead). The CoordinateSequence interface is a bit convoluted for what we need.

Details on memory usage for the 3 versions

A - Dominated objects of ElevationProfileSegment

Class Name                                                         |   Objects | Shallow Heap
----------------------------------------------------------------------------------------------
double[]                                                           |   353,591 |  143,276,384
org.opentripplanner.routing.util.ElevationProfileSegment           |   353,592 |   16,972,416
org.opentripplanner.common.geometry.PackedCoordinateSequence$Double|   353,591 |    8,486,184
Total: 3 entries                                                   | 1,060,774 |  168,734,984
----------------------------------------------------------------------------------------------

B - Dominated objects of ElevationProfileSegment

Class Name                                              |   Objects | Shallow Heap
-----------------------------------------------------------------------------------
byte[]                                                  |   353,591 | 26,035,784
org.opentripplanner.routing.util.ElevationProfileSegment|   353,592 | 16,972,416
Total: 2 entries                                        |   707,183 | 43,008,200
-----------------------------------------------------------------------------------

C - Dominated objects of ElevationProfileSegment

Class Name                                              |   Objects | Shallow Heap
-----------------------------------------------------------------------------------
byte[]                                                  |   353,591 | 26,035,784
org.opentripplanner.routing.util.ElevationProfileSegment|   353,592 | 11,314,944
Total: 2 entries                                        |   707,183 | 37,350,728
-----------------------------------------------------------------------------------

The documentation on this wiki is outdated and should not be used

unless you are intentionally working with legacy versions of OpenTripPlanner. Please consult the current documentation at readthedocs

Clone this wiki locally