Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edge flags refactoring #1447

Merged
merged 49 commits into from Feb 19, 2019

Conversation

5 participants
@karussell
Copy link
Member

commented Aug 28, 2018

Mainly this PR avoids the 64 bit limitation of getFlags and makes working with edge properties much easier (direction-dependent or not):

// definition of the EncodedValue:
registerNewEncodedValue.add(averageSpeedEnc = new DecimalEncodedValue("some_name.average_speed", speedBits, defaultValue, speedFactor, speedTwoDirections));

// for the import:
edge.set(averageSpeedEnc, 60).setReverse(averageSpeedEnc, 30.0);

// in the Weighting:
edge.get(averageSpeedEnc);
edge.getReverse(averageSpeedEnc);

It is now very easy to enable direction-dependent values, just with a boolean flag in the constructor. One can now easily introduce FlagEncoder-independent values like we did already with the roundabout boolean. One can now easier use external edge data as there is no need to define a direction bit and one uses the reverse property of the EncodedValue (see REVERSE_STATE or graph.getEdgeIteratorState returns stored direction). And also the EncodedValues are independent of any OSM tags, interesting for other data sets like GTFS. This change also reduces the usage of getFlags until we can make it private at some point.

This PR fixes at least #472, #1164 and #728. This will come after 0.11.

The goal was to make a 'minimal' version of #1112 without introducing new features like TagParser and also we did not yet remove FlagEncoders. Also performance seems to be the same. All other features could be introduced later.

Done:

  1. remove reverseFlags
  2. use int[] (IntsRef) instead long for edge flags (edge.getFlags)
  3. introduce EncodedValue stuff and mostly hide getFlags via using edge.set(enc, value)! Do no longer use graph.isForward as it is not able to consider reverse flag of the edge to read IntsRef properly, same for get/setSpeed, see https://gist.github.com/karussell/c06486c92c85613a88c3201e6d6c2488
  4. handle graph.isForward(encoder) or encoder.isForward(flags) via accessEnc
  5. for chGraphImpl.edge.isForward we currently assume the boolean getter method is only called for access property! See https://github.com/graphhopper/graphhopper/blob/master/core/src/main/java/com/graphhopper/storage/CHGraphImpl.java#L477
  6. simplified FlagEncoder interface
  7. make it possible to reuse bits accross FlagEncoders. Example is "roundabout" which is created in EncodingManager. Throw exception for a duplicate EncodedValue.
  8. implement proper fetch of IntsRef from BaseGraph without reversing flags! This is a huge thing and directly related to 1. (no node-order dependent storage, mostly taken from 5da75c5)
  9. new speedDefault variable in AbstractFlagEncoder as a 0 should be interpreted as 0km/h and not as the default value in the new DecimalEncodedValue class
  10. getEdgeIteratorState => now we return the edge in "storage direction", which means there are no hacks necessary for direction dependent external properties
  11. refactored edge.copyPropertiesFrom as we need internal properties of the target edge
  12. use old handling internal to the FlagEncoders for slow speed => setSpeed to 0 and access to false, see #242, #367, #665
  13. CH => skippedEdges are now properly stored as A and B and do not swap, see "Later improvements" regarding Path4CH
  14. due to the shared roundabout flag now the default access bits cannot be used for CH shortcuts anymore => several tests needed a fix with the SC_ACCESS, see "Later improvements" of introducing proper edge flags for CH
  15. if speed in OSM tags exceeded storable maximum we silently use a maximum value stored in all encoders. The EncodedValue itself throws an exception.
  16. we do not allow bits == 0 like before as for spatialEncoder this change was required. The dynamic EncodedValue creation needs a different way instead of creating many unused&empty EncodedValues
  • remove possibility to specify default value for IntEncodedValue as we would have to associate the default value with 0 and associate the 0 entry with the index the default value had before. Instead one should now use MappedIntEncodedValue if there is no possibility to implement the use case with the FactorizedIntEncodedValue
  • change all EncodedValue classes into interfaces, new names for the IntEncodedValue implementation like FactorizedIntEncodedValue?
  • move the StringEncodedValue into an EnumEncodedValue? enum are a clean and type safe way to do this. As Enums are impossible to extend, we created the class EnumAlike https://stackoverflow.com/q/6511453/194609
  • gtfs module and all other modules need to compile and green tests
  • roundabout flag is initialized and overwritten multiple times if multiple FlagEncoders are added
  • something is broken with the bounding box (strange one is displayed on /maps)
  • remove many unnecessary edge.setFlags(edge.getFlags()); in GtfsReader
  • TODO NOW stuff
  • copy some tests from old branch
  • fix technical.md image => https://karussell.files.wordpress.com/2013/08/wiki-graph.png A and B are no longer ordered!

TODOs:

  • create test when flags are 8 bytes and this clashes with 4 byte shortcuts due to CommonEdgeIterator.setFlags. This will fail (as downstream code shows) - fix this!
  • rename EnumAlike to IndexBased and all getEnumXY to getObjXY?
  • merge master

Later improvements:

  1. reorder EncodedValues in EncodingManager to save bits e.g. 17, 20, 5, 5, 5 could be reordered to 17, 5, 5, 5, 20 and would then fit into just two ints without the need to implement "cross-int" support for EncodedValue
  2. increase edgeFlags to more than 2 integers and reorder EncodedValue for a more compact bit representation (or allow crossing the int-border for one EncodedValue).
  3. do the elevation interpolation for all FlagEncoders, where the type of the highway is stored just once like we do with the roundaboutEnc
  4. introduce proper CH edge flags with EncodedValue stuff that could be increased to more than 4 bytes. Integrate SC_ACCESS into CHGraphImpl
  5. introduce EncodedValue as first class citizen in EncodingManager, remove flagsDefault #800 or even better try to remove FlagEncoder usage. I.e. remove flagDefaults and get rid of the EncodingManager withing the BaseGraph
  6. introduce EncodedValueLookup lookup instead Encoder in Weighting
  7. remove Weighting.getFlagEncoder
  8. Instead of BooleanEncodedValue accessEnc = encoder.getBooleanEncodedValue(EncodingManager.getKey(encoder, "access")); use a more logical/convenient way like encoder.getBooleanEncodedValue("access") ?
  9. potentially we can improve handling in Path4CH.expandEdge as getEdgeIteratorState has "storage direction"?
  10. move clases into subpackages, especially "storage" package is full, see also #583
  11. avoid accessing DataAccess for every small set, instead require ending "store" method https://gist.github.com/karussell/f4c2b2b1191be978d7ee9ec8dd2cd48f
  12. think about a similar solution (edge flags) for nodes and relations, then deprecate EncodedValueOld
  13. we could now refactor the QueryGraph to use 2 bidirectional virtual edges instead of 4 directed, see #892

karussell added some commits Aug 14, 2018

@karussell karussell added this to the 0.12 milestone Aug 28, 2018

@karussell karussell force-pushed the intsref_refactoring branch from 12f4887 to 5c3d229 Sep 7, 2018

@karussell karussell force-pushed the intsref_refactoring branch from 395e3b0 to 1c917c7 Sep 7, 2018

karussell added some commits Sep 7, 2018

@karussell karussell closed this Sep 7, 2018

@karussell karussell reopened this Sep 7, 2018

@karussell karussell force-pushed the intsref_refactoring branch from 85cf3d9 to 037704b Sep 7, 2018

@karussell

This comment has been minimized.

Copy link
Member Author

commented Sep 7, 2018

Ok, I'm now happy with the scope of the change and I think this PR is in a good shape. All tests are green and performance is also ok, so we have something we can work on.

Before a merge we would need to test this on a world wide instance and also I need to know if it satisfies some of the use cases we had before (like external edge data etc). So I would need feedback regarding the EncodedValue-API, maybe @ammagamma & @michaz can have a look and already work a bit with it?

Another question is if we should use IntsRef or LongsRef. Currently we have not implemented a cross-int EncodedValue i.e. if you have two EncVs with 20 bits then the second one will be in the next int and 12 bits are unused. Maybe one can implement a cross-int solution (like we partially had before via ints-merging) or maybe we reduce problems with a LongsRef? The best thing would be to hide this decision as much as possible and make it trivial to change it later. Maybe we can then keep using IntsRef as also the underlying DataAccess for edges is of type int[], potentially making copying between DataAccess and the BaseGraph simpler&faster.

@zhiqiangZHAO

This comment has been minimized.

Copy link
Contributor

commented Sep 13, 2018

It's clever to separate direction-dependent and non-direction-dependent features. reverse speed is slow, though if the bits are same it could be optimized.

@karussell

This comment has been minimized.

Copy link
Member Author

commented Sep 13, 2018

What do you mean? btw: the underlying implementation could be easily changed (have several experiments done but as a start simplicity is more important), so that e.g. the value and its delta are stored like we do now for landmarks instead forward+backward values.

@karussell karussell force-pushed the intsref_refactoring branch from 991e7cf to cffad50 Sep 20, 2018

@@ -189,21 +191,22 @@ private void findAndHandleShortcuts(int node) {
}

private void countPreviousEdges(int node) {
BooleanEncodedValue accessEnc = encoder.getAccessEnc();

This comment has been minimized.

Copy link
@karussell

karussell Feb 5, 2019

Author Member

Just as an info: the dependency flagEncoder->EncodedValue is one that we should avoid and always prefer EncodedManager->EncodedValue if possible and simple. In this case it seems to be harder so it might be ok (that's why I created the method as a helper).

// TODO NOW creating a new intsref every time here is probably not efficient ?
IntsRef intsRef = new IntsRef(1);
intsRef.ints[0] = direction;
shortcut.setFlags(intsRef);
shortcut.setSkippedEdges(edgeFrom.edge, edgeTo.edge)

This comment has been minimized.

Copy link
@karussell

karussell Feb 5, 2019

Author Member

Good question. As you can see in the node based contractor (addShortcuts) I'm doing the same, but I've moved the creation out of the loop so that it can be reused. Reusing is especially important if this is a hot method. IMO this is the main reason the preparation is slower compared to master.

I tried to let the Graph handle this caching and called shortcut.getFlags (creating a cached IntsRef) but as we currently do not allow to directly call getFlags for shortcuts this fails. The cleanest thing would be to create a shortcutWeightEnc and shortcutAccessEnc like we have e.g. averageSpeedEnc or accessEnc. Still in case of a new shortcut it would not be possible for the PrepareGraph to cache this (as it should be stateless), unlike when it would be a EdgeIterator or EdgeExplorer (where we assume stateful).

This comment has been minimized.

Copy link
@karussell

karussell Feb 5, 2019

Author Member

Or maybe you have a better idea in this regard?

This comment has been minimized.

Copy link
@karussell

karussell Feb 6, 2019

Author Member

Some possibilities:

  • allow getFlags: this avoids an additional IntsRef object in case of a new shortcut as we can just call "shortcut.getFlags". This is not that bad as getFlags is now also discouraged for edges too due to the low level usability
  • introduce accessEnc and weightEnc, which allows e.g. dynamically configuring the precision but the overhead when comparing to pure bit operations might be too high?
  • use a different API for shortcuts, e.g. fallback to long getShortcutFlags as everything that we want to store is known before and we could create setter&getter for it (like setWeight or setFirstAndLastOrigEdges). But this is inconsistent compared to normal edges.

easbar added some commits Feb 6, 2019

Fixes EdgeBasedNodeContractorTest.
Signed-off-by: easbar <easbar.mail@posteo.net>
Merge branch 'master' into intsref_refactoring
Signed-off-by: easbar <easbar.mail@posteo.net>

# Conflicts:
#	core/src/test/java/com/graphhopper/routing/ch/PrepareContractionHierarchiesTest.java

karussell added a commit that referenced this pull request Feb 6, 2019

easbar and others added some commits Feb 7, 2019

Adds more efficient shortcut method (prevents iterator), but sth is w…
…rong.

Signed-off-by: easbar <easbar.mail@posteo.net>
Fixes error (write flags to edges DA instead of shortcuts) and uses n…
…ew method for node- and edge-based CH.

Signed-off-by: easbar <easbar.mail@posteo.net>
minor
Signed-off-by: easbar <easbar.mail@posteo.net>
Minor cleanup.
Signed-off-by: easbar <easbar.mail@posteo.net>
Avoids intsref for shortcut updates and removes IntsRef creation from…
… CHEdgeIteratorImpl.

Signed-off-by: easbar <easbar.mail@posteo.net>

@karussell karussell referenced this pull request Feb 13, 2019

Merged

Edge CH speedup #1542

@karussell

This comment has been minimized.

Copy link
Member Author

commented Feb 13, 2019

@easbar Do you think this change makes this PR a bit cleaner or at least more consistent of CH and none-CH stuff? There are two tiny advantages: the weight is 1 bit more precise and setWeight and access handling is independent, i.e. you do not need to call one before the other or something. I wasn't able to measure any performance difference but will do again with some improved measurement stuff. After we decided this I would finally merge this PR.
update: it seems to be slower for the CH preparation

There is also this lazy loading commit, but I do not like this pattern that much.

karussell added some commits Feb 13, 2019

@easbar

This comment has been minimized.

Copy link
Collaborator

commented Feb 15, 2019

update: it seems to be slower for the CH preparation

Ok I have not looked at the change you mentioned yet, should I wait for more changes or have a look ?

easbar and others added some commits Feb 15, 2019

Removes unused intsref.
Signed-off-by: easbar <easbar.mail@posteo.net>

@karussell karussell merged commit 022697d into master Feb 19, 2019

3 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
@karussell

This comment has been minimized.

Copy link
Member Author

commented Feb 19, 2019

Ok, after some more tests this is now merged. Instead of the lazy loading commit we decided to use the intsref_no_iter_cached that gave a minor boost for CH preparation.

BTW: @easbar finally I got the co author thing working :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.