Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make tile building reproducible #2480

Merged
merged 9 commits into from
Jul 28, 2020
Merged

Make tile building reproducible #2480

merged 9 commits into from
Jul 28, 2020

Conversation

merkispavel
Copy link
Contributor

@merkispavel merkispavel commented Jul 21, 2020

Issue

At the moment if you try to run valhalla_build_tiles a few times with the same input data(config, pbfs) you'll get different tilesets(tile hash sums are different). The PR fixes that behavior

Example of how to compare tilesets

cd ${PATH_TO_FIRST_TILESETS} && find . -type f -exec md5 {} + | sort -k 2 > ../first_hash.txt && cd -
cd ${PATH_TO_SECOND_TILESETS} && find . -type f -exec md5 {} + | sort -k 2 > ../second_hash.txt && cd -
diff first_hash.txt second_hash.txt # must be empty after this PR

Tasklist

  • Add tests
  • Add #fixes with the issue number that this PR addresses
  • Generally use squash merge to rebase and clean comments before merging
  • Update the changelog

Requirements / Relations

Link any requirements here. Other pull requests this PR is based on?

@danpaz
Copy link
Collaborator

danpaz commented Jul 21, 2020

Is this reproducible with a gurka test? If so it seems worth adding a simple test that builds tiles twice using the same input map and checking the resulting tiles are the same.

@merkispavel
Copy link
Contributor Author

merkispavel commented Jul 21, 2020

Is this reproducible with a gurka test? If so it seems worth adding a simple test that builds tiles twice using the same input map and checking the resulting tiles are the same.

Totally agreed! Didn't want to assign reviewers until I add tests.
The only thing is: I thought about using some small real world pbf(small country/extracted region) rather than "non-realworld" . Not sure which one is better

@danpaz
Copy link
Collaborator

danpaz commented Jul 21, 2020

The only thing is: I thought about using some small real world pbf(small country/extracted region) rather than "non-realworld" . Not sure which one is better

IMO, if a minimal fake map reproduces the issue it will be a lot simpler to write and understand that test, and it will run a lot faster in CI.

@gknisely gknisely self-requested a review July 24, 2020 14:45
gknisely
gknisely previously approved these changes Jul 24, 2020
@merkispavel
Copy link
Contributor Author

merkispavel commented Jul 24, 2020

Hm, tests pass on my machine but fails in CI. Seems there're another UB somewhere in the code..
I built and ran this test locally with UB sanitizer. 4 UB errors(3 unique?) were found. Need some time to figure out what UB-s are critical for this test/building tiles.
Makes sense to me not to merge this PR until that to avoid flaky tests in the master

[----------] 1 test from GraphTileBuilder
[ RUN      ] GraphTileBuilder.TestBuildIsReproducible
[          ] generating map PBF at test/data/gurka_reproduce_tile_build/1/map.pbf
[          ] building tiles in test/data/gurka_reproduce_tile_build/1
../valhalla/baldr/graphtileheader.h:166:12: runtime error: downcast of address 0x615000010200 which does not point to an object of type 'const midgard::PointLL' (aka 'const GeoPoint<float>')
0x615000010200: note: object has invalid vptr
 5f 00 00 36  82 5e 3f 00 00 00 00 00  00 00 00 00 00 00 00 00  33 2e 30 2e 39 00 00 00  00 00 00 00
              ^~~~~~~~~~~~~~~~~~~~~~~
              invalid vptr
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../valhalla/baldr/graphtileheader.h:166:12 in
../valhalla/baldr/nodeinfo.h:65:41: runtime error: member call on address 0x615000000800 which does not point to an object of type 'valhalla::midgard::GeoPoint<float>'
0x615000000800: note: object has invalid vptr
 7f 00 00 41  82 5e 3f 00 00 00 00 00  00 00 00 00 00 00 00 00  33 2e 30 2e 39 00 00 00  00 00 00 00
              ^~~~~~~~~~~~~~~~~~~~~~~
              invalid vptr
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../valhalla/baldr/nodeinfo.h:65:41 in
../valhalla/baldr/nodeinfo.h:66:41: runtime error: member call on address 0x615000000800 which does not point to an object of type 'valhalla::midgard::GeoPoint<float>'
0x615000000800: note: object has invalid vptr
 7f 00 00 41  82 5e 3f 00 00 00 00 00  00 00 00 00 00 00 00 00  33 2e 30 2e 39 00 00 00  00 00 00 00
              ^~~~~~~~~~~~~~~~~~~~~~~
              invalid vptr
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../valhalla/baldr/nodeinfo.h:66:41 in
../src/baldr/directededge.cc:506:22: runtime error: shift exponent 4294967295 is too large for 32-bit type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../src/baldr/directededge.cc:506:22 in
../src/mjolnir/graphvalidator.cc:634:33: runtime error: division by zero
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../src/mjolnir/graphvalidator.cc:634:33 in
[          ] generating map PBF at test/data/gurka_reproduce_tile_build/2/map.pbf
[          ] building tiles in test/data/gurka_reproduce_tile_build/2
[       OK ] GraphTileBuilder.TestBuildIsReproducible (5520 ms)
[----------] 1 test from GraphTileBuilder (5520 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (5520 ms total)
[  PASSED  ] 1 test.

test/gurka/test_reproduce_tile_build.cc Show resolved Hide resolved
test/gurka/test_reproduce_tile_build.cc Outdated Show resolved Hide resolved
valhalla/baldr/admin.h Show resolved Hide resolved
@@ -502,6 +502,8 @@ void DirectedEdge::set_shortcut(const uint32_t shortcut) {
void DirectedEdge::set_superseded(const uint32_t superseded) {
if (superseded > kMaxShortcutsFromNode) {
LOG_WARN("Exceeding max shortcut edges from a node: " + std::to_string(superseded));
} else if (superseded == 0) {
superseded_ = 0;
Copy link
Contributor Author

@merkispavel merkispavel Jul 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the fix of the failed build-release check a few days ago. If superseded == 0 then 1 << (superseded - 1) produces UB

This fix will be committed in different PR with most UB fixes
mookerji
mookerji previously approved these changes Jul 28, 2020
Copy link
Contributor

@mookerji mookerji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! The Azure pipelines build is unrelated to your PR, and is failing for unrelated reasons.

@mookerji
Copy link
Contributor

@merkispavel actually I think you need to add an entry to the CHANGELOG as well, as part of this change.

@merkispavel
Copy link
Contributor Author

@merkispavel actually I think you need to add an entry to the CHANGELOG as well, as part of this change.

Thanks for the reminder! Done

@merkispavel merkispavel merged commit bea80dc into master Jul 28, 2020
@merkispavel merkispavel deleted the consistent-tile-build branch July 28, 2020 13:35
yuzheyan added a commit that referenced this pull request Jul 28, 2020
* 'master' of github.com:valhalla/valhalla:
  Make tile building reproducible (#2480)
  Fix install-script for ubuntu 18.04 (#2306) (#2486)
  nit(git): Configures changelog to resolve conflicts with union strategy (#2489)
  feat(traffictile.h): Adds versioning checks (#2484)
  Fix dereferencing of end for std::lower_bound and possible UB (#2488)
  Minor fixes to tests (#2483)
  nit: Missing changelog entry (#2478)
  Safiefy protobuf (#2477)
yuzheyan added a commit that referenced this pull request Jul 28, 2020
* master:
  Make tile building reproducible (#2480)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants