-
Notifications
You must be signed in to change notification settings - Fork 661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid UB on converting NaN to int in annotations #2992
Conversation
01da0b6
to
8491c1f
Compare
b0e0f58
to
54c8af6
Compare
Why does the duplicate occur? I thought the cutting logic was robust to this which means it has to be duplicated shape in OSM itself? If so that is an easy gruka test to write to actually get test coverage here. Shouldn't we add that quickly before merging? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can someone add a test for this case? @kevinkreiser yeah, this is occuring when the underlying OSM data contains a zero-length, duplicate-coordinate node.
TEST(ShapeAttributes, test_shape_attributes_duplicated_point) { | ||
tyr::actor_t actor(conf); | ||
|
||
auto result_json = actor.trace_attributes( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test passes on master as well...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@purew the test will pass, but is ubsan
triggered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how to test UB without ubsan, I'll enable it on CI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be fine to locally on your computer:
- undo the fix in triplegbuilder but keep the unit test
- turn on ubsan locally and run the unit test
- show that the sanitizer finds the issue via the unit test and post it to this PR
we shouldnt change the behavior of CI in this PR, iirc when we had that enabled we either got randomly OOM killed or the build took forever
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, CI should test valhalla for UB, it would help to avoid UB issues in future, It's not the only issue found with ubsan.
i wonder if this line is a contributing factor:
maybe speed is 0 if at the beginning or end of a map match we use 0% of an edge or maybe edge_seconds is 0. @averkhaturau i assume you ran into this sometime. did you happen to debug it to see the root cause of the nan? |
In short, if we have a duplicated coordinate in route geometry (we usually do in multileg routes), the speed is calculated as So, the reason is actually a duplicated coordinate. |
18719f1
to
1f5fc7d
Compare
I do not understand your comment about multileg routes! TripLegBuilder only works on one leg at a time, so surely multileg routes cannot cause this problem. I'm just wondering where the point duplication comes in. Is it in the base OSM data like that or do we have a bug elsewhere that is erroneously repeating a coordinate (ie when cutting the shape for the first or last edge of the route, also done in triplegbuilder)? its good to handle it because OSM could have duplicate adjacent points (and im certain we dont weed that out at tile building time), but my main question is what is causing duplicated points and do we need to fix something more in a separate PR |
@kevinkreiser as far as i see we do have a request that reproduces the issue. I'll come back to you soon and let you know if we need one more fix somewhere else. But I believe this activity is not a blocker for this PR |
This PR is a hot workaround of UB. Duplicated point is a separate issue. |
yeah im not trying to block the PR. let me reiterate my points more concisely:
^^ we should ship this to protect us from NaN regardless of the cause
^^ lets investigate the cause of the duplicate adjacent shape points in a separate issue/PR
^^ revert the changes to CI |
f66383b
to
98beccc
Compare
@kevinkreiser , I reverted CI changes. Should I revert a test as well? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
Duplicated coordinates can easily exist in the original OSM data - I don't think there's any need to look for another cause here. A simple scan of OSM I'm sure will expose thousands, if not millions, of cases of this. Valhalla needs to be robust against that, and as @averkhaturau pointed out, we've been lucky that it just so happens that NaN->0 for our serialization in most cases. |
Issue
Speed annotation may fail with converting NaN to integer if the same geometry point in duplicated in a route.
Tasklist
Requirements / Relations
Link any requirements here. Other pull requests this PR is based on?