fix: clamp content differences to prevent FMA-induced numerical instability#1455
fix: clamp content differences to prevent FMA-induced numerical instability#1455barendgehrels wants to merge 1 commit intoboostorg:developfrom
Conversation
…l instability Might fix boostorg#1452
| out_content_increase1 = content_incrase1; | ||
| out_content_increase2 = content_incrase2; | ||
| out_content_increase1 = content_increase1; | ||
| out_content_increase2 = content_increase2; |
There was a problem hiding this comment.
I also fixed a typo, that's why the diff is a bit larger here
|
I tried something similar to what the language model outputs here before I wrote #1453 , but instead of clamping, I tested the values of the content calls directly for equality. I went into a different direction because
Consider the following test using the instrumented main.cpp and the data.csv from the issue: #!/usr/bin/env bash
(cd ../../boost/libs/geometry/ && git reset --hard -q && git checkout $1 -q && sed -i '504i\cross_track_calls++;' include/boost/geometry/strategies/geographic/distance_cross_track.hpp && git branch --show-current)
g++ -g -I../../boost/ -DNDEBUG -O3 -march=$2 main.cpp -std=c++20 -o a.out && ./a.out | grep -E "^2:|query all took|cross_track calls:"Then we get: bartels@DESKTOP-BPNV3G3:~/dev/bg_1452/test$ sh test.sh develop x86-64-v2 # Baseline without issue (v2)
develop
2: 6763
query all took : 3460.036 ms
cross_track calls: 3303357
bartels@DESKTOP-BPNV3G3:~/dev/bg_1452/test$ sh test.sh develop x86-64-v3 # Issue (v3)
develop
2: 6685
query all took : 7346.178 ms
cross_track calls: 6117432
bartels@DESKTOP-BPNV3G3:~/dev/bg_1452/test$ sh test.sh fix/index-content-fp-contract-off x86-64-v3 #1453
fix/index-content-fp-contract-off
2: 6763 # Probably the same tree structure
query all took : 3496.460 ms
cross_track calls: 3306476 # Minor difference in distance calls, might be due to difference in distance itself.
bartels@DESKTOP-BPNV3G3:~/dev/bg_1452/test$ sh test.sh fix/issue-1452-index-content x86-64-v3 # This PR
fix/issue-1452-index-content
2: 6697 # Different tree structure from baseline.
query all took : 4120.981 ms
cross_track calls: 4200675 # Still a severe regression in the number of distance calls.So there almost certainly call sites of content that were not covered by the LLM-generated patch. I still like the change in itself, though, because it seems like a sensibly defensive thing to do that causes no branching and prevents potentially nonsensical values. It may be sensible to merge both PRs (the original PR also e.g. prevents issues with contraction of content at potential future call sites or existing sites unrelated to this bug if there are any). |
| content_type content_increase1 = enlarged_content1 - content1; | ||
| content_type content_increase2 = enlarged_content2 - content2; | ||
| content_type const content_increase1 = (std::max)(content_type(0), enlarged_content1 - content1); | ||
| content_type const content_increase2 = (std::max)(content_type(0), enlarged_content2 - content2); |
There was a problem hiding this comment.
Minor: These violate our max line length of 100.
Might fix #1452
@tinko92 could you verify if this helps the problem?
The fix is by Claude Code (reviewed/directed by me), it argues:
I think this makes sense, looks good - and locally it did not break any unit tests (the RST always failed for me - on the fly that is fixed now as well, see other PR)