fix: clamp content differences to prevent FMA-induced numerical instability by barendgehrels · Pull Request #1455 · boostorg/geometry

barendgehrels · 2026-03-30T15:18:28Z

Might fix #1452

@tinko92 could you verify if this helps the problem?

The fix is by Claude Code (reviewed/directed by me), it argues:

 Analysis

  The core invariant being violated

  When box_exp is expanded from box_orig to include an indexable, we have box_exp ⊇ box_orig, so mathematically content(box_exp) >=
  content(box_orig) and content_diff >= 0. With FMA fusion, this invariant breaks: the multiply inside content() fuses with the
  subsequent subtraction differently for the two calls, yielding negative or spuriously non-zero results.

  The dangerous patterns

  Every call site follows one of two shapes:

  Shape 1 — "enlargement cost" (5 sites): content(expanded) - content(original)
  - insert.hpp:92
  - rstar/choose_next_node.hpp:120, 249
  - quadratic/redistribute_elements.hpp:294-295
  - linear/redistribute_elements.hpp:435-436

  Shape 2 — "free content" (1 site): content(union) - content(a) - content(b)
  - quadratic/redistribute_elements.hpp:77-79

  In all cases the result is mathematically guaranteed >= 0, but FMA can make it negative or spuriously non-zero.

  Why negative matters more than "slightly different positive"

  When content_diff becomes negative for a box that didn't actually expand, that box appears to shrink when receiving the new element —
  making it the "best" choice by a wide margin. This cascades: wrong node selections lead to a suboptimal tree structure (6685 nodes
  instead of 6763 in the reported case), which means far less effective pruning during queries.

  The fix: clamp to the known invariant

  Since we know these differences are >= 0, clamping them is mathematically sound and eliminates the FMA issue without suppressing any  compiler optimizations globally:

  content_type content_diff = (std::max)(content_type(0),
      content - index::detail::content(ch_i.first));

  This fixes both failure modes:
  - Negative values become exactly 0 (correct: the box didn't enlarge)
  - "Should be zero but is epsilon" — while (std::max) doesn't fix this, it doesn't need to. Two boxes that both don't enlarge will both   clamp to exactly 0, so equality comparisons work. Two boxes that both enlarge by a tiny positive amount are just heuristic  tie-breakers where sub-ULP differences are irrelevant.

  Interesting existing precedent

  The code at rstar/choose_next_node.hpp:135 already does epsilon-aware comparison:
  if ( min_content_diff < -std::numeric_limits<content_type>::epsilon()
    || std::numeric_limits<content_type>::epsilon() < min_content_diff )
  This shows the codebase already anticipated near-zero issues, just not systematically.

  What about the == comparisons?

  Lines like content_diff == smallest_content_diff and content1 == content2 are used as tie-breakers. After clamping, the most important
   case — two equal boxes both showing zero enlargement — will compare as exactly 0 == 0. For two genuinely different enlargements,
  sub-ULP differences in tie-breaking are harmless for the heuristic.

  Concrete changes needed:

  │ insert.hpp                          │ 92      │ Clamp content_diff                 │
  │ rstar/choose_next_node.hpp          │ 120     │ Clamp content_diff                 │
  │ rstar/choose_next_node.hpp          │ 249     │ Clamp content_diff                 │
  │ quadratic/redistribute_elements.hpp │ 77-79   │ Clamp free_content                 │
  │ quadratic/redistribute_elements.hpp │ 294-295 │ Clamp both content_increase values │
  │ linear/redistribute_elements.hpp    │ 435-436 │ Clamp both content_increase values │

I think this makes sense, looks good - and locally it did not break any unit tests (the RST always failed for me - on the fly that is fixed now as well, see other PR)

…l instability Might fix boostorg#1452

barendgehrels · 2026-03-30T15:19:11Z

include/boost/geometry/index/detail/rtree/quadratic/redistribute_elements.hpp

-                out_content_increase1 = content_incrase1;
-                out_content_increase2 = content_incrase2;
+                out_content_increase1 = content_increase1;
+                out_content_increase2 = content_increase2;


I also fixed a typo, that's why the diff is a bit larger here

tinko92 · 2026-03-31T15:06:45Z

I tried something similar to what the language model outputs here before I wrote #1453 , but instead of clamping, I tested the values of the content calls directly for equality. I went into a different direction because

it added an extra branch and prevented contraction to FMA (because the compiler needed to compute both intermediate results for comparing anyway), so no advantage over turning of contraction for content directly. This clamping is better though because max is branchless and it still allows applying FMA, so 1. is resolved.
It did not fully resolve the issue.

Consider the following test using the instrumented main.cpp and the data.csv from the issue:

#!/usr/bin/env bash

(cd ../../boost/libs/geometry/ && git reset --hard -q && git checkout $1 -q && sed -i '504i\cross_track_calls++;' include/boost/geometry/strategies/geographic/distance_cross_track.hpp && git branch --show-current)
g++ -g -I../../boost/ -DNDEBUG -O3 -march=$2 main.cpp -std=c++20 -o a.out && ./a.out | grep -E "^2:|query all took|cross_track calls:"

Then we get:

bartels@DESKTOP-BPNV3G3:~/dev/bg_1452/test$ sh test.sh develop x86-64-v2 # Baseline without issue (v2)
develop
2: 6763
query all took :   3460.036 ms
cross_track calls: 3303357

bartels@DESKTOP-BPNV3G3:~/dev/bg_1452/test$ sh test.sh develop x86-64-v3 # Issue (v3)
develop
2: 6685
query all took :   7346.178 ms
cross_track calls: 6117432

bartels@DESKTOP-BPNV3G3:~/dev/bg_1452/test$ sh test.sh fix/index-content-fp-contract-off x86-64-v3 #1453 
fix/index-content-fp-contract-off
2: 6763 # Probably the same tree structure
query all took :   3496.460 ms
cross_track calls: 3306476 # Minor difference in distance calls, might be due to difference in distance itself.

bartels@DESKTOP-BPNV3G3:~/dev/bg_1452/test$ sh test.sh fix/issue-1452-index-content x86-64-v3 # This PR
fix/issue-1452-index-content
2: 6697 # Different tree structure from baseline.
query all took :   4120.981 ms
cross_track calls: 4200675 # Still a severe regression in the number of distance calls.

So there almost certainly call sites of content that were not covered by the LLM-generated patch. I still like the change in itself, though, because it seems like a sensibly defensive thing to do that causes no branching and prevents potentially nonsensical values. It may be sensible to merge both PRs (the original PR also e.g. prevents issues with contraction of content at potential future call sites or existing sites unrelated to this bug if there are any).

tinko92

I approve merging these changes, the one formatting comment is no blocker for me.

tinko92 · 2026-03-31T15:14:18Z

include/boost/geometry/index/detail/rtree/linear/redistribute_elements.hpp

-                        content_type content_increase1 = enlarged_content1 - content1;
-                        content_type content_increase2 = enlarged_content2 - content2;
+                        content_type const content_increase1 = (std::max)(content_type(0), enlarged_content1 - content1);
+                        content_type const content_increase2 = (std::max)(content_type(0), enlarged_content2 - content2);


Minor: These violate our max line length of 100.

fix(rtree): clamp content differences to prevent FMA-induced numerica…

4aa13e5

…l instability Might fix boostorg#1452

barendgehrels requested review from awulkiew, tinko92 and vissarion March 30, 2026 15:18

barendgehrels self-assigned this Mar 30, 2026

barendgehrels commented Mar 30, 2026

View reviewed changes

barendgehrels mentioned this pull request Mar 30, 2026

Prevent FP contractions to FMA in index/.../content.hpp to avoid numerical instability in rtree #1453

Open

tinko92 approved these changes Mar 31, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: clamp content differences to prevent FMA-induced numerical instability#1455

fix: clamp content differences to prevent FMA-induced numerical instability#1455
barendgehrels wants to merge 1 commit intoboostorg:developfrom
barendgehrels:fix/issue-1452-index-content

barendgehrels commented Mar 30, 2026

Uh oh!

barendgehrels Mar 30, 2026

Uh oh!

tinko92 commented Mar 31, 2026 •

edited

Loading

Uh oh!

tinko92 left a comment •

edited

Loading

Uh oh!

tinko92 Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

barendgehrels commented Mar 30, 2026

Uh oh!

barendgehrels Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

tinko92 commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tinko92 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tinko92 Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tinko92 commented Mar 31, 2026 •

edited

Loading

tinko92 left a comment •

edited

Loading