Skip to content

Mixed scalar/array fill zeroes the whole histogram when a non-inclusive axis drops the first entry #426

@henryiii

Description

@henryiii

The following report was generated from Claude Code (Opus 4.8) from an investigation of a boost-histogram issue.

🤖 AI text below. 🤖

Summary

When histogram::fill is called with a mix of an array argument and a scalar argument
(the scalar is broadcast over the array length), and one axis is non-inclusive
(no underflow/overflow, no growth) so that some entries fall out of range, the result
can be that every entry is discarded — not just the out-of-range ones. The whole
histogram ends up empty.

The trigger is specifically that the first entry of an index buffer chunk is dropped
by an axis that is processed before a scalar axis. The single-value operator() fill
path is unaffected; only the bulk fill path (detail/fill_n.hpp) is wrong.

This was reported downstream as scikit-hep/boost-histogram#960, where it silently zeroes
real analysis histograms (categorical axis used to "ignore" unmatched values without an
overflow bin).

Minimal reproducer (no downstream code)

#include <boost/histogram.hpp>
#include <boost/variant2/variant.hpp>
#include <iostream>
#include <vector>

namespace bh = boost::histogram;

int main() {
  // Axis 0: integer category, NO under/overflow, NO growth -> non-inclusive
  auto a0 = bh::axis::category<int, bh::use_default, bh::axis::option::none_t>({1, 2, 3});
  // Axis 1: ordinary axis; receives a scalar (broadcast) argument in the batched fill
  auto a1 = bh::axis::category<int>({7});

  // ground truth: single-value fills, one entry at a time
  auto h1 = bh::make_histogram(a0, a1);
  std::vector<int> xs = {0, 1, 2, 3}; // first element (0) is out of range on axis 0
  for (int xi : xs) h1(xi, 7);
  std::cout << "single-value fill sum   = " << bh::algorithm::sum(h1) << "\n";

  // batched fill: array on axis 0, scalar (broadcast) on axis 1
  auto h2 = bh::make_histogram(a0, a1);
  using V = boost::variant2::variant<int, std::vector<int>>;
  V xy[2];
  xy[0] = xs; // array on the non-inclusive axis; FIRST element is out of range
  xy[1] = 7;  // scalar on the second axis
  h2.fill(xy);
  std::cout << "batched (mixed) fill sum = " << bh::algorithm::sum(h2) << "\n";

  std::cout << "expected = 3 (only the out-of-range 0 is dropped)\n";
}

Output (Boost 1.90):

single-value fill sum   = 3
batched (mixed) fill sum = 0      <-- BUG, should be 3
expected = 3 (only the out-of-range 0 is dropped)

If the out-of-range element is moved so it is not the first element of the array
(e.g. {1, 2, 3, 0}), the bug disappears and the sum is correctly 3 — which points
straight at the root cause.

Root cause

include/boost/histogram/detail/fill_n.hpp, index_visitor::call_1 for the scalar
(broadcast) case:

template <class T>
void call_1(std::true_type, const T& value) const {
  // T is compatible value; fill single value N times
  const auto before = *begin_;
  call_2(IsGrowing{}, begin_, value);
  if (is_valid(*begin_)) {
    const auto delta =
        static_cast<std::intptr_t>(*begin_) - static_cast<std::intptr_t>(before);
    for (auto it = begin_ + 1; it != begin_ + size_; ++it) *it += delta;
  } else
    std::fill(begin_, begin_ + size_, invalid_index);   // <-- wrongly fires
}

The optimization computes the scalar's per-axis contribution once, using the first
index buffer entry *begin_ as the representative, then broadcasts the delta to the rest.

Axes are processed in order in fill_n_indices. When a preceding non-inclusive axis
maps the first array element out of range, it sets *begin_ = invalid_index
(optional_index is "sticky": operator+= on an already-invalid value is a no-op).
The scalar axis then runs call_1:

  • before = *begin_ is already invalid_index.
  • call_2 linearizes the (valid) scalar into *begin_, but *begin_ += idx*stride
    does nothing because it is already invalid, so *begin_ stays invalid.
  • is_valid(*begin_) is therefore false, and the else branch
    std::fill(begin_, begin_ + size_, invalid_index) invalidates all entries —
    even the ones that were perfectly valid.

So the validity of the first entry (which an earlier axis may have already killed) is
mistaken for the validity of the scalar on this axis, and one dropped leading entry
takes the entire buffer down with it. (For buffers larger than 1ul << 14, this happens
per chunk, whenever a chunk's first entry was dropped.)

Proposed fix

Compute the scalar's contribution on a fresh, valid temporary index instead of reading
it back from *begin_, then broadcast it to every entry. optional_index::operator+=
leaves entries already invalidated by previous axes untouched, while a scalar that is
itself out of range for this axis still invalidates every entry (because the temporary
becomes invalid). Behaviour is identical to the old code whenever *begin_ was valid.

+  // Compute the index contribution of a scalar value into a temporary that is
+  // independent of the existing index buffer. Mirrors call_2, but writes into `out`
+  // (initially a valid zero) instead of an index buffer entry, so there is no earlier
+  // index to shift; the growth shift, if any, is still recorded for the grower.
+  template <class T>
+  void scalar_index(std::true_type, index_type& out, const T& x) const {
+    axis::index_type shift;
+    linearize_growth(out, shift, stride_, axis_,
+                     try_cast<value_type, std::invalid_argument>(x));
+    if (shift > 0) *shift_ += shift;
+  }
+
+  template <class T>
+  void scalar_index(std::false_type, index_type& out, const T& x) const {
+    linearize(out, stride_, axis_, try_cast<value_type, std::invalid_argument>(x));
+  }
+
   template <class T>
   void call_1(std::true_type, const T& value) const {
     // T is compatible value; fill single value N times
 
-    // Optimization: We call call_2 only once and then add the index shift onto the
-    // whole array of indices, because it is always the same. This also works if the
-    // axis grows during this operation. There are no shifts to apply if the zero-point
-    // changes.
-    const auto before = *begin_;
-    call_2(IsGrowing{}, begin_, value);
-    if (is_valid(*begin_)) {
+    // Optimization: the value is a scalar, so its index contribution to this axis is
+    // the same for every entry. We compute that contribution once and add it to the
+    // whole array of indices. This also works if the axis grows during this operation.
+    //
+    // The contribution is computed on a fresh, valid temporary index rather than read
+    // back from *begin_: a previous axis may already have invalidated *begin_, and
+    // deriving the contribution from an invalid index would incorrectly invalidate the
+    // entire buffer (see https://github.com/scikit-hep/boost-histogram/issues/960).
+    // Adding the contribution with operator+= leaves entries already invalidated by
+    // previous axes untouched, while a scalar that is itself out of range for this axis
+    // correctly invalidates every entry.
+    index_type tmp{};
+    tmp = 0;
+    scalar_index(IsGrowing{}, tmp, value);
+    if (is_valid(tmp)) {
       // since index can be std::size_t or optional_index, must do conversion here
-      const auto delta =
-          static_cast<std::intptr_t>(*begin_) - static_cast<std::intptr_t>(before);
-      for (auto it = begin_ + 1; it != begin_ + size_; ++it) *it += delta;
+      const auto delta = static_cast<std::intptr_t>(tmp);
+      for (auto it = begin_; it != begin_ + size_; ++it) *it += delta;
     } else
       std::fill(begin_, begin_ + size_, invalid_index);
   }

Note: the original call_2(true_type, ...) also contains a
while (it != begin_) *--it += shift * stride_; loop that shifts earlier entries when a
growing axis changes its zero-point. In the scalar broadcast case that loop is always a
no-op (the single linearization happens at begin_), so scalar_index omits it and only
records *shift_ += shift for the storage grower — equivalent for this code path, and it
also avoids ever passing a non-buffer pointer into that decrement loop.

Verification

  • The reproducer above prints 3 for both fills after the patch.
  • Re-ordering so the out-of-range element is not first already gave 3 before and still does.
  • A growing second axis filled with a scalar (scalar_index(std::true_type, …) branch)
    matches the single-value fill (sum and grown axis size).
  • scikit-hep/boost-histogram's full test suite (940 tests) passes with the vendored copy
    patched; the downstream #960 reproducer now yields the expected non-zero sums.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions