Skip to content

reduce_sum: std::forward in fwd Loop Moves Shared Args #3303

@andrepfeuffer

Description

@andrepfeuffer

Silent Wrong Values

Files Affected

  • stan/lib/stan_math/stan/math/fwd/functor/reduce_sum.hpp
    • Line 93: Loop in auto_partitioning=false path

Root Cause Analysis

The forward differentiation (fwd) implementation of reduce_sum processes data in chunks. For each chunk, it calls the user-provided ReduceFunction:

// BEFORE (buggy)
for (size_t start = 0; start < num_chunks; ++start) {
  sum += ReduceFunction()(sub_slice, start, end, msgs,
                          std::forward<Args>(args)...);  // ← BUG
}

The Problem:

When args contains rvalue references (e.g., temporary objects passed to reduce_sum_static):

auto result = reduce_sum_static<F>(data, grainsize, nullptr,
                                   std::vector<double>{1.0, 2.0});
//                                   ^^^^^^^^^^^^^^^^^ rvalue

The std::forward<Args>(args)... unpacks and moves the rvalue references:

Iteration 1:
  std::forward<Args>(args)... → std::move(shared_arg)
  ReduceFunction() gets moved object ✅
  shared_arg is now in moved-from state

Iteration 2:
  std::forward<Args>(args)... → std::move(shared_arg)
  But shared_arg is already moved-from! ❌
  Undefined behavior — reading moved-from object

Detailed Technical Example

struct sum_with_shared_fn {
  double operator()(const std::vector<double>& slice, size_t, size_t,
                    std::ostream*, std::vector<double> shared) const {
    double s = 0;
    for (auto x : slice) s += x;
    for (auto x : shared) s += x;  // Add shared arg to sum
    return s;
  }
};

// Usage:
std::vector<double> data(4, 1.0);           // 4 elements, each = 1.0
std::vector<double> shared(2, 10.0);        // shared = {10.0, 10.0}

auto result = reduce_sum_static<sum_with_shared_fn>(
    data, 2, nullptr,                       // grainsize=2 → 2 chunks
    std::move(shared)                       // Pass shared as rvalue
);

// Expected result: (4 * 1.0) + (2 * 10.0) = 24.0
// With bug: 
//   Chunk 1: 2*1.0 + 2*10.0 = 22.0 ✅
//   Chunk 2: 2*1.0 + 0*? = 2.0 ❌ (shared moved-from)
//   Total: 24.0 expected, but 22.0 or garbage returned

Why std::forward Is Wrong Here

// Fundamental C++ rule:
// - Named rvalue references are lvalues in expressions
// - std::forward<T>(x) is only for perfectly-forwarded parameters

Args&&... args  // Rvalue references (parameters)

// In the loop:
std::forward<Args>(args)...
  // This unpacks to std::forward<T1>(arg1), std::forward<T2>(arg2), ...
  // Each std::forward performs a move (if T is rvalue)
  // → On iteration 1: moves
  // → On iteration 2+: moves moved-from object ❌

args...  // Without std::forward
  // Each arg is a named rvalue ref, which is an lvalue
  // Expression args treats them as lvalues
  // → Each iteration copies (safe to loop) ✅

The Fix

Changed: Removed std::forward<> to treat rvalue references as lvalues

// AFTER (correct)
for (size_t start = 0; start < num_chunks; ++start) {
  sum += ReduceFunction()(sub_slice, start, end, msgs,
                          args...);  // ← No std::forward
}

This is subtle but crucial:

  • Named parameters Args&&... args are declared as rvalue references
  • But in expressions, they behave as lvalues
  • Without std::forward<>, they're copied each iteration (safe)
  • This matches the prim implementation which stores args in a tuple and unpacks safely

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions