Skip to content

[tmva][sofie] ReduceMean: loop variable typo in kFirst path causes infinite loop or wrong result #21682

@harz05

Description

@harz05

Check duplicate issues.

  • Checked for duplicates

Description

After the outer accumulation loop finishes, i holds the fixed value reducedLength and stays there. The division loop then checks i < outputLength which isa condition that never changes thus, j increments forever, reading and writing memory way out of bounds.

I can think of two failure modes depending on tensor shape:

  • reducedLength < outputLength (e.g. reducing axis 0 of a [3, 4] tensor: reducedLength=3, outputLength=4, so 3 < 4 is always true): infinite loop + out-of-bounds memory access at inference time
  • reducedLength >= outputLength: division loop never executes at all, returning the sum instead of the mean

I've also reported this on the SOFIE repo: ML4EP/SOFIE#17

Reproducer

Step 1: create a minimal ONNX model with ReduceMean over axis 0:

import onnx
from onnx import helper, TensorProto

X = helper.make_tensor_value_info('X', TensorProto.FLOAT, [3, 4])
Y = helper.make_tensor_value_info('Y', TensorProto.FLOAT, [4])
node = helper.make_node('ReduceMean', inputs=['X'], outputs=['Y'], axes=[0], keepdims=0)
graph = helper.make_graph([node], 'test', [X], [Y])
onnx.save(helper.make_model(graph), '/tmp/reduce_test.onnx')

Step 2: parse and generate with SOFIE:

import ROOT
parser = ROOT.TMVA.Experimental.SOFIE.RModelParser_ONNX()
m = parser.Parse('/tmp/reduce_test.onnx')
m.Generate()
m.OutputGenerated('/tmp/reduce_test.hxx')

Step 3: check the generated loops:

grep -n "for (size_t j" /tmp/reduce_test.hxx

The full generated doInfer function makes the problem clear — i is out of scope after the outer loop yet used in the inner condition:

for (size_t i = 0; i < 3; i++) {        // i ends at 3 after this loop
    for (size_t j = 0; j < 4; j++) {
        tensor_Y[j] += tensor_X[i * 4 + j];
    }
}
for (size_t j = 0; i < 4; j++) {        // i is still 3, 3<4 is always true -> infinite loop
    tensor_Y[j] /= static_cast<float>(3);
}
Image

ROOT version

6.39.01 (built from source)

Installation method

Built from source with -Dtmva-sofie=ON -Dtmva-pymva=ON -Dbuiltin_protobuf=ON

Operating system

Linux (Ubuntu 22.04)

Additional context

kMiddle and kLast paths are unaffected. Only ReduceMean on leading axes triggers the kFirst path.

Metadata

Metadata

Assignees

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions