[runtime/onnxruntime] Paraformer::Forward heap-buffer-overflow: encoder_out_lens reads ONNX int32 tensor as int64_t

## Summary

`Paraformer::Forward` in `runtime/onnxruntime/src/paraformer.cpp` reads an ONNX `int32` tensor (`outputTensor[1]`, `encoder_out_lens`) as `int64_t*`, dereferencing 8 bytes from a 4-byte allocation. AddressSanitizer catches a heap-buffer-overflow read on every single inference call.

## Location

`runtime/onnxruntime/src/paraformer.cpp` (current `main`), line 512 (and consumers at lines 529, 531, 538, 540):

```cpp
auto outputTensor = m_session_->Run(...);
...
auto encoder_out_lens = outputTensor[1].GetTensorMutableData<int64_t>();
...
result = GreedySearch(floatData, *encoder_out_lens, outputShape[2]);   // line 538
```

For at least the `paraformer-large-contextual` ONNX model (`speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404-onnx` from ModelScope), `outputTensor[1]` is allocated as a 4-byte int32 tensor. Treating it as `int64_t*` and dereferencing reads 8 bytes, overflowing 4 bytes past the buffer.

## ASAN evidence

Rebuilt with `-fsanitize=address` and triggered a single inference call:

```
==1==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x... at pc 0x... thread T...
READ of size 8 at 0x... thread T...
    #0 0x... in funasr::Paraformer::Forward[abi:cxx11](float**, int*, bool, ...) /onnxruntime/src/paraformer.cpp:538
    #1 0x... in FunTpassInferBuffer ... /onnxruntime/src/funasrruntime.cpp:575
    #2 0x... in funasr_tpass_offline_infer ... /onnxruntime/src/funasr_capi.cpp:287

0x...d44 is located 0 bytes to the right of 4-byte region [0x...d40, 0x...d44)
allocated by thread T... here:
    #0 0x... in __interceptor_posix_memalign
    #1 0x... in libonnxruntime.so.1.14.0
    #2 0x... in libonnxruntime.so.1.14.0
```

The allocation is exactly 4 bytes (the int32 element). The read is 8 bytes (`int64_t`). Triggered every time the model runs, on the very first inference.

## Why production typically doesn't crash

The 4-byte overrun reads into ONNX runtime's `posix_memalign` padding / glibc tcache freelist metadata, which usually contains zeros. `*encoder_out_lens` then truncates back to `int` when passed to `GreedySearch`/`BeamSearch`, and the result happens to equal the correct value.

This is undefined behavior. Any change in glibc allocator behavior (a tcache fill pattern change in a future glibc version, swapping in jemalloc, ASAN/MSAN/HWASAN builds, or simply different load patterns) can make those 4 bytes return garbage and produce wrong decoder output — silently, with no crash to flag it.

## Proposed fix

`GreedySearch(float*, int n_len, ...)` and `BeamSearch(... int len, ...)` (declared in the same file) take their length argument as `int`. Reading the tensor as `int32_t*` and passing `*encoder_out_lens` (4 bytes) to a function expecting `int` is semantically identical to the current `int64_t*` → truncate path, but without the UB:

```diff
- auto encoder_out_lens = outputTensor[1].GetTensorMutableData<int64_t>();
+ auto encoder_out_lens = outputTensor[1].GetTensorMutableData<int32_t>();
```

Verified: with the one-line change, ASAN no longer reports the OOB on inference, and the decoded text is byte-identical to what the existing build produces. Tested on `paraformer-large-contextual` end-to-end with hundreds of requests.

## Suggested follow-up

If the ONNX model can in principle emit either `int32` or `int64` for this output across model variants, the proper fix is to inspect the tensor element type at runtime via `GetTensorTypeAndShapeInfo().GetElementType()` and branch. But for at least the published ModelScope `paraformer-large-contextual` model, the emitted dtype is unambiguously `int32`, and the current code is reading it incorrectly.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[runtime/onnxruntime] Paraformer::Forward heap-buffer-overflow: encoder_out_lens reads ONNX int32 tensor as int64_t #2848

Summary

Location

ASAN evidence

Why production typically doesn't crash

Proposed fix

Suggested follow-up

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[runtime/onnxruntime] Paraformer::Forward heap-buffer-overflow: encoder_out_lens reads ONNX int32 tensor as int64_t #2848

Description

Summary

Location

ASAN evidence

Why production typically doesn't crash

Proposed fix

Suggested follow-up

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions