arrow::Decimal128::FromString silently truncates when the input string has more than 38 significant digits

### Describe the bug, including details regarding any error messages, version, and platform.

`arrow::Decimal128::FromString` silently truncates (wraps mod 2¹²⁸) when the input string has more than 38 significant digits, producing a garbage unscaled value **and returning `Status::OK`**. The out-parameter `precision` does reflect the true parsed precision (> 38), but the `Decimal128` value itself is already corrupted, and most callers do not validate `precision` against `Decimal128::kMaxPrecision`.

The same issue applies symmetrically in `Decimal256::FromString` when precision > 76 - both paths share `DecimalFromString<T>` in `cpp/src/arrow/util/decimal.cc`.

**Why?**

`DecimalFromString` feeds the digit string into `ShiftAndAdd`, which multiplies-and-adds into a fixed-size `uint64_t` array sized to the target decimal's bit width:

```cpp
std::array<uint64_t, Decimal::kBitWidth / 64> little_endian_array{};
ShiftAndAdd(dec.whole_digits, little_endian_array.data(),
            little_endian_array.size());
ShiftAndAdd(dec.fractional_digits, little_endian_array.data(),
            little_endian_array.size());
```

`ShiftAndAdd` carries high bits only through `out_size` limbs and drops the remaining carry on the floor - no overflow check. `DecimalFromString` then returns `Status::OK()` without validating `parsed_precision <= kMaxPrecision` (and also without validating `parsed_scale <= kMaxScale`).

Relevant code (paths as of apache/arrow `main`):

- `ShiftAndAdd` - `cpp/src/arrow/util/decimal.cc:747`
- `DecimalFromString` - `cpp/src/arrow/util/decimal.cc:857`

**Repro**
```cpp
// arrow_decimal_repro.cc
#include <arrow/util/decimal.h>
#include <iostream>
#include <string>

int main() {
  const std::string input =
      "1.55555555555555555555555555555555555555555555555555";

  arrow::Decimal128 dec;
  int32_t precision = 0, scale = 0;
  auto st = arrow::Decimal128::FromString(input, &dec, &precision, &scale);

  std::cout << "input:     " << input << "\n"
            << "status:    " << st.ToString() << "\n"
            << "precision: " << precision << "\n"
            << "scale:     " << scale << "\n"
            << "dec (raw): " << dec.ToString(0) << "\n";
  return 0;
}
```

Observed output (Arrow `main`, macOS arm64, clang++ -std=c++17):

```
input:     1.55555555555555555555555555555555555555555555555555
status:    OK
precision: 51
scale:     50
dec (raw): 151473969238762845885664163847966963939
```

That unscaled value is exactly the input (51-digit integer `155555555555555555555555555555555555555555555555555`) reduced mod 2¹²⁸:

```python
>>> 155555555555555555555555555555555555555555555555555 % (1 << 128)
151473969238762845885664163847966963939
```

**Desired (?) behavior**
Return `Status::Invalid` when parsed precision exceeds `kMaxPrecision` (or parsed scale exceeds `kMaxScale`) for the target decimal type, matching the error-reporting contract of `Decimal::Rescale` and friends.

**Impact (breaking change?)**
Any caller that trusts `FromString` to fail on invalid input silently produces wrong answers for user-supplied decimal strings longer than 38 significant digits. Downstream, Gandiva's `castDECIMAL_utf8` - and any compute kernel that routes VARCHAR-to-DECIMAL through `FromString` before rescaling via `Decimal128::Rescale` / `gandiva::decimalops::Convert` - returns numerically wrong results for perfectly legal SQL casts such as:

```sql
CAST('1.55555555555555555555555555555555555555555555555555' AS DECIMAL(38,37))
-- expected: 1.5555555555555555555555555555555555556  (HALF_UP)
-- observed:  0.0000000000015147396923876284588566416
```


### Component(s)

C++, Gandiva

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

arrow::Decimal128::FromString silently truncates when the input string has more than 38 significant digits #49817

Describe the bug, including details regarding any error messages, version, and platform.

Component(s)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

arrow::Decimal128::FromString silently truncates when the input string has more than 38 significant digits #49817

Description

Describe the bug, including details regarding any error messages, version, and platform.

Component(s)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions