[Possible Bug] - Display panics for large numbers #108

DanielBauman88 · 2023-07-19T23:31:04Z

Is this intentional? I think it'd be an improvement to avoid panics in display, and simply use a different display format for numbers that can't be displayed with the standard implementation.

EG: the following test will panic

    #[test]
    fn test_big_scale() {
        let min: i64 = i64::MIN;
        let str = &format!("1E{min}");
        BigDecimal::from_str(str).unwrap().to_string();
    }

Will panic with "attempt to subtract with overflow"

Same for example for

    #[test]
    fn test_small_scale() {
        let max: i64 = i64::MAX;
        let str = &format!("1E{}", max);
        BigDecimal::from_str(str).unwrap().to_string();
    }

The text was updated successfully, but these errors were encountered:

akubera · 2023-07-20T02:15:08Z

Ok: edge case comes from subtracting the length of the "decimal" string from the scale, counting how many digits should come before the decimal point:

bigdecimal-rs/src/lib.rs

Line 1957 in 2633d23

let location = abs_int.len() as i64 - self.scale;

Interesting, I wrote this unit test

        let dec = BigDecimal::new(1.into(), stdlib::i64::MAX);
        let expected = format!("1E{}", stdlib::i64::MAX);
        assert_eq!(dec.to_string(), expected);

and got memory allocation of 9223372036854775806 bytes failed

So... that's a bad sign.

akubera · 2023-07-20T02:33:34Z

It boils down to we don't support exponential syntax? In which case, anything close to i64 min or max (or even the 4GB of i32 min/max) will cause problems.

I don't know what heuristic to use for formatting decimals in exponential-form vs standard-form.

Definitely a bug that needs fixed, but I'm not too worried about it as it's been there for 6 years. It's on the todo list

DanielBauman88 · 2023-07-20T16:13:37Z

Another issue here which isn't related to the edge case, it's not obvious that display will allocate so much memory for large numbers.

EG - try those examples just below the limit. Display does huge allocations.

    #[test]
    fn test_big_scale() {
        let max: i64 = i64::MAX - 2;
        let str = &format!("1E{}", max);
        BigDecimal::from_str(str).unwrap().to_string();
    }

The to_string call here allocates 9223372036854775805 bytes of memory
https://github.com/akubera/bigdecimal-rs/blob/main/src/lib.rs#L1694-L1694

Doesn't seem necessary for displaying. Would be better to have some limit where you display in exponential form.

edit: I see now that you came across the same problem here - #108 (comment)

**Description** Add a second path to the `Display` impl so that numbers that have a large scale relative to length of their integer representation are formatted differently. Specifically, I switched it to use exponential number format like: ``` 1.23456789E38 ``` instead of ``` 123456789000000000000000000000000000000 ``` **Motivation** If you extrapolate from the example above, and also looking at issue #108, you could see that with large exponents you run the risk of large allocations, since the resulting bytestring is proportional in size to the exponent. Any software using the `bigdecimal` crate and converting unprotected uer input into the `BigDecimal` type runs the risk of allocating way too much and potentially crashing, if they happen to use either the `Debug` or `Display` impls. This change is useful because it bounds the output and prevents unexpected large allocations when formatting the `BigDecimal` type **Testing Done** `cargo test && ./scripts/bigdecimal-property-tests test` **Backwards Compatibility Criteria** I believe that this does not constitue a backwards incompatible change because the `BigDecimal::from_str` method already supports exponential format numbers as input.

ddimaria · 2024-02-21T14:40:38Z

We're running into the same issue in production when converting a large BigDecimal to a string. Design-wise, the to_string() method should return a result or there should be different ways of displaying the stringified BigDecimal that avoids panics. Are you open to a PR to fix this?

akubera · 2024-02-22T05:04:47Z

Does the branch in MR #121 fix the issue?

akubera · 2024-02-22T05:27:06Z

It did not.

This line had a potential out of bounds error if the exponent was near the i64 boundary, and had to be shifted forward or backwards past the boundaries when displaying in scientific notation:

bigdecimal-rs/src/impl_fmt.rs

Line 233 in dbf3b48

let exponent = abs_int.len() as i64 + exp - 1;

Casting the length and exp to i128's will solve that particular issue.

akubera · 2024-02-23T00:26:40Z

I casted a bunch of i64's to i128's before doing formatting stuff, and added some tests where the exponent is near the i64 boundary:

bigdecimal-rs/src/impl_fmt.rs

Lines 563 to 572 in 72818b7

    
               impl_case!(test_max: format!("1E{}", i64::MAX) => "1E+9223372036854775807"); 
        
               impl_case!(test_max_multiple_digits: format!("314156E{}", i64::MAX) => "3.14156E+9223372036854775812"); 
        
               impl_case!(test_min_scale: "1E9223372036854775808" => "1E+9223372036854775808"); 
        
               impl_case!(test_max_scale: "1E-9223372036854775807" => "1E-9223372036854775807"); 
        
               impl_case!(test_min_multiple_digits: format!("271828182E-{}", i64::MAX) => "2.71828182E-9223372036854775799"); 
        
               impl_case!((panics) test_max_exp_overflow: "1E9223372036854775809"); 
        
               impl_case!((panics) test_min_exp_overflow: "1E-9223372036854775808"); 
        
           }

Those test round-tripping, so everything rendered in .to_string() is able to be parsed (which wasn't true before).

akubera · 2024-02-23T00:42:59Z

Where did these issues come up? I know Python limits Decimal exponents to 10^18 (about 60 bits) and Java uses 32-bit a scale so I thought 64 bits would be enough for anybody. But I guess not?

Obviously these were bugs needing to be fixed, but I'm curious where they come up?

ddimaria · 2024-02-23T15:15:47Z

Is was while importing a user-submitted CSV. It comes in as text, then BigDecimal parses the exponential number successfully (signals that the text was a number). When we serialize the value we get the error. Our temp fix is to manually check for overflows.

akubera · 2024-03-06T01:14:06Z

Should be fixed in 0.4.3!

akubera added the bug label Jul 20, 2023

akubera mentioned this issue Jul 25, 2023

Comparisons with large exponents are extremely slow #109

Open

declanvk mentioned this issue Jan 8, 2024

Use exponential format for decimals with relatively large scale #121

Merged

akubera closed this as completed Mar 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Possible Bug] - Display panics for large numbers #108

[Possible Bug] - Display panics for large numbers #108

DanielBauman88 commented Jul 19, 2023 •

edited

akubera commented Jul 20, 2023

akubera commented Jul 20, 2023

DanielBauman88 commented Jul 20, 2023 •

edited

ddimaria commented Feb 21, 2024

akubera commented Feb 22, 2024

akubera commented Feb 22, 2024

akubera commented Feb 23, 2024 •

edited

akubera commented Feb 23, 2024

ddimaria commented Feb 23, 2024

akubera commented Mar 6, 2024

[Possible Bug] - Display panics for large numbers #108

[Possible Bug] - Display panics for large numbers #108

Comments

DanielBauman88 commented Jul 19, 2023 • edited

akubera commented Jul 20, 2023

akubera commented Jul 20, 2023

DanielBauman88 commented Jul 20, 2023 • edited

ddimaria commented Feb 21, 2024

akubera commented Feb 22, 2024

akubera commented Feb 22, 2024

akubera commented Feb 23, 2024 • edited

akubera commented Feb 23, 2024

ddimaria commented Feb 23, 2024

akubera commented Mar 6, 2024

DanielBauman88 commented Jul 19, 2023 •

edited

DanielBauman88 commented Jul 20, 2023 •

edited

akubera commented Feb 23, 2024 •

edited