Is it normal that to_string() for bigint takes so much(8-10 mins)? #79

Moldoteck · 2019-01-20T13:21:50Z

Well, i have an big int with ~50K digits. And when i want to print it/write it to file, to_string() takes very much. Is there a way to boost this?
P.s. i'm using --release
P.P.s this number is result for calculating factorial(1000000)

cuviper · 2019-01-24T01:16:32Z

Converting such a large binary number to decimal requires a lot of division by 10 (or powers of 10). I measured 560 seconds (9m20s) for this number from BigUint. For comparison, the same in Python took me 384 seconds (6m24s) -- same order of magnitude, but we do have room for improvement.

It's much faster to use hexadecimal -- less than 10ms.

Moldoteck · 2019-01-24T06:18:46Z

The ideea is that:
When i do something like
write!(filename,{:?},mynum)
It is almost instant, but it writes the number as json like many numbers separated by comma. It is possible to take advantage from this and print number faster by iterating the data array?

cuviper · 2019-01-24T18:57:29Z

We don't directly expose the data array, because we may change the internal details in the future, like choosing a different word size. However, if you enable the "serde" feature of num-bigint, you can use serde_json for this:
https://docs.rs/serde_json/1.0.37/serde_json/#creating-json-by-serializing-data-structures

This is quite fast, and also has the advantage that you can simply deserialize it. Any other serde format will work for this too.

programmerjake · 2019-08-25T10:10:42Z

decimal conversion takes so long because bigint doesn't use an efficient algorithm.
when using bc from GNU coreutils (which I think uses libgmp), printing 2^166096 (close to 10^50000) takes about a second on my smartphone.

using the Schönhage–Strassen algorithm for multiplication with Newton's method for division along with recursive subdivision for the base conversion should improve the algorithm to O(n * log(n)^j * log(log(n))^k) running time where j and k are small numbers that I won't bother to compute.

From what I recall, the currently implemented algorithm has O(n^3) running time, which is much worse (around 10000^3 operations, which would take around 300s at 3Gops/s).

cuviper · 2019-08-25T14:34:00Z

PRs welcome!

chirsz-ever · 2019-09-10T12:07:26Z

decimal conversion takes so long because bigint doesn't use an efficient algorithm.
when using bc from GNU coreutils (which I think uses libgmp), printing 2^166096 (close to 10^50000) takes about a second on my smartphone.

I read the source code of bc then I found that the reason why it outputs numbers so fast is that bc uses decimal digits representation internally...

programmerjake · 2019-09-10T13:25:06Z

decimal conversion takes so long because bigint doesn't use an efficient algorithm.
when using bc from GNU coreutils (which I think uses libgmp), printing 2^166096 (close to 10^50000) takes about a second on my smartphone.

I read the source code of bc then I found that the reason why it outputs numbers so fast is that bc uses decimal digits representation internally...

Well, if bc does use decimal, then all the time would be spent on converting 2^166096 to decimal in the process of calculating that power of 2, so it wouldn't be much faster.

I just tried using libgmp (through python3 with gmpy2), converting 2^33219280 (has exactly 10,000,000 decimal digits) takes around 2s on my desktop (Ryzen 5 2600).

from gmpy2 import mpz
len((mpz(1) << 33219280).digits())

I know for a fact that libgmp uses either base 2^32 or 2^64 internally.

tczajka · 2020-10-15T04:01:56Z

This will get fixed by #169.

Moldoteck · 2020-10-15T07:19:31Z

This will get fixed by #169.

Thanks 👍

seanyoung · 2023-05-24T07:52:05Z

For display purposes, we often types we don't care about the exact value, but we want a visual representation of some sort.For example I'm using bigint in solang, and if the source code is:

contract C {
    uint public a = 0x42 << 0x1000;
}

The error message is:

error: value 68929666173268065441655678907297209250277640437127329505359409896740957926122792152022569587709838850498747043152991347307179846920262378079373581798190108344433160451884859890463864948869325408752252620782492450940837043400249041214374089166742716754789321974583044468704747216847828201554886533717469914441337972474197993872495771592557222079572243926566348973999761965241895651054178391269138225751793146952429238097003702352676784449795583594206767356612476729034555526070556182110034136368176157825790350818918680381187579004110590505254414061021345580607020527860296286314915887827214755800931314516484844723571229390274210314498301135085782169058787061833048081352669155166748440562902015274828003678438834758235744983872502940358864173884374362418421893050854849978562010362466242620085674985353278126304735758052063795187738140115584115393670010597080656489254414544708321337728396892217877211121649029592465037162750628525088209561398145517576740389232774117638701953386535741992764967317125237969417691252073510511769718651225882588926353299502641873735260637594404246088340952162161101156716117693793415175187367098516802584372524774806321669598085959288834521960432288439734685792771594710206781956091110750466608176562176 does not fit into type uint256.

How about a to_string_abbreviated() which prints something like:

68929666173e500
68929666173..(500 digits)..6608176562176
68929666173..(abbreviated)..6608176562176

Not sure on the exact format - suggestions welcome.

estebank · 2023-09-27T16:24:45Z

Would the project consider accepting a PR that changes the BigInt type to be something along the lines of

pub struct BigInt<T: BigIntFormatting = Short> {
    sign: Sign,
    data: BigUint,
    formatter: PhantomData<T>,
}

In order to provide a "fast" default that prints normally only up until an acceptable size and uses scientific notation beyond that, but users still retain the ability to opt-into the current behavior relatively easily. I find this slightly better than .to_string_abbreviated() because it protects the unaware without overly restricting them.

All of this wouldn't be needed if Rust had custom formatting specifiers, but that's not the Rust we have today :)

cuviper · 2023-09-27T16:55:42Z

I think a type parameter would be way too intrusive. I also think that defaulting to an abbreviated form would be a surprising change for the "unaware" on the other end of the spectrum.

estebank · 2023-09-27T17:19:17Z

@cuviper would the prior suggestion of a new method with conditional formatting predicated on a length check be acceptable instead?

cuviper · 2024-03-05T01:37:04Z

Let's maybe start with an abbreviated form, and leave conditionals external for now. I fear that might still be somewhat slow to find the upper decimal digits, but maybe it can discard a lot with large divisors. I'm open to experimentation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it normal that to_string() for bigint takes so much(8-10 mins)? #79

Is it normal that to_string() for bigint takes so much(8-10 mins)? #79

Moldoteck commented Jan 20, 2019

cuviper commented Jan 24, 2019

Moldoteck commented Jan 24, 2019

cuviper commented Jan 24, 2019

programmerjake commented Aug 25, 2019

cuviper commented Aug 25, 2019

chirsz-ever commented Sep 10, 2019

programmerjake commented Sep 10, 2019

tczajka commented Oct 15, 2020

Moldoteck commented Oct 15, 2020

seanyoung commented May 24, 2023

estebank commented Sep 27, 2023

cuviper commented Sep 27, 2023

estebank commented Sep 27, 2023

cuviper commented Mar 5, 2024

Is it normal that to_string() for bigint takes so much(8-10 mins)? #79

Is it normal that to_string() for bigint takes so much(8-10 mins)? #79

Comments

Moldoteck commented Jan 20, 2019

cuviper commented Jan 24, 2019

Moldoteck commented Jan 24, 2019

cuviper commented Jan 24, 2019

programmerjake commented Aug 25, 2019

cuviper commented Aug 25, 2019

chirsz-ever commented Sep 10, 2019

programmerjake commented Sep 10, 2019

tczajka commented Oct 15, 2020

Moldoteck commented Oct 15, 2020

seanyoung commented May 24, 2023

estebank commented Sep 27, 2023

cuviper commented Sep 27, 2023

estebank commented Sep 27, 2023

cuviper commented Mar 5, 2024