libstd: Implement BigInt and BigUint. Issue #37 #4198

Merged
merged 5 commits into from Jan 8, 2013

Conversation

Projects
None yet
8 participants
Contributor

gifnksm commented Dec 15, 2012

Implement BigInt (arbitrary precision integer) type.

Contributor

brson commented Dec 16, 2012

I'm excited about this. Looks cool.

Owner

huonw commented Dec 16, 2012

How does this compare to this? Presumably the rust-gmp bindings are faster, but this has the advantage of being pure rust.

Contributor

graydon commented Dec 17, 2012

Awesome. I figured we'd bind to acme bignum (it's in rt/) but this looks like a pretty direct reimplementation in plain rust. I'm into it!

bstrie referenced this pull request Dec 17, 2012

Closed

implement 'big' type #33

Contributor

thestinger commented Dec 17, 2012

@lenny222: The only real problem with using it with rust is the licensing one (it can't be in the standard library), and it's awesome to have pure rust libraries for things like this to evolve the performance of the language as a whole.

@dbaupp: gmp is very fast because it uses different algorithms for a task like multiplication based on the size of the input, so they have much better asymptotic performance than most other libraries (covered here). It's also full of hand-rolled assembly and years of optimization, so it would be quite hard to compete with it from a performance standpoint. I think it would be a better idea to compare against implementations that other languages use (Go, Haskell, Python, etc.) and work towards beating all of those 😉.

kud1ing referenced this pull request in Mozilla-Student-Projects/Projects-Tracker Dec 18, 2012

Closed

Implement big integers for Rust #37

kud1ing commented Dec 18, 2012

Some of the missing Rust Shootout-benchmarks need big numbers.
Those benchmarks could be used to test and benchmark the new code.

See mozilla#2776

I had also done a Big Integer implementation in parallel (https://github.com/ahmadsalim/rust-bigint), if it can be in any help.

Contributor

brson commented Dec 21, 2012

@ahmadsalim Thanks!

The two major differences I see are that the @gifnksm implementation has both BigInt and BigUint, while @ahmadsalim only has BigInt. @ahmadsalim uses ~ while @gifnksm uses @.

Let's consider this carefully.

Some observations:

  • we do need to avoid @ or else big ints will be second class compared to primitive ints.
  • having both signed and unsigned version seems appropriate, but maybe they should both go in the same std::bigint module.
Contributor

thestinger commented Dec 21, 2012

I think using ~ is definitely the way to go, and I don't think it really implies more copies. It just requires the calling code to be a bit smarter in certain cases where unnecessary allocation can be avoided. There are idioms used by the APIs of C big integer libraries to do minimal copying, but it doesn't map well to operator overloading.

gmp has function definitions like this:

mpz_neg(output, input)
mpz_add(output, input1, input2)

The output can be the same variable as one of the inputs, or a different one. It will (for some operations) reuse the memory allocated for the output variable, whether or not the operation is actually in-place.

Equivalents in rust:

// gmp: mpz_add(x, y, z)
x = y + z // to initialize x
x.set_add(y, z) // to re-use memory allocated to x
// gmpxx uses an operator= overload using template hackery to do this for simple cases

An operation like i.set_add(x * y, z) still has an implicit temporary value from the multiplication, but in a loop you could store it in variable and use .set_mul(x, y) instead of reallocating each time.

// gmp: mpz_add(y, y, z)
y.set_add(y, z)
y += z // once rust can overload += separately

So basically, you just need a mutable in-place API and then a pure API implemented with that (set_neg in addition to neg, etc.).

Contributor

gifnksm commented Dec 22, 2012

Thank you for the many comments.

I try to reimplement std::bigint with using ~ vectors.
After that, I'll compare the performance of std::bigint and big int libralies in other languages (perl, python, ruby, haskell, gmp) by some simple benchmarks.

If possible, I would like to implement in-place calculation that was pointed out by @thestinger.

@gifnksm That seems to be the best solution, as you currently have an implementation of the optimized algorithms.
Although different strategies can be utilized, depending on the size of input, like GMP does.

Contributor

gifnksm commented Dec 23, 2012

I did simple benchmark tests. The result is https://gist.github.com/4360131#file-benchmark-result-txt .

This benchmark has two tests for each languages. The first test fib(n) displays the nth Fibonacci number. The second test factorial(n) displays the factorial of n. These tests measure the amount of time it takes to compute and display (converting to decimal).
In Haskell, because I don't know how to disable lazy evaluation, display time and computation time are not clearly separated. Since the test only works on python2, an error has occurred in the test of python in Arch Linux.

My first impression is my implementation is very slow.
Addition time and multiplication time are not so bad (but still slow. I think this is not a problem of the algorithm, and it's only a matter of how to program.)
Display time is terribly slow. I think this caused by the following two points.

  • Radix conversion is implemented by a simple division.
  • Slow implementation of the division.

For now, I try to implement the Karatsuba radix conversion.

@gifnksm The reason that Haskell does not strictly evaluate your functions are that the let-bindings themselves are lazy.
This can be solved in two ways, the first of which is to use the let-bound variable (in factorial case):

...
let fac = factorial n
return $! fac
... 

The second is to use bang patterns to strictly evaluate the variable at binding time:

{-# LANGUAGE BangPatterns #-}
...
let !fac = factorial n
...

Hopefully this answer will help you to do a more accurate comparison.

Contributor

gifnksm commented Dec 29, 2012

@ahmadsalim
I tried the first way, the program seems to be measured accurately.
Thank you!

Now, I'm elminating extra memory allocation operations.
Some calculations are now 10 times faster than before!

Contributor

nikomatsakis commented Jan 8, 2013

This looks like a good start! r+ from me. I'm sure we can optimize and improve over time, but the interface seems minimal and reasonable.

brson merged commit 68c689f into rust-lang:incoming Jan 8, 2013

Contributor

brson commented Jan 8, 2013

Merged. Thanks for the thorough review @ahmadsalim, @gifnksm, @thestinger and everyone.

Contributor

brson commented Jan 9, 2013

@gifnksm I had to disable some of the tests because they fail on x86: mozilla#4393. Please give them a look. You can test by configuring with --host-triple=i686-unknown-linux-gnu (or similar).

gifnksm deleted the gifnksm:bigint branch Jan 9, 2013

Contributor

gifnksm commented Jan 9, 2013

@brson Thank you for merging! I try to fix #4393.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment