improve bigFactorial algorithm #1687

bartekleon · 2019-12-07T04:28:39Z

You can use different algorithms of factorial to speed this 2-8 times. Universal one (twice as fast):
[i am using bigint's in implementation, but logic is the same]

function factorial(n) {
  if(n === 0n || n === 1n) {
    return 1n;
  }
  if(n % 2n === 1n) {
    return n * factorial(n - 1n);
  }

  let p = n;
  let prod = n;

  while(p > 2n) {
    p -= 2n;
    n += p;

    prod *= n;
  }

  return prod;
}

If you are interested in the less universal (faster than this after n=1400), I could provide it in comment

josdejong · 2019-12-09T18:55:20Z

Performance improvements are of course always welcome Bartosz :) Would you like to work this out in a Pull request, or can we ask the community to pick this up?

I think it will be useful to write a little benchmark to make sure we're actually improving (see /benchmark folder.

bartekleon · 2019-12-09T20:30:03Z

@josdejong for now, I have benchmarks for my implementation. This algorithm is written to be fast tho. Should be always ~2 times faster since you have twice less numbers
20! = 1 * 2 * 3 * 4 * ... * 20 ( 20 terms )
20! = 20 * (20 + 18) * (20 + 18 + 16) * (20 + ... + 2) (9 terms or so)

And yes, I can take this feature :)

bartekleon · 2019-12-09T20:36:05Z

@josdejong I have one question tho. Should I use this math library or can I use BigInt?

bartekleon · 2019-12-09T21:43:55Z

After a small test, it seems it is faster (from 8! tho, but we can hardcode first 7 factorials I guess)

bartekleon · 2019-12-11T00:29:43Z

@josdejong would you also like a proof paper for this algorithm?

Proof for this algorithm: https://docs.google.com/document/d/13CYoqFFkcwNYRN8ObKEvyXpkCP8GSf14n9G7XeWK1OI/edit?usp=sharing

bartekleon · 2019-12-11T03:19:07Z

Update: I have been working theoretically investigating this subject, and I have managed to make this algorithm twice as fast, these are results on this moment (right now I have implemented square difference factorial, as you can see in PR, although I am planning to check other possibilities too)

calculating speed for factorial 10        
factorial by definition: 0.088ms
square difference factorial: 0.045ms      
doubleFactorial square difference: 0.082ms
recursive binary factorial: 0.112ms       

calculating speed for factorial 100       
factorial by definition: 0.025ms
square difference factorial: 0.024ms      
doubleFactorial square difference: 0.026ms
recursive binary factorial: 0.052ms       

calculating speed for factorial 1000      
factorial by definition: 1.438ms
square difference factorial: 0.357ms      
doubleFactorial square difference: 0.231ms
recursive binary factorial: 0.626ms       

calculating speed for factorial 10000     
factorial by definition: 73.748ms
square difference factorial: 38.261ms
doubleFactorial square difference: 17.682ms
recursive binary factorial: 11.47ms   

calculating speed for factorial 100000
factorial by definition: 10.981s
square difference factorial: 5.647s
doubleFactorial square difference: 2.272s
recursive binary factorial: 997.771ms

(It has been written using BigInt, not this library, so probably due to complexity I'll have to leave it as it is in PR right now)

josdejong · 2019-12-15T08:51:42Z

Thanks Bartosz, this benchmark clearly shows a factor 2 improvement, that's impressive :). I've merged the benchmark in the develop branch and fixed some linting errors that it complained about.

Anyone interested in replacing the current bignumber implementation with the faster one proposed here?

bartekleon · 2019-12-15T13:30:08Z

I am still working theoretically on that. Give me a week or two, I'll try to handle this PR or at least write benchmarks :)
Also, I need to check edge cases like n = 0 or n = 1 since in my proof I am omitting it. I need to consult my maths teacher so he can check my proofs :)

And @josdejong this version written in Benchmark is actually working one (it's not in BigInt)

josdejong · 2019-12-15T20:53:26Z

And @josdejong this version written in Benchmark is actually working one (it's not in BigInt)

Yes so it should be quite straight forward to replace the implementation with the one in from benchmark :)

I'll await your further experimentation.

bartekleon · 2019-12-15T21:10:52Z

Exactly. It will probably be just copy-paste implementation :)

I guess it is mostly done. Now I need to find crossing-point for these two functions and implement one which will decide, which one should be used (crossing-point is probably for n something around 1000)

(If you have other functions which could be optimized like trigonometric functions/logarithms etc. feel free to send me a message or create an issue and ping me there)

bartekleon · 2019-12-17T16:30:40Z

@josdejong If you like, I have also quite fast algorithm for double factorial (the same logic as for factorial, so double speed).

josdejong · 2019-12-21T12:45:13Z

That sounds interesting of course :)

bartekleon · 2019-12-21T12:56:39Z

If you are interested, I have created a math library based in bigints. You might want to take a look on some implementations :)
I'll also try putting some of my research / algorithms into docs and put it in different GitHub repo, but for now I don't have that much time.

If you are interested in some particular algorithm / operation, lemme know. I am always curious about this stuff and so ill try my best to test / find it :)

bartekleon · 2019-12-22T00:55:14Z

I was thinking of making a simple graph to illustrate the speed of algorithms and came up with something like this (it is a rough version for simple illustrations)

josdejong · 2019-12-23T09:29:27Z

That's nice, that really gives a clear insight on how much performance improvement the different give, and how good they score on different ranges of values. It's interesting to see that up to about 1450 the initializeFactorial has worse performance than the others. So if the typical use case is below that, it may yield a regression in real world applications. Do you have an idea on what range of values people normally need?

bartekleon · 2019-12-23T14:10:52Z

Honestly, I have not done research in this field. For now, I am only testing different algorithms since I'm super interested in the "speed" part, not the "usage"/"social" part.

Although the speed is worse for numbers 1450 < , I still find this algorithm usable. You just need to "swap" algorithms for different thresholds:

n < 8 -> just return value
8 <= n < 1450 -> factorial3
n >= 1450 -> initializeFactorial

using this method, we always get the best performance possible. The only drawback is that you need all these implementations - but I guess in the modern world, it is better to increase app size twice than having twice longer execution.

The reason why it probably is worse in performance than it's different versions is 'the speed for O',
so it might have a better algorithm like O(nlog(n)) instead of O(n^2) but the execution of one element it slightly longer so overall it is faster only for bigger numbers.

josdejong · 2019-12-24T21:03:30Z

Yeah I can imagine that it's really interesting to see how all these different approaches compare :) I'm sure it will also differ a lot whether working with numbers, BigNumbers, BigInt, etc.

I think it makes sense to put all your different algorithms and benchmarks and research in a separate repository (and maybe an npm library) purely focused on the best performance of factorial for large values. It goes a bit too far though for mathjs, where we try to balance usability, performance, code complexity and maintainability, and bundle size.

I would love to have the improvements that are currently in /test/benchmark/factorial.js integrated in the factorial function of mathjs, but let's not go for #1691 at this point, see my reasoning in #1691 (comment). Is that ok with you?

bartekleon · 2019-12-24T23:16:34Z

I'll remove the initializeFactorial algorithm due to the high complexity of it. Instead, I'll try benchmarking my other model (factorial3) which uses the fact that: n! can be almost written as `n!! * 2 ** (n/2) * (n/2)!. It has super easy proof by itself and both double factorial and factorial uses quite similar algorithms, so it is easy to understand. Full implementation in BigInt:

function doubleFactorial(n) {
  switch((n + 1n) % 4n) {
    case 0n:
      const p = (n - 1n) / 2n;
      n += 1n;
      let prod = 1n;
  
      for(let k = 1n; k <= p; k += 2n) {
        prod *= (n - k) * k;
      }
    
      return prod;
    case 2n:
      return n * doubleFactorial(n - 2n);
  }

  return 2n ** (n / 2n) * factorial3(n / 2n);
}

function factorial3(n) {
  if(n < 7n) {
    return [1n, 1n, 2n, 6n, 24n, 120n, 720n][+`${n}`]
  }

  if(n % 2n === 0n) {
    const k = n / 2n;
    return 2n ** k * doubleFactorial(n - 1n) * factorial3(k);
  }

  return factorial3(n - 1n) * n;
}

I guess it could be a nice middle ground between complexity and speed, what do you think?
(unless it's slower for BigNumber or difference is insignificant, then ill leave only that easy-one)

josdejong · 2019-12-25T11:16:32Z

These are very well understandable algorithms! Looks good to me 👍

Practical questions:

mathjs does not have any support for BigInt. It makes sense to me to add BigInt support for mathjs but that only works it we add support for at least all arithmetic functions for example and add documentation etc, that requires some work. I think that's too big to be part of this PR though.
The current implementation of factorial supports number and BigNumber. Can your algorithms be transformed for number and BigNumber instead of `BigInt? Are they then faster than the current implementations?
If I understand it correctly, the faster implementations that you propose above works only for integer numbers? And since it's using BigInt, it will not work on all browsers. I think it is essential to have a factorial implementation which works on all browsers and supports floating point. So should we create a switch in the current implementations for number and BigNumber to use this fast integer implementation when the input is an integer value, and otherwise use an other implementation?

bartekleon · 2019-12-25T12:26:25Z

There is no problem in changing it from bigint to bignumber, the only drawback is performance, which I'll measure to ensure its indeed faster.

If it comes to other functions, if needed, you could consider using parts of my library which basically works quite similar to BigNumber, but on BigInts.

But I am probably able to write every function using BigNumbers. Still I have to measure performance though :/

If it comes to switch. It is not that hard to do. I could try implementing it and see the difference :)

josdejong · 2019-12-26T20:04:46Z

Let's focus on improving factorial (number and BigNumber) for now and keep BigInt support separate ok? I'm looking forward to see what you come up with :)

bartekleon · 2019-12-27T13:56:07Z

@josdejong I have removed that complex algorithm and was playing with another algorithm, but I couldn't make it fast enough (even with BigInt, since I had to use BigInt(n) instead of 1n,2n etc which make it slow). Even after optimisations, the difference was insignificant, so this is the fastest available for now :/ (still double speed is good i guess)

josdejong · 2019-12-28T15:51:53Z

Double speed is awesome 😎 🚀 ! Bummer that the other optimizations didn't work out, but very good to have this verified with some benchmarks! Thanks for all your effort.

I think we can close #1691 then, and the next step would be to actually implement the 2x improvements that are already in test/benchmark/factorial.js?

bartekleon · 2019-12-28T17:16:11Z

Well yeah. I guess it's enough to just copy /paste implementation to SRC folder :) I can do it tomorrow /maybe today

josdejong · 2019-12-31T10:33:38Z

Thanks Bartosz, I've merged your PR now! It's a very nice improvement and I love it that it's achieved by just a few lines of code.

josdejong · 2019-12-31T10:56:53Z

Improved factorial is now available in v6.3.0

bartekleon · 2019-12-31T13:13:24Z

If you have any algorithm you want to be improved, message me. I'll take a look then ;p

josdejong · 2019-12-31T14:28:17Z

Thanks! It would be great if you could have a look at the performance of det some day. This is currently like 100 times slower than we would expect, see #908.

bartekleon · 2020-01-29T17:09:07Z

@josdejong If you are interested, we could change the factorial algorithm again in the future. It is quite fast and implementation isn't that hard to understand:

function FactorialHyper(n) {
  let nostart = false;
  const h = n / 2n;
  let s = h + 1n;
  let k = s + h;
  let a = (n & 1n) === 1n ? k : 1n;

  if ((h & 1n) === 1n) a = -a;
  k += 4n;

  function HyperFact(l) {
    if (l > 1n) {
      var m = l >> 1n;
      return HyperFact(m) * HyperFact(l - m);
    }
  
    if (nostart) {
      s -= k -= 4n;
      return s;
    }
    nostart = true;

    return a;
  }

  return HyperFact(h + 1n) << h;
}

bartekleon · 2020-01-29T17:11:57Z

Difference between this one (FactorialHyper) and current implementation [note that these are implemented using BigInt, not BigNumber so the results may differ a little]

josdejong · 2020-02-01T09:19:08Z

If you are interested, we could change the factorial algorithm again in the future. It is quite fast and implementation isn't that hard to understand

Yes of course :) That algorithm looks very simple!

We should be careful not to compare apples with pears though: so far mathjs has no built in support for BigInt. It would be great to add this but I think it only makes sense if we create support for it for all basic arithmetic functions (add, subtract, etc), we discussed that before I think.

I understand your implementation only works for BigInt. Can it be used for number too?

I was thinking, we could make a switch in the current number and BigNumber implementation: if the input is integer and the browser has BigInt support, switch to this fast algorithm you propose above. This way BigInt is only used under the hood. What do you think?

bartekleon · 2020-02-01T09:59:40Z

Most of this algorithm can use normal numbers. I am using BigInt everywhere since conversion number -> BigInt is slow. There is no problem for me for changing your implementations, but keep in mind, that BigInt isn't supported everywhere. If you take a look at my BigMath library, I have written there which browsers /node versions are supported

josdejong · 2020-02-01T10:26:11Z

If we do start using BigInt under the hood we definitely have to create a check on whether BigInt is supported in the first place.

bartekleon · 2020-02-01T11:17:51Z

@josdejong the problem is that BigInt(5) is slow, and 5n will throw an error if a browser doesn't support BigInt :/ well... I'll see what can be done to squeeze as much performance as I can

josdejong · 2020-02-01T11:33:49Z

You can probably create some constants for that like const five = BigInt(5).

bartekleon · 2020-02-01T11:41:25Z

Indeed. Well. We will see later. For now, I have exams at university and my own projects so I don't have that much time. But I'll inform you when I find some nice algorithm ;D

I'll take a look in free time, make benchmarks and see what we can achieve with this.

About gamma function - I guess there might be a problem with reflection formula since I have found somewhere that Lanczos works only for Re(z) > 0.5 and Stirling Re(z) > 0 (or 1)

josdejong · 2020-02-01T11:42:40Z

👍 good luck with your exams

josdejong added feature help wanted labels Dec 9, 2019

bartekleon mentioned this issue Dec 10, 2019

[WIP] improve bigFactorial algorithm #1690

Merged

josdejong closed this as completed in #1690 Dec 15, 2019

josdejong reopened this Dec 15, 2019

josdejong mentioned this issue Dec 15, 2019

feat: factorial for HUGE numbers #1691

Merged

josdejong closed this as completed Dec 31, 2019

improve bigFactorial algorithm #1687

improve bigFactorial algorithm #1687

Comments

bartekleon commented Dec 7, 2019 • edited

josdejong commented Dec 9, 2019

bartekleon commented Dec 9, 2019 • edited

bartekleon commented Dec 9, 2019

bartekleon commented Dec 9, 2019

bartekleon commented Dec 11, 2019 • edited

bartekleon commented Dec 11, 2019

josdejong commented Dec 15, 2019

bartekleon commented Dec 15, 2019 • edited

josdejong commented Dec 15, 2019

bartekleon commented Dec 15, 2019 • edited

bartekleon commented Dec 17, 2019

josdejong commented Dec 21, 2019

bartekleon commented Dec 21, 2019 • edited

bartekleon commented Dec 22, 2019 • edited

josdejong commented Dec 23, 2019

bartekleon commented Dec 23, 2019 • edited

josdejong commented Dec 24, 2019

bartekleon commented Dec 24, 2019 • edited

josdejong commented Dec 25, 2019 • edited

bartekleon commented Dec 25, 2019 • edited

josdejong commented Dec 26, 2019

bartekleon commented Dec 27, 2019 • edited

josdejong commented Dec 28, 2019

bartekleon commented Dec 28, 2019

josdejong commented Dec 31, 2019

josdejong commented Dec 31, 2019

bartekleon commented Dec 31, 2019

josdejong commented Dec 31, 2019

bartekleon commented Jan 29, 2020 • edited

bartekleon commented Jan 29, 2020 • edited

josdejong commented Feb 1, 2020

bartekleon commented Feb 1, 2020

josdejong commented Feb 1, 2020

bartekleon commented Feb 1, 2020 • edited

josdejong commented Feb 1, 2020

bartekleon commented Feb 1, 2020 • edited

josdejong commented Feb 1, 2020

bartekleon commented Dec 7, 2019 •

edited

bartekleon commented Dec 9, 2019 •

edited

bartekleon commented Dec 11, 2019 •

edited

bartekleon commented Dec 15, 2019 •

edited

bartekleon commented Dec 15, 2019 •

edited

bartekleon commented Dec 21, 2019 •

edited

bartekleon commented Dec 22, 2019 •

edited

bartekleon commented Dec 23, 2019 •

edited

bartekleon commented Dec 24, 2019 •

edited

josdejong commented Dec 25, 2019 •

edited

bartekleon commented Dec 25, 2019 •

edited

bartekleon commented Dec 27, 2019 •

edited

bartekleon commented Jan 29, 2020 •

edited

bartekleon commented Jan 29, 2020 •

edited

bartekleon commented Feb 1, 2020 •

edited

bartekleon commented Feb 1, 2020 •

edited