Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve bigFactorial algorithm #1687

Closed
bartekleon opened this issue Dec 7, 2019 · 37 comments · Fixed by #1690
Closed

improve bigFactorial algorithm #1687

bartekleon opened this issue Dec 7, 2019 · 37 comments · Fixed by #1690

Comments

@bartekleon
Copy link
Contributor

bartekleon commented Dec 7, 2019

You can use different algorithms of factorial to speed this 2-8 times. Universal one (twice as fast):
[i am using bigint's in implementation, but logic is the same]

function factorial(n) {
  if(n === 0n || n === 1n) {
    return 1n;
  }
  if(n % 2n === 1n) {
    return n * factorial(n - 1n);
  }

  let p = n;
  let prod = n;

  while(p > 2n) {
    p -= 2n;
    n += p;

    prod *= n;
  }

  return prod;
}

If you are interested in the less universal (faster than this after n=1400), I could provide it in comment

@josdejong
Copy link
Owner

Performance improvements are of course always welcome Bartosz :) Would you like to work this out in a Pull request, or can we ask the community to pick this up?

I think it will be useful to write a little benchmark to make sure we're actually improving (see /benchmark folder.

@bartekleon
Copy link
Contributor Author

bartekleon commented Dec 9, 2019

@josdejong for now, I have benchmarks for my implementation. This algorithm is written to be fast tho. Should be always ~2 times faster since you have twice less numbers
20! = 1 * 2 * 3 * 4 * ... * 20 ( 20 terms )
20! = 20 * (20 + 18) * (20 + 18 + 16) * (20 + ... + 2) (9 terms or so)

And yes, I can take this feature :)

@bartekleon
Copy link
Contributor Author

@josdejong I have one question tho. Should I use this math library or can I use BigInt?

@bartekleon
Copy link
Contributor Author

image
After a small test, it seems it is faster (from 8! tho, but we can hardcode first 7 factorials I guess)

@bartekleon
Copy link
Contributor Author

bartekleon commented Dec 11, 2019

@josdejong would you also like a proof paper for this algorithm?

Proof for this algorithm: https://docs.google.com/document/d/13CYoqFFkcwNYRN8ObKEvyXpkCP8GSf14n9G7XeWK1OI/edit?usp=sharing

@bartekleon
Copy link
Contributor Author

Update: I have been working theoretically investigating this subject, and I have managed to make this algorithm twice as fast, these are results on this moment (right now I have implemented square difference factorial, as you can see in PR, although I am planning to check other possibilities too)

calculating speed for factorial 10        
factorial by definition: 0.088ms
square difference factorial: 0.045ms      
doubleFactorial square difference: 0.082ms
recursive binary factorial: 0.112ms       

calculating speed for factorial 100       
factorial by definition: 0.025ms
square difference factorial: 0.024ms      
doubleFactorial square difference: 0.026ms
recursive binary factorial: 0.052ms       

calculating speed for factorial 1000      
factorial by definition: 1.438ms
square difference factorial: 0.357ms      
doubleFactorial square difference: 0.231ms
recursive binary factorial: 0.626ms       

calculating speed for factorial 10000     
factorial by definition: 73.748ms
square difference factorial: 38.261ms
doubleFactorial square difference: 17.682ms
recursive binary factorial: 11.47ms   

calculating speed for factorial 100000
factorial by definition: 10.981s
square difference factorial: 5.647s
doubleFactorial square difference: 2.272s
recursive binary factorial: 997.771ms

(It has been written using BigInt, not this library, so probably due to complexity I'll have to leave it as it is in PR right now)

@josdejong
Copy link
Owner

Thanks Bartosz, this benchmark clearly shows a factor 2 improvement, that's impressive :). I've merged the benchmark in the develop branch and fixed some linting errors that it complained about.

Anyone interested in replacing the current bignumber implementation with the faster one proposed here?

@bartekleon
Copy link
Contributor Author

bartekleon commented Dec 15, 2019

I am still working theoretically on that. Give me a week or two, I'll try to handle this PR or at least write benchmarks :)
Also, I need to check edge cases like n = 0 or n = 1 since in my proof I am omitting it. I need to consult my maths teacher so he can check my proofs :)

And @josdejong this version written in Benchmark is actually working one (it's not in BigInt)

@josdejong
Copy link
Owner

And @josdejong this version written in Benchmark is actually working one (it's not in BigInt)

Yes so it should be quite straight forward to replace the implementation with the one in from benchmark :)

I'll await your further experimentation.

@bartekleon
Copy link
Contributor Author

bartekleon commented Dec 15, 2019

Exactly. It will probably be just copy-paste implementation :)

I guess it is mostly done. Now I need to find crossing-point for these two functions and implement one which will decide, which one should be used (crossing-point is probably for n something around 1000)

(If you have other functions which could be optimized like trigonometric functions/logarithms etc. feel free to send me a message or create an issue and ping me there)

@bartekleon
Copy link
Contributor Author

@josdejong If you like, I have also quite fast algorithm for double factorial (the same logic as for factorial, so double speed).

@josdejong
Copy link
Owner

That sounds interesting of course :)

@bartekleon
Copy link
Contributor Author

bartekleon commented Dec 21, 2019

If you are interested, I have created a math library based in bigints. You might want to take a look on some implementations :)
I'll also try putting some of my research / algorithms into docs and put it in different GitHub repo, but for now I don't have that much time.

If you are interested in some particular algorithm / operation, lemme know. I am always curious about this stuff and so ill try my best to test / find it :)

@bartekleon
Copy link
Contributor Author

bartekleon commented Dec 22, 2019

download (2)

I was thinking of making a simple graph to illustrate the speed of algorithms and came up with something like this (it is a rough version for simple illustrations)

@josdejong
Copy link
Owner

That's nice, that really gives a clear insight on how much performance improvement the different give, and how good they score on different ranges of values. It's interesting to see that up to about 1450 the initializeFactorial has worse performance than the others. So if the typical use case is below that, it may yield a regression in real world applications. Do you have an idea on what range of values people normally need?

@bartekleon
Copy link
Contributor Author

bartekleon commented Dec 23, 2019

Honestly, I have not done research in this field. For now, I am only testing different algorithms since I'm super interested in the "speed" part, not the "usage"/"social" part.

Although the speed is worse for numbers 1450 < , I still find this algorithm usable. You just need to "swap" algorithms for different thresholds:

n < 8 -> just return value
8 <= n < 1450 -> factorial3
n >= 1450 -> initializeFactorial

using this method, we always get the best performance possible. The only drawback is that you need all these implementations - but I guess in the modern world, it is better to increase app size twice than having twice longer execution.

The reason why it probably is worse in performance than it's different versions is 'the speed for O',
so it might have a better algorithm like O(nlog(n)) instead of O(n^2) but the execution of one element it slightly longer so overall it is faster only for bigger numbers.

@josdejong
Copy link
Owner

Yeah I can imagine that it's really interesting to see how all these different approaches compare :) I'm sure it will also differ a lot whether working with numbers, BigNumbers, BigInt, etc.

I think it makes sense to put all your different algorithms and benchmarks and research in a separate repository (and maybe an npm library) purely focused on the best performance of factorial for large values. It goes a bit too far though for mathjs, where we try to balance usability, performance, code complexity and maintainability, and bundle size.

I would love to have the improvements that are currently in /test/benchmark/factorial.js integrated in the factorial function of mathjs, but let's not go for #1691 at this point, see my reasoning in #1691 (comment). Is that ok with you?

@bartekleon
Copy link
Contributor Author

bartekleon commented Dec 24, 2019

I'll remove the initializeFactorial algorithm due to the high complexity of it. Instead, I'll try benchmarking my other model (factorial3) which uses the fact that: n! can be almost written as `n!! * 2 ** (n/2) * (n/2)!. It has super easy proof by itself and both double factorial and factorial uses quite similar algorithms, so it is easy to understand. Full implementation in BigInt:

function doubleFactorial(n) {
  switch((n + 1n) % 4n) {
    case 0n:
      const p = (n - 1n) / 2n;
      n += 1n;
      let prod = 1n;
  
      for(let k = 1n; k <= p; k += 2n) {
        prod *= (n - k) * k;
      }
    
      return prod;
    case 2n:
      return n * doubleFactorial(n - 2n);
  }

  return 2n ** (n / 2n) * factorial3(n / 2n);
}

function factorial3(n) {
  if(n < 7n) {
    return [1n, 1n, 2n, 6n, 24n, 120n, 720n][+`${n}`]
  }

  if(n % 2n === 0n) {
    const k = n / 2n;
    return 2n ** k * doubleFactorial(n - 1n) * factorial3(k);
  }

  return factorial3(n - 1n) * n;
}

I guess it could be a nice middle ground between complexity and speed, what do you think?
(unless it's slower for BigNumber or difference is insignificant, then ill leave only that easy-one)

@josdejong
Copy link
Owner

josdejong commented Dec 25, 2019

These are very well understandable algorithms! Looks good to me 👍

Practical questions:

  1. mathjs does not have any support for BigInt. It makes sense to me to add BigInt support for mathjs but that only works it we add support for at least all arithmetic functions for example and add documentation etc, that requires some work. I think that's too big to be part of this PR though.
  2. The current implementation of factorial supports number and BigNumber. Can your algorithms be transformed for number and BigNumber instead of `BigInt? Are they then faster than the current implementations?
  3. If I understand it correctly, the faster implementations that you propose above works only for integer numbers? And since it's using BigInt, it will not work on all browsers. I think it is essential to have a factorial implementation which works on all browsers and supports floating point. So should we create a switch in the current implementations for number and BigNumber to use this fast integer implementation when the input is an integer value, and otherwise use an other implementation?

@bartekleon
Copy link
Contributor Author

bartekleon commented Dec 25, 2019

There is no problem in changing it from bigint to bignumber, the only drawback is performance, which I'll measure to ensure its indeed faster.

If it comes to other functions, if needed, you could consider using parts of my library which basically works quite similar to BigNumber, but on BigInts.

But I am probably able to write every function using BigNumbers. Still I have to measure performance though :/

If it comes to switch. It is not that hard to do. I could try implementing it and see the difference :)

@josdejong
Copy link
Owner

Let's focus on improving factorial (number and BigNumber) for now and keep BigInt support separate ok? I'm looking forward to see what you come up with :)

@bartekleon
Copy link
Contributor Author

bartekleon commented Dec 27, 2019

@josdejong I have removed that complex algorithm and was playing with another algorithm, but I couldn't make it fast enough (even with BigInt, since I had to use BigInt(n) instead of 1n,2n etc which make it slow). Even after optimisations, the difference was insignificant, so this is the fastest available for now :/ (still double speed is good i guess)

@josdejong
Copy link
Owner

Double speed is awesome 😎 🚀 ! Bummer that the other optimizations didn't work out, but very good to have this verified with some benchmarks! Thanks for all your effort.

I think we can close #1691 then, and the next step would be to actually implement the 2x improvements that are already in test/benchmark/factorial.js?

@bartekleon
Copy link
Contributor Author

Well yeah. I guess it's enough to just copy /paste implementation to SRC folder :) I can do it tomorrow /maybe today

@josdejong
Copy link
Owner

Thanks Bartosz, I've merged your PR now! It's a very nice improvement and I love it that it's achieved by just a few lines of code.

@josdejong
Copy link
Owner

Improved factorial is now available in v6.3.0

@bartekleon
Copy link
Contributor Author

If you have any algorithm you want to be improved, message me. I'll take a look then ;p

@josdejong
Copy link
Owner

Thanks! It would be great if you could have a look at the performance of det some day. This is currently like 100 times slower than we would expect, see #908.

@bartekleon
Copy link
Contributor Author

bartekleon commented Jan 29, 2020

@josdejong If you are interested, we could change the factorial algorithm again in the future. It is quite fast and implementation isn't that hard to understand:

function FactorialHyper(n) {
  let nostart = false;
  const h = n / 2n;
  let s = h + 1n;
  let k = s + h;
  let a = (n & 1n) === 1n ? k : 1n;

  if ((h & 1n) === 1n) a = -a;
  k += 4n;

  function HyperFact(l) {
    if (l > 1n) {
      var m = l >> 1n;
      return HyperFact(m) * HyperFact(l - m);
    }
  
    if (nostart) {
      s -= k -= 4n;
      return s;
    }
    nostart = true;

    return a;
  }

  return HyperFact(h + 1n) << h;
}

@bartekleon
Copy link
Contributor Author

bartekleon commented Jan 29, 2020

download (1)

Difference between this one (FactorialHyper) and current implementation [note that these are implemented using BigInt, not BigNumber so the results may differ a little]

@josdejong
Copy link
Owner

If you are interested, we could change the factorial algorithm again in the future. It is quite fast and implementation isn't that hard to understand

Yes of course :) That algorithm looks very simple!

We should be careful not to compare apples with pears though: so far mathjs has no built in support for BigInt. It would be great to add this but I think it only makes sense if we create support for it for all basic arithmetic functions (add, subtract, etc), we discussed that before I think.

I understand your implementation only works for BigInt. Can it be used for number too?

I was thinking, we could make a switch in the current number and BigNumber implementation: if the input is integer and the browser has BigInt support, switch to this fast algorithm you propose above. This way BigInt is only used under the hood. What do you think?

@bartekleon
Copy link
Contributor Author

Most of this algorithm can use normal numbers. I am using BigInt everywhere since conversion number -> BigInt is slow. There is no problem for me for changing your implementations, but keep in mind, that BigInt isn't supported everywhere. If you take a look at my BigMath library, I have written there which browsers /node versions are supported

@josdejong
Copy link
Owner

If we do start using BigInt under the hood we definitely have to create a check on whether BigInt is supported in the first place.

@bartekleon
Copy link
Contributor Author

bartekleon commented Feb 1, 2020

@josdejong the problem is that BigInt(5) is slow, and 5n will throw an error if a browser doesn't support BigInt :/ well... I'll see what can be done to squeeze as much performance as I can

@josdejong
Copy link
Owner

You can probably create some constants for that like const five = BigInt(5).

@bartekleon
Copy link
Contributor Author

bartekleon commented Feb 1, 2020

Indeed. Well. We will see later. For now, I have exams at university and my own projects so I don't have that much time. But I'll inform you when I find some nice algorithm ;D

I'll take a look in free time, make benchmarks and see what we can achieve with this.

About gamma function - I guess there might be a problem with reflection formula since I have found somewhere that Lanczos works only for Re(z) > 0.5 and Stirling Re(z) > 0 (or 1)

@josdejong
Copy link
Owner

👍 good luck with your exams

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants