Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of count on Combinations #58

Open
mdznr opened this issue Jan 14, 2021 · 3 comments
Open

Improve performance of count on Combinations #58

mdznr opened this issue Jan 14, 2021 · 3 comments
Labels
enhancement New feature or request

Comments

@mdznr
Copy link
Contributor

mdznr commented Jan 14, 2021

The implementation of the counts property on Combinations can be optimized to make use of the fact that the binomial is symmetrical around N/2 or (N - 1)/2.

More details available in this review comment on #51.

@mdznr mdznr added the enhancement New feature or request label Jan 14, 2021
@mdznr mdznr mentioned this issue Jan 14, 2021
4 tasks
@toddthomas
Copy link
Contributor

toddthomas commented Mar 11, 2021

I got a bit obsessed with investigating this, but I'm coming to the conclusion that what's there now is pretty optimal, in addition to being so beautiful I'm reluctant to add anything to it! 😍🤩 I think perhaps not all the current optimizations were implemented when this issue was created?

The case for k > n/2 makes use of the symmetry to take the shorter path back to a base case for one such k. The only duplication of effort I see now is when the range of k spans the line of symmetry, retracing paths already taken on the left side. Memoization would eliminate that, as @mdznr mentioned.

I wrote an iterative version which builds the left half of the triangle (keeping track of only the last two rows), then sums the required elements in the last row.

func binomialCount(n: Int, kRange: Range<Int>) -> Int {
  // Handling of `n < 2` cases omitted.
  let m = n / 2
  var a = Array(repeating: 0, count: m + 1)
  a[0] = 1
  var b = Array(repeating: 0, count: m + 1)
  b[0] = 1

  let relativeKRange = kRange.relative(to: 0..<(n + 1))

  for i in 2...n {
    for j in 1...(i / 2) {
      switch (i % 2 == 0, j < i / 2) {
      case (true, true):
        a[j] = b[j - 1] + b[j]
      case (true, false):
        a[j] = b[j - 1] * 2
      case (false, _):
        b[j] = a[j - 1] + a[j]
      }
    }
  }

  let sumIndices = relativeKRange.indices.lazy.map {
    $0 > m ? n - $0 : $0
  }
  let resultRow = n % 2 == 0 ? a : b

  return sumIndices.lazy.reduce(0) { $0 + resultRow[$1] }
}

But it's not faster, and it's much uglier than the lovely recursive version. It computes more elements for two reasons: first, I didn't bother to compute only what was needed for the given range of k. That would make an ugly method uglier, and also wouldn't eliminate much of the half triangle due to the second reason: it uses only addition, instead of taking advantage of the binomial(n, k) == n * binomial(n - 1, k - 1) / k relation. So it needs elements above the nice diagonal back to the base case traced by the recursive method.

Avoiding multiplication does give it the advantage of being able to compute up to a higher n or a broader range of k before succumbing to arithmetic overflow. For example, the iterative version can sneak right up to a count of Int.max via binomialCount(n: 63, kRange: 1..<64), whereas the recursive version maxes out at n: 61, kRange: 1..<62 when counting an analogous range of k.

With n being capped in the sixties, I don't think stack overflow is a concern, so there's not much gained by eliminating the recursion.

For both methods, we're talking execution times of just hundreds of microseconds on my machine, even for those high-n, broad-k-range cases, so it doesn't seem to me like there's a big performance problem here.

So: I had a lot of fun digging into this, but I'm wondering if this issue is still meant to be open. Do we want to do the memoization, or is there some other optimization opportunity I've missed?

@mdznr
Copy link
Contributor Author

mdznr commented Mar 11, 2021

It’s been a while since I’ve looked at this, but calculating the number of combinations of size 0...k (last line in Pascal’s triangle) can be done with a power of two. So if you wanted to calculate something like 1...k, it’s faster to calculate the sum of the whole last row, then subtract a value than it would be to calculate every other value on the line and sum them up.

I have some WIP code that starts to go down that route: main...mdznr:arithmetic_triangle

@toddthomas
Copy link
Contributor

Aha! Nice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants