Improve performance of `count` on `Combinations` #58

mdznr · 2021-01-14T20:44:34Z

The implementation of the counts property on Combinations can be optimized to make use of the fact that the binomial is symmetrical around N/2 or (N - 1)/2.

More details available in this review comment on #51.

The text was updated successfully, but these errors were encountered:

toddthomas · 2021-03-11T00:18:37Z

I got a bit obsessed with investigating this, but I'm coming to the conclusion that what's there now is pretty optimal, in addition to being so beautiful I'm reluctant to add anything to it! 😍🤩 I think perhaps not all the current optimizations were implemented when this issue was created?

The case for k > n/2 makes use of the symmetry to take the shorter path back to a base case for one such k. The only duplication of effort I see now is when the range of k spans the line of symmetry, retracing paths already taken on the left side. Memoization would eliminate that, as @mdznr mentioned.

I wrote an iterative version which builds the left half of the triangle (keeping track of only the last two rows), then sums the required elements in the last row.

func binomialCount(n: Int, kRange: Range<Int>) -> Int {
  // Handling of `n < 2` cases omitted.
  let m = n / 2
  var a = Array(repeating: 0, count: m + 1)
  a[0] = 1
  var b = Array(repeating: 0, count: m + 1)
  b[0] = 1

  let relativeKRange = kRange.relative(to: 0..<(n + 1))

  for i in 2...n {
    for j in 1...(i / 2) {
      switch (i % 2 == 0, j < i / 2) {
      case (true, true):
        a[j] = b[j - 1] + b[j]
      case (true, false):
        a[j] = b[j - 1] * 2
      case (false, _):
        b[j] = a[j - 1] + a[j]
      }
    }
  }

  let sumIndices = relativeKRange.indices.lazy.map {
    $0 > m ? n - $0 : $0
  }
  let resultRow = n % 2 == 0 ? a : b

  return sumIndices.lazy.reduce(0) { $0 + resultRow[$1] }
}

But it's not faster, and it's much uglier than the lovely recursive version. It computes more elements for two reasons: first, I didn't bother to compute only what was needed for the given range of k. That would make an ugly method uglier, and also wouldn't eliminate much of the half triangle due to the second reason: it uses only addition, instead of taking advantage of the binomial(n, k) == n * binomial(n - 1, k - 1) / k relation. So it needs elements above the nice diagonal back to the base case traced by the recursive method.

Avoiding multiplication does give it the advantage of being able to compute up to a higher n or a broader range of k before succumbing to arithmetic overflow. For example, the iterative version can sneak right up to a count of Int.max via binomialCount(n: 63, kRange: 1..<64), whereas the recursive version maxes out at n: 61, kRange: 1..<62 when counting an analogous range of k.

With n being capped in the sixties, I don't think stack overflow is a concern, so there's not much gained by eliminating the recursion.

For both methods, we're talking execution times of just hundreds of microseconds on my machine, even for those high-n, broad-k-range cases, so it doesn't seem to me like there's a big performance problem here.

So: I had a lot of fun digging into this, but I'm wondering if this issue is still meant to be open. Do we want to do the memoization, or is there some other optimization opportunity I've missed?

mdznr · 2021-03-11T00:24:45Z

It’s been a while since I’ve looked at this, but calculating the number of combinations of size 0...k (last line in Pascal’s triangle) can be done with a power of two. So if you wanted to calculate something like 1...k, it’s faster to calculate the sum of the whole last row, then subtract a value than it would be to calculate every other value on the line and sum them up.

I have some WIP code that starts to go down that route: main...mdznr:arithmetic_triangle

toddthomas · 2021-03-11T00:34:51Z

Aha! Nice.

mdznr added the enhancement New feature or request label Jan 14, 2021

mdznr mentioned this issue Jan 14, 2021

All combinations #51

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance of `count` on `Combinations` #58

Improve performance of `count` on `Combinations` #58

mdznr commented Jan 14, 2021

toddthomas commented Mar 11, 2021 •

edited

mdznr commented Mar 11, 2021

toddthomas commented Mar 11, 2021

Improve performance of count on Combinations #58

Improve performance of count on Combinations #58

Comments

mdznr commented Jan 14, 2021

toddthomas commented Mar 11, 2021 • edited

mdznr commented Mar 11, 2021

toddthomas commented Mar 11, 2021

Improve performance of `count` on `Combinations` #58

Improve performance of `count` on `Combinations` #58

toddthomas commented Mar 11, 2021 •

edited