## All matching sets

[Blog post.](https://asteroid.divnull.com/2008/01/chance-of-reign/)

[Question on Reddit.](https://www.reddit.com/r/askmath/comments/rqtqkq/probability_value_has_chance_in_a_way_i_dont/)

[Another question on Reddit.](https://www.reddit.com/r/RPGdesign/comments/u8yuhg/odds_of_multiples_doubles_triples_quads_quints/)

[Question on StackExchange.](https://math.stackexchange.com/questions/4436121/probability-of-rolling-repeated-numbers)

Roll a bunch of dice, and find **all** matching sets (pairs, triples, etc.)

We *could* manually enumerate every case as per the blog post. However, this is prone to error.
Fortunately, Icepool can do this simply and reasonably efficiently with no explicit combinatorics on the user's part.

In [1]:
import piplite
await piplite.install("icepool")

import icepool

class AllMatchingSets(icepool.EvalPool):
    def next_state(self, state, outcome, count):
        if state is None:
            state = ()
        # If at least a pair, append the size of the matching set.
        if count >= 2:
            state += (count,)
        # Prioritize larger sets.
        return tuple(sorted(state, reverse=True))

all_matching_sets = AllMatchingSets()

# Evaluate on 10d10.
print(all_matching_sets.eval(icepool.d10.pool(10)))

Denominator: 10000000000

|         Outcome |     Weight | Probability |
|----------------:|-----------:|------------:|
|              () |    3628800 |   0.036288% |
|            (2,) |  163296000 |   1.632960% |
|          (2, 2) | 1143072000 |  11.430720% |
|       (2, 2, 2) | 1905120000 |  19.051200% |
|    (2, 2, 2, 2) |  714420000 |   7.144200% |
| (2, 2, 2, 2, 2) |   28576800 |   0.285768% |
|            (3,) |  217728000 |   2.177280% |
|          (3, 2) | 1524096000 |  15.240960% |
|       (3, 2, 2) | 1905120000 |  19.051200% |
|    (3, 2, 2, 2) |  381024000 |   3.810240% |
|          (3, 3) |  317520000 |   3.175200% |
|       (3, 3, 2) |  381024000 |   3.810240% |
|    (3, 3, 2, 2) |   31752000 |   0.317520% |
|       (3, 3, 3) |   14112000 |   0.141120% |
|            (4,) |  127008000 |   1.270080% |
|          (4, 2) |  476280000 |   4.762800% |
|       (4, 2, 2) |  285768000 |   2.857680% |
|    (4, 2, 2, 2) |   15876000 |   0.158760% |
|          (4, 3) |  127008000 |  

### Mixed pools

In fact, Icepool can compute this for mixed pools of standard dice as well. [Similar StackExchange question.](https://rpg.stackexchange.com/questions/179043/how-to-count-duplicates-in-a-mixed-pool-using-anydice)

In [2]:
# Evaluate on a pool of 3d12, 2d10, 1d8.
print(all_matching_sets.eval(icepool.standard_pool(12, 12, 12, 10, 10, 8)))

Denominator: 1382400

|   Outcome | Weight | Probability |
|----------:|-------:|------------:|
|        () | 290304 |  21.000000% |
|      (2,) | 653184 |  47.250000% |
|    (2, 2) | 256608 |  18.562500% |
| (2, 2, 2) |   9936 |   0.718750% |
|      (3,) | 118080 |   8.541667% |
|    (3, 2) |  41088 |   2.972222% |
|    (3, 3) |    736 |   0.053241% |
|      (4,) |  10848 |   0.784722% |
|    (4, 2) |   1128 |   0.081597% |
|      (5,) |    480 |   0.034722% |
|      (6,) |      8 |   0.000579% |



### Improving efficiency for more specific queries

If you have a more specific query than enumerating all possible sets of matching sets, you can reduce the state space and improve efficiency by only retaining enough information to compute the answer. For example, if you just want to know the number of pairs (counting e.g. a quadruple as two pairs---if you want unique pairs just replace `//` with `>=`):

In [3]:
class NumPairs(icepool.EvalPool):
    def next_state(self, state, outcome, count):
        return (state or 0) + (count // 2)

num_pairs = NumPairs()

# Evaluate on 10d10.
print(num_pairs.eval(icepool.Pool(icepool.d10, 10)))

Denominator: 10000000000

| Outcome |     Weight | Probability |
|--------:|-----------:|------------:|
|       0 |    3628800 |   0.036288% |
|       1 |  381024000 |   3.810240% |
|       2 | 3149798400 |  31.497984% |
|       3 | 4904524800 |  49.045248% |
|       4 | 1514960640 |  15.149606% |
|       5 |   46063360 |   0.460634% |



### Loop over pool sizes

I placed this at the end because the tables are long.

In [4]:
for pool_size in range(1, 11):
    print(f'### Pool size {pool_size}')
    print(all_matching_sets.eval(icepool.Pool(icepool.d10, pool_size)))

### Pool size 1
Denominator: 10

| Weight | Probability |
|-------:|------------:|
|     10 | 100.000000% |

### Pool size 2
Denominator: 100

| Outcome | Weight | Probability |
|--------:|-------:|------------:|
|      () |     90 |  90.000000% |
|    (2,) |     10 |  10.000000% |

### Pool size 3
Denominator: 1000

| Outcome | Weight | Probability |
|--------:|-------:|------------:|
|      () |    720 |  72.000000% |
|    (2,) |    270 |  27.000000% |
|    (3,) |     10 |   1.000000% |

### Pool size 4
Denominator: 10000

| Outcome | Weight | Probability |
|--------:|-------:|------------:|
|      () |   5040 |  50.400000% |
|    (2,) |   4320 |  43.200000% |
|  (2, 2) |    270 |   2.700000% |
|    (3,) |    360 |   3.600000% |
|    (4,) |     10 |   0.100000% |

### Pool size 5
Denominator: 100000

| Outcome | Weight | Probability |
|--------:|-------:|------------:|
|      () |  30240 |  30.240000% |
|    (2,) |  50400 |  50.400000% |
|  (2, 2) |  10800 |  10.800000% |
|    (3,) |