## Three Set Mahjong Probability Calculations

This notebook tabulates hand probabilities for a teaching Mahjong variant with a limited number of types and hand size:
- **Tiles**: 84 tiles covering the bamboo (1-9 numeric), circles (1-9 numeric), and dragons (white, green, red); four copies of each tile.
- **Hand Size**: Players seek to complete a hand with 11 tiles, consisting of 3 sets of three (sequence or triplet; no quads) and 1 pair.
- **Calls**: As a side note, calls for sequences (_chii_) and triplets (_pon_) are allowed.

In traditional Mahjong, patterns of tiles in a completed hand are given point values based generally on their elegance and rarity: how do the rarities of those patterns change when we limit the types of tiles and the number of tiles in hand?

In [44]:
import math
import numpy as np
import pandas as pd

from itertools import product

In [45]:
# load pre-computed tile combination properties for numeric tiles
suited_df = pd.read_csv('./shanten_suuhai.csv', 
                        index_col='tile_int', 
                        dtype={'tile_vector': str})

# trim to only combinations with eleven or fewer tiles
suited_df = suited_df[suited_df['n_tiles'] <= 11]

print(suited_df.shape)
suited_df.sample(10)

(123275, 11)


Unnamed: 0_level_0,tile_vector,n_tiles,n_sets,n_triplets,n_sequences,n_blocks,n_pairs,max_pairs,n_koritsu,n_terminals,n_ways
tile_int,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
556788999,21123,9,2,1,1,1,1,2,1,1,2304
15555889,100040021,8,1,1,0,1,1,1,2,2,96
22334467888,22201130,11,3,1,3,1,1,3,0,0,13824
2235788,21010120,7,0,0,0,3,2,2,0,0,2304
123555799,111030102,9,2,1,1,1,1,1,0,2,6144
12234677788,121101320,11,2,1,2,2,1,2,0,1,36864
3557777889,1020421,10,2,1,1,1,1,2,1,1,576
11223333478,224100110,11,2,1,2,2,1,2,0,1,2304
123344555,112230000,9,2,1,2,1,1,2,0,1,2304
1133334799,204100102,10,1,1,0,3,2,2,0,2,576


In [46]:
# load pre-computed tile combination properties for honor tiles
dragon_df = pd.read_csv('./shanten_jihai.csv', 
                        index_col='tile_int', 
                        dtype={'tile_vector': str})

# trim to only combinations with eleven or fewer tiles that only contain dragons
dragon_df = dragon_df[dragon_df['n_tiles'] <= 11]
no_winds = dragon_df['tile_vector'].apply(lambda x: x[:4]) == '0000'
dragon_df = dragon_df[no_winds]

print(dragon_df.shape)
dragon_df.sample(10)

(124, 7)


Unnamed: 0_level_0,tile_vector,n_tiles,n_triplets,n_pairs,n_koritsu,n_terminals,n_ways
tile_int,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
55566677,332,8,2,1,0,3,96
555577,402,6,1,1,1,2,6
55566667777,344,11,3,0,2,3,4
566777,123,6,1,1,1,3,96
55556677,422,8,1,2,1,3,36
55567777,314,8,2,0,2,3,16
55556666,440,8,2,0,2,2,1
555566,420,6,1,1,1,2,6
555666777,333,9,3,0,0,3,64
567777,114,6,1,0,3,3,16


In [47]:
def vector_to_int(t_vector):
    t_int = ''
    for i, cnt in zip(np.arange(1,len(t_vector)+1),t_vector):
        t_int += cnt * str(i)
    if t_int:
        return int(t_int)
    else:
        return 0

def int_to_vector(t_int, n_types=9):
    t_vector = np.zeros(n_types, dtype=int)
    t_int = str(t_int)
    for i in t_int:
        t_vector[int(i)-1] += 1
    return t_vector

## General Probabilities
- How many possible hands are there?
- How many of those hands form a winning combination? (Tenhou/Chiihou equivalent)

In [48]:
### How many possible hands are there, winning or otherwise?
total_hands = math.comb(21*4,11)

print(total_hands)

18574174153080


In [49]:
### How many possible winning hands are there?
suited_complete = suited_df.query('(3 * n_sets + 2 * n_pairs == n_tiles) & (n_pairs <= 1)')
suited_complete_ways = suited_complete.groupby(['n_tiles', 'n_sets', 'n_pairs']).sum(numeric_only=True)['n_ways'].reset_index()
suited_complete_ways

Unnamed: 0,n_tiles,n_sets,n_pairs,n_ways
0,0,0,0,1
1,2,0,1,54
2,3,1,0,484
3,5,1,1,19200
4,6,2,0,65272
5,8,2,1,1748756
6,9,3,0,2742868
7,11,3,1,47037380


In [50]:
dragon_complete = dragon_df.query('(3 * n_triplets + 2 * n_pairs == n_tiles) & (n_pairs <= 1)')
dragon_complete_ways = dragon_complete.groupby(['n_tiles', 'n_triplets', 'n_pairs']).sum(numeric_only=True)['n_ways'].reset_index()
dragon_complete_ways

Unnamed: 0,n_tiles,n_triplets,n_pairs,n_ways
0,0,0,0,1
1,2,0,1,18
2,3,1,0,12
3,5,1,1,144
4,6,2,0,48
5,8,2,1,288
6,9,3,0,64


In [51]:
total_winning_hands = 0
for sou, pin, hon in product(range(8), range(8), range(7)):
    total_tiles = suited_complete_ways.loc[sou,'n_tiles'] + suited_complete_ways.loc[pin,'n_tiles'] + dragon_complete_ways.loc[hon,'n_tiles']
    if total_tiles == 11:
        total_winning_hands += suited_complete_ways.loc[sou,'n_ways'] * suited_complete_ways.loc[pin,'n_ways'] * dragon_complete_ways.loc[hon,'n_ways']

print(total_winning_hands)
print(f"proportion: {total_winning_hands/total_hands:0.7f}; 1 in {total_hands/total_winning_hands:.0f}")

6232346696
proportion: 0.0003355; 1 in 2980


## Specific Hand Type Proportions
- **Terminals and Honors**
  - All Simples (_tanyao_): only numeric tiles from 2-8
  - Included Terminals and Honors (_chanta_): each set and the pair includes a 1, 9, or dragon
  - Included Terminals (_junchan_): each set and the pair includes a 1 or 9; no dragons
  - All Terminals and Honors (_honroutou_): each set and the pair consists of only 1s, 9s, or dragons
  - All Terminals (_chinroutou_): each set and the pair consists of only 1s or 9s
- **Set Consistency**
  - All Sequences (_pinfu_-like): three sequences and a pair
  - All Triplets (_toitoi_; _sanankou_-like): three triplets and a pair
- **Dragon Triplets**
  - Dragon Triplet (_yakuhai_): triplet of dragons
  - 2x Dragon Triplet: two triplets of dragons
  - Small Three Dragons (_shousangen_): two triplets of dragons + pair of third
  - Big Three Dragons (_daisangen_): three triplets of dragons
- **Single Numeric Suit**
  - Half Flush (_honitsu_): all tiles are of a single numeric suit (bamboo, circles) or dragons
  - Full Flush (_chinitsu_): all tiles are of a single numeric suit; no dragons
- **Other Set Patterns**
  - Two Identical Sequences (_iipeikou_-like): two identical sequences, including same suit
  - Full Straight (_ikkitsuukan_): sequences of 1-9 in a single suit

In [52]:
# defining sets for assembling winning combinations
sequences = [int_to_vector(123), int_to_vector(234), int_to_vector(345), int_to_vector(456),
             int_to_vector(567), int_to_vector(678), int_to_vector(789), np.zeros(9,dtype=int)]
triplets  = [int_to_vector(111), int_to_vector(222), int_to_vector(333), int_to_vector(444), int_to_vector(555),
             int_to_vector(666), int_to_vector(777), int_to_vector(888), int_to_vector(999), np.zeros(9,dtype=int)]
sets      = sequences[:-1] + triplets

pairs = [int_to_vector(11), int_to_vector(22), int_to_vector(33), int_to_vector(44), int_to_vector(55),
         int_to_vector(66), int_to_vector(77), int_to_vector(88), int_to_vector(99), np.zeros(9,dtype=int)]

terminal_sequences = [int_to_vector(123), int_to_vector(789), np.zeros(9,dtype=int)]
terminal_triplets  = [int_to_vector(111), int_to_vector(999), np.zeros(9,dtype=int)]
terminal_sets      = terminal_sequences[:-1] + terminal_triplets
terminal_pairs     = [int_to_vector(11),  int_to_vector(99),  np.zeros(9,dtype=int)]

In [53]:
def assemble_from_groups(*args):
    test_groups = product(*args)

    valid_groups = []
    for test_group in test_groups:
        test_vector = np.array(test_group).sum(axis=0)
        if (test_vector <= 4).sum() == test_vector.size:
            valid_groups.append(vector_to_int(test_vector))
    valid_groups = np.unique(np.array(valid_groups))
    
    return valid_groups

#### Terminals and Honors

In [54]:
### All Simples
simple_complete = suited_complete.query('n_terminals == 0')
simple_complete_ways = simple_complete.groupby(['n_tiles', 'n_sets', 'n_pairs']).sum(numeric_only=True)['n_ways'].reset_index()

winning_hands = 0
for sou in range(8):
    pin = 7-sou
    winning_hands += simple_complete_ways.loc[sou, 'n_ways'] * simple_complete_ways.loc[pin, 'n_ways']

print(winning_hands)
print(f"proportion: {winning_hands/total_winning_hands:0.7f}; 1 in {total_winning_hands/winning_hands:.2f}")

835960840
proportion: 0.1341326; 1 in 7.46


In [87]:
### Included Terminals and Honors
valid_groups = assemble_from_groups(terminal_sets, terminal_sets, terminal_sets, terminal_pairs)

terminal_complete = suited_complete.loc[valid_groups,:]
terminal_complete_ways = terminal_complete.groupby(['n_tiles', 'n_sets', 'n_pairs']).sum(numeric_only=True)['n_ways'].reset_index()

winning_hands = 0
for sou, pin, hon in product(range(8), range(8), range(1,7)): # must include some number of honors
    total_tiles = terminal_complete_ways.loc[sou,'n_tiles'] + terminal_complete_ways.loc[pin,'n_tiles'] + dragon_complete_ways.loc[hon,'n_tiles']
    if total_tiles == 11:
        winning_hands += terminal_complete_ways.loc[sou,'n_ways'] * terminal_complete_ways.loc[pin,'n_ways'] * dragon_complete_ways.loc[hon,'n_ways']

winning_hands -= (math.comb(7, 3) * 4 - math.comb(4, 3)) * 4 ** 3 * 6 # exclude "All Terminals and Honors" hands

print(winning_hands)
print(f"proportion: {winning_hands/total_winning_hands:0.7f}; 1 in {total_winning_hands/winning_hands:.0f}")

35279040
proportion: 0.0056606; 1 in 177


In [85]:
### Included Terminals
winning_hands = 0
for sou in range(8):
    pin = 7-sou
    winning_hands += terminal_complete_ways.loc[sou, 'n_ways'] * terminal_complete_ways.loc[pin, 'n_ways']
winning_hands -= math.comb(4, 3) * 4 ** 3 * 6 # exclude "All Terminals" hands

print(winning_hands)
print(f"proportion: {winning_hands/total_winning_hands:0.7f}; 1 in {total_winning_hands/winning_hands:.0f}")


13579968
proportion: 0.0021789; 1 in 459


In [83]:
### All Terminals and Honors
winning_hands  = math.comb(7, 3) * 4 # select three sets and pair
winning_hands -= math.comb(4, 3) # exclude "All Terminals" hands
winning_hands *= 4 ** 3 * 6 # selecting tiles within each group

print(winning_hands)
print(f"proportion: {winning_hands/total_winning_hands:0.3e}; 1 in {total_winning_hands/winning_hands:.0f}")

52224
proportion: 8.380e-06; 1 in 119339


In [84]:
### All Terminals
winning_hands  = math.comb(4, 3) # select three sets (pair is set after selection)
winning_hands *= 4 ** 3 * 6 # select tiles within each group

print(winning_hands)
print(f"proportion: {winning_hands/total_winning_hands:0.3e}; 1 in {total_winning_hands/winning_hands:.0f}")

1536
proportion: 2.465e-07; 1 in 4057517


### Set Consistency

In [58]:
### All Sequences
valid_groups = assemble_from_groups(sequences, sequences, sequences, pairs)

sequences_complete = suited_complete.loc[valid_groups,:]
sequences_complete_ways = sequences_complete.groupby(['n_tiles', 'n_sets', 'n_pairs']).sum(numeric_only=True)['n_ways'].reset_index()

winning_hands = 0
for sou, pin, hon in product(range(8), range(8), range(2)):
    total_tiles = sequences_complete_ways.loc[sou,'n_tiles'] + sequences_complete_ways.loc[pin,'n_tiles'] + dragon_complete_ways.loc[hon,'n_tiles']
    if total_tiles == 11:
        winning_hands += sequences_complete_ways.loc[sou,'n_ways'] * sequences_complete_ways.loc[pin,'n_ways'] * dragon_complete_ways.loc[hon,'n_ways']

print(winning_hands)
print(f"proportion: {winning_hands/total_winning_hands:0.7f}; 1 in {total_winning_hands/winning_hands:.2f}")

4274787328
proportion: 0.6859033; 1 in 1.46


In [59]:
### All Sequences; no dragons
winning_hands = 0
for sou in range(8):
    pin = 7-sou
    winning_hands += sequences_complete_ways.loc[sou, 'n_ways'] * sequences_complete_ways.loc[pin, 'n_ways']

print(winning_hands)
print(f"proportion: {winning_hands/total_winning_hands:0.7f}; 1 in {total_winning_hands/winning_hands:.2f}")

3344259328
proportion: 0.5365971; 1 in 1.86


In [60]:
### All Triplets
valid_groups = assemble_from_groups(triplets, triplets, triplets, pairs)

triplets_complete = suited_complete.loc[valid_groups,:]
triplets_complete_ways = triplets_complete.groupby(['n_tiles', 'n_sets', 'n_pairs']).sum(numeric_only=True)['n_ways'].reset_index()

winning_hands = 0
for sou, pin, hon in product(range(8), range(8), range(7)):
    total_tiles = triplets_complete_ways.loc[sou,'n_tiles'] + triplets_complete_ways.loc[pin,'n_tiles'] + dragon_complete_ways.loc[hon,'n_tiles']
    if total_tiles == 11:
        winning_hands += triplets_complete_ways.loc[sou,'n_ways'] * triplets_complete_ways.loc[pin,'n_ways'] * dragon_complete_ways.loc[hon,'n_ways']

# same result as math.comb(21,3) * 18 * 4 ** 3 * 6

print(winning_hands)
print(f"proportion: {winning_hands/total_winning_hands:0.7f}; 1 in {total_winning_hands/winning_hands:.0f}")

9192960
proportion: 0.0014750; 1 in 678


#### Dragon Triplets

In [61]:
### One Dragon Triplet
# honors index 2 (non-dragon pair) or 3 (dragon pair)
winning_hands = 0
for sou, pin, hon in product(range(6), range(6), range(2,4)):
    total_tiles = suited_complete_ways.loc[sou,'n_tiles'] + suited_complete_ways.loc[pin,'n_tiles'] + dragon_complete_ways.loc[hon,'n_tiles']
    if total_tiles == 11:
        winning_hands += suited_complete_ways.loc[sou, 'n_ways'] * suited_complete_ways.loc[pin, 'n_ways'] * dragon_complete_ways.loc[hon,'n_ways']

print(winning_hands)
print(f"proportion: {winning_hands/total_winning_hands:0.7f}; 1 in {total_winning_hands/winning_hands:.1f}")

402121056
proportion: 0.0645216; 1 in 15.5


In [62]:
### Two Dragon Triplets
# honors index 4

winning_hands = 0
for sou in range(4):
    pin = 3-sou
    winning_hands += suited_complete_ways.loc[sou, 'n_ways'] * suited_complete_ways.loc[pin, 'n_ways'] * dragon_complete_ways.loc[4,'n_ways']

print(winning_hands)
print(f"proportion: {winning_hands/total_winning_hands:0.7f}; 1 in {total_winning_hands/winning_hands:.0f}")

4352256
proportion: 0.0006983; 1 in 1432


In [76]:
### Small Three Dragons
# honors index 5; numeric index 2 (one set); multiply by 2 for both suits
winning_hands = 2 * suited_complete_ways.loc[2, 'n_ways'] * dragon_complete_ways.loc[5,'n_ways']

print(winning_hands)
print(f"proportion: {winning_hands/total_winning_hands:0.3e}; 1 in {total_winning_hands/winning_hands:.0f}")

278784
proportion: 4.473e-05; 1 in 22355


In [77]:
### Big Three Dragons
# honors index 6; numeric index 1 (pair); multiply by 2 for both suits
winning_hands = 2 * suited_complete_ways.loc[1, 'n_ways'] * dragon_complete_ways.loc[6,'n_ways']

print(winning_hands)
print(f"proportion: {winning_hands/total_winning_hands:0.3e}; 1 in {total_winning_hands/winning_hands:.0f}")

6912
proportion: 1.109e-06; 1 in 901671


#### Single Numeric Suit

In [80]:
### Half Flush
# calculate for single suit; multiply by 2 for both suits

winning_hands = 0
for hon in range(1,7): # must have some number of honors
    sou = 7-hon
    winning_hands += suited_complete_ways.loc[sou, 'n_ways'] * dragon_complete_ways.loc[hon,'n_ways']
winning_hands *= 2

print(winning_hands)
print(f"proportion: {winning_hands/total_winning_hands:0.7f}; 1 in {total_winning_hands/winning_hands:.1f}")

161640624
proportion: 0.0259358; 1 in 38.6


In [66]:
### Full Flush
# calculate for a single suit; multiply by 2 for both suits
winning_hands = 2 * suited_complete_ways.loc[7, 'n_ways']

print(winning_hands)
print(f"proportion: {winning_hands/total_winning_hands:0.7f}; 1 in {total_winning_hands/winning_hands:.1f}")

94074760
proportion: 0.0150946; 1 in 66.2


#### Other Set Patterns

In [67]:
### Two Identical Sequences
# calculate for single suit; multiply by 2 for both suits

identical_sequences = [x+x for x in sequences[:-1]]
valid_groups = assemble_from_groups(identical_sequences, sets, pairs)

iipeikou_complete = suited_complete.loc[valid_groups,:]
iipeikou_complete_ways = iipeikou_complete.groupby(['n_tiles', 'n_sets', 'n_pairs']).sum(numeric_only=True)['n_ways'].reset_index()

winning_hands = 0
for sou, pin, hon in product(range(4), range(4), range(4)):
    total_tiles = iipeikou_complete_ways.loc[sou,'n_tiles'] + suited_complete_ways.loc[pin,'n_tiles'] + dragon_complete_ways.loc[hon,'n_tiles']
    if total_tiles == 11:
        winning_hands += iipeikou_complete_ways.loc[sou, 'n_ways'] * suited_complete_ways.loc[pin, 'n_ways'] * dragon_complete_ways.loc[hon, 'n_ways']

winning_hands *= 2

print(winning_hands)
print(f"proportion: {winning_hands/total_winning_hands:0.7f}; 1 in {total_winning_hands/winning_hands:.1f}")

195974976
proportion: 0.0314448; 1 in 31.8


In [68]:
### Full Straight
# calculate for single suit; multiply by 2 for both suits
winning_hands = suited_complete.loc[123456789,'n_ways'] # making a straight
winning_hands *= 9 * 1 + 12 * 6 # selecting a pair (in suit + out of suit)
winning_hands *= 2 # flipping the numeric suit

print(winning_hands)
print(f"proportion: {winning_hands/total_winning_hands:0.7f}; 1 in {total_winning_hands/winning_hands:.0f}")

42467328
proportion: 0.0068140; 1 in 147


## Shanten Calculations

- What is the shanten distribution across all ten-tile hands, and what is the average shanten count?

In [69]:
suited_ways = suited_df.groupby(['n_tiles', 'n_sets', 'n_blocks', 'n_pairs']).agg({'n_ways': sum}).reset_index()
suited_ways = suited_ways[suited_ways['n_tiles'] <= 10]

dragon_ways = dragon_df.groupby(['n_tiles', 'n_triplets', 'n_pairs']).agg({'n_ways': sum}).reset_index()
dragon_ways = dragon_ways[dragon_ways['n_tiles'] <= 10]
dragon_ways = dragon_ways.rename(columns={'n_triplets':'n_sets'})
dragon_ways['n_blocks'] = dragon_ways['n_pairs']

In [70]:
shanten_ways = np.zeros(7,dtype=np.int64)

for sou_idx in suited_ways.index:
    sou_part = suited_ways.loc[sou_idx]
    
    pin_ways = suited_ways[suited_ways['n_tiles'] <= 10-sou_part['n_tiles']]
    for pin_idx in pin_ways.index:
        pin_part = pin_ways.loc[pin_idx]

        hon_ways = dragon_ways[dragon_ways['n_tiles'] == 10-sou_part['n_tiles']-pin_part['n_tiles']]
        for hon_idx in hon_ways.index:
            hon_part = hon_ways.loc[hon_idx]
            hand = sou_part + pin_part + hon_part

            # calculate shanten
            has_pair = min(hand['n_pairs'], 1)
            shanten = 6 - 2 * hand['n_sets'] - has_pair - min(hand['n_blocks']-has_pair, 3-hand['n_sets'])
            shanten_ways[shanten] += sou_part['n_ways'] * pin_part['n_ways'] * hon_part['n_ways']

In [78]:
print(math.comb(84,10))
print(sum(shanten_ways))

2761025887620
2761025887620


In [74]:
print(f'hands by shanten (in billions):')
print(np.round(shanten_ways / 1e9, 3))
print(np.round(shanten_ways / shanten_ways.sum(), 6))
print('')
print(f'tenpai chance: {shanten_ways[0] / shanten_ways.sum():0.7f}; 1 in {shanten_ways.sum() / shanten_ways[0]:0.0f}')
print(f'average shanten: {(shanten_ways * np.arange(7)).sum() / shanten_ways.sum():0.2f}')

hands by shanten (in billions):
[  14.024  369.033 1367.33   881.904  125.447    3.287    0.   ]
[0.005079 0.133658 0.495225 0.319412 0.045435 0.001191 0.      ]

tenpai chance: 0.0050794; 1 in 197
average shanten: 2.27
