# Problem 470 - Super Ramvok
<p>Consider a single game of Ramvok:</p>

<p>Let $t$ represent the maximum number of turns the game lasts. If $t = 0$, then the game ends immediately. Otherwise, on each turn $i$, the player rolls a die. After rolling, if $i \lt t$ the player can either stop the game and receive a prize equal to the value of the current roll, or discard the roll and try again next turn. If $i = t$, then the roll cannot be discarded and the prize must be accepted. Before the game begins, $t$ is chosen by the player, who must then pay an up-front cost $ct$ for some constant $c$. For $c = 0$, $t$ can be chosen to be infinite (with an up-front cost of $0$). Let $R(d, c)$ be the expected profit (i.e. net gain) that the player receives from a single game of optimally-played Ramvok, given a fair $d$-sided die and cost constant $c$. For example, $R(4, 0.2) = 2.65$. Assume that the player has sufficient funds for paying any/all up-front costs.</p>

<p>Now consider a game of Super Ramvok:</p>

<p>In Super Ramvok, the game of Ramvok is played repeatedly, but with a slight modification. After each game, the die is altered. The alteration process is as follows: The die is rolled once, and if the resulting face has its pips visible, then that face is altered to be blank instead. If the face is already blank, then it is changed back to its original value. After the alteration is made, another game of Ramvok can begin (and during such a game, at each turn, the die is rolled until a face with a value on it appears). The player knows which faces are blank and which are not at all times. The game of Super Ramvok ends once all faces of the die are blank.</p>

<p>Let $S(d, c)$ be the expected profit that the player receives from an optimally-played game of Super Ramvok, given a fair $d$-sided die to start (with all sides visible), and cost constant $c$. For example, $S(6, 1) = 208.3$.</p>

<p>Let $F(n) = \sum_{4 \le d \le n} \sum_{0 \le c \le n} S(d, c)$.</p>

<p>Calculate $F(20)$, rounded to the nearest integer.</p>

## Solution.

In [1]:
from functools import cache, lru_cache
from tqdm import tqdm
from math import comb
import numpy as np
import matplotlib.pyplot as plt
from itertools import combinations

In [2]:
@cache
def R_help(dice, c, t):
    '''
    dice - a tuple of numbers on the die
    '''
    if c == 0:
        return max(dice)

    if t == 0:
        return 0

    
    if t == 1:
        return sum(dice)/len(dice) - c

    e = 0
    l = len(dice)
    for d in dice:
        e += max(R_help(dice, c, t-1) + c*(t-1), d)

    return e/l - c*t

In [10]:
@cache
def R(dice, c):
    if c > max(dice):
        return 0
        
    if dice == ():
        return 0
    
    if c == 0:
        return max(dice)

    e = -1
    t = 0
    while R_help(dice, c, t) >= e: 
        e = R_help(dice, c, t)
        t += 1

    return e

In [11]:
def generate_layer(k, d):
    if k == 0:
        return [()]
    return list(combinations(range(1, d + 1), k))

In [16]:
@lru_cache
def S(d, c):
    if c > d:
        return 0

    A = np.zeros((d+1, d+1), dtype=float)
    y = np.zeros(d+1)
    A[0][0] = 1

    for k in (range(1, d)):
        y[k] = sum(R(v, c) for v in generate_layer(k, d))
        A[k][k] = 1
        A[k][k-1] = -(d-(k-1))/d 
        A[k][k+1] = -(k+1)/d 
        
   
    A[d][d] = 1
    A[d][d-1] = -1/d
    y[d] = R(tuple([x for x in range(1, d+1)]), c)

    x = np.linalg.solve(A, y)
    
    return x[-1]


In [13]:
S(6,1)

np.float64(208.29999999999913)

In [14]:
def F(n):
    s = 0
    for n in tqdm(range(4, n+1)):
        for c in range(0, n+1):
            s += S(n, c)

    return s

In [15]:
F(20)

100%|██████████████████████████████████████████████████████████████████████████████████| 17/17 [04:32<00:00, 16.06s/it]


np.float64(147668793.78897253)

ANSWER: 147668794