# Implement beer game described by John Sterman 1989

## Introduction

The stock management control problem is divided into two parts:

 1. the stock and flow structure of the system
 2. the decision rule used by the manager

### Variables

 - $S(t)$ : Stock level at time $t$ (i.e. inventory).
 - $A(t)$ : Acquisition rate at time $t$ (i.e. additions to stock, such as deliveries from the supplier).
 - $L(t)$ : Loss rate at time $t$ (i.e. reductions from stock, such as shipments to the customer).
 - $O(t)$ : Actual orders at time $t$, $O(t) \ge 0$.
 - ${IO}(t)$ : Indicated orders at time $t$.
 - $\hat{L}(t)$ : Expected stock losses at time $t$.
 - ${AS}(t)$ : Difference between desired and actual stock at time $t$.
 - ${ASL}(t)$ : Difference between desired and actual supply line at time $t$.
 - $S^*(t)$ : Desired stock level at time $t$.
 - ${SL}^*(t)$ : Desired supply line level at time $t$.
 - $\alpha_{s}$ : Decision rule parameter – the adjustment rate for the stock.
 - $\alpha_{sl}$ : Decision rule parameter – the adjustment rate for the supply line.

### Stock and flow system model

The stock at the next time instant $S(t+1)$ is given by the stock in the current time instant plus the current accumulation and loss rates (I assume the accumulation and loss rates are defined in units / period and are constant during each period),

\begin{equation}
S(t+1) = A(t) - L(t) + S(t). \tag{1}
\end{equation}

The acquisition rate depends on the supply line of units, $SL(t)$, which have been ordered but not yet received, and the average acquisition lag, $\lambda$. The supply line at the next time instant is simply the accumulation of the orders placed in the current period $O(t)$, less those which have been delivered $A(t)$:

\begin{equation}
SL(t+1) = O(t) - A(t) + {SL}(t). \tag{2}
\end{equation}


### Management decision rule

Managers are assumed to choose orders so as to:
 1. replace expected losses from stock
 2. reduce the discrepancy between the desired and actual stock
 3. maintain an adequate supply line of unfilled orders.

Note that orders must be non-negative, therefore

\begin{equation}
O(t) = \max(0, IO(t)). \tag{3}
\end{equation}

The indicated order rate is based on the anchoring and 
adjustment heuristic (Tversky and Kahneman 1974).
Anchoring and adjustment is a common strategy in which 
an unknown quantity is estimated by first recalling a
known reference point (the anchor) and then adjusting
for the effects of other factors which may be less
salient or whose effects are obscure, requiring the
subject to estimate these effects by what Kahneman and
Tversky (1982) call 'mental simulation.' Here, the
anchor is the expected loss rate $L$. Adjustments are then
made to correct discrepancies between the desired and
actual stock ($AS$), and between the desired and actual 
supply line ($ASL$).

\begin{equation}
IO(t) = \hat{L}(t) + AS(t) + ASL(t). \tag{4}
\end{equation}

There is a negative feedback loop which regulates the stock level.  The adjustment rate is linear in the discrepancy between the desired stock $S^*(t)$ and the actual stock:

\begin{equation}
AS(t) = \alpha_s (S^*(t) - S(t)). \tag{5}
\end{equation}

where $\alpha_s$ is the stock adjustment parameter.

Similarly, there is a second feedback loop which regulates the supply line level.  The adjustment rate is linear in the discrepancy between the desired supply line level $S^*(t)$ and the actual:

\begin{equation}
ASL(t) = \alpha_sl (SL^*(t) - SL(t)). \tag{6}
\end{equation}

where $\alpha_s$ is the stock adjustment parameter.

The desired supply line in general is not constant but depends on the desired throughput, $\Phi^*(t)$ and the expected lag between ordering and acquisition of goods:

\begin{equation}
SL^*(t) = \hat{\lambda}(t) \cdot \Phi^*(t). \tag{7}
\end{equation}

## Functions to implement system model

In [44]:
def calc_stock_tp1(stock_t, acquisition_rate_t, loss_rate_t, dt=1):
    """Calculate stock at time t

        S(t+1) = S(t) + (A(t) - L(t)) * dt

    This is a discrete-time version of Eqn. 1 in Sterman (1989).
    """
    return stock_t + (acquisition_rate_t - loss_rate_t) * dt

assert calc_stock_tp1(10, 4, 3, 1) == 11
assert calc_stock_tp1(10, 4, 3, 0.5) == 10.5

In [45]:
def calc_supply_line_tp1(supply_line_tm1, order_rate_tm1, acquisition_rate_tm1, dt=1):
    """Calculate supply line at time t

        SL(t+1) = SL(t) + (O(t) - A(t)) * dt

    This is a discrete-time version of Eqn. 2 in Sterman (1989).
    """
    return supply_line_tm1 + (order_rate_tm1 - acquisition_rate_tm1) * dt

assert calc_supply_line_t(10, 4, 3, 1) == 11
assert calc_supply_line_t(10, 4, 3, 0.5) == 10.5

## Functions to implement decision rule

In [46]:
def calc_orders_t(expected_losses_t, adjustment_for_stock_t, adjustment_for_supply_line_t):
    """Calculate order decision at time t

        IO(t) = L_hat(t) + AS(t) + ASL(t)
        O(t) = max(0, IO(t))

    These are discrete-time versions of Eqn.s 3 and 4 in Sterman (1989).
    """
    indicated_orders = expected_losses_t + adjustment_for_stock_t + adjustment_for_supply_line_t
    return np.clip(indicated_orders, 0, None)

assert calc_orders_t(1, 2, 3) == 6
assert calc_orders_t(1, 2, -3) == 0
assert calc_orders_t(1, 2, -10) == 0

In [47]:
def calc_adjustment_for_stock_t(stock_t, stock_target_t, alpha_s):
    """Calculate order adjustment for stock discrepancy at time t

        AS(t) = alpha_s * (S_target(t) - S(t))

    This is a discrete-time version of Eqn. 5 in Sterman (1989).
    """
    return alpha_s * (stock_target_t - stock_t)

assert calc_adjustment_for_stock_t(10, 12, 0.5) == 1

In [48]:
def calc_adjustment_for_supply_line_t(supply_line_t, supply_line_target_t, alpha_sl):
    """Calculate order adjustment for stock discrepancy at time t

        ASL(t) = alpha_sl * (SL_target(t) - SL(t))

    This is a discrete-time version of Eqn. 6 in Sterman (1989).
    """
    return alpha_s * (stock_target_t - stock_t)

assert calc_adjustment_for_stock_t(12, 10, 0.5) == -1

## Beer Game Experiment

## References

 - John Sterman (1989). Modeling Managerial Behavior: Misperceptions of Feedback in a Dynamic Decision Making Experiment, Management Science, Vol. 35, No. 3 (Mar., 1989), pp. 321-339, https://www.jstor.org/stable/2631975. 
