# Sequential Decision Modeling

## Step 1: Narrative

## Step 2: Identify Core Metrics

* Maximize contribution.
* Demand for each day is unknown.
    * $\hat{W}_{t+1}=\hat{D}_{t+1}$: Demand for today.
    * $\hat{D} \sim \mathcal{N}(\mu = 50, \sigma^2 = 5)$
* Decision variable $x_t$: How many pounds of bacon we'll order at the end of the day. $t$
* Constants:
    * $p=25$
    * $c=15$
* Initial values:
    * $R_0=10$
* Policies:
    * $\theta \in [40,60]$

## Step 3: Mathematical Model

### State:

$$S_t=(R_t, \hat{D}_{t+1}, c, p, \mu, \sigma^2)$$

### Decision Variables:

$$X^\pi = \max(\theta - R_t, 0)$$

### Exogenous Informartion:

$$\hat{D}_{t+1} \sim \mathcal{N}(\mu = 50, \sigma^2 = 5)$$

### Transition Function:

$$S^M=\max(R_t+x_t-\hat{D}_{t+1}, 0)$$

### Objective Function:

$$\max_{\pi} \sum_{t=1}^T C(S_t,x_t) | S_0$$

$$C(S_t,x_t)=-c(x_t)+p(\min(R_t+x_t,\hat{D}_{t+1}))$$

## Step 4: Uncertainity Model

* Previously defined...

## Step 5: Designing Policies

$$\theta \in [40,60]$$

## Step 6: Evaluating Policies

In [1]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

In [2]:
def X_t(r_t: int, theta: int) -> int:
    return np.max([0, theta - r_t])

In [3]:
c = 15
p = 25
mu = 50
var = 5
r_0 = 10

theta_space = list(range(40, 61))

for theta in theta_space:
    contributions = []

    for _ in range(1000):
        scenarios = []
        r_t = r_0
    
        for i in range(1, 11):
            # Buy inventory
            x_t = X_t(r_t, theta)

            # Exogenous information
            D_t_1 = np.round(np.random.normal(mu, np.sqrt(var)))

            # Transition function
            r_t_1 = np.max([x_t + r_t - D_t_1, 0])

            contribution = -c * x_t + p * (np.min([x_t + r_t, D_t_1]))

            scenarios.append({
                "t": i,
                "r_t": r_t,
                "x_t": x_t,
                "D_t_1": D_t_1,
                "C": contribution,
            })

            r_t = r_t_1

        theta_df = pd.DataFrame.from_records(scenarios)

        contributions.append(theta_df.C.sum())

    print(f"Theta = {theta} -> {np.mean(contributions):.2f}")

Theta = 40 -> 4150.00
Theta = 41 -> 4250.00
Theta = 42 -> 4349.97
Theta = 43 -> 4449.97
Theta = 44 -> 4549.69
Theta = 45 -> 4648.86
Theta = 46 -> 4746.46
Theta = 47 -> 4838.99
Theta = 48 -> 4923.11
Theta = 49 -> 4996.15
Theta = 50 -> 5045.38
Theta = 51 -> 5083.36
Theta = 52 -> 5093.66
Theta = 53 -> 5096.34
Theta = 54 -> 5086.23
Theta = 55 -> 5071.60
Theta = 56 -> 5061.10
Theta = 57 -> 5042.52
Theta = 58 -> 5033.94
Theta = 59 -> 5011.31
Theta = 60 -> 5003.91


In [4]:
import yfinance as yf

In [14]:
ticker = "AAPL"
df = yf.download(ticker, period='5y', interval='1d')['Close']
df['future_price'] = df['AAPL'].pct_change(periods=5).shift(-5)
df.head(10)

[*********************100%***********************]  1 of 1 completed


Ticker,AAPL,future_price
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2020-10-26,111.911186,-0.054585
2020-10-27,113.418884,-0.05283
2020-10-28,108.166222,0.033723
2020-10-29,112.173828,0.032171
2020-10-30,105.89006,0.09218
2020-11-02,105.802505,0.071258
2020-11-03,107.426956,0.051884
2020-11-04,111.813911,0.041289
2020-11-05,115.7826,0.00324
2020-11-06,115.651054,0.004802


In [16]:
df['final_signal'] = 0
alpha = 0.02
df.loc[df['future_price'] > alpha, 'final_signal'] = 1
df.loc[df['future_price'] < -alpha, 'final_signal'] = -1
df

Ticker,AAPL,future_price,final_signal
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2020-10-26,111.911186,-0.054585,-1
2020-10-27,113.418884,-0.052830,-1
2020-10-28,108.166222,0.033723,1
2020-10-29,112.173828,0.032171,1
2020-10-30,105.890060,0.092180,1
...,...,...,...
2025-10-20,262.239990,,0
2025-10-21,262.769989,,0
2025-10-22,258.450012,,0
2025-10-23,259.579987,,0
