# Structural Price-Sales Coupling Prediction (2024-2030)

## Objective

Generate market-linked predictions for Problem 3 optimization by:
- Modeling price evolution via Geometric Brownian Motion (GBM)
- Coupling sales response through cross-price elasticity matrix
- Simulating yield and cost with realistic uncertainty

## Key Innovation

**Problem 2** (independent): Each crop's sales predicted separately

**Problem 3** (coupled): Sales respond to all crops' price changes via elasticity matrix

$$\Delta \ln Q_{it} = \sum_{j=1}^{41} E_{ij} \cdot \Delta \ln P_{jt} + \mu_q$$

This captures "one price affects all demands" market dynamics.

In [6]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

np.random.seed(2024)
plt.rcParams['font.size'] = 10

## 1. Data Loading

Load elasticity matrix and 2023 baseline data (price, sales, cost, yield).

In [7]:
# Load elasticity matrix
E_df = pd.read_csv('cross_price_elasticity_matrix.csv', index_col=0)
CROPS = E_df.index.tolist()
E = E_df.values
N_CROPS = len(CROPS)

print(f"Loaded elasticity matrix: {E.shape}")
print(f"  Substitution strength: {E[E > 0].mean():.3f}")
print(f"  Complementarity strength: {E[(E < 0) & (~np.eye(N_CROPS, dtype=bool))].mean():.3f}")

Loaded elasticity matrix: (41, 41)
  Substitution strength: 0.210
  Complementarity strength: -0.234


In [8]:
# Load baseline data
base_df = pd.read_csv('sales_volume_data.csv')
base_data = base_df.groupby('Crop Name').agg({
    'Avg_Price': 'mean',
    'Expected_Sales_Volume': 'max',
    'Cost_per_mu': 'mean',
    'Yield_per_mu': 'mean'
})

# Initialize arrays
P0 = np.zeros(N_CROPS)
Q0 = np.zeros(N_CROPS)
C0 = np.zeros(N_CROPS)
Y0 = np.zeros(N_CROPS)

for i, crop in enumerate(CROPS):
    if crop in base_data.index:
        P0[i] = base_data.loc[crop, 'Avg_Price'] * 2  # Yuan/jin → Yuan/kg
        Q0[i] = base_data.loc[crop, 'Expected_Sales_Volume']
        C0[i] = base_data.loc[crop, 'Cost_per_mu']
        Y0[i] = base_data.loc[crop, 'Yield_per_mu']
    else:
        # Defaults for missing crops
        P0[i] = 10.0
        Q0[i] = 50000.0
        C0[i] = 1000.0
        Y0[i] = 500.0

## 2. Structural Simulation Model

### Price Evolution (Geometric Brownian Motion)

$$P_{it} = P_{i,t-1} \cdot \exp(\mu_p + \sigma_{macro} Z_t + \sigma_{idio} \epsilon_{it})$$

- $\mu_p = 0.02$: Long-term inflation trend
- $\sigma_{macro} = 0.05$: Macroeconomic shock (affects all crops)
- $\sigma_{idio} = 0.03$: Crop-specific shock

### Sales Response (Elasticity Coupling)

$$\Delta \ln Q_{it} = \mu_q + \sum_{j=1}^{41} E_{ij} \cdot \Delta \ln P_{jt}$$

$$Q_{it} = Q_{i,t-1} \cdot \exp\left(\mu_q + \sum_{j} E_{ij} \cdot \ln\frac{P_{jt}}{P_{j,t-1}}\right)$$

- $\mu_q = 0.01$: Base demand growth
- $E_{ij}$: Cross-price elasticity from Step 1

### Cost Evolution (Input Price Shock)

$$C_{it} = C_{i,t-1} \cdot (1.01 + \xi_{it}), \quad \xi_{it} \sim U(-0.03, 0.03)$$

Base trend: 1% annual increase + ±3% random shock

### Yield Evolution (Climate Uncertainty)

$$Y_{it} = Y_{i,t-1} \cdot (1 + \eta_{it}), \quad \eta_{it} \sim U(-0.10, 0.10)$$

Consistent with Problem 2: ±10% annual yield variability

In [9]:
def simulate_coupled_market(E, P0, Q0, C0, Y0, n_scenarios=1000, years=7):
    """
    Monte Carlo simulation with market coupling.
    
    Returns:
        P, Q, C, Y: (n_scenarios, n_crops, years+1) arrays
    """
    n_crops = len(P0)
    
    # Initialize (index 0 = 2023, index 7 = 2030)
    P = np.zeros((n_scenarios, n_crops, years + 1))
    Q = np.zeros((n_scenarios, n_crops, years + 1))
    C = np.zeros((n_scenarios, n_crops, years + 1))
    Y = np.zeros((n_scenarios, n_crops, years + 1))
    
    P[:, :, 0] = P0
    Q[:, :, 0] = Q0
    C[:, :, 0] = C0
    Y[:, :, 0] = Y0
    
    # Parameters
    MU_P = 0.02
    MU_Q = 0.01
    SIGMA_MACRO = 0.05
    SIGMA_IDIO = 0.03
    
    for t in range(1, years + 1):
        # Price: GBM with macro + idiosyncratic shocks
        Z_macro = np.random.normal(0, 1, (n_scenarios, 1))
        eps_idio = np.random.normal(0, 1, (n_scenarios, n_crops))
        
        shock = MU_P + SIGMA_MACRO * Z_macro + SIGMA_IDIO * eps_idio
        P[:, :, t] = P[:, :, t-1] * np.exp(shock)
        
        # Sales: Elasticity-driven response
        d_ln_P = np.log(P[:, :, t]) - np.log(P[:, :, t-1])
        d_ln_Q = np.einsum('si, ji -> sj', d_ln_P, E)  # (n_scenarios, n_crops)
        Q[:, :, t] = Q[:, :, t-1] * np.exp(MU_Q + d_ln_Q)
        
        # Cost: Trend + random shock
        cost_shock = np.random.uniform(-0.03, 0.03, (n_scenarios, n_crops))
        C[:, :, t] = C[:, :, t-1] * (1.01 + cost_shock)
        
        # Yield: ±10% random walk (same as Problem 2)
        yield_shock = np.random.uniform(-0.10, 0.10, (n_scenarios, n_crops))
        Y[:, :, t] = Y[:, :, t-1] * (1 + yield_shock)
    
    return P, Q, C, Y

P_sim, Q_sim, C_sim, Y_sim = simulate_coupled_market(E, P0, Q0, C0, Y0)

## 3. Export for Optimization

Format data for Problem 3 optimization model.

In [11]:
# Build output DataFrame
output_rows = []
years = np.arange(2024, 2031)  # 2024-2030

for i, crop in enumerate(CROPS):
    for t_idx, year in enumerate(years):
        output_rows.append({
            'Crop': crop,
            'Year': year,
            'Price_mean': round(P_mean[i, t_idx + 1], 2),  # t_idx+1 because index 0 is 2023
            'Sales_Volume_mean': round(Q_mean[i, t_idx + 1], 0),
            'Cost_mean': round(C_mean[i, t_idx + 1], 2),
            'Yield_per_Mu_mean': round(Y_mean[i, t_idx + 1], 1)
        })

output_df = pd.DataFrame(output_rows)

# Add category
category_map = {}
G1 = ['Wheat', 'Corn', 'Rice', 'Sorghum', 'Millet', 'Broomcorn Millet', 'Buckwheat', 'Barley', 'Naked Oat',
    'Soybean', 'Black Bean', 'Red Bean', 'Mung Bean', 'Climbing Bean', 'Cowpea', 'Sword Bean', 'Kidney Bean',
    'Potato', 'Sweet Potato', 'Pumpkin']
G3 = ['Morel Mushroom', 'Shiitake Mushroom', 'Golden Oyster Mushroom', 'White Elf Mushroom']

for crop in CROPS:
    if crop in G1:
        category_map[crop] = 'G1'
    elif crop in G3:
        category_map[crop] = 'G3'
    else:
        category_map[crop] = 'G2'

output_df['Category'] = output_df['Crop'].map(category_map)

# Save
output_path = 'crop_predictions_problem3.csv'
output_df.to_csv(output_path, index=False)

print(f"\nExported to: {output_path}")


Exported to: crop_predictions_problem3.csv
