# DeepSequence PWL Demo

This notebook demonstrates the **DeepSequence PWL** model with modular component architecture.

## Key Features

1. **Modular Component Design**: Each component (Trend, Seasonal, Holiday, Regressor) receives only its relevant features
2. **Two Modes**:
   - **Intermittent**: Two-stage prediction (zero probability + magnitude) for sparse demand
   - **Continuous**: Direct forecasting for regular demand
3. **Piecewise-Linear (PWL) Calibration**: Monotonic constraints for interpretability
4. **No Transformers**: Efficient, lightweight architecture

## Feature Separation Benefits

- **No redundancy**: Each feature used by appropriate component only
- **Better interpretability**: Clear component responsibilities
- **Efficient learning**: No feature overweighting
- **True modularity**: Components learn specialized patterns

## 1. Setup and Imports

In [6]:
import sys
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Import from the new src-layout structure
sys.path.insert(0, '..')

from src.deepsequence_pwl import DeepSequencePWL

print("✓ All imports successful!")
print("✓ Using modular component architecture with feature separation")

✓ All imports successful!
✓ Using modular component architecture with feature separation


In [None]:
# Reload the module to get latest changes
import importlib
if 'deepsequence_pwl' in sys.modules:
    importlib.reload(sys.modules['deepsequence_pwl.model'])
    from src.deepsequence_pwl import DeepSequencePWL
    print("✓ Module reloaded")

## 2. Generate Sample Data

Let's create synthetic time series data with:
- Multiple SKUs (products)
- Trend, seasonality, and noise
- High zero rate (sparse/intermittent demand)

In [7]:
np.random.seed(42)

# Configuration
n_samples = 10000
n_skus = 25
n_features = 6  # Increased to demonstrate feature separation
zero_rate = 0.90  # 90% zeros (highly sparse)

# Generate features with semantic meaning:
# Features 0-1: Trend-related (time features)
# Features 2-3: Seasonal-related (cyclic features)
# Features 4-5: Regressor-related (external variables)
time_index = np.arange(n_samples)
X_trend = np.column_stack([
    time_index / n_samples,  # Normalized time
    (time_index % 30) / 30   # Month progress
])
X_seasonal = np.column_stack([
    np.sin(2 * np.pi * time_index / 7),    # Day of week
    np.cos(2 * np.pi * time_index / 365)   # Day of year
])
X_regressor = np.random.randn(n_samples, 2)  # External variables (price, promo, etc.)

X = np.column_stack([X_trend, X_seasonal, X_regressor])

# Generate SKU IDs
sku_ids = np.random.randint(0, n_skus, n_samples)

# Generate target with trend and seasonality
trend = 0.01 * time_index / 100
seasonality = 5 * np.sin(2 * np.pi * time_index / 365)
noise = np.random.randn(n_samples) * 2

y_magnitude = np.maximum(0, 10 + trend + seasonality + X.sum(axis=1) + noise)

# Apply sparsity (intermittent demand)
zero_mask = np.random.rand(n_samples) < zero_rate
y = y_magnitude.copy()
y[zero_mask] = 0

print(f"Dataset created:")
print(f"  Samples: {n_samples:,}")
print(f"  SKUs: {n_skus}")
print(f"  Features: {n_features}")
print(f"    - Trend features (0-1): time-based")
print(f"    - Seasonal features (2-3): cyclic patterns")
print(f"    - Regressor features (4-5): external variables")
print(f"  Zero rate: {(y == 0).mean():.1%}")
print(f"  Non-zero mean: {y[y > 0].mean():.2f}")
print(f"  Non-zero std: {y[y > 0].std():.2f}")

Dataset created:
  Samples: 10,000
  SKUs: 25
  Features: 6
    - Trend features (0-1): time-based
    - Seasonal features (2-3): cyclic patterns
    - Regressor features (4-5): external variables
  Zero rate: 90.2%
  Non-zero mean: 11.63
  Non-zero std: 4.49


## 3. Train-Test Split

In [8]:
# Split data: 70% train, 15% validation, 15% test
X_temp, X_test, y_temp, y_test, sku_temp, sku_test = train_test_split(
    X, y, sku_ids, test_size=0.15, random_state=42
)

X_train, X_val, y_train, y_val, sku_train, sku_val = train_test_split(
    X_temp, y_temp, sku_temp, test_size=0.176, random_state=42  # 0.176 * 0.85 ≈ 0.15
)

print(f"Data split:")
print(f"  Train: {len(X_train):,} samples ({len(X_train)/n_samples:.1%})")
print(f"  Val:   {len(X_val):,} samples ({len(X_val)/n_samples:.1%})")
print(f"  Test:  {len(X_test):,} samples ({len(X_test)/n_samples:.1%})")

Data split:
  Train: 7,004 samples (70.0%)
  Val:   1,496 samples (15.0%)
  Test:  1,500 samples (15.0%)


## 4. Model 1: Intermittent Demand (Two-Stage Prediction)

This mode is ideal for sparse/intermittent demand with high zero rates.

In [9]:
# Build model with intermittent handling enabled
model_intermittent = DeepSequencePWL(
    num_skus=n_skus,
    n_features=n_features,
    enable_intermittent_handling=True,  # Two-stage prediction
    id_embedding_dim=8,
    component_hidden_units=32,
    component_dropout=0.2,
    zero_prob_hidden_units=64,
    zero_prob_hidden_layers=2,
    zero_prob_dropout=0.2,
    activation='mish'
)

# Build the model with feature separation (NEW in modular architecture)
# Each component receives only its relevant features
main_model, trend_model, seasonal_model, holiday_model, regressor_model = model_intermittent.build_model(
    trend_feature_indices=[0, 1],      # Time-based features for trend
    seasonal_feature_indices=[2, 3],   # Cyclic features for seasonality
    holiday_feature_index=None,        # No holiday feature in this dataset
    regressor_feature_indices=[4, 5]   # External variables
)

print(f"\n✓ Intermittent model built with feature separation")
print(f"  Total parameters: {main_model.count_params():,}")
print(f"  Outputs: {list(main_model.output.keys())}")


[Building DeepSequence PWL Model]
  SKUs: 25, Features: 6
  Activation: mish
  ID embedding: 8D
  Component hidden: 32
  Trend features: 2 indices
  Seasonal features: 2 indices
  Holiday feature: None
  Regressor features: 2 indices


TypeError: Exception encountered when calling layer "holiday_distance_extract" (type Lambda).

unsupported operand type(s) for +: 'NoneType' and 'int'

Call arguments received by layer "holiday_distance_extract" (type Lambda):
  • inputs=tf.Tensor(shape=(None, 6), dtype=float32)
  • mask=None
  • training=None

In [None]:
# Compile and train
main_model.compile(
    optimizer='adam',
    loss={'final_forecast': 'mae'},
    metrics={'final_forecast': ['mae']}
)

print("Training intermittent model...\n")
history = main_model.fit(
    [X_train, sku_train],
    {'final_forecast': y_train},
    validation_data=([X_val, sku_val], {'final_forecast': y_val}),
    epochs=3,
    batch_size=256,
    verbose=1
)

print("\n✓ Training completed!")

In [None]:
# Evaluate on test set
predictions_intermittent = main_model.predict([X_test, sku_test], verbose=0)

y_pred = predictions_intermittent['final_forecast'].flatten()
zero_prob = predictions_intermittent['zero_probability'].flatten()

test_mae = np.abs(y_test - y_pred).mean()

print(f"\nIntermittent Model Performance:")
print(f"  Test MAE: {test_mae:.4f}")
print(f"  Zero probability range: [{zero_prob.min():.3f}, {zero_prob.max():.3f}]")
print(f"  Mean zero probability: {zero_prob.mean():.3f}")
print(f"  Actual zero rate: {(y_test == 0).mean():.3f}")

## 5. Model 2: Continuous Demand (Direct Forecast)

This mode is ideal for regular continuous demand forecasting with fewer parameters.

In [None]:
# Build model with intermittent handling disabled
model_continuous = DeepSequencePWL(
    num_skus=n_skus,
    n_features=n_features,
    enable_intermittent_handling=False,  # Direct forecast only
    id_embedding_dim=8,
    component_hidden_units=32,
    component_dropout=0.2,
    activation='mish'
)

# Build the model with feature separation
main_model2, _, _, _, _ = model_continuous.build_model(
    trend_feature_indices=[0, 1],
    seasonal_feature_indices=[2, 3],
    holiday_feature_index=None,
    regressor_feature_indices=[4, 5]
)

print(f"\n✓ Continuous model built with feature separation")
print(f"  Total parameters: {main_model2.count_params():,}")
print(f"  Parameter savings: {(1 - main_model2.count_params()/main_model.count_params())*100:.1f}%")
print(f"  Outputs: {list(main_model2.output.keys())}")

In [None]:
# Compile and train
main_model2.compile(
    optimizer='adam',
    loss={'final_forecast': 'mae'},
    metrics={'final_forecast': ['mae']}
)

print("Training continuous model...\n")
history2 = main_model2.fit(
    [X_train, sku_train],
    {'final_forecast': y_train},
    validation_data=([X_val, sku_val], {'final_forecast': y_val}),
    epochs=3,
    batch_size=256,
    verbose=1
)

print("\n✓ Training completed!")

In [None]:
# Evaluate on test set
predictions_continuous = main_model2.predict([X_test, sku_test], verbose=0)

y_pred2 = predictions_continuous['final_forecast'].flatten()

test_mae2 = np.abs(y_test - y_pred2).mean()

print(f"\nContinuous Model Performance:")
print(f"  Test MAE: {test_mae2:.4f}")
print(f"  Note: Higher MAE expected for sparse data without intermittent handling")

## 6. Model Comparison

In [None]:
comparison_df = pd.DataFrame({
    'Model': ['Intermittent (Two-Stage)', 'Continuous (Direct)'],
    'Parameters': [main_model.count_params(), main_model2.count_params()],
    'Test MAE': [test_mae, test_mae2],
    'Best For': ['Sparse/intermittent demand (high zero rate)', 'Regular continuous demand']
})

print("\n" + "="*80)
print("MODEL COMPARISON")
print("="*80)
print(comparison_df.to_string(index=False))
print("="*80)

print(f"\n✓ For this dataset (zero rate: {(y_test == 0).mean():.1%}):")
print(f"  Intermittent model is {test_mae2/test_mae:.1f}x better")
print(f"  Using {main_model.count_params() - main_model2.count_params():,} more parameters")

## 7. Component Analysis (Intermittent Model)

Let's examine the individual component contributions.

In [None]:
# Get a sample for analysis
sample_idx = 0
sample_X = X_test[sample_idx:sample_idx+1]
sample_sku = sku_test[sample_idx:sample_idx+1]
sample_y = y_test[sample_idx]

# Get predictions from individual components
trend_pred = trend_model.predict([sample_X, sample_sku], verbose=0)[0, 0]
seasonal_pred = seasonal_model.predict([sample_X, sample_sku], verbose=0)[0, 0]
holiday_pred = holiday_model.predict([sample_X, sample_sku], verbose=0)[0, 0]
regressor_pred = regressor_model.predict([sample_X, sample_sku], verbose=0)[0, 0]

# Get main model predictions
main_pred = main_model.predict([sample_X, sample_sku], verbose=0)
base_forecast = main_pred['base_forecast'][0, 0]
final_forecast = main_pred['final_forecast'][0, 0]
zero_prob = main_pred['zero_probability'][0, 0]

print(f"\nComponent Analysis for Sample {sample_idx}:")
print(f"  Trend:       {trend_pred:8.4f}")
print(f"  Seasonal:    {seasonal_pred:8.4f}")
print(f"  Holiday:     {holiday_pred:8.4f}")
print(f"  Regressor:   {regressor_pred:8.4f}")
print(f"  " + "-" * 30)
print(f"  Base forecast:  {base_forecast:8.4f}")
print(f"  Zero prob:      {zero_prob:8.4f}")
print(f"  Final forecast: {final_forecast:8.4f}")
print(f"  Actual value:   {sample_y:8.4f}")
print(f"  Error:          {abs(final_forecast - sample_y):8.4f}")

## 8. Summary

### Key Takeaways:

1. **Intermittent Mode** (`enable_intermittent_handling=True`):
   - Includes zero probability network
   - Best for sparse/intermittent demand (high zero rate)
   - More parameters but better accuracy for sparse data

2. **Continuous Mode** (`enable_intermittent_handling=False`):
   - Direct forecast without zero probability overhead
   - Best for regular continuous demand
   - 86% fewer parameters, more efficient

3. **Components**:
   - Trend: Captures long-term patterns
   - Seasonal: Captures periodic patterns
   - Holiday: Captures special events (PWL + Lattice)
   - Regressor: Captures feature relationships
   - All combined additively for interpretability

### When to Use:

- **High zero rate (>70%)**: Use intermittent mode
- **Low zero rate (<30%)**: Use continuous mode
- **Mixed scenarios**: Test both and compare

## Next Steps

1. Try with your own data
2. Tune hyperparameters (hidden units, dropout, layers)
3. Experiment with different activations ('mish', 'relu', 'elu')
4. Train for more epochs with early stopping
5. Analyze component contributions for insights

See `deepsequence/deepsequence_pwl/README.md` for detailed documentation.