MODEL BUILDING FOR PRICE OPTIMIZATION

BUSINESS OBJECTIVE: Build ML model to predict profit at different price points
TECHNICAL OBJECTIVE: Compare models, select best for production deployment

MODEL SELECTION CRITERIA:
1. Prediction accuracy (R¬≤, MAE, RMSE)
2. Interpretability (can we explain to stakeholders?)
3. Speed (real-time pricing decisions)
4. Robustness (works across all products/segments)

MODELS TO TEST:
1. Linear Regression - Baseline, highly interpretable
2. Ridge Regression - Regularized linear, prevents overfitting
3. Random Forest - Non-linear, feature importance, robust
4. Gradient Boosting - Best accuracy, industry standard
5. XGBoost - Production-ready, fast, accurate


In [3]:
import pandas as pd
import numpy as np
import json
from sklearn.model_selection import train_test_split, TimeSeriesSplit, cross_val_score
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression, Ridge, Lasso
from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor
from sklearn.metrics import mean_squared_error, r2_score, mean_absolute_error, mean_absolute_percentage_error
import joblib
import warnings
warnings.filterwarnings('ignore')


# ============================================================================
# LOAD DATA AND FEATURES
# ============================================================================
print("="*80)
print("1. LOAD DATA")
print("="*80)
print()

df = pd.read_csv('lab_equipment_pricing_features.csv')
print(f"‚úì Loaded: {len(df):,} records with {df.shape[1]} columns")

# Load feature metadata
with open('feature_metadata.json', 'r') as f:
    feature_metadata = json.load(f)

all_features = feature_metadata['all_features']
target = feature_metadata['target']

print(f"‚úì Features: {len(all_features)}")
print(f"‚úì Target: {target}")
print()

1. LOAD DATA

‚úì Loaded: 10,000 records with 54 columns
‚úì Features: 35
‚úì Target: profit



In [4]:
# ============================================================================
# PREPARE TRAIN/TEST SPLIT
# ============================================================================
print("="*80)
print("2. TRAIN/TEST SPLIT STRATEGY")
print("="*80)
print()

print("WHY TIME-BASED SPLIT:")
print("  - Can't use random split (data leakage risk)")
print("  - Must simulate real scenario: train on past, predict future")
print("  - Business: Model must work on upcoming quarters")
print()

# Sort by date
df_sorted = df.sort_values('date').reset_index(drop=True)

# Remove rows with NaN in features (from rolling calculations)
df_model = df_sorted[all_features + [target]].dropna()
print(f"Rows after removing NaN: {len(df_model):,}")

X = df_model[all_features]
y = df_model[target]

# Time-based split: 80% train, 20% test
split_idx = int(len(X) * 0.8)
X_train, X_test = X[:split_idx], X[split_idx:]
y_train, y_test = y[:split_idx], y[split_idx:]

print(f"\nTrain set: {len(X_train):,} records (80%)")
print(f"Test set: {len(X_test):,} records (20%)")
print(f"Train target mean: ${y_train.mean():,.0f}")
print(f"Test target mean: ${y_test.mean():,.0f}")
print()


2. TRAIN/TEST SPLIT STRATEGY

WHY TIME-BASED SPLIT:
  - Can't use random split (data leakage risk)
  - Must simulate real scenario: train on past, predict future
  - Business: Model must work on upcoming quarters

Rows after removing NaN: 10,000

Train set: 8,000 records (80%)
Test set: 2,000 records (20%)
Train target mean: $467,910
Test target mean: $452,682



In [5]:
# ============================================================================
# FEATURE SCALING
# ============================================================================
print("="*80)
print("3. FEATURE SCALING")
print("="*80)
print()

print("WHY SCALING:")
print("  - Linear models sensitive to feature scales")
print("  - Tree models don't need scaling (but doesn't hurt)")
print("  - Speeds up convergence")
print()

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print("‚úì Features scaled using StandardScaler")
print(f"  Mean: ~0, Std: ~1 for all features")
print()


3. FEATURE SCALING

WHY SCALING:
  - Linear models sensitive to feature scales
  - Tree models don't need scaling (but doesn't hurt)
  - Speeds up convergence

‚úì Features scaled using StandardScaler
  Mean: ~0, Std: ~1 for all features



In [None]:
# ============================================================================
# MODEL 1: LINEAR REGRESSION (BASELINE)
# ============================================================================
print("="*80)
print("4. MODEL 1: LINEAR REGRESSION (BASELINE)")
print("="*80)
print()

print("WHY LINEAR REGRESSION:")
print("  Business: Highly interpretable, shows feature importance")
print("  Technical: Fast, simple, good baseline")
print("  Limitation: Assumes linear relationships")
print()

lr_model = LinearRegression()
lr_model.fit(X_train_scaled, y_train)

# Predictions
y_pred_train_lr = lr_model.predict(X_train_scaled)
y_pred_test_lr = lr_model.predict(X_test_scaled)

# Metrics
train_r2_lr = r2_score(y_train, y_pred_train_lr)
test_r2_lr = r2_score(y_test, y_pred_test_lr)
test_mae_lr = mean_absolute_error(y_test, y_pred_test_lr)
test_rmse_lr = np.sqrt(mean_squared_error(y_test, y_pred_test_lr))
test_mape_lr = mean_absolute_percentage_error(y_test, y_pred_test_lr)

print("Performance Metrics:")
print(f"  Train R¬≤: {train_r2_lr:.4f}")
print(f"  Test R¬≤: {test_r2_lr:.4f}")
print(f"  Test MAE: ${test_mae_lr:,.0f}")
print(f"  Test RMSE: ${test_rmse_lr:,.0f}")
print(f"  Test MAPE: {test_mape_lr:.1%}")
print()

# Feature importance (coefficients)
feature_importance_lr = pd.DataFrame({
    'feature': all_features,
    'coefficient': lr_model.coef_
}).sort_values('coefficient', key=abs, ascending=False)

print("Top 10 Most Important Features (by coefficient magnitude):")
print(feature_importance_lr.head(10).to_string(index=False))
print()

print("BUSINESS INTERPRETATION:")
top_feature = feature_importance_lr.iloc[0]
print(f"  Most influential: {top_feature['feature']} (coef: {top_feature['coefficient']:.2f})")
print(f"  ‚Üí Each unit increase in {top_feature['feature']} changes profit by ${top_feature['coefficient']:.2f}")
print()


4. MODEL 1: LINEAR REGRESSION (BASELINE)

WHY LINEAR REGRESSION:
  Business: Highly interpretable, shows feature importance
  Technical: Fast, simple, good baseline
  Limitation: Assumes linear relationships

Performance Metrics:
  Train R¬≤: 0.8683
  Test R¬≤: 0.8548
  Test MAE: $130,994
  Test RMSE: $170,992
  Test MAPE: 273.5%

Top 10 Most Important Features (by coefficient magnitude):
                feature    coefficient
price_pct_vs_competitor  922799.049438
 price_ratio_competitor -922754.994543
          inventory_pct  609219.710505
        inventory_level -607863.198778
                  price  187453.749424
       competitor_price  184646.196100
        segment_encoded   99683.411301
            price_ma_7d   89570.647195
           price_ma_30d  -62655.987861
     is_summer_slowdown  -47234.461899

BUSINESS INTERPRETATION:
  Most influential: price_pct_vs_competitor (coef: 922799.05)
  ‚Üí Each unit increase in price_pct_vs_competitor changes profit by $922799.05



In [7]:
# ============================================================================
# MODEL 2: RIDGE REGRESSION (REGULARIZED)
# ============================================================================
print("="*80)
print("5. MODEL 2: RIDGE REGRESSION (REGULARIZED)")
print("="*80)
print()

print("WHY RIDGE:")
print("  Business: Same interpretability as linear, but more robust")
print("  Technical: L2 regularization prevents overfitting")
print("  Use case: When we have many correlated features")
print()

ridge_model = Ridge(alpha=10.0)  # Regularization strength
ridge_model.fit(X_train_scaled, y_train)

# Predictions
y_pred_train_ridge = ridge_model.predict(X_train_scaled)
y_pred_test_ridge = ridge_model.predict(X_test_scaled)

# Metrics
train_r2_ridge = r2_score(y_train, y_pred_train_ridge)
test_r2_ridge = r2_score(y_test, y_pred_test_ridge)
test_mae_ridge = mean_absolute_error(y_test, y_pred_test_ridge)
test_rmse_ridge = np.sqrt(mean_squared_error(y_test, y_pred_test_ridge))
test_mape_ridge = mean_absolute_percentage_error(y_test, y_pred_test_ridge)

print("Performance Metrics:")
print(f"  Train R¬≤: {train_r2_ridge:.4f}")
print(f"  Test R¬≤: {test_r2_ridge:.4f}")
print(f"  Test MAE: ${test_mae_ridge:,.0f}")
print(f"  Test RMSE: ${test_rmse_ridge:,.0f}")
print(f"  Test MAPE: {test_mape_ridge:.1%}")
print()


5. MODEL 2: RIDGE REGRESSION (REGULARIZED)

WHY RIDGE:
  Business: Same interpretability as linear, but more robust
  Technical: L2 regularization prevents overfitting
  Use case: When we have many correlated features

Performance Metrics:
  Train R¬≤: 0.8682
  Test R¬≤: 0.8549
  Test MAE: $130,944
  Test RMSE: $170,881
  Test MAPE: 273.0%



In [8]:

# ============================================================================
# MODEL 3: RANDOM FOREST (NON-LINEAR)
# ============================================================================
print("="*80)
print("6. MODEL 3: RANDOM FOREST")
print("="*80)
print()

print("WHY RANDOM FOREST:")
print("  Business: Captures non-linear relationships (price curves aren't straight lines)")
print("  Technical: Robust, handles outliers well, provides feature importance")
print("  Use case: When relationships are complex")
print()

rf_model = RandomForestRegressor(
    n_estimators=100,        # Number of trees
    max_depth=15,            # Prevent overfitting
    min_samples_split=20,    # Minimum samples to split
    min_samples_leaf=10,     # Minimum samples per leaf
    max_features='sqrt',     # Features per split
    random_state=42,
    n_jobs=-1               # Use all CPU cores
)

print("Training Random Forest...")
rf_model.fit(X_train, y_train)  # Trees don't need scaling

# Predictions
y_pred_train_rf = rf_model.predict(X_train)
y_pred_test_rf = rf_model.predict(X_test)

# Metrics
train_r2_rf = r2_score(y_train, y_pred_train_rf)
test_r2_rf = r2_score(y_test, y_pred_test_rf)
test_mae_rf = mean_absolute_error(y_test, y_pred_test_rf)
test_rmse_rf = np.sqrt(mean_squared_error(y_test, y_pred_test_rf))
test_mape_rf = mean_absolute_percentage_error(y_test, y_pred_test_rf)

print("\nPerformance Metrics:")
print(f"  Train R¬≤: {train_r2_rf:.4f}")
print(f"  Test R¬≤: {test_r2_rf:.4f}")
print(f"  Test MAE: ${test_mae_rf:,.0f}")
print(f"  Test RMSE: ${test_rmse_rf:,.0f}")
print(f"  Test MAPE: {test_mape_rf:.1%}")
print()

# Feature importance
feature_importance_rf = pd.DataFrame({
    'feature': all_features,
    'importance': rf_model.feature_importances_
}).sort_values('importance', ascending=False)

print("Top 10 Most Important Features (by RF importance):")
print(feature_importance_rf.head(10).to_string(index=False))
print()




6. MODEL 3: RANDOM FOREST

WHY RANDOM FOREST:
  Business: Captures non-linear relationships (price curves aren't straight lines)
  Technical: Robust, handles outliers well, provides feature importance
  Use case: When relationships are complex

Training Random Forest...

Performance Metrics:
  Train R¬≤: 0.9580
  Test R¬≤: 0.9385
  Test MAE: $64,946
  Test RMSE: $111,266
  Test MAPE: 17.2%

Top 10 Most Important Features (by RF importance):
              feature  importance
                price    0.219923
         price_ma_30d    0.178998
     competitor_price    0.162480
          price_ma_7d    0.125968
      product_encoded    0.100218
      segment_encoded    0.070031
price_diff_competitor    0.040215
            qty_ma_7d    0.025073
           qty_ma_30d    0.012561
   is_summer_slowdown    0.008494



In [9]:
# ============================================================================
# MODEL 4: GRADIENT BOOSTING (BEST PERFORMANCE)
# ============================================================================
print("="*80)
print("7. MODEL 4: GRADIENT BOOSTING")
print("="*80)
print()

print("WHY GRADIENT BOOSTING:")
print("  Business: Industry standard for pricing, high accuracy")
print("  Technical: Sequential learning, corrects previous errors")
print("  Use case: When we need best possible predictions")
print()

gb_model = GradientBoostingRegressor(
    n_estimators=200,        # More trees = better but slower
    learning_rate=0.05,      # Smaller = more conservative
    max_depth=5,             # Tree depth
    min_samples_split=20,
    min_samples_leaf=10,
    subsample=0.8,           # Use 80% of data per tree
    max_features='sqrt',
    random_state=42
)

print("Training Gradient Boosting...")
gb_model.fit(X_train, y_train)

# Predictions
y_pred_train_gb = gb_model.predict(X_train)
y_pred_test_gb = gb_model.predict(X_test)

# Metrics
train_r2_gb = r2_score(y_train, y_pred_train_gb)
test_r2_gb = r2_score(y_test, y_pred_test_gb)
test_mae_gb = mean_absolute_error(y_test, y_pred_test_gb)
test_rmse_gb = np.sqrt(mean_squared_error(y_test, y_pred_test_gb))
test_mape_gb = mean_absolute_percentage_error(y_test, y_pred_test_gb)

print("\nPerformance Metrics:")
print(f"  Train R¬≤: {train_r2_gb:.4f}")
print(f"  Test R¬≤: {test_r2_gb:.4f}")
print(f"  Test MAE: ${test_mae_gb:,.0f}")
print(f"  Test RMSE: ${test_rmse_gb:,.0f}")
print(f"  Test MAPE: {test_mape_gb:.1%}")
print()

# Feature importance
feature_importance_gb = pd.DataFrame({
    'feature': all_features,
    'importance': gb_model.feature_importances_
}).sort_values('importance', ascending=False)

print("Top 10 Most Important Features (by GB importance):")
print(feature_importance_gb.head(10).to_string(index=False))
print()

7. MODEL 4: GRADIENT BOOSTING

WHY GRADIENT BOOSTING:
  Business: Industry standard for pricing, high accuracy
  Technical: Sequential learning, corrects previous errors
  Use case: When we need best possible predictions

Training Gradient Boosting...

Performance Metrics:
  Train R¬≤: 0.9779
  Test R¬≤: 0.9670
  Test MAE: $51,315
  Test RMSE: $81,482
  Test MAPE: 18.6%

Top 10 Most Important Features (by GB importance):
              feature  importance
                price    0.228834
         price_ma_30d    0.202077
      product_encoded    0.147952
      segment_encoded    0.110623
          price_ma_7d    0.097988
     competitor_price    0.097488
price_diff_competitor    0.025583
            qty_ma_7d    0.019619
           qty_ma_30d    0.015748
    is_academic_start    0.010357



In [10]:
# ============================================================================
# MODEL COMPARISON
# ============================================================================
print("="*80)
print("8. MODEL COMPARISON")
print("="*80)
print()

comparison = pd.DataFrame({
    'Model': ['Linear Regression', 'Ridge Regression', 'Random Forest', 'Gradient Boosting'],
    'Train_R¬≤': [train_r2_lr, train_r2_ridge, train_r2_rf, train_r2_gb],
    'Test_R¬≤': [test_r2_lr, test_r2_ridge, test_r2_rf, test_r2_gb],
    'Test_MAE': [test_mae_lr, test_mae_ridge, test_mae_rf, test_mae_gb],
    'Test_RMSE': [test_rmse_lr, test_rmse_ridge, test_rmse_rf, test_rmse_gb],
    'Test_MAPE_%': [test_mape_lr*100, test_mape_ridge*100, test_mape_rf*100, test_mape_gb*100]
})

comparison = comparison.round(4)
print(comparison.to_string(index=False))
print()

# Select best model
best_idx = comparison['Test_R¬≤'].idxmax()
best_model_name = comparison.loc[best_idx, 'Model']
best_model_r2 = comparison.loc[best_idx, 'Test_R¬≤']

print(f"üèÜ BEST MODEL: {best_model_name}")
print(f"   Test R¬≤: {best_model_r2:.4f}")
print(f"   Explains {best_model_r2*100:.1f}% of profit variation")
print()

# Business interpretation
print("BUSINESS INTERPRETATION:")
if best_model_name == 'Gradient Boosting':
    print("  ‚úì Gradient Boosting wins - complex non-linear relationships captured")
    print("  ‚úì Trade-off: Less interpretable than linear models")
    print("  ‚úì Solution: Use SHAP values or feature importance for explainability")
    best_model = gb_model
elif best_model_name == 'Random Forest':
    print("  ‚úì Random Forest wins - good balance of accuracy and speed")
    print("  ‚úì Feature importance built-in for explainability")
    best_model = rf_model
else:
    print("  ‚úì Linear model wins - simple relationships, highly interpretable")
    print("  ‚úì Can directly explain coefficient impact")
    best_model = ridge_model if best_model_name == 'Ridge Regression' else lr_model
print()


8. MODEL COMPARISON

            Model  Train_R¬≤  Test_R¬≤    Test_MAE   Test_RMSE  Test_MAPE_%
Linear Regression    0.8683   0.8548 130993.9822 170991.5566     273.5090
 Ridge Regression    0.8682   0.8549 130943.6738 170881.4504     272.9689
    Random Forest    0.9580   0.9385  64945.8018 111265.8225      17.1546
Gradient Boosting    0.9779   0.9670  51315.0579  81482.2924      18.6169

üèÜ BEST MODEL: Gradient Boosting
   Test R¬≤: 0.9670
   Explains 96.7% of profit variation

BUSINESS INTERPRETATION:
  ‚úì Gradient Boosting wins - complex non-linear relationships captured
  ‚úì Trade-off: Less interpretable than linear models
  ‚úì Solution: Use SHAP values or feature importance for explainability



In [11]:
# ============================================================================
# CROSS-VALIDATION
# ============================================================================
print("="*80)
print("9. CROSS-VALIDATION (TIME SERIES)")
print("="*80)
print()

print("WHY CROSS-VALIDATION:")
print("  - Validate model stability across different time periods")
print("  - Detect overfitting")
print("  - Ensure model works on future data")
print()

# Time series cross-validation
tscv = TimeSeriesSplit(n_splits=5)

print("Running 5-fold time series cross-validation on best model...")
cv_scores = cross_val_score(best_model, X_train, y_train, cv=tscv, 
                             scoring='r2', n_jobs=-1)

print(f"\nCV R¬≤ Scores: {[f'{s:.4f}' for s in cv_scores]}")
print(f"Mean CV R¬≤: {cv_scores.mean():.4f} (+/- {cv_scores.std():.4f})")
print()

if cv_scores.std() < 0.05:
    print("‚úì Low variance across folds - model is stable")
else:
    print("‚ö† High variance - model performance varies by time period")
print()


9. CROSS-VALIDATION (TIME SERIES)

WHY CROSS-VALIDATION:
  - Validate model stability across different time periods
  - Detect overfitting
  - Ensure model works on future data

Running 5-fold time series cross-validation on best model...

CV R¬≤ Scores: ['0.8637', '0.9510', '0.9635', '0.9605', '0.9662']
Mean CV R¬≤: 0.9410 (+/- 0.0390)

‚úì Low variance across folds - model is stable



In [None]:
# ============================================================================
# RESIDUAL ANALYSIS
# ============================================================================
print("="*80)
print("10. RESIDUAL ANALYSIS")
print("="*80)
print()

print("WHY RESIDUALS:")
print("  - Check for systematic errors (bias)")
print("  - Validate model assumptions")
print("  - Identify segments where model struggles")
print()

residuals = y_test - y_pred_test_gb

print("Residual Statistics:")
print(f"  Mean error: ${residuals.mean():,.0f} (should be ~0)")
print(f"  Median error: ${residuals.median():,.0f}")
print(f"  Std dev: ${residuals.std():,.0f}")
print()

# Check for bias by segment
df_test = df_sorted[split_idx:split_idx+len(y_test)].copy()
df_test['residual'] = residuals.values
df_test['abs_error'] = np.abs(residuals.values)

print("Mean Absolute Error by Product:")
product_errors = df_test.groupby('product')['abs_error'].mean().sort_values(ascending=False)
print(product_errors.apply(lambda x: f"${x:,.0f}"))
print()

print("BUSINESS INSIGHT:")
worst_product = product_errors.index[0]
best_product = product_errors.index[-1]
print(f"  Hardest to predict: {worst_product} (error: ${product_errors.iloc[0]:,.0f})")
print(f"  Easiest to predict: {best_product} (error: ${product_errors.iloc[-1]:,.0f})")
print(f"  ‚Üí May need product-specific models or more features for {worst_product}")
print()


10. RESIDUAL ANALYSIS

WHY RESIDUALS:
  - Check for systematic errors (bias)
  - Validate model assumptions
  - Identify segments where model struggles

Residual Statistics:
  Mean error: $-1,029 (should be ~0)
  Median error: $-718
  Std dev: $81,496

Mean Absolute Error by Product:
product
Microscope     $103,226
Centrifuge      $89,649
PCR_System      $55,657
Pipettes         $6,515
Reagent_Kit      $6,185
Name: abs_error, dtype: object

BUSINESS INSIGHT:
  Hardest to predict: Microscope (error: $103,226)
  Easiest to predict: Reagent_Kit (error: $6,185)
  ‚Üí May need product-specific models or more features for Microscope



In [13]:

# ============================================================================
# SAVE FINAL MODEL
# ============================================================================
print("="*80)
print("11. SAVE FINAL MODEL FOR PRODUCTION")
print("="*80)
print()

# Save model
joblib.dump(best_model, 'price_optimization_model.pkl')
print(f"‚úì Saved: price_optimization_model.pkl ({best_model_name})")

# Save scaler
joblib.dump(scaler, 'feature_scaler.pkl')
print("‚úì Saved: feature_scaler.pkl")

# Save model metadata
model_metadata = {
    'model_type': best_model_name,
    'features': all_features,
    'target': target,
    'train_size': len(X_train),
    'test_size': len(X_test),
    'test_r2': float(best_model_r2),
    'test_mae': float(test_mae_gb),
    'test_rmse': float(test_rmse_gb),
    'cv_mean_r2': float(cv_scores.mean()),
    'cv_std_r2': float(cv_scores.std())
}

with open('model_metadata.json', 'w') as f:
    json.dump(model_metadata, f, indent=2)
print("‚úì Saved: model_metadata.json")
print()

# ============================================================================
# FINAL SUMMARY
# ============================================================================
print("="*80)
print("MODEL BUILDING COMPLETE - PRODUCTION READY")
print("="*80)
print()

print("DELIVERABLES:")
print("  ‚úì Trained model: price_optimization_model.pkl")
print("  ‚úì Feature scaler: feature_scaler.pkl")
print("  ‚úì Model metadata: model_metadata.json")
print("  ‚úì Feature definitions: feature_metadata.json")
print()

print("MODEL PERFORMANCE:")
print(f"  Test R¬≤: {best_model_r2:.4f} ({best_model_r2*100:.1f}% variance explained)")
print(f"  Test MAE: ${test_mae_gb:,.0f}")
print(f"  Test MAPE: {test_mape_gb:.1%}")
print()

print("NEXT STEPS:")
print("  1. Build Streamlit app for interactive optimization")
print("  2. Test with real pricing scenarios")
print("  3. A/B test recommendations vs current pricing")
print("  4. Monitor model performance in production")
print("  5. Retrain monthly with new data")
print()

print("="*80)

11. SAVE FINAL MODEL FOR PRODUCTION

‚úì Saved: price_optimization_model.pkl (Gradient Boosting)
‚úì Saved: feature_scaler.pkl
‚úì Saved: model_metadata.json

MODEL BUILDING COMPLETE - PRODUCTION READY

DELIVERABLES:
  ‚úì Trained model: price_optimization_model.pkl
  ‚úì Feature scaler: feature_scaler.pkl
  ‚úì Model metadata: model_metadata.json
  ‚úì Feature definitions: feature_metadata.json

MODEL PERFORMANCE:
  Test R¬≤: 0.9670 (96.7% variance explained)
  Test MAE: $51,315
  Test MAPE: 18.6%

NEXT STEPS:
  1. Build Streamlit app for interactive optimization
  2. Test with real pricing scenarios
  3. A/B test recommendations vs current pricing
  4. Monitor model performance in production
  5. Retrain monthly with new data

