# Hybrid Modelling of HUMINT Source Performance: ML-TSSP Model

This notebook walks through the **entire HUMINT ML-TSSP pipeline** as implemented in the project: data generation/preprocessing, classification (XGBoost + SMOTE), regression (GRU for reliability/deception), TSSP optimization, cost analysis, and advanced metrics (EVPI, EMV, sensitivity, efficiency frontier).

Run from **project root** so `src` and config resolve correctly. GLPK (or CBC) must be installed for TSSP and advanced metrics.

## 1. Title and Setup

In [1]:
import sys
from pathlib import Path

# Add project root to path (run notebook from project root). Matches src/pipeline.py logic.
PROJECT_ROOT = Path.cwd()
if str(PROJECT_ROOT) not in sys.path:
    sys.path.insert(0, str(PROJECT_ROOT))

import pandas as pd
import numpy as np
import joblib

from src.data import (
    generate_humint_dataset,
    prepare_classification_data,
    prepare_regression_data,
    scale_features,
    load_features_from_file,
)
from src.ml import ClassificationModelTrainer, RegressionModelTrainer
from src.optimization import TSSPModel
from src.analysis import (
    analyze_costs,
    generate_cost_report,
)
from src.analysis.advanced_metrics import (
    calculate_evpi,
    calculate_emv,
    sensitivity_analysis,
    generate_advanced_metrics_report,
    calculate_efficiency_frontier,
    plot_efficiency_frontier,
)
from src.utils.config import (
    PROJECT_ROOT as CONFIG_ROOT,
    MODELS_DIR,
    OUTPUT_DIR,
    BEHAVIOR_CLASSES,
    RECOURSE_COSTS,
    CLASSIFICATION_FEATURES_FILE,
    REGRESSION_FEATURES_FILE,
)
# Use config paths; config PROJECT_ROOT is parent of src/
PROJECT_ROOT = CONFIG_ROOT

import matplotlib.pyplot as plt
import seaborn as sns

RANDOM_SEED = 42
np.random.seed(RANDOM_SEED)
print(f"Project root: {PROJECT_ROOT}")
print(f"Models dir: {MODELS_DIR}")
print(f"Output dir: {OUTPUT_DIR}")

Project root: D:\Updated-FINAL DASH
Models dir: D:\Updated-FINAL DASH\models
Output dir: D:\Updated-FINAL DASH\output


## 2. Data — Load or Generate Synthetic HUMINT Dataset

In [2]:
DATA_PATH = PROJECT_ROOT / "humint_source_dataset_15000_enhanced.csv"
if DATA_PATH.exists():
    print(f"Loading dataset from: {DATA_PATH}")
    df = pd.read_csv(DATA_PATH)
else:
    print(f"Generating new dataset with 15000 sources...")
    df = generate_humint_dataset(
        n_sources=15000,
        random_seed=RANDOM_SEED,
        output_path=DATA_PATH,
    )
print(f"Dataset loaded: {len(df)} sources")

Loading dataset from: D:\Updated-FINAL DASH\humint_source_dataset_15000_enhanced.csv
Dataset loaded: 15000 sources


In [3]:
# EDA
print("Shape:", df.shape)
print("\nColumns:", list(df.columns))
print("\nBehavior class counts:")
print(df["behavior_class"].value_counts())
df.head()

Shape: (15000, 16)

Columns: ['source_id', 'task_success_rate', 'corroboration_score', 'report_timeliness', 'handler_confidence', 'deception_score', 'ci_flag', 'report_accuracy', 'report_frequency', 'access_level', 'information_value', 'handling_cost_kes', 'threat_relevant_features', 'reliability_score', 'behavior_class', 'scenario_probability']

Behavior class counts:
behavior_class
uncertain      6537
deceptive      5952
coerced        2482
cooperative      29
Name: count, dtype: int64


Unnamed: 0,source_id,task_success_rate,corroboration_score,report_timeliness,handler_confidence,deception_score,ci_flag,report_accuracy,report_frequency,access_level,information_value,handling_cost_kes,threat_relevant_features,reliability_score,behavior_class,scenario_probability
0,SRC_00001,0.555,0.575,0.783,0.528,0.593,0,0.449,4,3,0.591,116416,1,0.43,deceptive,0.203
1,SRC_00002,0.946,0.76,0.676,0.864,0.705,1,0.898,3,1,0.558,69237,3,0.454,deceptive,0.348
2,SRC_00003,0.798,0.622,0.979,0.986,0.371,0,0.815,2,1,0.627,60343,4,0.567,uncertain,0.474
3,SRC_00004,0.707,0.262,0.531,0.88,0.231,0,0.693,6,2,0.553,110441,4,0.448,uncertain,0.434
4,SRC_00005,0.406,0.339,0.753,0.539,0.255,1,0.42,2,3,0.347,104764,5,0.277,deceptive,0.261


In [4]:
df.describe()

Unnamed: 0,task_success_rate,corroboration_score,report_timeliness,handler_confidence,deception_score,ci_flag,report_accuracy,report_frequency,access_level,information_value,handling_cost_kes,threat_relevant_features,reliability_score,scenario_probability
count,15000.0,15000.0,15000.0,15000.0,15000.0,15000.0,15000.0,15000.0,15000.0,15000.0,15000.0,15000.0,15000.0,15000.0
mean,0.63766,0.576945,0.699178,0.645361,0.400574,0.120467,0.636433,4.020467,2.512667,0.470227,113178.9866,2.7592,0.473362,0.35282
std,0.196309,0.215348,0.173455,0.199181,0.229557,0.325517,0.209727,2.001112,1.328341,0.164145,33220.231543,1.788434,0.109111,0.127994
min,0.3,0.2,0.4,0.3,0.0,0.0,0.07,0.0,1.0,0.0,32919.0,0.0,0.111,0.0
25%,0.468,0.393,0.548,0.474,0.203,0.0,0.469,3.0,1.0,0.355,87329.25,1.0,0.396,0.265
50%,0.636,0.579,0.698,0.647,0.3995,0.0,0.638,4.0,2.0,0.468,109877.0,3.0,0.475,0.353
75%,0.809,0.76,0.849,0.819,0.602,0.0,0.809,5.0,4.0,0.583,137107.75,4.0,0.552,0.441
max,0.98,0.95,1.0,0.99,0.8,1.0,1.0,13.0,5.0,1.0,231925.0,13.0,0.802,0.805


## 3. Classification — Behavior Prediction (XGBoost + SMOTE)

In [5]:
X_train, y_train, X_test, y_test, label_encoder = prepare_classification_data(
    df,
    feature_file=CLASSIFICATION_FEATURES_FILE,
    random_state=RANDOM_SEED,
)
classification_trainer = ClassificationModelTrainer(random_state=RANDOM_SEED)
X_train, y_train = classification_trainer.apply_smote(X_train, y_train)

In [6]:
xgb_results = classification_trainer.train_xgboost(X_train, y_train, X_test, y_test)
m = xgb_results["metrics"]
print(f"Accuracy:  {m['accuracy']:.4f}")
print(f"F1:        {m['f1']:.4f}")
print(f"Precision: {m['precision']:.4f}")
print(f"Recall:    {m['recall']:.4f}")
if "roc_auc" in m:
    print(f"ROC-AUC:   {m['roc_auc']:.4f}")
classification_trainer.best_model = xgb_results["model"]
classification_trainer.best_model_name = "xgboost"

Accuracy:  0.9950
F1:        0.9950
Precision: 0.9951
Recall:    0.9950
ROC-AUC:   1.0000


In [7]:
from sklearn.metrics import ConfusionMatrixDisplay

y_pred = classification_trainer.best_model.predict(X_test)
fig, ax = plt.subplots(figsize=(8, 6))
ConfusionMatrixDisplay.from_predictions(y_test, y_pred, ax=ax)
plt.title("Confusion Matrix (XGBoost)")
plt.tight_layout()
plt.show()

  plt.show()


In [8]:
classification_trainer.save_model(MODELS_DIR / "classification_model.pkl", label_encoder)

Label encoder saved to: D:\Updated-FINAL DASH\models\classification_model_label_encoder.pkl
Model saved to: D:\Updated-FINAL DASH\models\classification_model.pkl


## 4. Regression — Reliability and Deception Scores (GRU)

In [9]:
X_train_r, y_train_r, X_test_r, y_test_r = prepare_regression_data(
    df,
    feature_file=REGRESSION_FEATURES_FILE,
    target_col="reliability_score",
    random_state=RANDOM_SEED,
)
X_train_scaled, X_test_scaled, reliability_scaler = scale_features(X_train_r, X_test_r)
reliability_trainer = RegressionModelTrainer(random_state=RANDOM_SEED)
rel_results = reliability_trainer.train_gru(
    X_train_scaled, y_train_r, X_test_scaled, y_test_r
)
reliability_trainer.best_model = rel_results["model"]
reliability_trainer.best_model_name = "gru"
rm = rel_results["metrics"]
print(f"Reliability GRU R²:   {rm['r2']:.4f}")
print(f"Reliability GRU RMSE: {rm['rmse']:.4f}")
print(f"Reliability GRU MAE:  {rm['mae']:.4f}")

Reliability GRU R²:   0.9539
Reliability GRU RMSE: 0.0233
Reliability GRU MAE:  0.0187


In [10]:
reliability_trainer.save_model(MODELS_DIR / "reliability_model.keras")
joblib.dump(reliability_scaler, MODELS_DIR / "reliability_scaler.pkl")

Model saved to: D:\Updated-FINAL DASH\models\reliability_model.keras


['D:\\Updated-FINAL DASH\\models\\reliability_scaler.pkl']

In [11]:
X_train_d, y_train_d, X_test_d, y_test_d = prepare_regression_data(
    df,
    feature_file=REGRESSION_FEATURES_FILE,
    target_col="deception_score",
    random_state=RANDOM_SEED,
)
X_train_ds, X_test_ds, deception_scaler = scale_features(X_train_d, X_test_d)
deception_trainer = RegressionModelTrainer(random_state=RANDOM_SEED)
dec_results = deception_trainer.train_gru(
    X_train_ds, y_train_d, X_test_ds, y_test_d
)
deception_trainer.best_model = dec_results["model"]
deception_trainer.best_model_name = "gru"
dm = dec_results["metrics"]
print(f"Deception GRU R²:   {dm['r2']:.4f}")
print(f"Deception GRU RMSE: {dm['rmse']:.4f}")
print(f"Deception GRU MAE:  {dm['mae']:.4f}")

Deception GRU R²:   0.6375
Deception GRU RMSE: 0.1379
Deception GRU MAE:  0.1107


In [12]:
deception_trainer.save_model(MODELS_DIR / "deception_model.keras")
joblib.dump(deception_scaler, MODELS_DIR / "deception_scaler.pkl")

Model saved to: D:\Updated-FINAL DASH\models\deception_model.keras


['D:\\Updated-FINAL DASH\\models\\deception_scaler.pkl']

## 5. TSSP — Two-Stage Stochastic Optimization

In [13]:
opt_n_sources = 100
opt_n_tasks = 10
sources_df = df.head(opt_n_sources).copy()
sources = sources_df["source_id"].tolist()
tasks = [f"TASK_{i:03d}" for i in range(1, opt_n_tasks + 1)]

features = load_features_from_file(CLASSIFICATION_FEATURES_FILE)
available_features = [f for f in features if f in sources_df.columns]
X_pred = sources_df[available_features]
proba = classification_trainer.best_model.predict_proba(X_pred)

behavior_probabilities = {}
for idx, source_id in enumerate(sources):
    for class_idx, behavior in enumerate(BEHAVIOR_CLASSES):
        behavior_probabilities[(source_id, behavior)] = float(proba[idx, class_idx])

reg_features = load_features_from_file(REGRESSION_FEATURES_FILE)
available_reg = [f for f in reg_features if f in sources_df.columns]
X_reg = sources_df[available_reg]
X_reg_rel = reliability_scaler.transform(X_reg)
X_reg_rel = X_reg_rel.reshape(X_reg_rel.shape[0], 1, X_reg_rel.shape[1])
reliability_predictions = reliability_trainer.best_model.predict(X_reg_rel, verbose=0).flatten()
X_reg_dec = deception_scaler.transform(X_reg)
X_reg_dec = X_reg_dec.reshape(X_reg_dec.shape[0], 1, X_reg_dec.shape[1])
deception_predictions = deception_trainer.best_model.predict(X_reg_dec, verbose=0).flatten()
sources_df = sources_df.copy()
sources_df["predicted_reliability"] = reliability_predictions
sources_df["predicted_deception"] = deception_predictions

In [14]:
stage1_costs = {}
for idx, source_id in enumerate(sources):
    row = sources_df[sources_df["source_id"] == source_id].iloc[0]
    base = 10.0 * (1.0 - row["predicted_reliability"])
    for task_id in tasks:
        stage1_costs[(source_id, task_id)] = round(base, 2)

information_values = {}
for idx, source_id in enumerate(sources):
    row = sources_df[sources_df["source_id"] == source_id].iloc[0]
    info_val = row.get("information_value", 0.5)
    base_value = (row["predicted_reliability"] + info_val) / 2
    for task_id in tasks:
        information_values[(source_id, task_id)] = base_value

In [15]:
tssp_inputs = {
    "sources": sources,
    "tasks": tasks,
    "behavior_classes": BEHAVIOR_CLASSES,
    "behavior_probabilities": behavior_probabilities,
    "stage1_costs": stage1_costs,
    "recourse_costs": RECOURSE_COSTS,
    "information_values": information_values,
}
tssp_model = TSSPModel(**tssp_inputs)
tssp_model.build_model()
success = tssp_model.solve(solver_name="glpk")
print(f"TSSP solve success: {success}")
if success:
    print(f"Objective value: {tssp_model.solution.get('objective_value', None)}")
    n_assign = sum(1 for v in tssp_model.solution.get("assignments", {}).values() if v)
    print(f"Number of assignments: {n_assign}")

TSSP solve success: True
Objective value: 34.16757663701156
Number of assignments: 10


## 6. Cost Analysis and Reporting

In [16]:
analysis_results = analyze_costs(tssp_model, output_dir=OUTPUT_DIR)
decomposition = analysis_results["decomposition"]
verification = analysis_results["verification"]
print("Stage 1 cost:", decomposition["stage1_cost"])
print("Stage 2 expected cost:", decomposition["stage2_expected_cost"])
print("Verified:", verification["verified"])


Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `x` variable to `hue` and set `legend=False` for the same effect.

  sns.barplot(x=behavior_classes, y=costs, palette='viridis')


Saved plot to: D:\Updated-FINAL DASH\output\cost_by_behavior.png



Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `x` variable to `hue` and set `legend=False` for the same effect.

  sns.barplot(x=sources, y=costs, palette='viridis')


Saved plot to: D:\Updated-FINAL DASH\output\cost_by_source.png


Saved plot to: D:\Updated-FINAL DASH\output\cost_pie_chart.png
Stage 1 cost: 34.02
Stage 2 expected cost: 0.14757641766646507
Verified: True


In [17]:
report_path = OUTPUT_DIR / "cost_analysis_report.txt"
report_text = generate_cost_report(decomposition, verification, output_path=report_path)
print(report_text)

Report saved to: D:\Updated-FINAL DASH\output\cost_analysis_report.txt
TSSP COST ANALYSIS REPORT

COST VERIFICATION:
  Optimal Objective Value: 34.17
  Calculated Total Cost: 34.17
  Difference: 0.000000
  Verified: ✓

COST DECOMPOSITION:
  Stage 1 Cost (Strategic Tasking): 34.02
  Stage 2 Expected Recourse Cost: 0.15
  Total Expected Cost: 34.17
  Stage 2 Proportion: 0.43%

STAGE 2 COST BY BEHAVIOR CLASS:
  deceptive      :       0.10 (67.7%)
  coerced        :       0.04 (28.1%)
  uncertain      :       0.01 ( 4.2%)
  cooperative    :       0.00 ( 0.0%)

TOP 10 SOURCES BY STAGE 2 COST:
  SRC_00054      :       0.04
  SRC_00012      :       0.02
  SRC_00013      :       0.02
  SRC_00063      :       0.02
  SRC_00034      :       0.01
  SRC_00071      :       0.01
  SRC_00035      :       0.01
  SRC_00087      :       0.01
  SRC_00052      :       0.01
  SRC_00070      :       0.01



In [18]:
pd.DataFrame({
    "Stage 1": [decomposition["stage1_cost"]],
    "Stage 2 (expected)": [decomposition["stage2_expected_cost"]],
    "Total": [decomposition["stage1_cost"] + decomposition["stage2_expected_cost"]],
})

Unnamed: 0,Stage 1,Stage 2 (expected),Total
0,34.02,0.147576,34.167576


## 7. Advanced Metrics — EVPI, EMV, Sensitivity, Efficiency Frontier

In [19]:
evpi_results = None
emv_results = None
sensitivity_results = None
frontier_results = None

try:
    evpi_results = calculate_evpi(
        tssp_model=tssp_model,
        behavior_classes=BEHAVIOR_CLASSES,
        behavior_probabilities=tssp_inputs["behavior_probabilities"],
        sources=tssp_inputs["sources"],
        tasks=tssp_inputs["tasks"],
        stage1_costs=tssp_inputs["stage1_costs"],
        recourse_costs=tssp_inputs["recourse_costs"],
        solver_name="glpk",
    )
    print(f"EVPI: {evpi_results.get('evpi', 0):.2f}")
    print(f"EVPI %: {evpi_results.get('evpi_percentage', 0):.2f}%")
except Exception as e:
    print(f"EVPI failed: {e}")

try:
    emv_results = calculate_emv(
        tssp_model=tssp_model,
        information_values=tssp_inputs.get("information_values"),
    )
    print(f"EMV: {emv_results.get('emv', 0):.2f}")
    print(f"Information value: {emv_results.get('information_value', 0):.2f}")
except Exception as e:
    print(f"EMV failed: {e}")

  Calculating wait-and-see value for 4 scenarios...


EVPI: -102.41
EVPI %: -299.72%
EMV: -27.87
Information value: 6.30


In [20]:
try:
    sensitivity_results = sensitivity_analysis(
        tssp_model=tssp_model,
        behavior_classes=BEHAVIOR_CLASSES,
        behavior_probabilities=tssp_inputs["behavior_probabilities"],
        sources=tssp_inputs["sources"],
        tasks=tssp_inputs["tasks"],
        stage1_costs=tssp_inputs["stage1_costs"],
        recourse_costs=tssp_inputs["recourse_costs"],
        variation_range=0.2,
        solver_name="glpk",
        output_dir=OUTPUT_DIR,
    )
    print("Sensitivity analysis done. Baseline:", sensitivity_results.get("baseline_value"))
except Exception as e:
    print(f"Sensitivity failed: {e}")

Performing sensitivity analysis on recourse costs...


Performing sensitivity analysis on behavior probabilities...


Performing sensitivity analysis on Stage 1 costs...


Saved sensitivity plot to: D:\Updated-FINAL DASH\output\sensitivity_recourse_costs.png


Saved sensitivity plot to: D:\Updated-FINAL DASH\output\sensitivity_behavior_probs.png
Sensitivity analysis done. Baseline: 34.16757663701156


In [21]:
try:
    frontier_results = calculate_efficiency_frontier(
        sources=tssp_inputs["sources"],
        tasks=tssp_inputs["tasks"],
        behavior_classes=BEHAVIOR_CLASSES,
        behavior_probabilities=tssp_inputs["behavior_probabilities"],
        stage1_costs=tssp_inputs["stage1_costs"],
        recourse_costs=tssp_inputs["recourse_costs"],
        n_scenarios=20,
        solver_name="glpk",
    )
    plot_efficiency_frontier(
        frontier_results,
        output_path=OUTPUT_DIR / "efficiency_frontier.png",
    )
    print(f"Efficiency frontier: {len(frontier_results['frontier_points'])} points")
except Exception as e:
    print(f"Efficiency frontier failed: {e}")

Calculating efficiency frontier with 20 allocation scenarios...


Efficiency frontier calculated: 9 frontier points, 11 dominated points


Efficiency frontier plot saved to: D:\Updated-FINAL DASH\output\efficiency_frontier.png
Efficiency frontier: 9 points


In [22]:
if evpi_results and emv_results and sensitivity_results:
    try:
        adv_report = generate_advanced_metrics_report(
            evpi_results=evpi_results,
            emv_results=emv_results,
            sensitivity_results=sensitivity_results,
            output_path=OUTPUT_DIR / "advanced_metrics_report.txt",
        )
        print(adv_report)
    except Exception as e:
        print(f"Advanced metrics report failed: {e}")
else:
    print("Skipping advanced metrics report (EVPI/EMV/sensitivity missing).")

Advanced metrics report saved to: D:\Updated-FINAL DASH\output\advanced_metrics_report.txt
ADVANCED METRICS REPORT

EXPECTED VALUE OF PERFECT INFORMATION (EVPI):
  Current Value (Here-and-Now): 34.17
  Wait-and-See Value (Perfect Info): 136.58
  EVPI: -102.41
  EVPI Percentage: -299.72%
  Interpretation: EVPI = -102.41 means we would save up to -102.41 units (-299.7%) with perfect information about behaviors.

EXPECTED MISSION VALUE (EMV):
  Total Cost: 34.17
  Stage 1 Cost: 34.02
  Stage 2 Cost: 0.15
  Information Value: 6.30
  EMV (Net Mission Value): -27.87
  EMV per Source: -2.79
  Interpretation: EMV = -27.87 represents the net mission value (Information Value: 6.30 - Total Cost: 34.17).

SENSITIVITY ANALYSIS:
  Baseline Objective Value: 34.17

  Recourse Cost Sensitivity:
    deceptive      : Range = 0.04 (Min: 34.15, Max: 34.19)
    coerced        : Range = 0.02 (Min: 34.16, Max: 34.18)
    uncertain      : Range = 0.00 (Min: 34.17, Max: 34.17)

  Behavior Probability Sensitivit

## 8. (Optional) Run Full Pipeline in One Go

In [23]:
from src.pipeline import MLTSSPPipeline

data_path = PROJECT_ROOT / "humint_source_dataset_15000_enhanced.csv"
pipeline = MLTSSPPipeline(data_path=data_path if data_path.exists() else None, random_seed=42)
results = pipeline.run_full_pipeline(
    n_sources=15000,
    opt_n_sources=100,
    opt_n_tasks=10,
    train_ml=True,
    solver_name="glpk",
)
print("Results keys:", list(results.keys()))
if "tssp" in results:
    print("TSSP solved:", results["tssp"].get("solved"))
if "analysis" in results:
    print("Analysis keys:", list(results["analysis"].keys()))


HUMINT ML-TSSP PIPELINE
Loading dataset from: D:\Updated-FINAL DASH\humint_source_dataset_15000_enhanced.csv
Dataset loaded: 15000 sources

TRAINING CLASSIFICATION MODEL
Applying SMOTE for class imbalance...

Training XGBoost Classifier (best performing model)...


XGBoost Accuracy: 0.9950
XGBoost F1-Score: 0.9950
XGBoost Precision: 0.9951
XGBoost Recall: 0.9950
Label encoder saved to: D:\Updated-FINAL DASH\models\classification_model_label_encoder.pkl
Model saved to: D:\Updated-FINAL DASH\models\classification_model.pkl

TRAINING REGRESSION MODELS

Training Reliability Score Model with GRU (best performing model)...


Reliability GRU R²: 0.9545
Reliability GRU RMSE: 0.0231
Reliability GRU MAE: 0.0186
Model saved to: D:\Updated-FINAL DASH\models\reliability_model.keras

Training Deception Score Model with GRU (best performing model)...


Deception GRU R²: 0.6381
Deception GRU RMSE: 0.1378
Deception GRU MAE: 0.1111
Model saved to: D:\Updated-FINAL DASH\models\deception_model.keras

PREPARING TSSP INPUTS
Generating behavior probabilities P(b | s) from XGBoost Classifier...
Generating reliability and deception scores from GRU models...


Calculating Stage 1 costs using ML predictions...
Calculating information values using ML predictions...

Summary of ML predictions used in TSSP:
  - Behavior probabilities: XGBoost Classifier
  - Reliability scores: GRU Regressor (mean: 0.457)
  - Deception scores: GRU Regressor (mean: 0.367)
Prepared inputs for 100 sources and 10 tasks

SOLVING TSSP OPTIMIZATION MODEL
Model built with 100 sources, 10 tasks, 4 behavior classes



✓ Optimization solved successfully!
  Optimal Objective Value: 34.16
  Number of assignments: 10

ANALYZING RESULTS



Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `x` variable to `hue` and set `legend=False` for the same effect.

  sns.barplot(x=behavior_classes, y=costs, palette='viridis')


Saved plot to: D:\Updated-FINAL DASH\output\cost_by_behavior.png



Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `x` variable to `hue` and set `legend=False` for the same effect.

  sns.barplot(x=sources, y=costs, palette='viridis')


Saved plot to: D:\Updated-FINAL DASH\output\cost_by_source.png


Saved plot to: D:\Updated-FINAL DASH\output\cost_pie_chart.png
Report saved to: D:\Updated-FINAL DASH\output\cost_analysis_report.txt

TSSP COST ANALYSIS REPORT

COST VERIFICATION:
  Optimal Objective Value: 34.16
  Calculated Total Cost: 34.16
  Difference: 0.000000
  Verified: ✓

COST DECOMPOSITION:
  Stage 1 Cost (Strategic Tasking): 34.02
  Stage 2 Expected Recourse Cost: 0.14
  Total Expected Cost: 34.16
  Stage 2 Proportion: 0.42%

STAGE 2 COST BY BEHAVIOR CLASS:
  deceptive      :       0.09 (66.4%)
  coerced        :       0.04 (29.2%)
  uncertain      :       0.01 ( 4.3%)
  cooperative    :       0.00 ( 0.0%)

TOP 10 SOURCES BY STAGE 2 COST:
  SRC_00054      :       0.04
  SRC_00012      :       0.02
  SRC_00013      :       0.02
  SRC_00063      :       0.02
  SRC_00071      :       0.01
  SRC_00035      :       0.01
  SRC_00087      :       0.01
  SRC_00026      :       0.01
  SRC_00052      :       0.01
  SRC_00070      :       0.01


CALCULATING ADVANCED METRICS

PREPARING

Calculating Stage 1 costs using ML predictions...
Calculating information values using ML predictions...

Summary of ML predictions used in TSSP:
  - Behavior probabilities: XGBoost Classifier
  - Reliability scores: GRU Regressor (mean: 0.457)
  - Deception scores: GRU Regressor (mean: 0.367)
Prepared inputs for 100 sources and 10 tasks

Calculating Expected Value of Perfect Information (EVPI)...
  Calculating wait-and-see value for 4 scenarios...


  EVPI: -102.39
  EVPI Percentage: -299.72%

Calculating Expected Mission Value (EMV)...
  EMV: -27.85
  Information Value: 6.31

Performing Sensitivity Analysis...
Performing sensitivity analysis on recourse costs...


Performing sensitivity analysis on behavior probabilities...


Performing sensitivity analysis on Stage 1 costs...


Saved sensitivity plot to: D:\Updated-FINAL DASH\output\sensitivity_recourse_costs.png


Saved sensitivity plot to: D:\Updated-FINAL DASH\output\sensitivity_behavior_probs.png
  Sensitivity analysis completed

Calculating Efficiency Frontier...
Calculating efficiency frontier with 20 allocation scenarios...


Efficiency frontier calculated: 9 frontier points, 11 dominated points
  Efficiency frontier calculated: 9 frontier points


Efficiency frontier plot saved to: D:\Updated-FINAL DASH\output\efficiency_frontier.png
Advanced metrics report saved to: D:\Updated-FINAL DASH\output\advanced_metrics_report.txt

ADVANCED METRICS REPORT

EXPECTED VALUE OF PERFECT INFORMATION (EVPI):
  Current Value (Here-and-Now): 34.16
  Wait-and-See Value (Perfect Info): 136.55
  EVPI: -102.39
  EVPI Percentage: -299.72%
  Interpretation: EVPI = -102.39 means we would save up to -102.39 units (-299.7%) with perfect information about behaviors.

EXPECTED MISSION VALUE (EMV):
  Total Cost: 34.16
  Stage 1 Cost: 34.02
  Stage 2 Cost: 0.14
  Information Value: 6.31
  EMV (Net Mission Value): -27.85
  EMV per Source: -2.79
  Interpretation: EMV = -27.85 represents the net mission value (Information Value: 6.31 - Total Cost: 34.16).

SENSITIVITY ANALYSIS:
  Baseline Objective Value: 34.16

  Recourse Cost Sensitivity:
    deceptive      : Range = 0.04 (Min: 34.14, Max: 34.18)
    coerced        : Range = 0.02 (Min: 34.15, Max: 34.17)
    