In this notebook, we show how to learn a predictive model using decision-focused learning approaches SPO+ [1] and PFYL [2] on the shortest path problem (using PyEPO [3]).

In [None]:
import os
import sys

path_to_project = os.path.dirname(os.path.abspath("")) + "/"
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath("decision-focused-learning-codebase"))))
print(path_to_project)

In [1]:
from src.concrete_models.grbpy_shortest_path import ShortestPath
from src.decision_makers.differentiable_decision_maker import (
    DifferentiableDecisionMaker,
)
from src.generate_data_functions.generate_data_shortestpath import gen_data_shortestpath
from src.problem import Problem
from src.runner import Runner

/Users/noah/Documents/Python projects/decision-focused-learning-codebase/
Auto-Sklearn cannot be imported.




The goal in the shortest path problem is to move from start (NW) to end (SE) (TODO: image?) in a 2D grid. In the predict and optimize setting we have here, the costs of each edge is uncertain but correlated with some (feature) values. Our aim is to learn a predictor that predicts costs such that the decision quality (the chosen path) has minimal costs on average.

(Some details on problem specifics)

We first construct the `OptimizationModel`

In [2]:
model = ShortestPath(grid=(10, 10))
print(f"For a grid of size {model.grid}, there are {len(model.arcs)} arcs.")
print(f"The uncertain parameters we have to predict for this problem are {model.param_to_predict_names} with size {model.param_to_predict_shapes}.")

Set parameter Username
Academic license - for non-commercial use only - expires 2026-05-16
For a grid of size (10, 10), there are 180 arcs.
The uncertain parameters we have to predict for this problem are ['c'] with size {'c': (180,)}.


We can use a data generation script to generate our data. This data generation script could be replaced by reading the data from a file. 

In [3]:
data_dict = gen_data_shortestpath(
    seed=5,
    num_data=500,
    num_features=5,
    grid=(10, 10),
    polynomial_degree=5,
    noise_width=0.5,
)
print(f"The data dictionary contains the following: {list(data_dict.keys())}")
for key in data_dict.keys():
    print(f"The shape of {key} is {data_dict[key].shape}")

Shape of values: (500, 180), features: (500, 5).
The data dictionary contains the following: ['c', 'features']
The shape of c is (500, 180)
The shape of features is (500, 5)


We are making use of a problem class that contains the optimization model and the data, which is split into a training set, a validation set and a test set. *The compute_optimal_decisions and compute_optimal_objectives parameters are used to pre-compute optimal decisions and objectives, which is practical for most DFL approaches, but unwanted if the data is too large and there is not enough memory.*

In [4]:
problem = Problem(
    data_dict=data_dict,
    opt_model=model,
    train_ratio=0.4,
    val_ratio=0.2,
    compute_optimal_decisions=True,
    compute_optimal_objectives=True,
)
print(f"The size of the training set equals {problem.train_size}, which corresponds to the number of training indices {len(problem.train_indices)}.")

Computing optimal decisions for the entire dataset...
Optimal decisions computed and added to dataset.
Computing optimal objectives for the entire dataset...
Optimal objectives computed and added to dataset.
Shuffling indices before splitting...
Dataset split completed: Train=200, Validation=100, Test=200
The size of the training set equals 200, which corresponds to the number of training indices 200.


Now we are almost set to run a DFL algorithm. In the general predict and optimize setting, our aim is to make the best decision given feature values. This is why we use the `DecisionMaker` object, which includes a predictive model. Let's first do some prediction-focused learning as a baseline. For this we use the `DifferentiableDecisionMaker`. 

In [5]:
predictor_kwargs = {
    "size": 256,
    "num_hidden_layers": 2,
    "activation": "leaky_relu",
    "output_activation": "identity",
}
predictor_str = "MLP"
loss_function_str = "mse"
decision_maker = DifferentiableDecisionMaker(
    problem=problem,
    learning_rate=0.005,
    batch_size=32,
    device_str="cpu",
    loss_function_str=loss_function_str,
    predictor_str=predictor_str,
    predictor_kwargs=predictor_kwargs,
)

Problem mode set to: train
Problem mode set to: train


We need use the `Runner` class to run our DFL algorithm

In [6]:
runner = Runner(decision_maker, num_epochs=5, use_wandb=False)
runner.run()

Epoch 0/5: Starting initial validation...
Problem mode set to: validation
Epoch Results:
validation/objective_mean: 10.7414
validation/sym_rel_regret_mean: 0.3425
validation/abs_regret_mean: 4.8216
validation/rel_regret_mean: 1.3711
validation/c_mean: 0.7929
validation/x_mean: 0.1000
Initial best validation metric (abs_regret): 4.821632385253906
Starting training...
Epoch: 1/5
Problem mode set to: train
Epoch Results:
validation/objective_mean: 10.7414
validation/sym_rel_regret_mean: 0.3425
validation/abs_regret_mean: 4.8216
validation/rel_regret_mean: 1.3711
train/solver_calls_mean: 100.0000
validation/c_mean: 0.7929
train/loss_mean: 0.7084
validation/x_mean: 0.1000
train/grad_norm_mean: 0.4628
Problem mode set to: validation
Epoch Results:
validation/objective_mean: 8.9880
validation/sym_rel_regret_mean: 0.2295
validation/abs_regret_mean: 3.0683
validation/rel_regret_mean: 0.8306
train/solver_calls_mean: 100.0000
validation/c_mean: 0.6754
train/loss_mean: 0.7084
validation/x_mean: 0.

0.79881155

Now let's run SPO+ [1]

In [7]:
loss_function_str = "SPOPlus"
decision_maker = DifferentiableDecisionMaker(
    problem=problem,
    learning_rate=0.005,
    batch_size=32,
    device_str="cpu",
    loss_function_str=loss_function_str,
    predictor_str=predictor_str,
    predictor_kwargs=predictor_kwargs,
)
runner = Runner(decision_maker, num_epochs=5, use_wandb=False)
runner.run()

Problem mode set to: train
Problem mode set to: train
Num of cores: 1
Epoch 0/5: Starting initial validation...
Problem mode set to: validation
Epoch Results:
validation/objective_mean: 10.6073
validation/sym_rel_regret_mean: 0.3438
validation/abs_regret_mean: 4.6875
validation/rel_regret_mean: 1.3681
validation/c_mean: 0.7934
validation/x_mean: 0.1000
Initial best validation metric (abs_regret): 4.687525272369385
Starting training...
Epoch: 1/5
Problem mode set to: train
Epoch Results:
validation/objective_mean: 10.6073
validation/sym_rel_regret_mean: 0.3438
validation/abs_regret_mean: 4.6875
train/rel_regret_mean: 0.4497
validation/rel_regret_mean: 1.3681
train/sym_rel_regret_mean: 0.1490
train/solver_calls_mean: 224.5714
train/objective_mean: 9.5865
train/abs_regret_mean: 2.3322
validation/c_mean: 0.7934
train/loss_mean: 10.8837
validation/x_mean: 0.1000
train/grad_norm_mean: 14.7431
Problem mode set to: validation
Epoch Results:
validation/objective_mean: 8.7504
validation/sym_rel_

0.58558434

There is a wide variety of DFL algorithms for optimization models with uncertainty linear in the objective, since we use PyEPO [2] under the hood for these type of models. They are specfied as the *loss_function_str*

In [8]:
decision_maker.allowed_losses

['mse',
 'objective',
 'regret',
 'SPOPlus',
 'perturbedOpt',
 'perturbedFenchelYoung',
 'implicitMLE',
 'blackboxOpt',
 'negativeIdentity',
 'smooth']

To more easily run full-fledged experiments, the run() function can be used. For clear and concise use of different settings you can use configs as yaml files (using dictionaries is also possible).

In [10]:
import yaml

from src.utils.experiments import run

sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(""))))


for seed in range(5):
    yaml_dir = "examples/linear_objective_config.yml"
    config = yaml.safe_load(open(path_to_project + yaml_dir))
    config["seed"] = seed
    run(config)

Generating data using shortestpath_5x5
Shape of values: (250, 40), features: (250, 5).
Features normalized (mean 0, std 1).
Computing optimal decisions for the entire dataset...
Optimal decisions computed and added to dataset.
Computing optimal objectives for the entire dataset...
Optimal objectives computed and added to dataset.
Shuffling indices before splitting...
Dataset split completed: Train=20, Validation=20, Test=210
Problem mode set to: train
Problem mode set to: train
Epoch 0/5: Starting initial validation...
Problem mode set to: validation
Epoch Results:
validation/objective_mean: 5.2171
validation/sym_rel_regret_mean: 0.2091
validation/abs_regret_mean: 1.9413
validation/rel_regret_mean: 0.5920
validation/c_mean: 0.6214
validation/x_mean: 0.2000
Initial best validation metric (rel_regret): 0.5919764041900635
Starting training...
Epoch: 1/5
Problem mode set to: train
Epoch Results:
validation/objective_mean: 5.2171
validation/sym_rel_regret_mean: 0.2091
validation/abs_regret_