Motivation
The catalog already has Steiner-tree models, but it does not have the biology-facing prize-collecting Steiner forest variant used to recover multiple pathway components at once.
This model is useful because it:
- generalizes prize-collecting Steiner tree from one connected tree to a forest,
- keeps the paper's node-prize / edge-cost objective directly in the model,
- separates the biological model from the artificial-root trick, which should live in a reduction rule rather than in the base problem definition.
Associated rule:
Definition
Name: PrizeCollectingSteinerForest
Reference:
- Nurcan Tuncbag et al., "Simultaneous Reconstruction of Multiple Signaling Pathways via the Prize-Collecting Steiner Forest Problem," Journal of Computational Biology 20(2):124-136, 2013. https://doi.org/10.1089/cmb.2012.0092
- Nurcan Tuncbag et al., "Simultaneous Reconstruction of Multiple Signaling Pathways via the Prize-Collecting Steiner Forest Problem," RECOMB 2012, LNBI 7262, pp. 287-301, 2012. https://doi.org/10.1007/978-3-642-29627-7_31
For a network
G = (V, E, c, p)
with nonnegative edge costs c(e) >= 0, nonnegative vertex prizes p(v) >= 0, and parameters beta >= 0, omega >= 0, find a forest subgraph
F = (V_F, E_F)
minimizing
beta * sum_{v notin V_F} p(v) + sum_{e in E_F} c(e) + omega * kappa(F),
where kappa(F) is the number of tree components of F.
Important conventions for this proposal:
- singleton vertices are allowed as one-vertex tree components,
- the empty forest is feasible,
- at the paper level, the input network may be directed or undirected,
- in the directed case, "forest" means acyclic after forgetting edge directions.
So this is a minimization problem.
Variables
Let n = |V| and m = |E|.
- Count:
n + m binary variables
- Per-variable domain:
{0,1}
- Meaning:
x_v = 1 iff vertex v belongs to V_F
y_e = 1 iff edge or arc e belongs to E_F
A configuration is feasible iff the selected edges are incident only to selected vertices and the selected subgraph is a forest. Singleton selected vertices are allowed.
Schema (data type)
Type name: PrizeCollectingSteinerForest
Paper-level variants:
- undirected networks
- directed networks
Recommended first concrete repo variant:
SimpleGraph initially
- weight types
i32 and f64
- directed-network support can be added later as a variant extension
| Field |
Type |
Description |
graph |
G |
The underlying network |
vertex_prizes |
Vec<W> |
Vertex prizes p : V -> R_{>= 0} |
edge_costs |
Vec<W> |
Edge or arc costs c : E -> R_{>= 0} |
beta |
W |
Tradeoff coefficient on omitted prizes |
omega |
W |
Cost per tree component |
Complexity
A conservative exact baseline is brute-force over vertex selections and edge selections, checking whether the chosen subgraph is a forest and evaluating the objective.
This gives a conservative worst-case bound
O(2^(num_vertices + num_edges) * poly(num_vertices + num_edges))
so the registry complexity string should be
2^(num_vertices + num_edges).
I have not verified a sharper exact worst-case bound for this specific paper-style PCSF variant, so the proposal should use the conservative baseline above.
References for the model and its practical solver context:
Extra Remark
This is not the standard demand-pair penalty version of PCSF.
The intended model is the biology-paper variant with:
- node prizes,
- edge costs,
- an explicit component penalty
omega * kappa(F).
The artificial root is not part of the base model. It belongs in the companion reduction rule.
Reduction Rule Crossref
How to solve
Example Instance
Take the undirected path
0 - 1 - 2
with:
- edge costs
- vertex prizes
p(0) = 5
p(1) = 2
p(2) = 5
beta = 1
omega = 2
Expected Outcome
An optimal forest is:
V_F = {0,1,2}
E_F = {(0,1)}
So the forest has two components:
- the tree on
{0,1}
- the singleton tree
{2}
Its objective value is:
- omitted-prize term:
0
- edge-cost term:
1
- component term:
2 * 2 = 4
Total:
1 + 4 = 5
This is better than:
- the full path
{(0,1),(1,2)}, which costs 1 + 6 + 2 = 9
- three singleton vertices, which cost
0 + 0 + 3 * 2 = 6
- the empty forest, which costs
5 + 2 + 5 = 12
So the optimum is 5.
BibTeX
@article{TuncbagEtAl2013PCSF,
author = {Nurcan Tuncbag and Alfredo Braunstein and Andrea Pagnani and Shao-Shan Carol Huang and Jennifer Chayes and Christian Borgs and Riccardo Zecchina and Ernest Fraenkel},
title = {Simultaneous Reconstruction of Multiple Signaling Pathways via the Prize-Collecting Steiner Forest Problem},
journal = {Journal of Computational Biology},
volume = {20},
number = {2},
pages = {124--136},
year = {2013},
doi = {10.1089/cmb.2012.0092},
url = {https://doi.org/10.1089/cmb.2012.0092}
}
@inproceedings{TuncbagEtAl2012RECOMB,
author = {Nurcan Tuncbag and Alfredo Braunstein and Andrea Pagnani and Shao-Shan Carol Huang and Jennifer Chayes and Christian Borgs and Riccardo Zecchina and Ernest Fraenkel},
title = {Simultaneous Reconstruction of Multiple Signaling Pathways via the Prize-Collecting Steiner Forest Problem},
booktitle = {Research in Computational Molecular Biology (RECOMB 2012)},
series = {Lecture Notes in Bioinformatics},
volume = {7262},
pages = {287--301},
year = {2012},
publisher = {Springer},
doi = {10.1007/978-3-642-29627-7_31},
url = {https://doi.org/10.1007/978-3-642-29627-7_31}
}
Motivation
The catalog already has Steiner-tree models, but it does not have the biology-facing prize-collecting Steiner forest variant used to recover multiple pathway components at once.
This model is useful because it:
Associated rule:
PrizeCollectingSteinerForest -> SteinerTreeDefinition
Name:
PrizeCollectingSteinerForestReference:
For a network
G = (V, E, c, p)with nonnegative edge costs
c(e) >= 0, nonnegative vertex prizesp(v) >= 0, and parametersbeta >= 0,omega >= 0, find a forest subgraphF = (V_F, E_F)minimizing
beta * sum_{v notin V_F} p(v) + sum_{e in E_F} c(e) + omega * kappa(F),where
kappa(F)is the number of tree components ofF.Important conventions for this proposal:
So this is a minimization problem.
Variables
Let
n = |V|andm = |E|.n + mbinary variables{0,1}x_v = 1iff vertexvbelongs toV_Fy_e = 1iff edge or arcebelongs toE_FA configuration is feasible iff the selected edges are incident only to selected vertices and the selected subgraph is a forest. Singleton selected vertices are allowed.
Schema (data type)
Type name:
PrizeCollectingSteinerForestPaper-level variants:
Recommended first concrete repo variant:
SimpleGraphinitiallyi32andf64graphGvertex_prizesVec<W>p : V -> R_{>= 0}edge_costsVec<W>c : E -> R_{>= 0}betaWomegaWComplexity
A conservative exact baseline is brute-force over vertex selections and edge selections, checking whether the chosen subgraph is a forest and evaluating the objective.
This gives a conservative worst-case bound
O(2^(num_vertices + num_edges) * poly(num_vertices + num_edges))so the registry complexity string should be
2^(num_vertices + num_edges).I have not verified a sharper exact worst-case bound for this specific paper-style PCSF variant, so the proposal should use the conservative baseline above.
References for the model and its practical solver context:
Extra Remark
This is not the standard demand-pair penalty version of PCSF.
The intended model is the biology-paper variant with:
omega * kappa(F).The artificial root is not part of the base model. It belongs in the companion reduction rule.
Reduction Rule Crossref
How to solve
[Rule] PrizeCollectingSteinerForest to SteinerTree).Example Instance
Take the undirected path
0 - 1 - 2with:
c(0,1) = 1c(1,2) = 6p(0) = 5p(1) = 2p(2) = 5beta = 1omega = 2Expected Outcome
An optimal forest is:
V_F = {0,1,2}E_F = {(0,1)}So the forest has two components:
{0,1}{2}Its objective value is:
012 * 2 = 4Total:
1 + 4 = 5This is better than:
{(0,1),(1,2)}, which costs1 + 6 + 2 = 90 + 0 + 3 * 2 = 65 + 2 + 5 = 12So the optimum is
5.BibTeX