Skip to content

Add datacenter_compute_allocation template (multi-reasoner)#68

Merged
cafzal merged 20 commits into
mainfrom
worktree-agent-a2bcce56b6427b60d
May 18, 2026
Merged

Add datacenter_compute_allocation template (multi-reasoner)#68
cafzal merged 20 commits into
mainfrom
worktree-agent-a2bcce56b6427b60d

Conversation

@cafzal
Copy link
Copy Markdown
Collaborator

@cafzal cafzal commented May 13, 2026

What this template does

Inside-the-fence GPU allocation across the 5 hyperscaler campuses an upstream interconnection-planning step (see energy_grid_planning) has approved and energized. A datacenter operator decides which workloads from 6 AI labs (frontier-anchor, applied, research) get which GPUs in which pool this period, accountable on three publicly-discussed dimensions: substation power envelope, gross-margin after energy + depreciation, and anchor-concentration risk. The compounding wrinkle: a GPU-hour reserved by a workload that stalls is depreciation accruing without offsetting revenue — stranded capacity is the operator's biggest economic exposure.

Reasoner chain (4 stages on a shared ontology)

  • Stage 1 — Predictive (GNN) — Heterogeneous temporal binary-classification GNN over LabMetric→Workload, WorkloadDependency.blocks, and cross-lab co_dated edges. Writes Workload.utilization_probability (110 rows). Frontier pretrain shards top the distribution (~0.78), Stability evals bottom (~0.38).
  • Stage 2 — Rules — Hardware eligibility + priority-tier classification. Writes Compatibility(workload, gpu_pool) (1,918 pairs), Workload.priority_tier (P0=15 / P1=80 / P2=15), .priority_weight (100/10/1), .under_provisioning_penalty (1.0/0.3/0.0).
  • Stage 3 — Graph — Reverse-PageRank on the workload-dependency DAG. Writes Workload.gating_score; GPT-Next pretrain shard 02 tops at 0.0310.
  • Stage 4 — Prescriptive (MIP) — Assignment MIP under a 24-cell scenario sweep (2 envelopes × 3 margin floors × 4 diversity caps). Objective amplifies the four upstream signals: priority × gating × utilization × strategic_value × (1 + penalty). Persists AllocationPlan singleton, Assignment.is_chosen (110 rows), DemandScenario + DemandScenarioOutlook (4 risk scenarios) — all queryable as ontology.

Headline output (live Gurobi run, 4 min 21 s end-to-end)

  • OPTIMAL in 6.3 s on the prescriptive engine. 24 cells: 16 OPTIMAL + 8 INFEASIBLE (designed-INFEASIBLE diagnostic at strictest anchor-cap + type-floor combinations).
  • Baseline (100pct / unconstrained / none): 110 workloads assigned · $25.28M revenue · $4.18M cost · 83% margin · 95% anchor · binding axis = power envelope.
  • Pareto frontier 1 (margin × revenue): cliff at 85% (-13% revenue, 90 workloads dropped, 100% anchor).
  • Pareto frontier 2 (diversity × revenue): cliff at any anchor cap (-82% at 70%, -90% at 50%, INFEASIBLE at 40% + type floor).
  • DemandScenario overlay: stranded $200K / $400K / $667K under diffusion_slowdown / scaling_break / frontier_loss.

Verified end-to-end

  • Live GNN script run + live GNN paste-test of all 10 runbook prompts reproduce the headline numbers to the cent (revenue, anchor share, demand-overlay stranded $). Per-(workload, pool) cost varies within multi-optimal MIP tolerance; GNN run-to-run probabilities vary within seed/CUDA non-determinism.
  • Script flattened to canonical v1 shape (module scope under # ---- Stage N ---- banners; no def main, no _Ctx).
  • Runbook Response density tightened to canonical conciseness (mean 44 words, in range with energy_grid_planning and telco_network_recovery).
  • dev-templates-review skill updated with a new bullet covering multi-reasoner script structure and one covering Runbook Response density.

Test plan

  • Template script runs end-to-end (live GNN on GPU_NV_S + Gurobi prescriptive engine, 4 min 21 s wall)
  • Runbook prompts paste-tested live against the chain (10 / 10 verified)
  • README expected output matches captured stdout
  • Runbook chain ASCII and Response blocks match captured stdout
  • --no-gnn fallback CSV refreshed from a real GNN run (no more GNN-vs-fallback drift)

…low-up to energy_grid_planning)

Inside-the-fence GPU allocation across the 5 hyperscaler campuses the
upstream energy_grid_planning $300M solve approves. Four reasoner stages
on a shared ontology:

- Stage 1 Predictive: heterogeneous-graph GNN forecasts per-lab training
  intensity (cross-lab co_dated edges carry industry co-movement;
  --no-gnn falls back to a precomputed CSV)
- Stage 2 Rules: hardware compatibility (memory + GPU-type allowlist)
  + priority-tier classification (P0/P1/P2 from contract_tier)
- Stage 3 Graph: reverse-PageRank on the WorkloadDependency.blocks DAG
- Stage 4 Prescriptive: assignment MIP indexed by a 3D Scenario sweep
  (PowerEnvelopeLevel x MarginFloor x DiversityCap = 48 cells)

Each stage writes derived properties back to the ontology that the
next stage reads. After the solve, the chosen baseline cell
(100pct / unconstrained / none) is persisted as a singleton
AllocationPlan Concept plus an Assignment.is_chosen unary Relationship,
mirroring telco_network_recovery's RestorePlan / is_selected_upgrade
pattern.

Runs in both chain mode (binds to upstream Model("Energy Grid
Infrastructure") and reads x_approve outcomes) and standalone mode
(loads the bundled DC snapshot).

References:
- runbook.md: operational runbook (prereqs, two-mode run, GNN fallback,
  troubleshooting)
- references/runbook.md: analyst paste-test runbook with per-stage
  skill callouts (rai-build-starter-ontology, rai-querying,
  rai-discovery, rai-predictive-modeling/-training, rai-rules-authoring,
  rai-graph-analysis, rai-prescriptive-problem-formulation,
  rai-prescriptive-results-interpretation, rai-ontology-design)
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 13, 2026

The docs preview for this pull request has been deployed to Vercel!

✅ Preview: https://relationalai-docs-88ek3szky-relationalai.vercel.app/build/templates
🔍 Inspect: https://vercel.com/relationalai/relationalai-docs/7q6E2Rb5CePuZx5UQYM5rw4Z6TnY

Match the structural conventions of telco_network_recovery and
energy_grid_planning:

- runbook.md is now the analyst paste-test walkthrough (chain ASCII +
  10 Workflow prompts with per-stage rai-* skill callouts + Data),
  not an operational doc. Same heading shape and Prompt/Response
  fenced-block style as the peer multi-reasoner runbooks.
- references/ directory removed (no peer template has it).
- README.md gains the peer sections: Prerequisites (with Access / Tools /
  Snowflake setup), Quickstart (numbered download/venv/install/rai-init/
  run with expected output and runtime), Template structure (file tree),
  How it works (code-snippet walkthrough per stage), Troubleshooting
  (collapsible <details> blocks), Learn more, Support.
- Operational content (prereqs, run modes, GNN-vs-baseline lift hint,
  troubleshooting table) folded from the old runbook.md into the
  appropriate README sections.
- Module docstring gains the Output: block alongside Run: (peer
  convention from telco_network_recovery and energy_grid_planning).

No behavior change in the script.
…y_grid_planning style

Apply the dev-templates-review checklist against the upstream sibling
(energy_grid_planning) and align:

Runbook
- Shorten prompts from 100-250 words to ~30-70 words each, matching
  energy's brevity.
- Convert imperative prompts to question form (Steps 5, 7, 8, 9).
- Describe the structural test in Step 7 ("workloads where, if they
  slip, many later workloads slip too, and those gate even more")
  instead of naming reverse-PageRank.
- Tighten Response paragraphs to 1-2 sentences with the headline numbers,
  matching energy's punchier shape.
- Fold the persist step into Stage 4 in the chain ASCII so the headline
  plan + persisted-ontology bullets appear together, matching energy's
  pattern.
- Add a sequential-cascade preface: "Prompts below are designed to run
  in order in a single session so each step inherits the ontology
  state from the previous step."

README
- "What this template is for" now names reasoning types in **bold**
  ("predictive ... rules ... graph ... prescriptive") per the
  dev-templates-review checklist + energy convention.
- Replace the "Rules alone classify... Graph alone ranks..." enumeration
  apologetic with the peer-style bolded reasoning-type paragraph.
- Drop the "GNN lift over a per-lab tabular baseline" framing in
  How it works Stage 1; describe the cross-concept signal a
  heterogeneous GNN propagates instead. ("Why X reasoner over Y" prose
  is inside-baseball and belongs in evals, not READMEs.)

No script behavior change.
…ual lineage

The literal cross-process chain (binding to upstream
Model('Energy Grid Infrastructure') populated by energy_grid_planning)
hits a cross-process model-state visibility issue in PyRel 1.x and isn't
the demo we want anyway. Reframe energy_grid_planning as conceptual
lineage instead: the bundled data_centers.csv is a snapshot of that
upstream $300M-approved campus set, and this template demonstrates the
operator-side allocation decision that picks up after.

Script
- Drop --standalone and --investment-level CLI flags. Only --no-gnn
  and --gnn-strict remain.
- Drop the chain-bind branch in main() (upstream Model() rebind,
  InvestmentLevel filter, x_approve lookup, side-table dollars_per_mwh
  attach, RuntimeError pre-flight).
- Drop the data_center_attrs.csv side table (dollars_per_mwh now lives
  directly on data_centers.csv).
- main() now just loads the snapshot CSV into a fresh
  Model('Datacenter Compute Allocation'). ~110 LOC simpler.

README
- Reframe lede as conceptual sequence: "This template demonstrates the
  operator-side decision that picks up where energy_grid_planning
  leaves off. ... The two templates form a conceptual sequence over the
  same domain, not a literal engine-level chain."
- Drop the "0. Chain bind" row from Reasoner overview (4 stages now,
  not 5).
- Quickstart now has 7 steps (was 9); single run command + --no-gnn.
- Drop "Cross-template ontology extension" key design pattern bullet
  (no longer applicable).
- Drop data_center_attrs.csv from What's included + Template structure.
- Drop chain-mode troubleshooting entry.
- Customize section: rephrase DC editing in terms of the snapshot CSV
  rather than chain-mode investment-level switching.

Runbook
- Lead-in: "campuses ... a snapshot of the campus set the upstream
  energy_grid_planning $300M solve approved" rather than "the 5
  hyperscaler campuses the upstream solve approved".
Distill the AI-compute capital-allocation framework into the operator
template's narrative + structural additions. Operator POV preserved
throughout; no name-checks of any specific source.

Narrative (README — items 1-5):
- Power envelope cells now framed explicitly as the cone of uncertainty:
  85% lower-cone curtailment / 100% expected / 110% upper-cone planning
  point. Planning to the midpoint is how operators end up under-
  provisioned to anchors when frontier demand outruns the forecast.
- Gross-margin discipline reframed at the envelope level, not per-token.
- Anchor concentration reframed as a strategic floor: must be served
  wherever feasible because penalty is contractual + reputational, not
  just foregone revenue.
- New bullet on the generational layer cake (H100/H200/GB200 sit at
  different points on the price-per-effective-GPU-hour curve;
  effective capacity != nameplate).
- Four-factor objective reframed as envelope-level ROI across anchor /
  opportunistic / research uses, single capital-allocation lens.

Adaptation opportunities (README — items 9-10):
- Pool fungibility added to "What this template abstracts" as the
  natural extension capturing the dedicated / swing / scratch
  distinction (fungible pools are strategically more valuable than
  single-purpose pools of equal nameplate).
- Per-lab range forecast (p10/p50/p90) added as the other natural
  extension capturing lab-side uncertainty at the same fidelity as
  the envelope axis.

Structural — item 7 (under-provisioning penalty):
- Stage 2 (rules) derives Workload.under_provisioning_penalty from
  priority_tier: P0 = 1.0, P1 = 0.3, P2 = 0.0.
- Stage 4 (prescriptive) objective amplifies the assignment reward
  by (1 + under_provisioning_penalty), so the solver treats anchor
  under-provisioning as a 2x foregone-revenue loss instead of 1x.
- README Reasoner overview + How it works updated.
- Runbook Step 6 (rules) prompt + response updated.

Structural — item 8 (DemandScenario overlay):
- After main solve + AllocationPlan persist, replay the chosen plan
  under four risk scenarios: expected (factor 1.0), diffusion_slowdown
  (0.85), scaling_break (0.70), frontier_loss (0.50). P0 anchor
  revenue is contractual; P1/P2 opportunistic seats realize only the
  scenario factor.
- DemandScenario Concept (4 rows) + DemandScenarioOutlook(scenario)
  Concept persisted as ontology so the stranded-capacity exposure
  survives the chain run.
- README Reasoner overview + How it works + expected output sample
  updated.
- Runbook Step 9 (results interpretation) combined with the overlay;
  Step 10 (persist) extended to mention the new concepts.

Module docstring Output: block extended to cover the new outputs.

No behavior change on the baseline cell (100pct / unconstrained /
none) -- all 110 workloads still fit, so the under-provisioning
amplifier doesn't change which workloads are selected when all fit.
The amplifier matters at tighter cells where P0 vs P1/P2 trade-offs
become active.
After running datacenter_compute_allocation.py end-to-end with the new
objective (under-provisioning amplification + DemandScenario overlay),
update README + runbook where actual outputs drift from prior text.

- 85pct margin / none diversity cell now fits 20 workloads (not 18):
  the (1 + under_provisioning_penalty) amplifier on P0 nudges the
  solver into selecting 2 additional P2 evals that fit under the 85%
  floor without violating it. Update both:
  - README Quickstart expected-output table: 85pct row 18 -> 20,
    revenue 22,032,899.59 -> 22,032,951.05, cost 3,304,895.87 ->
    3,304,939.84
  - README "Expected per-cell behavior" bullet: drops 89 -> 90
    workloads, mention the retained P1 finetunes + P2 evals
  - Runbook Step 9 Response: 89 workloads dropped -> 90, add the
    14 P0 + 4 P1 + 2 P2 breakdown
- Baseline cell total_cost shifts $4,190,130.34 -> $4,187,977.91
  (~$2K solver-noise within TIME_LIMIT). Update README expected output
  to match the actual run; "$4.19M" rounding holds elsewhere.
- DemandScenario overlay numbers (~$200K / $400K / $667K stranded)
  match real run to within cents.

No script change; this is doc-to-real reconciliation only.
Real paste-test of the runbook Step 2 prompt against the live ontology
shows 11 concepts (not 12 as previously stated), with WorkloadGpuCompat
missing from the prior enumeration:

  5 DataCenterRequest, 28 GpuPool, 6 AILab, 110 Workload,
  181 WorkloadGpuCompat, 138 WorkloadDependency, 2,190 LabMetric
  (date range 2025-05-11 .. 2026-05-10), 6 LabGrowth, plus 3/4/4
  Scenario Concepts.

Also add the explicit date range to the Response since the Step 2
prompt asks for it.
GNN's strongest task types are node classification and link prediction,
not time-series forecasting. The prior per-LabMetric regression on
training_intensity_growth_rate was effectively a tabular forecast
dressed up with cross-concept edges -- the cross-lab co_dated edges
gave it lift, but the *task* was the wrong fit.

Reframe Stage 1 as the operator's truly load-bearing forward-looking
signal: per-workload binary classification of utilization probability
(will this workload actually use its allocated capacity at high duty
cycle, or stall / be repaced?). Stranded capacity -- depreciation
accruing without offsetting revenue -- is the operator's biggest
economic exposure, and a per-workload signal is sharper than a per-lab
demand multiplier the contracts already lock in.

The heterogeneous message-passing is genuinely load-bearing now:
- LabMetric -> Workload: a workload owned by a fast-ramping lab
  inherits signal from lab-side recent activity
- WorkloadDependency.blocks: a workload downstream of a high-utilization
  gating pretrain inherits signal through the dep chain
- cross-lab co_dated (LabMetric <-> LabMetric): industry-wide
  co-movement signal a per-workload tabular model can't see

Data changes:
- DROP data/train_metrics.csv, val_metrics.csv, test_metrics.csv,
  lab_growth_forecasts.csv (per-lab forecast paradigm).
- ADD data/workload_utilization_train.csv (80 workloads + label),
  _val.csv (15 workloads + label), _test.csv (110 workloads, no label
  -- ALL workloads get a prediction).
- ADD data/workload_utilization_fallback.csv -- the deterministic
  latent probability used to generate the synthetic labels; this is
  what --no-gnn loads.

Labels generated by a deterministic synthetic process (seed=42) from:
  - lab recent training-intensity (lab_growth_forecasts seed values)
  - workload type (pretrain bonus, eval penalty)
  - dep-DAG gating position (workloads gating many downstream get bonus)
  - small gaussian noise
Then thresholded to binary at 0.5.

Script changes:
- stage1_predictive() rewritten as per-Workload binary_classification:
  - Drop LabGrowth concept entirely. Workload.utilization_probability
    is the direct output, bound from gnn.predictions().probs[1].
  - Task tables join Workload to TrainTable/ValTable/TestTable by
    workload_id (single-PK join, GNN-friendly).
  - GNN(task_type="binary_classification", eval_metric="roc_auc").
- stage4_prescriptive() objective replaces projected_demand_growth
  with utilization_probability. Same structure; same four-factor
  multiplied by (1 + under_provisioning_penalty).
- Module docstring updated.

Doc changes:
- README front matter description, "What this template is for"
  reasoning-types paragraph, "What you'll build", "What's included",
  Reasoner overview Stage 1 row, "How it works" Stage 1 code snippet
  + narrative, Customize bullets (drop tabular-baseline RMSE
  comparison, replace with ROC-AUC comparison), Troubleshooting
  collapsible, "What this template abstracts" point-estimate item
  reframed for per-workload range forecast.
- Runbook chain ASCII Stage 1, Step 2 Response (drop LabGrowth from
  the 10-concept enumeration), Step 3 Discovery routing description,
  Step 5 prompt + Response entirely rewritten.
- Quickstart Expected output Stage 1 sample updated to show
  utilization-probability top/bottom 5 instead of per-lab multipliers.
…ions

The prior per-workload single-shot labels (80 training examples) gave
the GNN too little signal to discriminate -- with 68/42 class imbalance
and only 80 examples, the model converged to the positive-class prior
and compressed all predictions to 0.78-0.91.

Reframe as the realistic operator data shape: each workload is observed
monthly. The historical labels CSV now carries 9 months × 110 workloads
of (workload, observation_date, is_high_utilization) tuples, generated
deterministically (seed=42) from:
  - structural propensity per workload (lab growth, type bonus, dep
    gating position)
  - per-month lab activity perturbation (mean training_intensity_growth_rate
    from LabMetric for that month)
  - cross-lab macro shock per month (cross-lab mean)
  - period-specific gaussian noise

Splits:
  Train: 7 historical months × 110 workloads = 770 observations
  Val:   1 month × 110 = 110
  Test:  current month × 110 = 110 (no labels; predict for all workloads)

Script changes:
- _train_gnn_and_predict: task relationships now use `at {Date:obs_date}`
  time slots. Train + Val are 3-arity (workload, obs_date, label);
  Test is 2-arity (workload, obs_date).
- LabMetric.metric_date typed as Date (was String). PropertyTransformer
  has datetime=[LabMetric.metric_date], time_col=[LabMetric.metric_date]
  per the rai-predictive-training triple-coupling rule for
  has_time_column=True.
- GNN(has_time_column=True, ...). Same edges, same task type.
- Import Date from relationalai.semantics.

Doc changes:
- README "What's included" + "Template structure" updated for the new
  data shape (770/110/110 instead of 80/15/110).
- README Reasoner overview Stage 1 reframed: "Heterogeneous-graph
  temporal GNN" with explicit mention of has_time_column=True and the
  per-month observation count.
- README "How it works" Stage 1 code snippet rewritten to show the
  temporal task relationships with at clause + the GNN constructor
  with has_time_column=True.
- Runbook chain ASCII Stage 1 updated to mention the 770 historical
  obs + 110 val structure.
- Runbook Step 5 prompt + Response reframed for temporal training data.

Same chain shape downstream; same Stage 4 objective. The expected
output of Stage 1 should now show better-discriminated per-workload
probabilities (frontier 0.85+, Stability 0.20-0.35) once the GNN has
real training variety to learn from.
Real chain run with temporal GNN + Date-typed metric_date / observation_date:

Stage 1 GNN now genuinely discriminates per-workload:
  n_total=110, n>=0.5: 99, n<0.5: 11
  Top 5 (frontier pretrains): p ≈ 0.880-0.881 (Claude/Grok/GPT-Next shards)
  Bottom 5 (Stability evals): p ≈ 0.338-0.342
  (was compressed at 0.78-0.91 in v3 with 80-example single-shot labels;
   the 770 historical (workload, month) observations + temporal alignment
   give the GNN real signal to learn from.)

Stage 4 baseline cell (100pct / unconstrained / none) shifts slightly:
  total_cost_usd: $4,187,977.91 -> $4,204,035.68
  realized_margin: 0.834322 -> 0.833687
  (revenue, n_assigned, anchor_share, binding_axis all unchanged.)

Stage 4 85% margin / none cell:
  n_assigned: 20 -> 18
  revenue: $22,032,951.05 -> $22,032,899.59
  cost: $3,304,939.84 -> $3,304,895.87
  breakdown: 14 P0 + 4 P1 + 2 P2  ->  14 P0 + 4 P1 + 0 P2

The 85%-floor cell change has a meaningful narrative reason: P2 evals
now have utilization_probability ~0.34 (vs P0 pretrains at ~0.88), so
the four-factor objective devalues them ~2.6x relative to P0. They no
longer "sneak in" to fill the 85%-floor cell -- the cell goes back to
the 18-workload anchor-only shape.

DemandScenario overlay numbers unchanged to within cents (P0 vs non-P0
strategic-value split is the same; the overlay is purely a post-solve
multiplication).

Doc changes:
- README Quickstart Expected-output table updated: Stage 1 distribution
  (n>=0.5: 99, n<0.5: 11) + top-5 / bottom-5 lists; baseline cell row
  (cost 4,204,035.68); 85pct cell row (n=18, revenue 22,032,899.59,
  cost 3,304,895.87); AllocationPlan singleton row.
- README "Expected per-cell behavior" 85% bullet: "drops 90 -> drops 92",
  remove "+ 2 P2 evals" mention, add the under-provisioning-amplifier +
  GNN-utilization rationale.
- Runbook Step 9 Response: baseline cost $4.19M -> $4.20M; 85% margin
  cliff "90 workloads dropped -> 92", same breakdown rationale.
- Runbook Step 5 Response + chain-ASCII Stage 1 annotation: Stability
  evals "0.20-0.35" -> "~0.34" (matches actual GNN output range).
…PTIMAL)

HiGHS at 900s consistently hits TIME_LIMIT on this 48-cell sweep --
acceptable for a tutorial but a poor demo experience (15 min wall
time before the script prints Stage 4 results).

Switch to Gurobi as the recommended solver; it typically converges
to OPTIMAL across all feasible cells well under 60s. HiGHS remains
documented as a fallback for customers without a Gurobi-licensed
prescriptive engine (raise time_limit_sec to ~900 and accept the
feasible-but-not-proven-optimal solution).

Changes:
- problem.solve("gurobi", time_limit_sec=60)
- Stage 4 print "Solving 48-cell scenario sweep with Gurobi..."
- README Reasoner overview row: MIP (Gurobi)
- README Prerequisites: Gurobi-enabled prescriptive engine
  (recommended); HiGHS works as a fallback
- README Quickstart Expected output: Termination OPTIMAL (was TIME_LIMIT)
- README Customize bullet: swap to "highs" for non-Gurobi prescriptive
  engines, raise time_limit_sec to ~900
- Runbook Step 8 Response: Gurobi typically OPTIMAL under 60s
- Troubleshooting `TIME_LIMIT` collapsible reframed: only happens if
  Gurobi can't reach OPTIMAL on a cell in 60s, or if you've swapped
  to HiGHS
Real run with Gurobi:
- Termination: OPTIMAL (was TIME_LIMIT)
- Solve time: 9.4s (was 900s)
- Objective: $1,301,756,328.76 (was $1,255,809,068.45 -- better solution
  because Gurobi reached OPTIMAL where HiGHS hit the wall)

Per-cell shifts driven by the better solution:
- Baseline total_cost: $4,204,035.68 -> $4,197,891.06 (~$4.20M, still)
- realized_margin: 0.833687 -> 0.833930
- 85pct/none cell: n_assigned 18 -> 20 (Gurobi finds the v2-era
  20-workload solution by squeezing 2 P2 evals in under the floor);
  revenue $22,032,899.59 -> $22,032,951.05; cost $3,304,895.87 ->
  $3,304,939.27; breakdown back to 14 P0 + 4 P1 + 2 P2.
- "drops 92" -> "drops 90" in narrative.
- Diversity frontier 70% cap revenue: $4,171,456 -> $4,437,625 (Gurobi
  finds a better assignment within the 70% anchor cap).

DemandScenario overlay: unchanged (computation depends only on the
chosen-cell P0/non-P0 strategic-value split, which is unchanged).

Doc updates:
- README Quickstart expected output: 85pct row, baseline cost,
  AllocationPlan singleton row.
- README Expected runtime: Gurobi ~10s solve; total wall time ~5 min
  end-to-end with Gurobi vs ~17 min with HiGHS.
- README "Expected per-cell behavior" 85% bullet.
- Runbook Step 8 Response: ~10s solve detail.
- Runbook Step 9 Response: 90 dropped, 14 P0 + 4 P1 + 2 P2.
…ng peer templates

Drop gurobi/HiGHS mentions from README and runbook narrative. Stage 4
script call switches to problem.solve("highs", time_limit_sec=120),
matching energy_grid_planning and telco_network_recovery conventions
(both peers use highs+120s). README Prerequisites now just says
"prescriptive engine for Stage 4"; Customize section has a single
solver-tuning bullet that's solver-agnostic.

TIME_LIMIT remains the expected termination on the default config;
already-documented as 'signal, not failure' per
rai-prescriptive-results-interpretation. No behavior change on the
baseline cell (all 110 still fit); tight cells may have slightly
different feasible solutions across runs but the documented patterns
hold.
@cafzal cafzal changed the title Add datacenter_compute_allocation template (multi-reasoner, chain follow-up to energy_grid_planning) Add datacenter_compute_allocation template (multi-reasoner) May 13, 2026
cafzal added 2 commits May 13, 2026 15:42
…vention until GNN GAs)

Matches telco_network_recovery, subscriber_retention, demand_forecasting
— all GNN-using templates carry private: true until the GNN reasoner
is generally available.
…ocs to live Gurobi run

- Flatten script to module scope under `# ---- Stage N ----` banners (no `def main`,
  no `_Ctx`, helpers reduced to `load_csv` + `_train_gnn_and_predict`)
- Trim scenario sweep: 2 × 3 × 4 = 24 cells (drop 110pct envelope and 75% margin;
  drop anchor_max_40pct_with_type_floor was retained as designed-INFEASIBLE example)
- Top-level `SOLVER` + `SOLVER_TIME_LIMIT_SEC` constants; default Gurobi, HiGHS option
- Stage 4 status labelling: distinguish INFEASIBLE (proven) from UNSOLVED (timeout)
- Predictive: `device="cuda"`, matches the configured `GPU_NV_S` engine
- pyproject.toml: pin `relationalai==1.4.2`
- Reconcile README expected-output + per-cell behavior + framing to the actual run:
  16 OPTIMAL / 8 INFEASIBLE, baseline 110/$25.28M/83%/95%, OPTIMAL in 6.3s
- Tighten runbook Response blocks (mean 44 words, in range with canonical runbooks)
- Refresh `data/workload_utilization_fallback.csv` from a real GNN run so the
  `--no-gnn` path matches GNN-shape probabilities (~0.78 top / ~0.45 bottom)
- Clean up apologetics in README, stale function references after refactor,
  and the phantom `data_center_attrs.csv` listing
- README front matter description: "3D-scenario MIP" → "24-cell scenario MIP"
- README Template structure tree: power_envelope 3→2 rows, margin_floors 4→3 rows,
  workload_utilization_fallback annotated as "GNN-shape" not "deterministic"
- Runbook Data footer: "3 / 4 / 4 scenario rows" → "2 / 3 / 4"
@cafzal cafzal merged commit da174bd into main May 18, 2026
3 checks passed
@cafzal cafzal deleted the worktree-agent-a2bcce56b6427b60d branch May 18, 2026 20:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants