Skip to content

Conversation

@prathi-wind
Copy link
Contributor

@prathi-wind prathi-wind commented Nov 19, 2025

User description

User description

Description

Adding OpenMP support changed the state variable. This caused CPU builds to request GPU resources on SLURM since the template was not updated. I updated the templates to no longer request GPU resources when not needed.

Fixes #(issue) [optional]

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Something else

Scope

  • This PR comprises a set of related changes with a common goal

If you cannot check the above box, please split your PR into multiple PRs that each have a common goal.

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.
Provide instructions so we can reproduce.
Please also list any relevant details for your test configuration

  • Test A
  • Test B

Test Configuration:

  • What computers and compilers did you use to test this:

I tested on Phoenix by printing out what the template expanded out to in the different build types.

Checklist

  • I have added comments for the new code
  • I added Doxygen docstrings to the new code
  • I have made corresponding changes to the documentation (docs/)
  • I have added regression tests to the test suite so that people can verify in the future that the feature is behaving as expected
  • I have added example cases in examples/ that demonstrate my new feature performing as expected.
    They run to completion and demonstrate "interesting physics"
  • I ran ./mfc.sh format before committing my code
  • New and existing tests pass locally with my changes, including with GPU capability enabled (both NVIDIA hardware with NVHPC compilers and AMD hardware with CRAY compilers) and disabled
  • This PR does not introduce any repeated code (it follows the DRY principle)
  • I cannot think of a way to condense this code and reduce any introduced additional line count

If your code changes any code source files (anything in src/simulation)

To make sure the code is performing as expected on GPU devices, I have:

  • Checked that the code compiles using NVHPC compilers
  • Checked that the code compiles using CRAY compilers
  • Ran the code on either V100, A100, or H100 GPUs and ensured the new feature performed as expected (the GPU results match the CPU results)
  • Ran the code on MI200+ GPUs and ensure the new features performed as expected (the GPU results match the CPU results)
  • Enclosed the new feature via nvtx ranges so that they can be identified in profiles
  • Ran a Nsight Systems profile using ./mfc.sh run XXXX --gpu -t simulation --nsys, and have attached the output file (.nsys-rep) and plain text results to this PR
  • Ran a Rocprof Systems profile using ./mfc.sh run XXXX --gpu -t simulation --rsys --hip-trace, and have attached the output file and plain text results to this PR.
  • Ran my code using various numbers of different GPUs (1, 2, and 8, for example) in parallel and made sure that the results scale similarly to what happens if you run without the new code/feature

PR Type

Bug fix


Description

  • Replace boolean gpu checks with gpuConfigOptions.NONE.value comparison

  • Import gpuConfigOptions from mfc.state in all SLURM templates

  • Fix CPU builds incorrectly requesting GPU resources on SLURM

  • Ensure GPU directives only apply when GPU is explicitly configured


Diagram Walkthrough

flowchart LR
  A["Boolean gpu variable"] -->|"Changed to"| B["gpu != gpuConfigOptions.NONE.value"]
  B -->|"Applied to"| C["16 SLURM templates"]
  C -->|"Result"| D["CPU builds no longer request GPU"]
Loading

File Walkthrough

Relevant files
Bug fix
14 files
bridges2.mako
Import gpuConfigOptions and fix GPU check                               
+2/-1     
carpenter-cray.mako
Import gpuConfigOptions and add blank line                             
+2/-0     
carpenter.mako
Import gpuConfigOptions and add blank line                             
+2/-0     
default.mako
Import gpuConfigOptions with blank line                                   
+2/-0     
delta.mako
Import gpuConfigOptions and fix GPU check                               
+2/-1     
deltaai.mako
Import gpuConfigOptions and fix GPU check                               
+2/-1     
frontier.mako
Import gpuConfigOptions and fix multiple GPU checks           
+4/-3     
hipergator.mako
Import gpuConfigOptions and fix GPU check                               
+2/-1     
nautilus.mako
Import gpuConfigOptions and fix GPU check                               
+2/-1     
oscar.mako
Import gpuConfigOptions and fix GPU check                               
+2/-1     
phoenix-bench.mako
Import gpuConfigOptions and fix GPU check                               
+2/-1     
phoenix.mako
Import gpuConfigOptions and fix GPU check                               
+2/-1     
santis.mako
Import gpuConfigOptions and fix GPU check                               
+2/-1     
summit.mako
Import gpuConfigOptions module                                                     
+1/-0     

Summary by CodeRabbit

  • New Features

    • Templates now use a unified GPU-active flag so GPU vs CPU execution paths are selected consistently.
  • Bug Fixes

    • GPU-related job options and module loading are properly gated by the GPU-active state, preventing GPU configuration when no GPU is requested.
  • Documentation

    • Minor wording clarity update in a simulation module header.

✏️ Tip: You can customize this high-level summary in your review settings.


CodeAnt-AI Description

Stop requesting GPU resources when running CPU-only templates

What Changed

  • Job templates now compute whether GPUs are required before emitting resource flags, so CPU builds no longer generate GPU-related SBATCH/PBS options
  • Module loading scripts and MPI launch wrappers use the new GPU-active indicator to load CPU–only configurations and avoid GPU bindings when none are requested
  • Helper templates expose the shared GPU-active flag so every cluster template stays aligned with the current build requirements

Impact

✅ No GPU requests during CPU-only builds
✅ Fewer GPU-related submission errors on SLURM
✅ More accurate resource allocations for CPU jobs

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

@coderabbitai
Copy link

coderabbitai bot commented Nov 19, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Multiple Mako templates now use a template-level boolean is_gpu_active (added in the helpers include by importing gpuConfigOptions and comparing gpu to gpuConfigOptions.NONE.value) and replace prior checks of gpu with is_gpu_active for GPU SBATCH directives, module-load mode selection, and related GPU gating.

Changes

Cohort / File(s) Summary
Templates — GPU gating switched to is_gpu_active
toolchain/templates/bridges2.mako, toolchain/templates/carpenter-cray.mako, toolchain/templates/carpenter.mako, toolchain/templates/default.mako, toolchain/templates/delta.mako, toolchain/templates/deltaai.mako, toolchain/templates/frontier.mako, toolchain/templates/hipergator.mako, toolchain/templates/nautilus.mako, toolchain/templates/oscar.mako, toolchain/templates/phoenix-bench.mako, toolchain/templates/phoenix.mako, toolchain/templates/santis.mako, toolchain/templates/summit.mako
Replaced GPU truthiness checks (% if gpu: / ${'g' if gpu else 'c'} and similar) with is_gpu_active equivalents (% if is_gpu_active: / ${'g' if is_gpu_active else 'c'}) across SBATCH GPU directives, module-load mode selection, MPICH/interactive gating, and other GPU-related branches; some templates now import or rely on gpuConfigOptions where required.
Helpers — define is_gpu_active
toolchain/templates/include/helpers.mako
Expanded imports to include from mfc.state import gpuConfigOptions, and defines is_gpu_active = (gpu != gpuConfigOptions.NONE.value), exposing a consistent boolean for GPU activity to templates.
Minor docs change
src/simulation/m_bubbles.fpp
Small wording change in module header documentation; no functional changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Renderer as Template renderer
  participant Helpers as include/helpers.mako
  participant Template as .mako template
  participant SBATCH as SBATCH / module loader

  rect `#E8F4FF`
  Note over Renderer,Helpers: render start — helpers compute GPU state
  Renderer->>Helpers: import gpuConfigOptions\ncompute is_gpu_active = (gpu != NONE)
  end

  Renderer->>Template: render template (context includes is_gpu_active)
  alt is_gpu_active
    Template->>SBATCH: emit GPU SBATCH directives\ninvoke mfc.sh load with 'g'
  else not is_gpu_active
    Template->>SBATCH: omit GPU SBATCH directives\ninvoke mfc.sh load with 'c'
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Review focus:
    • Verify is_gpu_active expression matches intended gpuConfigOptions.NONE.value semantics.
    • Spot-check templates with multiple GPU-related blocks (e.g., frontier.mako, deltaai.mako, phoenix.mako) for consistent replacement.
    • Confirm added imports do not introduce name collisions or missing imports in templates that now reference gpuConfigOptions.

Poem

🐰 I hopped through Mako lines so spry,
I taught the files to check the sky,
is_gpu_active now takes the lead,
Quiet swaps that do the deed,
Cheers — a tidy code-field sigh.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The description is largely incomplete. While it includes a problem statement, PR type selection, and testing notes, most critical checklist sections remain unchecked, and the description lacks details on broader testing, documentation updates, and regression tests. Complete the checklist items by documenting testing on GPU/CPU builds with NVHPC and CRAY compilers, add documentation updates, and include regression test plans if applicable to this bug fix.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: preventing CPU builds from requesting GPU resources on SLURM templates.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 214ee17 and e5315b2.

📒 Files selected for processing (1)
  • src/simulation/m_bubbles.fpp (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • src/simulation/m_bubbles.fpp
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: cubic · AI code reviewer
  • GitHub Check: Code Cleanliness Check
  • GitHub Check: Coverage Test on CodeCov
  • GitHub Check: Github (ubuntu, mpi, debug, false)
  • GitHub Check: Github (macos, mpi, no-debug, false)
  • GitHub Check: Github (ubuntu, no-mpi, single, no-debug, false)
  • GitHub Check: Github (ubuntu, mpi, no-debug, false)
  • GitHub Check: Github (ubuntu, mpi, no-debug, true)
  • GitHub Check: Self Hosted (gpu, omp, gt)
  • GitHub Check: Self Hosted (cpu, none, gt)
  • GitHub Check: Github (macos, mpi, debug, false)
  • GitHub Check: Github (ubuntu, mpi, debug, true)
  • GitHub Check: Self Hosted (gpu, acc, gt)
  • GitHub Check: Build & Publish

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. 📝 Description — Summarize the main change in 50–60 words, explaining what was done.
  2. 📓 References — List relevant issues, discussions, documentation, or related PRs.
  3. 📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. 📊 Contributor Summary — Include a Markdown table showing contributions:
    | Contributor | Lines Added | Lines Removed | Files Changed |
  5. ✔️ Additional Notes — Add any extra reviewer context.
    Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Comment @coderabbitai help to get the list of available commands and usage tips.

@sbryngelson
Copy link
Member

@prathi-wind does this need to be a draft or is it ready to merge?

@sbryngelson sbryngelson marked this pull request as ready for review November 19, 2025 17:36
@sbryngelson sbryngelson requested review from a team and Copilot November 19, 2025 17:36
@qodo-merge-pro
Copy link
Contributor

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Possible Issue

Mixed usage of gpu as boolean in . ./mfc.sh load -m ${'g' if gpu else 'c'} while other directives now check gpu != gpuConfigOptions.NONE.value. If gpu is no longer boolean, this conditional may mis-select modules. Ensure the module selection uses the same enum-based logic.

cd "${MFC_ROOT_DIR}"
% if engine == 'batch':
. ./mfc.sh load -c f -m ${'g' if gpu else 'c'}
% endif
cd - > /dev/null
echo

% if gpu != gpuConfigOptions.NONE.value:
    export MPICH_GPU_SUPPORT_ENABLED=1

Consistency
New import <%! from mfc.state import gpuConfigOptions %> added in many templates, but verify all templates that gate GPU resources have been updated uniformly. Any missed template would reintroduce the CPU-requesting-GPU issue.

@prathi-wind
Copy link
Contributor Author

@prathi-wind does this need to be a draft or is it ready to merge?

It is ready to merge.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High-level Suggestion

The logic for checking GPU usage is duplicated across 14 template files. This should be centralized into a single helper function or variable to improve maintainability and consistency. [High-level, importance: 8]

Solution Walkthrough:

Before:

// In toolchain/templates/bridges2.mako
<%! from mfc.state import gpuConfigOptions %>
...
% if gpu != gpuConfigOptions.NONE.value:
  #SBATCH --gres=gpu:v100-16:${tasks_per_node}
% endif

// In toolchain/templates/delta.mako
<%! from mfc.state import gpuConfigOptions %>
...
% if gpu != gpuConfigOptions.NONE.value:
  #SBATCH --gpus-per-node=${tasks_per_node}
% endif

// ... and so on for many other template files

After:

// In a central Python file that renders templates:
from mfc.state import gpuConfigOptions
...
is_gpu_active = (gpu_mode != gpuConfigOptions.NONE.value)
template.render(..., is_gpu_active=is_gpu_active)


// In toolchain/templates/bridges2.mako (and all others)
// No import or complex check needed
...
% if is_gpu_active:
  #SBATCH --gres=gpu:v100-16:${tasks_per_node}
% endif

// In toolchain/templates/delta.mako (and all others)
...
% if is_gpu_active:
  #SBATCH --gpus-per-node=${tasks_per_node}
% endif

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the final PR Bugbot will review for you during this billing cycle

Your free Bugbot reviews will reset on December 8

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

Bug: GPU load command uses stale boolean check

The line . ./mfc.sh load -c cc -m ${'g' if gpu else 'c'} still uses the old boolean check for gpu. Since gpu is now a string ('no', 'acc', or 'mp'), the string 'no' is truthy and will incorrectly load GPU modules on CPU builds. This should use gpu != gpuConfigOptions.NONE.value to match the pattern in other fixes.

toolchain/templates/carpenter-cray.mako#L30-L31

. ./mfc.sh load -c cc -m ${'g' if gpu else 'c'}
cd - > /dev/null

Fix in Cursor Fix in Web


echo

% if gpu:
% if gpu != gpuConfigOptions.NONE.value:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: GPU load command uses stale boolean check

The expression . ./mfc.sh load -c f -m ${'g' if gpu else 'c'} still uses the old boolean check for gpu. Since gpu is now a string ('no', 'acc', or 'mp'), the string 'no' evaluates to truthy, causing CPU builds to incorrectly load GPU modules. Should be updated to ${'g' if gpu != gpuConfigOptions.NONE.value else 'c'} to match other fixed conditions in this PR.

Fix in Cursor Fix in Web

@sbryngelson
Copy link
Member

some ai reviews that seem reasonable

Copilot finished reviewing on behalf of sbryngelson November 19, 2025 17:39
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 43 to 46
% if gpu != gpuConfigOptions.NONE.value:
export MPICH_GPU_SUPPORT_ENABLED=1
% else:
export MPICH_GPU_SUPPORT_ENABLED=0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Update remaining GPU conditionals to handle enum values

Changing the job header to gpu != gpuConfigOptions.NONE.value avoids requesting GPUs via SLURM, but other logic in this template still relies on if gpu (e.g. the module load line . ./mfc.sh load -c f -m ${'g' if gpu else 'c'}). Because gpu is now a non‑empty string ('no', 'acc', etc.), those expressions are always truthy and CPU builds will still load GPU modules and pass GPU runtime flags. This leaves CPU jobs misconfigured despite the header change. Please apply the same explicit comparison for the module selection here and in the other templates touched by this commit (bridges2.mako, carpenter*.mako, default.mako, hipergator.mako, summit.mako, etc.).

Useful? React with 👍 / 👎.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
toolchain/templates/carpenter.mako (1)

4-5: Unused import can be removed.

The gpuConfigOptions import is not used in this template since it doesn't contain GPU-related conditional logic. Consider removing it to keep the imports clean.

Apply this diff to remove the unused import:

-<%! from mfc.state import gpuConfigOptions %>
-
toolchain/templates/default.mako (1)

5-6: Unused import can be removed.

The gpuConfigOptions import is not used in this template. While line 53 references the gpu variable, it doesn't use gpuConfigOptions.NONE.value for comparison. Consider removing the unused import.

Apply this diff to remove the unused import:

-<%! from mfc.state import gpuConfigOptions %>
-
toolchain/templates/carpenter-cray.mako (1)

4-5: Unused import can be removed.

The gpuConfigOptions import is not used in this template since it doesn't contain GPU-related conditional logic.

Apply this diff to remove the unused import:

-<%! from mfc.state import gpuConfigOptions %>
-
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 22af239 and 985a861.

📒 Files selected for processing (14)
  • toolchain/templates/bridges2.mako (2 hunks)
  • toolchain/templates/carpenter-cray.mako (1 hunks)
  • toolchain/templates/carpenter.mako (1 hunks)
  • toolchain/templates/default.mako (1 hunks)
  • toolchain/templates/delta.mako (2 hunks)
  • toolchain/templates/deltaai.mako (2 hunks)
  • toolchain/templates/frontier.mako (4 hunks)
  • toolchain/templates/hipergator.mako (2 hunks)
  • toolchain/templates/nautilus.mako (2 hunks)
  • toolchain/templates/oscar.mako (2 hunks)
  • toolchain/templates/phoenix-bench.mako (2 hunks)
  • toolchain/templates/phoenix.mako (2 hunks)
  • toolchain/templates/santis.mako (2 hunks)
  • toolchain/templates/summit.mako (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: cubic · AI code reviewer
  • GitHub Check: Cursor Bugbot
🔇 Additional comments (22)
toolchain/templates/nautilus.mako (2)

4-4: LGTM: Import enables explicit GPU configuration check.

The import is correctly used at line 18 to gate GPU resource requests.


18-21: The code change is correct and verified. Approval is appropriate.

The template properly imports gpuConfigOptions at line 4, and the explicit comparison gpu != gpuConfigOptions.NONE.value correctly gates GPU resource requests. When gpu is 'no' (the NONE value), the GPU directives are skipped; otherwise they're included. This pattern is consistent across all compute cluster templates in the codebase.

toolchain/templates/delta.mako (2)

4-4: LGTM: Import enables explicit GPU configuration check.


18-22: LGTM: Explicit NONE check prevents unintended GPU requests.

The change correctly gates GPU-specific SBATCH directives (gpus-per-node, mem, gpu-bind) to only activate when a real GPU configuration is selected.

toolchain/templates/bridges2.mako (2)

4-4: LGTM: Import enables explicit GPU configuration check.


18-21: LGTM: Explicit NONE check prevents unintended GPU requests.

The condition correctly prevents GPU resource allocation on CPU builds.

toolchain/templates/phoenix.mako (2)

4-4: LGTM: Import enables explicit GPU configuration check.


21-24: LGTM: Explicit NONE check prevents unintended GPU requests.

The condition correctly gates GPU resource directives for the Phoenix cluster.

Note: Line 23 has a trailing backslash after 16G\ which may be unintentional but appears to be pre-existing.

toolchain/templates/phoenix-bench.mako (2)

4-4: LGTM: Import enables explicit GPU configuration check.


21-24: LGTM: Explicit NONE check prevents unintended GPU requests.

The condition correctly gates GPU resource directives. Same trailing backslash note as in phoenix.mako.

toolchain/templates/deltaai.mako (2)

4-4: LGTM: Import added for GPU configuration.

The import of gpuConfigOptions enables explicit GPU gating in this template.


18-22: LGTM: GPU resources correctly gated on non-NONE configuration.

The explicit comparison against gpuConfigOptions.NONE.value ensures SBATCH GPU directives are only emitted when GPU is actually configured, preventing CPU builds from incorrectly requesting GPU resources.

toolchain/templates/oscar.mako (2)

4-4: LGTM: Import added for GPU configuration.

The import enables explicit GPU gating in this template.


18-22: LGTM: GPU resources correctly gated.

The explicit comparison ensures GPU directives are only emitted for non-NONE GPU configurations, preventing unintended GPU resource requests.

toolchain/templates/santis.mako (2)

4-4: LGTM: Import added for GPU configuration.


78-80: LGTM: GPU flag correctly gated in srun command.

The explicit comparison ensures --gpus-per-task is only passed to srun when GPU is actually configured, preventing CPU builds from requesting GPU resources.

toolchain/templates/frontier.mako (4)

4-4: LGTM: Import added for GPU configuration.


14-17: LGTM: GPU resources correctly gated in batch mode.

The explicit comparison ensures SBATCH GPU directives are only emitted for non-NONE GPU configurations.


43-47: LGTM: MPICH_GPU_SUPPORT_ENABLED explicitly controlled.

Setting MPICH_GPU_SUPPORT_ENABLED based on GPU configuration is good practice and ensures MPI behavior aligns with resource allocation.


70-72: LGTM: GPU flags correctly gated in interactive mode.

toolchain/templates/hipergator.mako (2)

4-4: LGTM: Import added for GPU configuration.


13-16: LGTM: GPU resources correctly gated.

The explicit comparison against gpuConfigOptions.NONE.value ensures GPU directives are only emitted when GPU is configured, fixing the issue where CPU builds incorrectly requested GPU resources.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a bug where CPU builds incorrectly requested GPU resources on SLURM/PBS batch systems. The issue arose after OpenMP support changed the state variable's representation - the gpu variable now holds string values ('no', 'acc', or 'mp') instead of boolean values. Since any non-empty string is truthy in Python/Mako templates, the previous % if gpu: conditionals were evaluating to True even for CPU builds (gpu = 'no'), causing SLURM/PBS to request GPU resources unnecessarily.

The fix updates all batch scheduler templates to properly check gpu != gpuConfigOptions.NONE.value instead of relying on truthiness, ensuring GPU resources are only requested when actually needed.

  • Updated conditional checks from % if gpu: to % if gpu != gpuConfigOptions.NONE.value: across all affected templates
  • Added from mfc.state import gpuConfigOptions import to access the enum value
  • Applied changes consistently across 13 batch scheduler template files

Reviewed Changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
toolchain/templates/summit.mako Added gpuConfigOptions import to support future GPU-related conditionals (no direct GPU resource requests in this template)
toolchain/templates/santis.mako Fixed conditional to prevent GPU resource requests on CPU builds
toolchain/templates/phoenix.mako Fixed conditional to prevent GPU resource requests on CPU builds
toolchain/templates/phoenix-bench.mako Fixed conditional to prevent GPU resource requests on CPU builds
toolchain/templates/oscar.mako Fixed conditional to prevent GPU resource requests on CPU builds
toolchain/templates/nautilus.mako Fixed conditional to prevent GPU resource requests on CPU builds
toolchain/templates/hipergator.mako Fixed conditional to prevent GPU resource requests on CPU builds
toolchain/templates/frontier.mako Fixed conditionals for both SBATCH directives and runtime environment variables to prevent GPU requests on CPU builds
toolchain/templates/deltaai.mako Fixed conditional to prevent GPU resource requests on CPU builds
toolchain/templates/delta.mako Fixed conditional to prevent GPU resource requests on CPU builds
toolchain/templates/default.mako Added gpuConfigOptions import to support future GPU-related conditionals
toolchain/templates/carpenter.mako Added gpuConfigOptions import to support future GPU-related conditionals
toolchain/templates/carpenter-cray.mako Added gpuConfigOptions import to support future GPU-related conditionals
toolchain/templates/bridges2.mako Fixed conditional to prevent GPU resource requests on CPU builds

<%namespace name="helpers" file="helpers.mako"/>
<%! from mfc.state import gpuConfigOptions %>


Copy link

Copilot AI Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Inconsistent blank line spacing after the import statement. This file has two blank lines after <%! from mfc.state import gpuConfigOptions %> (lines 5-6), while other template files like phoenix.mako, frontier.mako, etc. have only one blank line. For consistency, remove one of the blank lines.

Suggested change

Copilot uses AI. Check for mistakes.
<%namespace name="helpers" file="helpers.mako"/>
<%! from mfc.state import gpuConfigOptions %>


Copy link

Copilot AI Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Inconsistent blank line spacing after the import statement. This file has two blank lines after <%! from mfc.state import gpuConfigOptions %> (lines 5-6), while other template files like phoenix.mako, frontier.mako, etc. have only one blank line. For consistency, remove one of the blank lines.

Suggested change

Copilot uses AI. Check for mistakes.
Comment on lines 4 to 5
%>
<%! from mfc.state import gpuConfigOptions %>
Copy link

Copilot AI Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The import statement is placed in a separate <%! ... %> block from the existing imports (lines 1-4). For consistency and cleaner code organization, consider consolidating all module-level imports into a single <%! ... %> block:

<%!
import os
from mako.exceptions import RuntimeException
from mfc.state import gpuConfigOptions
%>
Suggested change
%>
<%! from mfc.state import gpuConfigOptions %>
from mfc.state import gpuConfigOptions
%>

Copilot uses AI. Check for mistakes.
@prathi-wind
Copy link
Contributor Author

Actually, I realized I missed a few things I need to update.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 14 files

Prompt for AI agents (all 2 issues)

Understand the root cause of the following 2 issues and fix them.


<file name="toolchain/templates/delta.mako">

<violation number="1" location="toolchain/templates/delta.mako:18">
This condition still allocates GPUs for OpenMP-only builds; it should only request GPUs when the configuration is explicitly the accelerator option.</violation>
</file>

<file name="toolchain/templates/oscar.mako">

<violation number="1" location="toolchain/templates/oscar.mako:18">
This condition still requests GPUs for OpenMP CPU builds (gpu == gpuConfigOptions.MP.value); it should only request GPUs for the ACC (actual GPU) mode to avoid consuming GPU resources on CPU jobs.</violation>
</file>

Since this is your first cubic review, here's how it works:

  • cubic automatically reviews your code and comments on bugs and improvements
  • Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
  • Ask questions if you need clarification on any suggestion

Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR

#SBATCH --account="${account}"
% endif
% if gpu:
% if gpu != gpuConfigOptions.NONE.value:
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition still allocates GPUs for OpenMP-only builds; it should only request GPUs when the configuration is explicitly the accelerator option.

Prompt for AI agents
Address the following comment on toolchain/templates/delta.mako at line 18:

<comment>This condition still allocates GPUs for OpenMP-only builds; it should only request GPUs when the configuration is explicitly the accelerator option.</comment>

<file context>
@@ -14,7 +15,7 @@
 #SBATCH --account=&quot;${account}&quot;
 % endif
-% if gpu:
+% if gpu != gpuConfigOptions.NONE.value:
 #SBATCH --gpus-per-node=${tasks_per_node}
 #SBATCH --mem=208G
</file context>
Suggested change
% if gpu != gpuConfigOptions.NONE.value:
% if gpu == gpuConfigOptions.ACC.value:
Fix with Cubic

#SBATCH --account="${account}"
% endif
% if gpu:
% if gpu != gpuConfigOptions.NONE.value:
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition still requests GPUs for OpenMP CPU builds (gpu == gpuConfigOptions.MP.value); it should only request GPUs for the ACC (actual GPU) mode to avoid consuming GPU resources on CPU jobs.

Prompt for AI agents
Address the following comment on toolchain/templates/oscar.mako at line 18:

<comment>This condition still requests GPUs for OpenMP CPU builds (gpu == gpuConfigOptions.MP.value); it should only request GPUs for the ACC (actual GPU) mode to avoid consuming GPU resources on CPU jobs.</comment>

<file context>
@@ -14,7 +15,7 @@
 #SBATCH --account=&quot;${account}&quot;
 % endif
-% if gpu:
+% if gpu != gpuConfigOptions.NONE.value:
 #SBATCH --gpus-per-node=${tasks_per_node}
 #SBATCH --mem=64G
</file context>
Suggested change
% if gpu != gpuConfigOptions.NONE.value:
% if gpu == gpuConfigOptions.ACC.value:
Fix with Cubic

@codeant-ai
Copy link

codeant-ai bot commented Nov 19, 2025

CodeAnt AI is reviewing your PR.

@codeant-ai codeant-ai bot added the size:M This PR changes 30-99 lines, ignoring generated files label Nov 19, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
toolchain/templates/oscar.mako (1)

4-4: GPU gating now centralized via is_gpu_active, but please re-confirm its semantics for CPU/OpenMP builds

Using is_gpu_active here to guard --gpus-per-node, --mem, --gpu-bind, and the -m g/c switch in mfc.sh is consistent and keeps the site template itself simple. However, because is_gpu_active is currently defined as gpu != gpuConfigOptions.NONE.value in helpers.mako, this still treats every non‑NONE GPU mode as “GPU active”.

If your gpu state includes non‑GPU CPU‑only modes (e.g., an OpenMP/MP option distinct from the real GPU/ACC option), this condition will continue to request GPUs for those CPU‑only builds—the same behavior this PR is trying to eliminate. In that case, is_gpu_active should instead be tied explicitly to the actual GPU mode(s) (e.g., gpu == gpuConfigOptions.ACC.value or a small set of GPU modes), or delegated to a helper in mfc.state so the mapping lives in one place.

Please double‑check the gpu enum and is_gpu_active implementation against your intended state model and confirm that CPU/OpenMP configurations no longer trigger these GPU directives on Oscar.

Also applies to: 18-22, 36-36

🧹 Nitpick comments (1)
toolchain/templates/include/helpers.mako (1)

3-3: Clarify and possibly tighten is_gpu_active so it truly means “GPU hardware in use”

Centralizing GPU gating as:

<% is_gpu_active = (gpu != gpuConfigOptions.NONE.value) %>

is a good step, but a few details are worth double‑checking:

  • Semantic scope: If gpu is an enum with more than two states (e.g., NONE, MP for OpenMP‑only CPU builds, ACC for actual GPU builds), then gpu != gpuConfigOptions.NONE.value will be true for all non‑NONE modes, including CPU‑only ones. That would mean is_gpu_active is still true for OpenMP CPU builds, and all the site templates that rely on it would continue to request GPUs for those jobs. If that’s not intended, define is_gpu_active in terms of the specific GPU mode(s), e.g., equality against the GPU enum member(s) or via a small Python helper in mfc.state (so the mapping between state and “uses GPUs” lives in one place).

  • Context dependencies: This expression assumes both gpu and gpuConfigOptions are always present and well‑typed in the template context for every engine/site that pulls in helpers.mako. Please verify that all render paths set these consistently; otherwise this comparison could behave unexpectedly or fail at render time.

Given that every template in this PR now keys its SBATCH GPU options and module mode selection off is_gpu_active, it’s worth nailing down this definition so it exactly matches your conceptual “GPU enabled” state.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 985a861 and 3503f95.

📒 Files selected for processing (15)
  • toolchain/templates/bridges2.mako (3 hunks)
  • toolchain/templates/carpenter-cray.mako (2 hunks)
  • toolchain/templates/carpenter.mako (2 hunks)
  • toolchain/templates/default.mako (2 hunks)
  • toolchain/templates/delta.mako (3 hunks)
  • toolchain/templates/deltaai.mako (3 hunks)
  • toolchain/templates/frontier.mako (4 hunks)
  • toolchain/templates/hipergator.mako (3 hunks)
  • toolchain/templates/include/helpers.mako (1 hunks)
  • toolchain/templates/nautilus.mako (3 hunks)
  • toolchain/templates/oscar.mako (3 hunks)
  • toolchain/templates/phoenix-bench.mako (3 hunks)
  • toolchain/templates/phoenix.mako (3 hunks)
  • toolchain/templates/santis.mako (3 hunks)
  • toolchain/templates/summit.mako (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
  • toolchain/templates/carpenter.mako
  • toolchain/templates/bridges2.mako
  • toolchain/templates/summit.mako
  • toolchain/templates/phoenix.mako
  • toolchain/templates/frontier.mako
  • toolchain/templates/carpenter-cray.mako
  • toolchain/templates/delta.mako
  • toolchain/templates/deltaai.mako
  • toolchain/templates/santis.mako
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: cubic · AI code reviewer
  • GitHub Check: Build & Publish
🔇 Additional comments (6)
toolchain/templates/default.mako (1)

5-5: jsrun GPU count wiring via is_gpu_active looks correct; just ensure the helper matches your GPU enum

Switching --gpu_per_rs to ${1 if is_gpu_active else 0} is the right pattern: the launcher now cleanly depends on a single boolean that encodes “this run actually wants GPUs”.

Given that is_gpu_active is defined in helpers.mako as gpu != gpuConfigOptions.NONE.value, please confirm that every non‑NONE value of gpu truly corresponds to a GPU‑using configuration. If there are CPU‑only modes encoded in gpu (e.g., OpenMP), you may still end up with --gpu_per_rs 1 for CPU jobs unless is_gpu_active is tightened to check only the real GPU mode(s).

Also applies to: 53-53

toolchain/templates/nautilus.mako (1)

4-4: Nautilus GPU directives correctly gated on is_gpu_active; behavior depends on helper definition

On Nautilus, the SBATCH GPU options (--gpu-bind, --gres=gpu:...) and the mfc.sh load -m g/c switch are now consistently guarded by is_gpu_active, which is exactly what you want at the template level.

Just as with the other templates, please verify that is_gpu_active in helpers.mako accurately captures “this job actually uses GPUs” for all combinations of your build options. If gpu encodes CPU‑only modes as non‑NONE, consider tightening is_gpu_active to match only the actual GPU mode(s) to avoid continuing to request V100 resources for CPU/OpenMP cases on Nautilus.

Also applies to: 18-21, 35-35

toolchain/templates/phoenix-bench.mako (1)

4-4: Phoenix-bench GPU resource flags are cleanly gated on is_gpu_active

Here too, the SBATCH GPU options (--gres=gpu:V100:..., --mem-per-gpu) and the mfc.sh load -m g/c mode now consistently depend on is_gpu_active, which is the right abstraction for the template.

As with the other templates, please confirm that is_gpu_active (currently gpu != gpuConfigOptions.NONE.value in helpers.mako) is false for all CPU‑only Phoenix bench configurations, including any OpenMP‑only modes, so these jobs don’t continue to consume V100 GPUs unintentionally.

Also applies to: 21-24, 35-35

toolchain/templates/hipergator.mako (3)

13-16: GPU directive gating looks correct.

The change from gpu to is_gpu_active properly gates the GPU-specific SBATCH directives, which should resolve the issue where CPU builds incorrectly requested GPU resources. Ensure that is_gpu_active correctly evaluates to False for CPU builds.

The verification script above (for line 4) will confirm that is_gpu_active is properly defined in helpers.mako.


39-39: LGTM! Consistent with GPU directive gating.

The change from gpu to is_gpu_active in the module load command is consistent with the conditional on line 13, ensuring CPU builds correctly load CPU mode ('c') instead of GPU mode ('g').


4-4: I need to check whether helpers.mako also imports gpuConfigOptions:

Now I need to check more thoroughly how the templates are structured and whether gpuConfigOptions is passed as context:

Based on my analysis of the script output and Mako template documentation, I can now provide the rewritten review comment:

Remove the unused import or verify if it's necessary for context passing.

The import <%! from mfc.state import gpuConfigOptions %> on line 4 does not appear to be directly used in this file—only is_gpu_active is referenced (lines 13, 39). Code within <%! %> tags is executed at the module level of the template, and not within the rendering function of the template. Each template module maintains its own module-level imports independently. Since is_gpu_active is defined in helpers.mako using gpuConfigOptions, verify whether:

  • helpers.mako also imports gpuConfigOptions independently, or
  • The import here is needed to pass gpuConfigOptions through the context to helpers.mako

If the import is truly unused after verification, it should be removed as dead code.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
toolchain/templates/summit.mako (1)

32-35: Inconsistent GPU conditionals remain in jsrun flags.

While line 20 correctly uses is_gpu_active for module loading, lines 32 and 35 still use truthiness checks (if gpu) for the jsrun GPU flags. This is inconsistent with the PR objective and the pattern used in other templates. The gpu variable could have a non-NONE truthy value (e.g., 'no') that would incorrectly trigger GPU flags even for CPU-only builds.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3503f95 and 90e5f36.

📒 Files selected for processing (15)
  • toolchain/templates/bridges2.mako (2 hunks)
  • toolchain/templates/carpenter-cray.mako (1 hunks)
  • toolchain/templates/carpenter.mako (1 hunks)
  • toolchain/templates/default.mako (2 hunks)
  • toolchain/templates/delta.mako (2 hunks)
  • toolchain/templates/deltaai.mako (2 hunks)
  • toolchain/templates/frontier.mako (3 hunks)
  • toolchain/templates/hipergator.mako (2 hunks)
  • toolchain/templates/include/helpers.mako (1 hunks)
  • toolchain/templates/nautilus.mako (2 hunks)
  • toolchain/templates/oscar.mako (2 hunks)
  • toolchain/templates/phoenix-bench.mako (2 hunks)
  • toolchain/templates/phoenix.mako (2 hunks)
  • toolchain/templates/santis.mako (2 hunks)
  • toolchain/templates/summit.mako (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
  • toolchain/templates/include/helpers.mako
  • toolchain/templates/carpenter.mako
  • toolchain/templates/santis.mako
  • toolchain/templates/delta.mako
  • toolchain/templates/oscar.mako
  • toolchain/templates/nautilus.mako
  • toolchain/templates/phoenix-bench.mako
  • toolchain/templates/carpenter-cray.mako
  • toolchain/templates/default.mako
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: cubic · AI code reviewer
  • GitHub Check: Build & Publish
🔇 Additional comments (13)
toolchain/templates/frontier.mako (5)

13-16: LGTM: SBATCH GPU directives now properly gated.

The GPU resource requests are now correctly conditional on is_gpu_active, preventing CPU builds from requesting GPU resources via SLURM.


37-37: Past review issue resolved: Module loading now uses is_gpu_active.

The module loading command now correctly uses is_gpu_active instead of the boolean gpu check, ensuring CPU builds load CPU modules. This addresses the issues flagged in previous reviews.


42-46: LGTM: MPICH GPU support flag correctly configured.

The MPICH_GPU_SUPPORT_ENABLED environment variable is now properly set based on is_gpu_active, ensuring it matches the actual GPU configuration.


69-71: LGTM: Interactive srun GPU flags properly gated.

The GPU flags for interactive execution are correctly conditional on is_gpu_active, maintaining consistency with the batch execution path.


3-3: Verification confirmed: is_gpu_active is correctly defined in helpers.mako.

The definition in toolchain/templates/include/helpers.mako (line 5) correctly sets is_gpu_active = (gpu != gpuConfigOptions.NONE.value), matching the intended logic across all template changes.

toolchain/templates/summit.mako (1)

20-20: Module load update looks correct.

The change from if gpu to if is_gpu_active correctly aligns with the PR objective to prevent CPU builds from incorrectly requesting GPU resources.

toolchain/templates/bridges2.mako (2)

17-20: GPU SBATCH directives correctly updated.

The condition change from if gpu to if is_gpu_active correctly ensures GPU resources are only requested when GPU is explicitly configured, addressing the PR's objective.


34-34: Module load update looks correct.

The change to use is_gpu_active for module selection aligns with the updated SBATCH directives and PR objective.

toolchain/templates/deltaai.mako (2)

17-21: GPU SBATCH directives correctly updated.

The condition change from if gpu to if is_gpu_active properly gates all GPU-related SBATCH options, preventing CPU builds from incorrectly requesting GPU resources.


35-35: Module load update looks correct.

The change to use is_gpu_active for mode selection ('g' vs 'c') aligns with the updated SBATCH directives.

toolchain/templates/hipergator.mako (2)

12-15: GPU SBATCH directives correctly updated.

The condition change from if gpu to if is_gpu_active properly gates GPU resource requests, addressing the PR's objective.


38-38: Module load update looks correct.

The change to use is_gpu_active for module selection aligns with the updated SBATCH directives.

toolchain/templates/phoenix.mako (1)

34-34: Module load update looks correct.

The change to use is_gpu_active for module selection aligns with the updated SBATCH directives.

Comment on lines +20 to 23
% if is_gpu_active:
#SBATCH --gres=gpu:V100:${tasks_per_node}
#SBATCH --mem-per-gpu=16G\
% endif
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

GPU SBATCH directives updated, but trailing backslash needs removal.

The condition change from if gpu to if is_gpu_active is correct. However, line 22 has a trailing backslash (16G\) with no line continuation, which could cause issues with SLURM directive parsing.

Apply this diff to remove the trailing backslash:

-#SBATCH --mem-per-gpu=16G\
+#SBATCH --mem-per-gpu=16G
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
% if is_gpu_active:
#SBATCH --gres=gpu:V100:${tasks_per_node}
#SBATCH --mem-per-gpu=16G\
% endif
% if is_gpu_active:
#SBATCH --gres=gpu:V100:${tasks_per_node}
#SBATCH --mem-per-gpu=16G
% endif
🤖 Prompt for AI Agents
In toolchain/templates/phoenix.mako around lines 20 to 23, the GPU SBATCH block
contains a trailing backslash on the --mem-per-gpu directive ("16G\") which
should be removed; edit that line to remove the backslash so it reads a normal
SLURM directive (e.g., "--mem-per-gpu=16G") and ensure no unintended line
continuation characters remain in the SBATCH directives.

@codeant-ai
Copy link

codeant-ai bot commented Nov 19, 2025

Pull Request Feedback 🔍

🔒 No security issues identified
⚡ Recommended areas for review

  • No effect
    The new is_gpu_active flag is never consumed anywhere in this template, so CPU builds still request GPU resources just as before. Please gate the SLURM GPU directives on this boolean (or remove the flag) so that CPU-only builds stop requesting GPUs.

  • Undefined variable risk
    The new module-load line now dereferences is_gpu_active. Please confirm that every code path that renders this template populates that flag (or provide a safe fallback), otherwise the template raises a NameError and the job script will not be generated. You might keep the previous gpu flag for backwards compatibility or ensure the template context always provides is_gpu_active for both CPU and GPU builds.

  • Possible Bug
    The new module‑loading line now depends on is_gpu_active, but nothing in this file (or the diff) guarantees that this identifier exists in every execution context. If the template still runs in places where only the old gpu flag was provided, rendering will fail and the batch script won't be generated. Please ensure is_gpu_active is always exported to the template or keep using the previous symbol so CPU builds stay GPU‑free.

  • Undefined Context Variable
    is_gpu_active is now used in both the module-load command and the srun branch to decide whether GPUs should be requested, but the template does not define it anywhere. Unless the rendering context already provides this name, the template will raise an undefined-variable error and fail on CPU builds. Please derive is_gpu_active from the imported gpuConfigOptions (or fall back to the old gpu flag) before those new conditionals run, so the template renders successfully in all build configurations.

#SBATCH --account="${account}"
% endif
% if gpu:
% if is_gpu_active:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Use a guarded lookup such as globals().get('is_gpu_active', False) for the GPU block so that rendering still works even if only the old gpu flag is provided. [possible bug]

#SBATCH --account="${account}"
% endif
% if gpu:
% if is_gpu_active:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Guard the SBATCH GPU directives with a context lookup so missing is_gpu_active values don't crash the template. [possible bug]

@codeant-ai
Copy link

codeant-ai bot commented Nov 19, 2025

CodeAnt AI finished reviewing your PR.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 15 files (reviewed changes from recent commits).

Prompt for AI agents (all 3 issues)

Understand the root cause of the following 3 issues and fix them.


<file name="toolchain/templates/delta.mako">

<violation number="1" location="toolchain/templates/delta.mako:35">
`gpuConfigOptions` is referenced here but the template no longer imports it, so rendering the delta SLURM template now fails with a NameError instead of launching jobs. Reintroduce the state import before using the enum value.</violation>
</file>

<file name="toolchain/templates/include/helpers.mako">

<violation number="1" location="toolchain/templates/include/helpers.mako:2">
The newly added `from mfc.state import gpuConfigOptions` line is indented even though it is top-level Python inside `&lt;%! ... %&gt;`, so the template module fails to compile with `IndentationError`, breaking every template that includes helpers.mako.</violation>
</file>

<file name="toolchain/templates/phoenix.mako">

<violation number="1" location="toolchain/templates/phoenix.mako:20">
`is_gpu_active` is referenced here, but the phoenix template never defines or imports that name (it only exists inside the helpers namespace), so rendering this template will raise `NameError` before the batch script is produced.</violation>
</file>

Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR

ok ":) Loading modules:\n"
cd "${MFC_ROOT_DIR}"
. ./mfc.sh load -c d -m ${'g' if gpu else 'c'}
. ./mfc.sh load -c d -m ${'g' if gpu != gpuConfigOptions.NONE.value else 'c'}
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gpuConfigOptions is referenced here but the template no longer imports it, so rendering the delta SLURM template now fails with a NameError instead of launching jobs. Reintroduce the state import before using the enum value.

Prompt for AI agents
Address the following comment on toolchain/templates/delta.mako at line 35:

<comment>`gpuConfigOptions` is referenced here but the template no longer imports it, so rendering the delta SLURM template now fails with a NameError instead of launching jobs. Reintroduce the state import before using the enum value.</comment>

<file context>
@@ -33,7 +32,7 @@ ${helpers.template_prologue()}
 ok &quot;:) Loading modules:\n&quot;
 cd &quot;${MFC_ROOT_DIR}&quot;
-. ./mfc.sh load -c d -m ${&#39;g&#39; if gpu else &#39;c&#39;}
+. ./mfc.sh load -c d -m ${&#39;g&#39; if gpu != gpuConfigOptions.NONE.value else &#39;c&#39;}
 cd - &gt; /dev/null
 echo
</file context>

✅ Addressed in 3503f95

@@ -1,4 +1,8 @@
<%! import os %>
<%! import os
from mfc.state import gpuConfigOptions
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The newly added from mfc.state import gpuConfigOptions line is indented even though it is top-level Python inside <%! ... %>, so the template module fails to compile with IndentationError, breaking every template that includes helpers.mako.

Prompt for AI agents
Address the following comment on toolchain/templates/include/helpers.mako at line 2:

<comment>The newly added `from mfc.state import gpuConfigOptions` line is indented even though it is top-level Python inside `&lt;%! ... %&gt;`, so the template module fails to compile with `IndentationError`, breaking every template that includes helpers.mako.</comment>

<file context>
@@ -1,4 +1,8 @@
-&lt;%! import os %&gt;
+&lt;%! import os 
+    from mfc.state import gpuConfigOptions
+%&gt;
+
</file context>
Suggested change
from mfc.state import gpuConfigOptions
from mfc.state import gpuConfigOptions

✅ Addressed in 214ee17

#SBATCH --qos=${quality_of_service}
% endif
% if gpu:
% if is_gpu_active:
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_gpu_active is referenced here, but the phoenix template never defines or imports that name (it only exists inside the helpers namespace), so rendering this template will raise NameError before the batch script is produced.

Prompt for AI agents
Address the following comment on toolchain/templates/phoenix.mako at line 20:

<comment>`is_gpu_active` is referenced here, but the phoenix template never defines or imports that name (it only exists inside the helpers namespace), so rendering this template will raise `NameError` before the batch script is produced.</comment>

<file context>
@@ -18,7 +17,7 @@
 #SBATCH --qos=${quality_of_service}
 % endif
-% if gpu != gpuConfigOptions.NONE.value:
+% if is_gpu_active:
 #SBATCH --gres=gpu:V100:${tasks_per_node}
 #SBATCH --mem-per-gpu=16G\
</file context>
Fix with Cubic

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 15 files (reviewed changes from recent commits).

Prompt for AI agents (all 1 issues)

Understand the root cause of the following 1 issues and fix them.


<file name="toolchain/templates/delta.mako">

<violation number="1" location="toolchain/templates/delta.mako:35">
`gpuConfigOptions` is used in the updated module-load command even though this template no longer imports it, so Delta submissions will crash with a NameError before running; reintroduce the import (or qualify the reference) so the CPU/GPU selection works.</violation>
</file>

Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR

ok ":) Loading modules:\n"
cd "${MFC_ROOT_DIR}"
. ./mfc.sh load -c d -m ${'g' if gpu else 'c'}
. ./mfc.sh load -c d -m ${'g' if gpu != gpuConfigOptions.NONE.value else 'c'}
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gpuConfigOptions is used in the updated module-load command even though this template no longer imports it, so Delta submissions will crash with a NameError before running; reintroduce the import (or qualify the reference) so the CPU/GPU selection works.

Prompt for AI agents
Address the following comment on toolchain/templates/delta.mako at line 35:

<comment>`gpuConfigOptions` is used in the updated module-load command even though this template no longer imports it, so Delta submissions will crash with a NameError before running; reintroduce the import (or qualify the reference) so the CPU/GPU selection works.</comment>

<file context>
@@ -33,7 +32,7 @@ ${helpers.template_prologue()}
 ok &quot;:) Loading modules:\n&quot;
 cd &quot;${MFC_ROOT_DIR}&quot;
-. ./mfc.sh load -c d -m ${&#39;g&#39; if gpu else &#39;c&#39;}
+. ./mfc.sh load -c d -m ${&#39;g&#39; if gpu != gpuConfigOptions.NONE.value else &#39;c&#39;}
 cd - &gt; /dev/null
 echo
</file context>
Fix with Cubic

@sbryngelson
Copy link
Member

/improve

Comment on lines +17 to 20
% if gpu != gpuConfigOptions.NONE.value:
#SBATCH --gpus-per-node=${tasks_per_node}
#SBATCH --mem=208G
#SBATCH --gpu-bind=closest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: To ensure consistency and prevent potential NameError exceptions, replace the direct GPU check in delta.mako with the centralized is_gpu_active variable. [possible issue, importance: 9]

New proposed code:
-% if gpu != gpuConfigOptions.NONE.value:
+% if is_gpu_active:
 #SBATCH --gpus-per-node=${tasks_per_node}
 #SBATCH --mem=208G
 #SBATCH --gpu-bind=closest
 % endif

ok ":) Loading modules:\n"
cd "${MFC_ROOT_DIR}"
. ./mfc.sh load -c d -m ${'g' if gpu else 'c'}
. ./mfc.sh load -c d -m ${'g' if gpu != gpuConfigOptions.NONE.value else 'c'}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: To maintain consistency and prevent potential errors, replace the inline GPU check with the is_gpu_active variable for module loading. [possible issue, importance: 9]

Suggested change
. ./mfc.sh load -c d -m ${'g' if gpu != gpuConfigOptions.NONE.value else 'c'}
. ./mfc.sh load -c d -m ${'g' if is_gpu_active else 'c'}

${helpers.template_prologue()}

ok ":) Loading modules:\n"
cd "${MFC_ROOTDIR}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Correct the typo in the environment variable from MFC_ROOTDIR to MFC_ROOT_DIR to prevent a script failure. [possible issue, importance: 10]

Suggested change
cd "${MFC_ROOTDIR}"
cd "${MFC_ROOT_DIR}"

@sbryngelson sbryngelson requested a review from a team November 19, 2025 23:39
@codecov
Copy link

codecov bot commented Nov 20, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 44.34%. Comparing base (22af239) to head (e5315b2).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1055   +/-   ##
=======================================
  Coverage   44.34%   44.34%           
=======================================
  Files          71       71           
  Lines       20578    20578           
  Branches     1990     1990           
=======================================
  Hits         9125     9125           
  Misses      10308    10308           
  Partials     1145     1145           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sbryngelson
Copy link
Member

hmmm it's failing a bunch of tests

@sbryngelson
Copy link
Member

hi @prathi-wind can you explain what's going on with this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Review effort 2/5 size:M This PR changes 30-99 lines, ignoring generated files

Development

Successfully merging this pull request may close these issues.

2 participants