Fix variable bound violation in CPUFJ moves by aliceb-nv · Pull Request #930 · NVIDIA/cuopt

aliceb-nv · 2026-03-04T17:14:39Z

CPUFJ did not ensure that moves generated were clamped to the variable bounds. This allowed very slight (but within tolerance) bound violations accumulate, resulting in numerically flawed objective values.
This fixes a bug seen on cbs-cta where solutions were found that beat the optimal of 0.

Description

Issue

Checklist

I am familiar with the Contributing Guidelines.
Testing
- New or existing tests cover these changes
- Added tests
- Created an issue to follow-up
- NA
Documentation
- The documentation is up to date with these changes
- Added new documentation
- NA

copy-pr-bot · 2026-03-04T17:14:45Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

aliceb-nv · 2026-03-04T17:17:58Z

/ok to test 42267dd

coderabbitai · 2026-03-04T17:19:49Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7daa9100-1883-4cf8-84b1-5e1146b9a8a8

📥 Commits

Reviewing files that changed from the base of the PR and between 42267dd and d07e2ae.

📒 Files selected for processing (1)

cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu

🚧 Files skipped from review as they are similar to previous changes (1)

cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu

📝 Walkthrough

Walkthrough

The change updates the feasibility jump CPU implementation to consistently use incumbent objective tracking instead of best objective, adds defensive bounds clamping in variable assignment and constraint recomputation, and refines move application logic to ensure variables remain within bounds.

Changes

Cohort / File(s)	Summary
Objective reporting and bounds clamping `cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`	Updated all objective references from h_best_objective to h_incumbent_objective in logging and improvement callbacks. Refined apply_move to compute new_val, clamp to variable bounds, snap to integer when applicable, and recompute delta with finiteness/bounds validation. Added defensive bounds clamping in recompute_lhs before LHS and constraint updates.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Fix variable bound violation in CPUFJ moves' accurately and concisely describes the main change: fixing a bug where variable bound violations were not being prevented in CPUFJ move generation.
Description check	✅ Passed	The description clearly explains the bug (moves not clamped to bounds), the consequence (numerical objective flaws), and the affected case (cbs-cta), all directly related to the changeset.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu (1)

764-779: ⚠️ Potential issue | 🔴 Critical

Propagate the clamped delta through all state updates.

new_val is clamped, but LHS/objective logic still uses the original delta. When clamp changes the move, h_assignment can diverge from h_lhs/h_incumbent_objective, so drift can still accumulate.

Proposed fix (use applied_delta consistently)

 static void apply_move(fj_cpu_climber_t<i_t, f_t>& fj_cpu,
                        i_t var_idx,
                        f_t delta,
                        bool localmin = false)
 {
+  f_t old_val = fj_cpu.h_assignment[var_idx];
+  f_t new_val = old_val + delta;
+  if (is_integer_var<i_t, f_t>(fj_cpu, var_idx)) {
+    cuopt_assert(fj_cpu.view.pb.integer_equal(new_val, round(new_val)), "new_val is not integer");
+    new_val = round(new_val);
+  }
+  new_val = std::min(std::max(new_val, get_lower(fj_cpu.h_var_bounds[var_idx].get())),
+                     get_upper(fj_cpu.h_var_bounds[var_idx].get()));
+  f_t applied_delta = new_val - old_val;
+
   // Update the LHSs of all involved constraints.
   ...
-    f_t y                          = cstr_coeff * delta - fj_cpu.h_lhs_sumcomp[cstr_idx];
+    f_t y                          = cstr_coeff * applied_delta - fj_cpu.h_lhs_sumcomp[cstr_idx];
   ...
-  f_t new_val = fj_cpu.h_assignment[var_idx] + delta;
-  ...
   fj_cpu.h_assignment[var_idx] = new_val;
-  fj_cpu.h_incumbent_objective += fj_cpu.h_obj_coeffs[var_idx] * delta;
+  fj_cpu.h_incumbent_objective += fj_cpu.h_obj_coeffs[var_idx] * applied_delta;
-  if (delta > 0) {
+  if (applied_delta > 0) {

As per coding guidelines: "Validate algorithm correctness in optimization logic: ... constraint/objective handling must produce correct results" and "Check numerical stability: prevent ... precision loss ...".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu` around lines 764 - 779,
The code clamps new_val but continues to use the original delta, causing
h_assignment to diverge from h_lhs and h_incumbent_objective; compute an
applied_delta = new_val - old_value (old_value is the prior
fj_cpu.h_assignment[var_idx]) immediately after clamping and then use
applied_delta everywhere the move is applied (replace uses of delta for updating
fj_cpu.h_lhs, fj_cpu.h_incumbent_objective, and any other state updates that
depend on the actual change); ensure checks
(cuopt_assert/check_variable_within_bounds/isfinite) still refer to new_val and
that h_obj_coeffs[var_idx] is multiplied by applied_delta when updating the
objective.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/src/mip_heuristics/feasibility_jump/feasibility_jump_kernels.cu`:
- Around line 592-599: The kernel clamps new_val but still uses
fj.jump_move_delta when updating fj.incumbent_lhs and fj.incumbent_objective,
leading to inconsistent state; change the logic to compute the clamped new_val
from fj.incumbent_assignment[var_idx] and variable_bounds (using
get_lower/get_upper), then compute applied_delta = new_val -
fj.incumbent_assignment[var_idx] and use applied_delta for all subsequent
updates to fj.incumbent_lhs and fj.incumbent_objective (and any other places
that currently use jump_move_delta), so the stored assignment, LHS and objective
reflect the actual clamped step; apply the same fix at the other similar blocks
mentioned around the earlier (560-587) and later (654-657) ranges.

---

Outside diff comments:
In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 764-779: The code clamps new_val but continues to use the original
delta, causing h_assignment to diverge from h_lhs and h_incumbent_objective;
compute an applied_delta = new_val - old_value (old_value is the prior
fj_cpu.h_assignment[var_idx]) immediately after clamping and then use
applied_delta everywhere the move is applied (replace uses of delta for updating
fj_cpu.h_lhs, fj_cpu.h_incumbent_objective, and any other state updates that
depend on the actual change); ensure checks
(cuopt_assert/check_variable_within_bounds/isfinite) still refer to new_val and
that h_obj_coeffs[var_idx] is multiplied by applied_delta when updating the
objective.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2c0c0fb1-61d1-41b5-851e-f10d7599ac0a

📥 Commits

Reviewing files that changed from the base of the PR and between 0aa1b31 and 42267dd.

📒 Files selected for processing (2)

cpp/src/mip_heuristics/feasibility_jump/feasibility_jump_kernels.cu
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu

coderabbitai · 2026-03-04T17:19:52Z

cpp/src/mip_heuristics/feasibility_jump/feasibility_jump_kernels.cu

    f_t new_val = fj.incumbent_assignment[var_idx] + fj.jump_move_delta[var_idx];
-
    cuopt_assert(fj.pb.check_variable_within_bounds(var_idx, new_val),
                 "assignment not within bounds");
+    // clamping to err on the safe size - assert catches this
+    auto bounds = fj.pb.variable_bounds[var_idx];
+    f_t lb      = get_lower(bounds);
+    f_t ub      = get_upper(bounds);
+    new_val     = min(max(new_val, lb), ub);


⚠️ Potential issue | 🔴 Critical

Use clamped applied delta for kernel LHS/objective updates.

The kernel clamps new_val, but incumbent_lhs and incumbent_objective are still updated with jump_move_delta. If clamp changes the step, state becomes inconsistent.

Proposed fix (shared clamped value + shared applied_delta)

__global__ void update_assignment_kernel(...){ + __shared__ f_t clamped_new_val; + __shared__ f_t applied_delta; ... + if (FIRST_THREAD) { + auto bounds = fj.pb.variable_bounds[var_idx]; + f_t old_val = fj.incumbent_assignment[var_idx]; + f_t lb = get_lower(bounds); + f_t ub = get_upper(bounds); + clamped_new_val = min(max(old_val + fj.jump_move_delta[var_idx], lb), ub); + applied_delta = clamped_new_val - old_val; + } + __syncthreads(); + for (auto i = offset_begin + blockIdx.x; i < offset_end; i += gridDim.x) { ... - f_t new_lhs = old_lhs + cstr_coeff * fj.jump_move_delta[var_idx]; + f_t new_lhs = old_lhs + cstr_coeff * applied_delta; ... - f_t y = cstr_coeff * fj.jump_move_delta[var_idx] - fj.incumbent_lhs_sumcomp[cstr_idx]; + f_t y = cstr_coeff * applied_delta - fj.incumbent_lhs_sumcomp[cstr_idx]; } ... if (FIRST_THREAD) { - fj.incumbent_assignment[var_idx] = new_val; - *fj.incumbent_objective += fj.pb.objective_coefficients[var_idx] * fj.jump_move_delta[var_idx]; + fj.incumbent_assignment[var_idx] = clamped_new_val; + *fj.incumbent_objective += fj.pb.objective_coefficients[var_idx] * applied_delta; } }

As per coding guidelines: "Validate algorithm correctness in optimization logic: ... constraint/objective handling must produce correct results" and "Check numerical stability: prevent ... precision loss ...".

Also applies to: 560-587, 654-657

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/feasibility_jump/feasibility_jump_kernels.cu` around lines 592 - 599, The kernel clamps new_val but still uses fj.jump_move_delta when updating fj.incumbent_lhs and fj.incumbent_objective, leading to inconsistent state; change the logic to compute the clamped new_val from fj.incumbent_assignment[var_idx] and variable_bounds (using get_lower/get_upper), then compute applied_delta = new_val - fj.incumbent_assignment[var_idx] and use applied_delta for all subsequent updates to fj.incumbent_lhs and fj.incumbent_objective (and any other places that currently use jump_move_delta), so the stored assignment, LHS and objective reflect the actual clamped step; apply the same fix at the other similar blocks mentioned around the earlier (560-587) and later (654-657) ranges.

aliceb-nv · 2026-03-04T17:42:31Z

/ok to test d07e2ae

copy-pr-bot · 2026-03-04T17:42:35Z

/ok to test d07e2ae

@aliceb-nv, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

aliceb-nv · 2026-03-04T17:45:00Z

/ok to test d07e2ae

aliceb-nv · 2026-03-04T21:35:19Z

/merge

aliceb-nv added 2 commits March 4, 2026 07:11

bump

fc0106b

fix cpufj bug

42267dd

aliceb-nv added this to the 26.04 milestone Mar 4, 2026

aliceb-nv requested a review from a team as a code owner March 4, 2026 17:14

aliceb-nv requested review from Kh4ster and hlinsen March 4, 2026 17:14

aliceb-nv added bug Something isn't working non-breaking Introduces a non-breaking change labels Mar 4, 2026

rg20 approved these changes Mar 4, 2026

View reviewed changes

coderabbitai bot reviewed Mar 4, 2026

View reviewed changes

ai review

d07e2ae

rapids-bot bot merged commit d61b196 into NVIDIA:main Mar 4, 2026
182 of 185 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix variable bound violation in CPUFJ moves#930

Fix variable bound violation in CPUFJ moves#930
rapids-bot[bot] merged 3 commits intoNVIDIA:mainfrom
aliceb-nv:cbs-cta-fix

aliceb-nv commented Mar 4, 2026

Uh oh!

copy-pr-bot bot commented Mar 4, 2026

Uh oh!

aliceb-nv commented Mar 4, 2026

Uh oh!

coderabbitai bot commented Mar 4, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 4, 2026

Uh oh!

aliceb-nv commented Mar 4, 2026

Uh oh!

copy-pr-bot bot commented Mar 4, 2026

Uh oh!

aliceb-nv commented Mar 4, 2026

Uh oh!

aliceb-nv commented Mar 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

aliceb-nv commented Mar 4, 2026

Description

Issue

Checklist

Uh oh!

copy-pr-bot bot commented Mar 4, 2026

Uh oh!

aliceb-nv commented Mar 4, 2026

Uh oh!

coderabbitai bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

aliceb-nv commented Mar 4, 2026

Uh oh!

copy-pr-bot bot commented Mar 4, 2026

Uh oh!

aliceb-nv commented Mar 4, 2026

Uh oh!

aliceb-nv commented Mar 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Mar 4, 2026 •

edited

Loading