Skip to content

Fix variable bound violation in CPUFJ moves#930

Merged
rapids-bot[bot] merged 3 commits intoNVIDIA:mainfrom
aliceb-nv:cbs-cta-fix
Mar 4, 2026
Merged

Fix variable bound violation in CPUFJ moves#930
rapids-bot[bot] merged 3 commits intoNVIDIA:mainfrom
aliceb-nv:cbs-cta-fix

Conversation

@aliceb-nv
Copy link
Contributor

CPUFJ did not ensure that moves generated were clamped to the variable bounds. This allowed very slight (but within tolerance) bound violations accumulate, resulting in numerically flawed objective values.
This fixes a bug seen on cbs-cta where solutions were found that beat the optimal of 0.

Description

Issue

Checklist

  • I am familiar with the Contributing Guidelines.
  • Testing
    • New or existing tests cover these changes
    • Added tests
    • Created an issue to follow-up
    • NA
  • Documentation
    • The documentation is up to date with these changes
    • Added new documentation
    • NA

@aliceb-nv aliceb-nv added this to the 26.04 milestone Mar 4, 2026
@aliceb-nv aliceb-nv requested a review from a team as a code owner March 4, 2026 17:14
@aliceb-nv aliceb-nv requested review from Kh4ster and hlinsen March 4, 2026 17:14
@aliceb-nv aliceb-nv added bug Something isn't working non-breaking Introduces a non-breaking change labels Mar 4, 2026
@copy-pr-bot
Copy link

copy-pr-bot bot commented Mar 4, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@aliceb-nv
Copy link
Contributor Author

/ok to test 42267dd

@coderabbitai
Copy link

coderabbitai bot commented Mar 4, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7daa9100-1883-4cf8-84b1-5e1146b9a8a8

📥 Commits

Reviewing files that changed from the base of the PR and between 42267dd and d07e2ae.

📒 Files selected for processing (1)
  • cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu
🚧 Files skipped from review as they are similar to previous changes (1)
  • cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu

📝 Walkthrough

Walkthrough

The change updates the feasibility jump CPU implementation to consistently use incumbent objective tracking instead of best objective, adds defensive bounds clamping in variable assignment and constraint recomputation, and refines move application logic to ensure variables remain within bounds.

Changes

Cohort / File(s) Summary
Objective reporting and bounds clamping
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu
Updated all objective references from h_best_objective to h_incumbent_objective in logging and improvement callbacks. Refined apply_move to compute new_val, clamp to variable bounds, snap to integer when applicable, and recompute delta with finiteness/bounds validation. Added defensive bounds clamping in recompute_lhs before LHS and constraint updates.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Fix variable bound violation in CPUFJ moves' accurately and concisely describes the main change: fixing a bug where variable bound violations were not being prevented in CPUFJ move generation.
Description check ✅ Passed The description clearly explains the bug (moves not clamped to bounds), the consequence (numerical objective flaws), and the affected case (cbs-cta), all directly related to the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu (1)

764-779: ⚠️ Potential issue | 🔴 Critical

Propagate the clamped delta through all state updates.

new_val is clamped, but LHS/objective logic still uses the original delta. When clamp changes the move, h_assignment can diverge from h_lhs/h_incumbent_objective, so drift can still accumulate.

Proposed fix (use applied_delta consistently)
 static void apply_move(fj_cpu_climber_t<i_t, f_t>& fj_cpu,
                        i_t var_idx,
                        f_t delta,
                        bool localmin = false)
 {
+  f_t old_val = fj_cpu.h_assignment[var_idx];
+  f_t new_val = old_val + delta;
+  if (is_integer_var<i_t, f_t>(fj_cpu, var_idx)) {
+    cuopt_assert(fj_cpu.view.pb.integer_equal(new_val, round(new_val)), "new_val is not integer");
+    new_val = round(new_val);
+  }
+  new_val = std::min(std::max(new_val, get_lower(fj_cpu.h_var_bounds[var_idx].get())),
+                     get_upper(fj_cpu.h_var_bounds[var_idx].get()));
+  f_t applied_delta = new_val - old_val;
+
   // Update the LHSs of all involved constraints.
   ...
-    f_t y                          = cstr_coeff * delta - fj_cpu.h_lhs_sumcomp[cstr_idx];
+    f_t y                          = cstr_coeff * applied_delta - fj_cpu.h_lhs_sumcomp[cstr_idx];
   ...
-  f_t new_val = fj_cpu.h_assignment[var_idx] + delta;
-  ...
   fj_cpu.h_assignment[var_idx] = new_val;
-  fj_cpu.h_incumbent_objective += fj_cpu.h_obj_coeffs[var_idx] * delta;
+  fj_cpu.h_incumbent_objective += fj_cpu.h_obj_coeffs[var_idx] * applied_delta;
-  if (delta > 0) {
+  if (applied_delta > 0) {

As per coding guidelines: "Validate algorithm correctness in optimization logic: ... constraint/objective handling must produce correct results" and "Check numerical stability: prevent ... precision loss ...".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu` around lines 764 - 779,
The code clamps new_val but continues to use the original delta, causing
h_assignment to diverge from h_lhs and h_incumbent_objective; compute an
applied_delta = new_val - old_value (old_value is the prior
fj_cpu.h_assignment[var_idx]) immediately after clamping and then use
applied_delta everywhere the move is applied (replace uses of delta for updating
fj_cpu.h_lhs, fj_cpu.h_incumbent_objective, and any other state updates that
depend on the actual change); ensure checks
(cuopt_assert/check_variable_within_bounds/isfinite) still refer to new_val and
that h_obj_coeffs[var_idx] is multiplied by applied_delta when updating the
objective.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/src/mip_heuristics/feasibility_jump/feasibility_jump_kernels.cu`:
- Around line 592-599: The kernel clamps new_val but still uses
fj.jump_move_delta when updating fj.incumbent_lhs and fj.incumbent_objective,
leading to inconsistent state; change the logic to compute the clamped new_val
from fj.incumbent_assignment[var_idx] and variable_bounds (using
get_lower/get_upper), then compute applied_delta = new_val -
fj.incumbent_assignment[var_idx] and use applied_delta for all subsequent
updates to fj.incumbent_lhs and fj.incumbent_objective (and any other places
that currently use jump_move_delta), so the stored assignment, LHS and objective
reflect the actual clamped step; apply the same fix at the other similar blocks
mentioned around the earlier (560-587) and later (654-657) ranges.

---

Outside diff comments:
In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 764-779: The code clamps new_val but continues to use the original
delta, causing h_assignment to diverge from h_lhs and h_incumbent_objective;
compute an applied_delta = new_val - old_value (old_value is the prior
fj_cpu.h_assignment[var_idx]) immediately after clamping and then use
applied_delta everywhere the move is applied (replace uses of delta for updating
fj_cpu.h_lhs, fj_cpu.h_incumbent_objective, and any other state updates that
depend on the actual change); ensure checks
(cuopt_assert/check_variable_within_bounds/isfinite) still refer to new_val and
that h_obj_coeffs[var_idx] is multiplied by applied_delta when updating the
objective.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2c0c0fb1-61d1-41b5-851e-f10d7599ac0a

📥 Commits

Reviewing files that changed from the base of the PR and between 0aa1b31 and 42267dd.

📒 Files selected for processing (2)
  • cpp/src/mip_heuristics/feasibility_jump/feasibility_jump_kernels.cu
  • cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu

Comment on lines +592 to +599
f_t new_val = fj.incumbent_assignment[var_idx] + fj.jump_move_delta[var_idx];

cuopt_assert(fj.pb.check_variable_within_bounds(var_idx, new_val),
"assignment not within bounds");
// clamping to err on the safe size - assert catches this
auto bounds = fj.pb.variable_bounds[var_idx];
f_t lb = get_lower(bounds);
f_t ub = get_upper(bounds);
new_val = min(max(new_val, lb), ub);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Use clamped applied delta for kernel LHS/objective updates.

The kernel clamps new_val, but incumbent_lhs and incumbent_objective are still updated with jump_move_delta. If clamp changes the step, state becomes inconsistent.

Proposed fix (shared clamped value + shared applied_delta)
 __global__ void update_assignment_kernel(...){
+  __shared__ f_t clamped_new_val;
+  __shared__ f_t applied_delta;
   ...
+  if (FIRST_THREAD) {
+    auto bounds = fj.pb.variable_bounds[var_idx];
+    f_t old_val = fj.incumbent_assignment[var_idx];
+    f_t lb = get_lower(bounds);
+    f_t ub = get_upper(bounds);
+    clamped_new_val = min(max(old_val + fj.jump_move_delta[var_idx], lb), ub);
+    applied_delta = clamped_new_val - old_val;
+  }
+  __syncthreads();
+
   for (auto i = offset_begin + blockIdx.x; i < offset_end; i += gridDim.x) {
     ...
-    f_t new_lhs  = old_lhs + cstr_coeff * fj.jump_move_delta[var_idx];
+    f_t new_lhs  = old_lhs + cstr_coeff * applied_delta;
     ...
-    f_t y = cstr_coeff * fj.jump_move_delta[var_idx] - fj.incumbent_lhs_sumcomp[cstr_idx];
+    f_t y = cstr_coeff * applied_delta - fj.incumbent_lhs_sumcomp[cstr_idx];
   }
   ...
   if (FIRST_THREAD) {
-    fj.incumbent_assignment[var_idx] = new_val;
-    *fj.incumbent_objective += fj.pb.objective_coefficients[var_idx] * fj.jump_move_delta[var_idx];
+    fj.incumbent_assignment[var_idx] = clamped_new_val;
+    *fj.incumbent_objective += fj.pb.objective_coefficients[var_idx] * applied_delta;
   }
 }

As per coding guidelines: "Validate algorithm correctness in optimization logic: ... constraint/objective handling must produce correct results" and "Check numerical stability: prevent ... precision loss ...".

Also applies to: 560-587, 654-657

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/feasibility_jump/feasibility_jump_kernels.cu` around
lines 592 - 599, The kernel clamps new_val but still uses fj.jump_move_delta
when updating fj.incumbent_lhs and fj.incumbent_objective, leading to
inconsistent state; change the logic to compute the clamped new_val from
fj.incumbent_assignment[var_idx] and variable_bounds (using
get_lower/get_upper), then compute applied_delta = new_val -
fj.incumbent_assignment[var_idx] and use applied_delta for all subsequent
updates to fj.incumbent_lhs and fj.incumbent_objective (and any other places
that currently use jump_move_delta), so the stored assignment, LHS and objective
reflect the actual clamped step; apply the same fix at the other similar blocks
mentioned around the earlier (560-587) and later (654-657) ranges.

@aliceb-nv
Copy link
Contributor Author

/ok to test d07e2ae

@copy-pr-bot
Copy link

copy-pr-bot bot commented Mar 4, 2026

/ok to test d07e2ae

@aliceb-nv, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

@aliceb-nv
Copy link
Contributor Author

/ok to test d07e2ae

@aliceb-nv
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit d61b196 into NVIDIA:main Mar 4, 2026
182 of 185 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants