Skip to content

Conversation

@akifcorduk
Copy link
Contributor

@akifcorduk akifcorduk commented Sep 9, 2025

Description

This PR changes the heuristic structure by creating a natural balance between generation and improvement.
The FP/FJ loop now adds solution to the population and only if we have enough diverse solutions we exit the loop and execute the population improvement. The diversity is increased to sqrt(n_integers). The recombiners are run between the current best and all other solutions in the current population, if stagnation is detected in FP/FJ loop and then the loop continues. The bounds prop rounding in the context of FP is also improved. When the dual simplex solution is set, the pdlp is warm started now with both primal and dual solutions.

The default tolerance is now 1e-6 absolute tolerance and 1e-12 relative tolerance.

This PR includes bug fixes on:

  • Apperance of inf/nan on z vector dual simplex phase2.
  • Invalid launch dimensions on FJ and hash kernels.
  • Timer diff and function time limit issues when the solver is run with unlimited time limit.

Benchmark results in 10 mins run on H100:

  • Main branch: 207 feasible solutions and average gap: '28.54', 3 unfinished/crashed
  • This PR: 213 feasible and average gap: '23.11', 1 unfinished/crushed. (The PR didn't have any crash before merge with main branch)

closes #142
closes #374
closes #218

@akifcorduk akifcorduk marked this pull request as ready for review September 16, 2025 16:15
@akifcorduk akifcorduk requested review from a team as code owners September 16, 2025 16:15
context(context_)
{
setup(problem_);
// setup(problem_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove setup completely?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a bug in load balanced setup, @kaatish is fixing that. This should normally be enabled, I will try if it is resolved with Aatish's recommendation.

if (settings.heuristic_preemption_callback != nullptr) {
settings.heuristic_preemption_callback();
}
// FIXME: rarely dual simplex detects infeasible whereas it is feasible.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if heuristics has a feasible solution, but branch and bound says it is infeasible, you take the feasible solution?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's what i wanted to do for now.

for (i_t k = 0; k < delta_z_nz; ++k) {
const i_t j = delta_z_indices[k];
z[j] += step_length * delta_z[j];
if (std::isnan(z[j]) || std::isinf(z[j])) { return -1; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a performance critical loop. Hopefully the compiler is smart about this, but would you mind doing something like:

f_t zj = z[j] + step_length * delta_z[j];
z[j] = zj;
if (zj != zj || std::isinf(zj)) {
 return -1;
}

Also, can you file an issue so that we can track that this should be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think isfinite is probably better suited as a single instruction, let me know if it is better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh you are worried about double access, I am 90% sure compiler will optimize this but sure i can fix this to have single access.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might not need this at all because the bound_flipping_ratio test has solved the numerical issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, we don't need this. Removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we are limiting the number of threads to 8? It is not better to let be a user controller setting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually yes, but since we don't have a thread pool now, it is separate for now. Once we have a thread pool we should enable more dynamic strategy.

#endif
logger_.set_level(default_level());
logger_.flush_on(rapids_logger::level_enum::info);
logger_.flush_on(rapids_logger::level_enum::debug);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should keep the flush in the info option unless there is a performance penalty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug is disabled in production, so there will be no performance penalty.

Copy link
Contributor

@aliceb-nv aliceb-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM engine side, approving; thanks a lot for the great work Akif!



option(BUILD_MIP_BENCHMARKS "Build MIP benchmarks" OFF)
set(BUILD_MIP_BENCHMARKS ON)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debugging leftover I assume :)
On my setup I add --cmake-args=\"-DBUILD_MIP_BENCHMARKS=ON\" to my build.sh command which does the trick

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks will do that. I think we should add a build.sh command for this.

Comment on lines +97 to +100
if (nonbasic_entering == -1) {
// -1,-2 and -3 are reserved for other things
return -4;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issue is tracked, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will create an issue thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akifcorduk
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 3307a41 into NVIDIA:branch-25.10 Sep 18, 2025
201 of 202 checks passed
copy-pr-bot bot pushed a commit that referenced this pull request Sep 22, 2025
…uristics (#382)

This PR changes the heuristic structure by creating a natural balance between generation and improvement.
The FP/FJ loop now adds solution to the population and only if we have enough diverse solutions we exit the loop and execute the population improvement. The diversity is increased to `sqrt(n_integers)`. The recombiners are run between the current best and all other solutions in the current population, if stagnation is detected in FP/FJ loop and then the loop continues. The bounds prop rounding in the context of FP is also improved. When the dual simplex solution is set, the pdlp is warm started now with both primal and dual solutions.

The default tolerance is now 1e-6 absolute tolerance and 1e-12 relative tolerance.

This PR includes bug fixes on:
- Apperance of inf/nan on `z` vector dual simplex phase2.
- Invalid launch dimensions on FJ and hash kernels.
- Timer diff and function time limit issues when the solver is run with unlimited time limit.

Benchmark results in 10 mins run on H100:
- Main branch: 207 feasible solutions and average gap: '28.54', 3 unfinished/crashed
- This PR: 213 feasible and average gap: '23.11', 1 unfinished/crushed. (The PR didn't have any crash before merge with main branch)

closes #142  
closes #374 
closes #218

Authors:
  - Akif ÇÖRDÜK (https://github.com/akifcorduk)

Approvers:
  - Ramakrishnap (https://github.com/rgsl888prabhu)
  - Alice Boucher (https://github.com/aliceb-nv)

URL: #382
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

6 participants