Skip to content

[AP][Solver] Ignored Disconnected Blocks in AP Solver #3152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

AlexandreSinger
Copy link
Contributor

After investigating some of the slowest running testcases, I realized that we were not handling disconnected blocks in the solver.

Especially after we started thresholding out high-fanout nets, some circuits were taking far longer to solve than they should. They especially took a long time to set up the matrices. After investigating, I realized that there were many blocks which we completely disconnected from the rest of the circuit. There is no reason to optimize the location of these blocks since the AP objective is formulated based on net connectivity. As such, these disconnected blocks should be completely ignored during placement.

Ignoring these blocks reduces the number of variables in the A matrix, which can greatly improve runtime. Early results on Titan show up to a 3.5x improvement in GP runtime and a 20% improvement in GP runtime on average.

Future work is to be more methodical on what nets to mark as ignored. The AP flow currently does not directly set signals like clocks as ignored, which may be able to allow us to label more blocks as disconnected.

After investigating some of the slowest running testcases, I realized
that we were not handling disconnected blocks in the solver.

Especially after we started thresholding out high-fanout nets, some
circuits were taking far longer to solve than they should. They
especially took a long time to set up the matrices. After investigating,
I realized that there were many blocks which we completely disconnected
from the rest of the circuit. There is no reason to optimize the
location of these blocks since the AP objective is formulated based on
net connectivity. As such, these disconnected blocks should be
completely ignored during placement.

Ignoring these blocks reduces the number of variables in the A matrix,
which can greatly improve runtime. Early results on Titan show up to a
3.5x improvement in GP runtime and a 20% improvement in GP runtime on
average.

Future work is to be more methodical on what nets to mark as ignored.
The AP flow currently does not directly set signals like clocks as
ignored, which may be able to allow us to label more blocks as
disconnected.
@github-actions github-actions bot added VPR VPR FPGA Placement & Routing Tool lang-cpp C/C++ code labels Jun 19, 2025
@AlexandreSinger
Copy link
Contributor Author

Results on Titan. timing driven, no fixed blocks:

Metric Value normalized to AP Baseline
post_fl_hpwl 0.99
post_dp_hpwl 1.00
total_wirelength 1.00
post_fl_cpd 1.00
post_dp_cpd 1.01
crit_path_delay 1.01
ap_gp_runtime 0.78
ap_runtime 0.93
total_runtime 0.94

Outliers are denois and spartT1_chip2, where their GP runtime improved by 3x.

Practically no loss in quality and a 6% improvement in overall runtime.

@AlexandreSinger
Copy link
Contributor Author

In a future PR I will explore marking the nets as global if they are a clock or "non-clock global". This would match how the placer does this.

Then I may re-sweep the high-fanout threshold since clocks will always be ignored.

Copy link
Contributor

@amin1377 amin1377 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks!

@AlexandreSinger AlexandreSinger merged commit bd390e3 into verilog-to-routing:master Jun 19, 2025
33 checks passed
@AlexandreSinger AlexandreSinger deleted the feature-ap-solver branch June 19, 2025 13:29
Copy link
Contributor

@haydar-c haydar-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @AlexandreSinger, LGTM as well.

Copy link
Contributor

@vaughnbetz vaughnbetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but see one question on width/height


/// @brief The width of the device grid. Used for randomly generating points
/// on the grid.
size_t device_grid_width_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you keep cached copies of the grid width and height in the solver instead of asking the grid for width and height?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During Global Placement, the device size is assumed to be fixed. We also only really use this information in the first iteration. I preferred to just pass the device size in instead of holding on to a reference to the device itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lang-cpp C/C++ code VPR VPR FPGA Placement & Routing Tool
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants