Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Blocks of Multiple Weight Types? #126

Open
Licheng-Guo opened this issue Oct 3, 2022 · 11 comments
Open

Support Blocks of Multiple Weight Types? #126

Licheng-Guo opened this issue Oct 3, 2022 · 11 comments

Comments

@Licheng-Guo
Copy link

Dear authors,

Our lab has been working on partitioning a C++ FPGA design for parallel placement and routing. We are very interested to try KaHyPar to replace our current ILP solution.

However, in our situation, each block has different types of physical resources (e.g., Look-Up Table, Flip-Flop, Block RAM, DSP, etc), and each vertex of the graph has different usage of different resources (e.g., one vertex may require 5 LUT and 10 DSP, another may require 10 FF and 3 BRAM, etc).

I wonder if you have a plan to extend the "variable block weights" feature to support different types of weights? If not, I wonder how difficult it will be to implement the change?

Thanks for your explanation!

Best,
Licheng

@SebastianSchlag
Copy link
Member

Dear @Licheng-Guo,

Thank you very much for reaching out. Supporting multiple weights per vertex is indeed a feature that is currently missing in KaHyPar and its shared-memory parallelization Mt-KaHyPar. I believe that it would not be too difficult to support this, but it might need small changes in a lot of places. Since we haven't much man-power at the moment, we don't have any concrete plans to address this in the near future.
However, if you or somebody else in your lab would be interested in contributing this feature, @kittobi1992, @larsgottesbueren, and I would be happy to help and provide guidance.

Also, it is worth noting that PaToH already supports multi-constraint partitioning.

@Licheng-Guo
Copy link
Author

Thanks a lot for your reply and appreciate the pointer to PaToH! I will take a look at it first.

@Licheng-Guo
Copy link
Author

Hi @SebastianSchlag,

A while ago, I asked about supporting different types of weights. We are still very interested in this feature. I wonder if you have an estimation of how much effort it will take to add this feature?

Another feature we need is allowing different weights for different block connections. For example, connections between blocks 1 and 2 might be more costly than connections between blocks 2 and 3. Do you think this is doable?

We'd be happy to contribute to the project if things are not too difficult :)

Thanks for your time!

Best,
Licheng

@Licheng-Guo Licheng-Guo reopened this Mar 9, 2023
@SebastianSchlag
Copy link
Member

Hey @Licheng-Guo ,

A while ago, I asked about supporting different types of weights. We are still very interested in this feature. I wonder if you have an estimation of how much effort it will take to add this feature?

Starting from the hypergraph data structure, we'd have to pass it through to coarsening, initial partitioning and the refinement algorithm. It should certainly be doable, but is a non-trivial amount of work. At the moment, we don't have any active developers though. @kittobi1992 and I are mostly just supporting the project in our spare time.

Another feature we need is allowing different weights for different block connections. For example, connections between blocks 1 and 2 might be more costly than connections between blocks 2 and 3. Do you think this is doable?

This sounds interesting. Just to give an example, let's assume that a vertex in block 1 is connected to another vertex in block 2. Would that mean that if we now move the vertex in block 1 to block 3, the edge weight would change/be smaller?
I'm just brainstorming here, but this sounds like we'd need dynamic edge weights that depend on the block assignment of the pins. How would that look like if a hyper edge connects more than two blocks?

Integrating something like this efficiently with the refinement algorithms could be challenging because the notion of a gain of a more (in terms of cut/connectivity reduction) now becomes much more dynamic.

We'd be happy to contribute to the project if things are not too difficult :)

That would be very much appreciated. :) I'd be more than happy to help/advise on the changes that we'd need to make. In terms of difficulty I assume that supporting multiple weights should be significantly easier than dynamic edge weights.

@Licheng-Guo
Copy link
Author

Thanks for the feedback! We will try to look at the multi-weights feature first.

This sounds interesting. Just to give an example, let's assume that a vertex in block 1 is connected to another vertex in block 2. Would that mean that if we now move the vertex in block 1 to block 3, the edge weight would change/be smaller?
I'm just brainstorming here, but this sounds like we'd need dynamic edge weights that depend on the block assignment of the pins. How would that look like if a hyper edge connects more than two blocks?

I'm not exactly sure if we are on the same page. We're aiming to use the tool for scalable VLSI circuit floorplanning. The goal is to minimize the total net "length" when partitioning by considering the "distance" between blocks. Do you think this will require lots of change?

For example, if the layout of a chip looks like:

Block 1
Block 2
Block 3

For a net UV, if U is fixed to Block 1, then it is better to assign V to Block 2, because the resulting net length will be shorter compared to assigning V to Block 3.

Thanks for your time!

@kittobi1992
Copy link
Member

Hi @Licheng-Guo,

I also want to share my thoughts on the multi-constrained partitioning feature (multiple weights per vertex). Keep in mind that the problem of finding a balanced solution is an NP-hard problem (with one weight per vertex) even without optimizing an objective function on the hyperedges. We recently published a paper on balanced hypergraph partitioning and showed that existing partitioning algorithms considerably struggle to find balanced solutions for weighted instances -- especially for VLSI instances (see https://drops.dagstuhl.de/opus/volltexte/2021/13771/pdf/lipics-vol190-sea2021-complete.pdf#page=133). Adding multiple weights per vertex makes the problem even harder. I think there are some simple strategies that can be integrated into KaHyPar to support that feature, but they will give you no guarantee that a feasible solution will be found (or even exists) and also can drastically reduce the flexibility of our local search algorithms which can negatively impact solution quality. For a good and reliable solution, we need to generalize our techniques from our balancing paper to multi-constrained partitioning, which is a research project for its own.

@N-Maas
Copy link
Collaborator

N-Maas commented Mar 10, 2023

Hey,
I'd like to add some ideas with regards to the dynamic net weights. I think this feature might be more manageable than it looks at first glance.

So if I understood @Licheng-Guo correctly, the net weights can be modeled using a base weight for each net and a multiplier which depends on the blocks that the net connects. This is much more restricted than fully dynamic net weights. This is important since it means that there is no need to modify the hypergraph data structure, the coarsening or the bipartitioning algorithms. Also, it should not be necessary to modify two-way refinement algorithms. I think that even the flow-based refinement should still work since blocks are always considered as pairs (correct me if I'm wrong @SebastianSchlag @kittobi1992, I don't know how that works in detail). So it is primarily the k-way FM refinement where significant changes would be required. However they only affect the gain calculation, which could be doable.

In addition, changes to the initial partitioning are probably necessary to achieve good qualtiy. Assuming uniform blocks, my first idea would be to calculate a k-way partition as usual (using recursive bipartitioning) and afterwards compute an assignment of the parts of the graph to the blocks that minimizes the cut. Unfortunately, this is not possible at all if it is combined with the different types of weights. In this case, your best bet might be to repeat the whole k-way initial partitioning process multiple times, using a random permutation of the blocks for each pass (of course, this might increase the running time significantly).

For a net UV, if U is fixed to Block 1, then it is better to assign V to Block 2, because the resulting net length will be shorter compared to assigning V to Block 3.

The question is what is the weight of a net with 3 or more pins that are assigned to 3 or more different blocks. Since you mentioned this is about the length of the net, I think the following generalization could work: We model the distances between the blocks as a (complete) graph G where an edge is the distance between two blocks. The weight of a net that connects a set of blocks B is then the weight of a minimum spanning tree of the subgraph induced by B. In case of 3 blocks, this would be the sum of the two smaller distances (and we could probably hope that in practice, a net almost never connects more than 3 blocks).

One more, although more complicated, idea with regards to initial partitioning: It is likely best if the first bipartition divides blocks with comparatively high distance, and blocks that are close to each other are divided in the recursive calls. For this, you could simply specify an ordering of the blocks manually, or you could do this in a general way by computing a bipartition on the block graph G with maximum cut, using the result to determine the assignment of blocks for the actual bipartition.

Hope this helps.

@kittobi1992
Copy link
Member

In addition to @N-Maas answers, I want to note that your problem can be reduced to process mapping (see the following paper: https://drops.dagstuhl.de/opus/volltexte/2020/12078/pdf/LIPIcs-SEA-2020-4.pdf). In this problem, you want to embed a communication graph in a computing network such that communication is minimized. The computing network is also described as a graph. Essential for your problem is that communication between two compute nodes that are not connected is more expensive than communication between two adjacent nodes. Thus, you want to place nodes of the communication graph connected via a heavy edge near to each other in the computing network. For your application, you can model the block layout of the chip as a computing network and then embed the hypergraph by solving the process mapping problem. Unfortunately, there only exist work for graphs. However, this could be an interesting research topic since this is one of the first application of the problem for hypergraphs.

@SebastianSchlag
Copy link
Member

In addition to @N-Maas answers, I want to note that your problem can be reduced to process mapping (see the following paper: https://drops.dagstuhl.de/opus/volltexte/2020/12078/pdf/LIPIcs-SEA-2020-4.pdf). In this problem, you want to embed a communication graph in a computing network such that communication is minimized. The computing network is also described as a graph. Essential for your problem is that communication between two compute nodes that are not connected is more expensive than communication between two adjacent nodes. Thus, you want to place nodes of the communication graph connected via a heavy edge near to each other in the computing network. For your application, you can model the block layout of the chip as a computing network and then embed the hypergraph by solving the process mapping problem. Unfortunately, there only exist work for graphs. However, this could be an interesting research topic since this is one of the first application of the problem for hypergraphs.

That's a very good observation, @kittobi1992. From a high-level perspective, that would actually do the trick.

@Licheng-Guo
Copy link
Author

Thanks a lot for everyone's suggestions and comments! Really appreciate your feedback. @SebastianSchlag @kittobi1992 @N-Maas. I'm reading your related papers to better understand the discussion here.

We might first look into multiple weight types to see if there are easy shortcuts. Our problems typically only have between hundreds to tens of thousands of nodes. And we are OK with solutions that are not perfectly balanced. Also thanks a lot for the idea of converting the problem to process mapping, I'll look into that direction!

Appreciate all your helpful replies!

@kittobi1992
Copy link
Member

Hi @Licheng-Guo,

I want to let you know that we will start to work on the process mapping topic soon (probably in one month). The feature will be integrated into Mt-KaHyPar, which is the shared-memory parallelization of KaHyPar. I am very interested to collaborate with you on this as this has the potential to lead to a more accurate hypergraph Partitioning model for VLSI design and also could remove some software components from the VLSI placement toolchain. However, I am still not an expert in this area and would like to learn more about your specific problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants