repacker
is a highly effective solver for 2D-rectangle-packing problem based on a dedicated data structure and heuristics like greedy algorithm.
The 2D-rectangle-packing problem can be literally formalized as
Item | Setting |
---|---|
Input | • A set of 2D rectangles |
Output | • Arrangement of all these rectangles in a 2D space |
Constraints | • Each rectangle is aligned to X and Y axes of the space. • No overlapping is allowed. |
Objective | • Minimize the area of the bounding box aligned to axes and including all rectangles. |
The objective value is also denoted as occupancy rate.
Given a set of 200 (random-generated) example rectangles, repacker
delivers a highly optimized packing plan like:
with 93.92% occupancy rate. With another amount of example rectangles, say 1000:
solution achieves 95.62% occupancy rate.
From testing of repacker
, such rate is on average above 90% for input sets with size more than 50.
The algorithm is firstly implemented in Python then ported to C++. The performance is heavily improved. Given ca. 6000 rectangles, C++ code produces a output within ca. 5~6 seconds.
The performance of C++ code is surprisingly good, however I do not know yet how to plot the result conveniently using C++.
The algorithm can be called via command-line
python repacker.py <input-file> [<output-file>] [-n]
where
-
<input-file>
is text file as a list of 2-tuples in Python syntax, representing the sizes of the rectangles. -
<output-file
is the path to the solution file, which is a Python list. -
-n
means producing no figure for depicting the solution.
[*] No 3rd-party module is required for solving, but module PIL
is required for drawing the results like the figures above.
- No utilities from computational graphics or computational geometry are used, since the underlying data structure is indeed a list. I personally name it threaded quad-pointers, which is a 1D list of all modelled rectangle corners.
- With this structure, spatial relations can be computed via simple arithmetics, instead of any complicated geometric algorithm.
- The complexity is roughly O(N²). In pure Python, an input with 1000 rectangles can be solved within several seconds.
- The heuristical approach for deciding optimal placement can be categorized as a combination of strategies of Greedy, Bottom-Left, Best-Fit, which have been explored extensively in various literature.
The solution to this problem may find usage in various fields. For example, given a large plate of steel, we may be required to slice it into small rectangle pieces for future assembling work according to some engineering plot and hope to use this plate most economically.
Based on the threaded quad-pointers data structure, instead of computing the occupancy rate i.e. the bounding box area,
an alternative function F
is proved to be more effective as the objective function.
Given a rectangle r
and potential placement of it at coordinate c
,
F(r, c) = B(r, c).width + B(r, c).height
where B
is the bounding box determined via placing (r, c)
※.
A possible interpretation for F
's benefit is that using B.area
may lead to the rectangle cluster growing like a
long band, rather than growing like a box. Suppose we have already a "long" bounding box B
where B.height
is much greater than B.width
and we are to place a rectangle r
with roughly the size d
,
then placing r
on the short leads to a huge increment of the bounding area, i.e.
Δ(B.area) == d * B.height
which is much greater than placing on the short side where
Δ(B.area) == d * B.width
As a consequence, following the trend that r
is always placed along the long side when using B.area
as the
evaluation function, the objective value gradually loses further chance to get improved.
In contrast, with F
, the increment is roughly
Δ(F) == d
no matter r
is placed on the short side or the long side. This helps avoid the "long-stacking" problem effectively
and provides more choices for rest placements since the bounding box is stably very close to a rectangle .
Since there may be multiple placements of a rectangle resulting in bounding box size unchanged, the secondary objective function plays very important role for breaking the tie. Considering the following Best-Fit:
-
Any corner corresponds to a "slot" for placing a rectangle. The fill-rate of such slot matters - the greater the better, meaning there are less holes in the stack;
-
The absolute coordinate values of placement matters - the closer the placement to the axes, the more space it leaves for subsequent placements.
There seems to be no rule-of-thumb choosing potential tie-breakers. Ideally could be, the choice of tie-breaker is adaptive to the distribution of given rectangle sizes, which is yet to be explored systematically.