Optimizations from guy that did n=16 #5

joulebit · 2023-07-13T03:27:02Z

Here is the paper from the guy calculating to n=16 in 2005 in 11 days: http://kevingong.com/Polyominoes/ParallelPoly.html

Something very interesting he wrote about how to keep the hashtables small and more dividable

4.1.2 Hashing

One of most fundamental tasks of the polyomino program is to support uniqueness. We have to ensure that a polyomino has not already been counted. To do this, we store all of the unique polyominoes in a hash tree/table structure, and compare the candidate polyomino with only a small subset of the unique polyominoes. To access the hash structure, we first branch on some criteria(on) to find the correct hash table, then index into the hash table.
If we compare the candiate polyomino only with polyominoes of the same height and width, this narrows down the search considerably. We can also branch on the y-coordinate of the last square. This seems a bit strange, but it works well. (See Figure 4.1.2-1) To narrow it down even further, we use a hash function. This takes the XOR of every n bits, where n = height-1. (See Figure 4.1.2-2) This is not necessarily the best hash function, but it is simple to implement, and gives fair results, as we will see.

ClaasF · 2023-07-13T08:48:53Z

While watching the video, I wondered, if there's not a simpler way to canonicalize a representation, which would result in almost trivial hashing.

Each cube is essentially a string of 1s and 0s, just packed a bit weirdly (I don't know, how efficient numpy is here). So you could interpret each cube simply as a number.

Example in 2D:

Given the map
`
0 1 0

1 1 0

0 1 0
`
(edit: I have no idea, how to format this properly, but I think you get the idea)

you might interpret that as the binary "010110010". Each of the rotations should have a different binary representation (unless it's symmetrical, in which case it doesn't matter).

What I would call the "canonical" representation of a polycube is simply the rotation with the smallest binary value.

mikepound · 2023-07-13T08:54:04Z

This are really interesting ideas - I didn't read around much when I implemented mine in front of the TV! Actually I had in mind a video where it sparked some discussions, so in some sense my "do it badly" approach has been a success!

The speed of hashing and the speed of lookup are a concern. I certainly think the encoding / hashing could be improved, I also am not sure what the maximum capacity of the python set is. If it's small, then there'll be an increasing number of collisions that will massively slow down lookup. In this case splitting the problem on some trivial condition and having multiple hashes may be a lot more efficient.

Binary representation would also speed up hashing a lot. On my mind when I didn't do this, was the number of bits required to hold the biggest possible polycube for some n. For example what n means that we now have polycubes more than 32 bits? Python actually uses arbitrary precision integers, but i'd imagine these may take a performance hit.

ClaasF · 2023-07-13T08:58:12Z

Binary representation would also speed up hashing a lot. On my mind when I didn't do this, was the number of bits required to hold the biggest possible polycube for some n. For example what n means that we now have polycubes more than 32 bits? Python actually uses arbitrary precision integers, but i'd imagine these may take a performance hit.

Since it's just plain and simply hashing anyway, that shouldn't be too bad. One could even just use the pure String (capital S) and hash that. If the Bitcoin hype taught us anything it's that modern hardware is really good at hashing.

ClaasF · 2023-07-13T11:37:05Z

I just played around with the binary representation and turns out, it's not that simple:

cube 1:

[[[0]
[1]]

[[1]
[1]]]

cube 2:

[[[0 1]]

[[1 1]]]

And flattened:

[0 1 1 1]
[0 1 1 1]

So two very different sequences have the same signature - if one uses that simple approach, at least.

ClaasF · 2023-07-13T15:03:22Z

Update number 2: I found a relatively simple, but not exactly optimal solution: Padding the array with zeros:

def sig(polycube):
    polycube = np.pad(polycube, 1, 'constant', constant_values=0)
    packed_bytes = np.packbits(polycube)
    return int.from_bytes(packed_bytes)

def canonical_sig(polycube):
    sigs = [ (sig(cube_rotation), cube_rotation) for cube_rotation in all_rotations(polycube)]
    return min(sigs, key=lambda tup: tup[0])

This does yield the correct result, but padding is apparently really expensive, so it takes almost twice as long.

Not exactly the result I hoped for...

C++ code base cleanup

prophaner · 2023-07-31T16:04:20Z

I don't know where else to submit my entry. Check it out and see if it helps:
https://github.com/prophaner/polycubes/tree/main

nsch0e mentioned this issue Jul 13, 2023

C++ implementation #7

Merged

bertie2 pushed a commit to bertie2/opencubes that referenced this issue Jul 15, 2023

Merge pull request mikepound#5 from JATothrim/cleanup

1ffe456

C++ code base cleanup

VladimirFokow mentioned this issue Jul 16, 2023

Kevin Gong's methodology? #14

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizations from guy that did n=16 #5

Optimizations from guy that did n=16 #5

joulebit commented Jul 13, 2023

ClaasF commented Jul 13, 2023 •

edited

mikepound commented Jul 13, 2023

ClaasF commented Jul 13, 2023

ClaasF commented Jul 13, 2023

ClaasF commented Jul 13, 2023

prophaner commented Jul 31, 2023

Optimizations from guy that did n=16 #5

Optimizations from guy that did n=16 #5

Comments

joulebit commented Jul 13, 2023

ClaasF commented Jul 13, 2023 • edited

mikepound commented Jul 13, 2023

ClaasF commented Jul 13, 2023

ClaasF commented Jul 13, 2023

ClaasF commented Jul 13, 2023

prophaner commented Jul 31, 2023

ClaasF commented Jul 13, 2023 •

edited