Representation and orientation by prime products #12

erik-mansson · 2023-07-15T17:10:09Z

Here's a proposal that I hope to eventually start implementing, but feel free to run with the idea if you wish.

This concerns the core topics of how to efficiently look up whether a polycube candidate is a rotation (including the null rotation) of any in the set of canonical polycubes, and how to possibly iterate over candidates in a way that lets us skip iterations/branches early if it can be concluded that they would not lead to valid or canonical polycubes.

One way to define the canonical orientation of a polycube (once it has been cropped to a tight bounding block) is to pick the orientation that gives the minimum hash value. If we can construct the hashing function to have some nice properties, it might be possible to know that candidates of size N are iterated in a canonical order, predict whether an cube-extension would give a non-canonical orientation and then start taking shortcuts in the iteration...

I propose the idea to assign a prime number to each 3D array element and then and define the hash number as the product of the primes whose locations are occupied with cubes. I.e. if occupied is a 3D-array with 1 where cubes are located and 0 elsewhere and primes is another 3D-array, the prime product would be P = np.product(primes ** occupied).

Consequences for iterating

If we for the moment ignore the issue of how many and large numbers would be needed, some nice properties for the iteration appear from the definition of canonical orientation as having the smallest P. I defined the order by defining primes = primes_1D[:H*W*L].reshape(H, W, L) for the case of a search space of shape (H, W, L) and with primes_1D being the sequence of enough primes, e.g. [2, 3, 5, 7, 11...]. Since this means that primes[0, 1, :] > primes[0, 0, -1] and so on, the canonical orientation keeps the occupied bounding box as close to the origin as possible (no padding before) and prefers to extend along along the last axis, i.e. keeping h <= w <= l for the bounding block of height h, width w and length l. Some pseudocode made me convinced that there are ways to skip many polycube extension candidates just by comparing the coordinates and whether the h <= w <= l invariant would be invalidated.

When extending towards positive coordinates or filling internal voids, I believe the canonical new P_candidate can be computed as simply as P_smaller_polycube * primes[where the new cube is put] without having to do any array padding or rotation! For extending towards negative coordinates, a special case per axis can be used, and here it seems the various rotations need to be computed to guarantee finding the canonical rotation of the candidate.

I don't have a very strong proof that my iteration scheme will find all polycubes of size N when starting from only canonically oriented cubes of size N-1, this remains to be shown empirially by comparison with the results of others...

Consequences for hashing and equality check

If we never reuse primes within the 3D array, the factorization of a product P is unique so the hashing is actually a bijective function between canonical polycube and its corresponding integer. Thus there's no need for storing something like a RLE-encoded polycube in memory to check for equality in the case of hash-collisions, we just need to count how many unique hash numbers we find, while ensuring we only do it for polycubes in canonical orientation. We can of course still write (append) newly found polycubes to a cache file in some format.

Size of numbers

Whether this would be faster than existing approaches is of course not obvious. If the prime product can fit in an unsigned 64-bit integer I have some hope. But so far I have not been able to prove that less than 173 bits would suffice for N=17 (but N=9 should definitively fit). I know my estimate is an unnecessarily high bound and Python does have arbitrary-size integers that could be used in a first implementation for exploring the range of canonical P-values empirically (letting the hash map use some internal non-unique machine-sized integers). I did consider optimizations like leaving zeros in some far-off elements of primes_3D known to not be needed for canonically oriented polycubes, and even special-case treatment of very straight polycubes to let the 3D-array side length be at most N-2. (Hash codes that involve squares of prime numbers are available for special meanings, since occupied.max() being 1 means that squares will never appear in P!) But to ensure my bit-bound is not an underestimate I currently allow prime-products that are clearly unreachable for canonically oriented cubes.

The day after first posting this, I edit to include a minor optimization towards smaller products: Instead of just using primes as factors, we can allow prime**2, prime**4, prime**8 and so on (powers of powers of 2) since there's still no way for two polycubes to get the same product. instead of primes = [2, 3, 5, ...] we now use numbers = [2, 3, 4, 5, 7, 9, 11, 13, 16, 17, ...] and for instance only the straight polycube of length 3 will have the product P = 24 = 2**1 * 3**1 * 2**2. Put differently: if we previously needed to use the V = len(primes) smallest primes, we can instead use the V smallest numbers in the set of primes**(2**k) for k < np.log2(primes.max()). For N=17, this just improves my upper bound from 173.4 to 171.2 bits, but even a two bit reduction might for some N make the difference between fitting in 64 bits or not.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Representation and orientation by prime products #12

Representation and orientation by prime products #12

erik-mansson commented Jul 15, 2023 •

edited

Loading

Representation and orientation by prime products #12

Representation and orientation by prime products #12

Comments

erik-mansson commented Jul 15, 2023 • edited Loading

Consequences for iterating

Consequences for hashing and equality check

Size of numbers

erik-mansson commented Jul 15, 2023 •

edited

Loading