Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Representation and orientation by prime products #12

Open
erik-mansson opened this issue Jul 15, 2023 · 0 comments
Open

Representation and orientation by prime products #12

erik-mansson opened this issue Jul 15, 2023 · 0 comments

Comments

@erik-mansson
Copy link

erik-mansson commented Jul 15, 2023

Here's a proposal that I hope to eventually start implementing, but feel free to run with the idea if you wish.

This concerns the core topics of how to efficiently look up whether a polycube candidate is a rotation (including the null rotation) of any in the set of canonical polycubes, and how to possibly iterate over candidates in a way that lets us skip iterations/branches early if it can be concluded that they would not lead to valid or canonical polycubes.

One way to define the canonical orientation of a polycube (once it has been cropped to a tight bounding block) is to pick the orientation that gives the minimum hash value. If we can construct the hashing function to have some nice properties, it might be possible to know that candidates of size N are iterated in a canonical order, predict whether an cube-extension would give a non-canonical orientation and then start taking shortcuts in the iteration...

I propose the idea to assign a prime number to each 3D array element and then and define the hash number as the product of the primes whose locations are occupied with cubes. I.e. if occupied is a 3D-array with 1 where cubes are located and 0 elsewhere and primes is another 3D-array, the prime product would be P = np.product(primes ** occupied).

Consequences for iterating

If we for the moment ignore the issue of how many and large numbers would be needed, some nice properties for the iteration appear from the definition of canonical orientation as having the smallest P. I defined the order by defining primes = primes_1D[:H*W*L].reshape(H, W, L) for the case of a search space of shape (H, W, L) and with primes_1D being the sequence of enough primes, e.g. [2, 3, 5, 7, 11...]. Since this means that primes[0, 1, :] > primes[0, 0, -1] and so on, the canonical orientation keeps the occupied bounding box as close to the origin as possible (no padding before) and prefers to extend along along the last axis, i.e. keeping h <= w <= l for the bounding block of height h, width w and length l. Some pseudocode made me convinced that there are ways to skip many polycube extension candidates just by comparing the coordinates and whether the h <= w <= l invariant would be invalidated.

When extending towards positive coordinates or filling internal voids, I believe the canonical new P_candidate can be computed as simply as P_smaller_polycube * primes[where the new cube is put] without having to do any array padding or rotation! For extending towards negative coordinates, a special case per axis can be used, and here it seems the various rotations need to be computed to guarantee finding the canonical rotation of the candidate.

I don't have a very strong proof that my iteration scheme will find all polycubes of size N when starting from only canonically oriented cubes of size N-1, this remains to be shown empirially by comparison with the results of others...

Consequences for hashing and equality check

If we never reuse primes within the 3D array, the factorization of a product P is unique so the hashing is actually a bijective function between canonical polycube and its corresponding integer. Thus there's no need for storing something like a RLE-encoded polycube in memory to check for equality in the case of hash-collisions, we just need to count how many unique hash numbers we find, while ensuring we only do it for polycubes in canonical orientation. We can of course still write (append) newly found polycubes to a cache file in some format.

Size of numbers

Whether this would be faster than existing approaches is of course not obvious. If the prime product can fit in an unsigned 64-bit integer I have some hope. But so far I have not been able to prove that less than 173 bits would suffice for N=17 (but N=9 should definitively fit). I know my estimate is an unnecessarily high bound and Python does have arbitrary-size integers that could be used in a first implementation for exploring the range of canonical P-values empirically (letting the hash map use some internal non-unique machine-sized integers). I did consider optimizations like leaving zeros in some far-off elements of primes_3D known to not be needed for canonically oriented polycubes, and even special-case treatment of very straight polycubes to let the 3D-array side length be at most N-2. (Hash codes that involve squares of prime numbers are available for special meanings, since occupied.max() being 1 means that squares will never appear in P!) But to ensure my bit-bound is not an underestimate I currently allow prime-products that are clearly unreachable for canonically oriented cubes.

The day after first posting this, I edit to include a minor optimization towards smaller products: Instead of just using primes as factors, we can allow prime**2, prime**4, prime**8 and so on (powers of powers of 2) since there's still no way for two polycubes to get the same product. instead of primes = [2, 3, 5, ...] we now use numbers = [2, 3, 4, 5, 7, 9, 11, 13, 16, 17, ...] and for instance only the straight polycube of length 3 will have the product P = 24 = 2**1 * 3**1 * 2**2. Put differently: if we previously needed to use the V = len(primes) smallest primes, we can instead use the V smallest numbers in the set of primes**(2**k) for k < np.log2(primes.max()). For N=17, this just improves my upper bound from 173.4 to 171.2 bits, but even a two bit reduction might for some N make the difference between fitting in 64 bits or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant