-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rotation invariant representation for 2D shapes #25
Comments
Unless Im miss understanding something does this not always respond with 511 (0b1_1111_1111) times the number of squares? Because each set square contributes each bit once to another square as you do the convolution and then you sum all the numbers at the end recombining the bits? |
Ah, you are right, damn! The sum is basically counting the set blocks in the shape... I was a bit suspicious about using the sum as the fingerprint, but it seemed to work 🙃 I wonder if we could use some other property to match the feature matrices. |
it would certainly be useful. there might even be a lot of value in just a rotation invariant hash but im not 100% clear on how to use one if we had one to save wasting time performing a bunch of rotations when we can be quickly sure theyre different. Maybe taking the mask of 3x3s from above or 2x2x2 for 3d as 27bits is quite a big key, converting each to a canonical rotation through a table, then counting how many of what patterns there are and generating a hash from that but I havent put that much thought into it |
I have been tinkering around to find a rotation invariant representation for the shapes, and so far it seems that I have found one for at least 2D. I am now trying to extend this to 3D.
My idea was inspired by Local Binary Patterns (LBP). It is used in machine vision to extract features from an image.
The idea
To represent a shape in a rotation invariant way, the encoding should be based on relating all the cubes to all the other cubes in the shape, instead of some reference frame around the shape (because we have no way of fixing that reference frame). This makes it so that it doesn't matter in which order we compute and we still get the same result.
We first figure out a way to represent all distinct Moore neighborhoods of a cell. In 2D, there are 2^9 = 512 possible distinct Moore neighborhoods (the central cell plus all the 8-neighbors).
Then we calculate the distinct cell values for all the cells in the shape (including the empty cells) to get a feature matrix. That matrix should be unique for all shapes, but it is the same for rotated variants of that same shape.
Example in 2D
Say, we have a shape like this:
1. Add zero padding for calculation
We start by adding 2 levels of zero padding around the shape. One level is needed because the feature matrix is always larger than the shape (since all neighbors affect the cell value), and one level is needed to make sure we don't try reading outside the matrix when calculating the cell values.
2. Calculate the cell values using modified LBP
Then we can start calculating the feature matrix. We start from cell
(1, 1)
, the second cell of the second row (so that there are not edges next to it).To understand how the cell value is calculated, see the figure. The cell
(1, 1)
is highlighted in orange and its Moore neighborhood is enumerated from 1 to 9 (starting from top-left, spiraling into the center):Since there are 512 distinct Moore neighborhoods, 9 bits is sufficient to represent all of them. We start setting the bits one by one, starting from the MSB:
Now we have a 9-bit cell value for the first cell:
00001_0000
, which is16
in decimal. We can then move to the next cell. This is repeated for all cells except cells that are on the edge (so skipping the first and last columns as well as the first and last rows; they would be zeros anyway).Here are all the cell values for the example shape:
Then, the feature vector of this shape is:
3. Calculate the sum of the matrix values
Finally, we sum all the cell values in the feature matrix to get the fingerprint of that shape. This fingerprint is the same for all rotations of the same shape.
Comparing the variants
Here are all the variants of the example shape, along with their feature matrices and their sums (the fingerprints).
As you see, all variants have the same fingerprint 3577!
Final words
As of yet, I have not been able to prove or disprove this method, but so far it seems very promising. I have tested it on tens of various shapes in 2D, as you can see in my repository.
I want to prove that this method actually works and never gives the same fingerprint to 2 different shapes, then proceed to applying this to 3D, if possible.
The text was updated successfully, but these errors were encountered: