Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enumerating 3d and 4d rotations at the same time #52

Open
pbj4 opened this issue Sep 25, 2023 · 8 comments
Open

Enumerating 3d and 4d rotations at the same time #52

pbj4 opened this issue Sep 25, 2023 · 8 comments

Comments

@pbj4
Copy link

pbj4 commented Sep 25, 2023

It's possible to count the 3d rotation polycubes when generating the 4d rotation polycubes using the following relationship:

If a 4d rotation polycube has a reflectional symmetry, then it counts as one 3d rotation polycube, else it counts as two.

I have a repo that implements this idea, #11, #49, and a streaming articulation point algorithm. The performance seems competitive, but I would appreciate if someone else could benchmark it as well. For n = 12 my laptop takes 5.6s to run mine, 7s to run #49 with ./polycube_generator 12 -a, 11.2s to run #27, and 39.6s to run the current rust version with cargo run -r -- enumerate -m hashless 12 and a n = 11 cache file.

@snowmanam2
Copy link

I did some testing on my FX-8350 (mostly "y" revision, but the "z" seems similar) and got the following timing results (with updated numbers from mine for comparison):
N=12: 8s (9s)
N=13: 58s (63s)
N=14: 437s (546s)

Indeed it seems your methods are faster, and it seems the scaling is better as N increases. I also find it interesting this was possible while checking reflections at the same time. I do wonder how much of the speedup was due to improvement of the removal check versus other factors. I did notice revision "v" is 13.6s vs "p" at 23.29s (N=12), so perhaps that might indicate part of the speed difference - though I'm sure there's some other optimizations you're doing somewhere in there.

I'm not sure if I'm interpreting what you're doing correctly, but are you only propagating the "4D-rotation" unique polycubes and then checking for symmetry to back-calculate the "3D-rotation" unique polycubes? I could see why that would scale better to larger N values if that's the case, and I would expect it to be the main difference in speed. I was also thinking it might be interesting to do something similar with the "translation" unique polycubes, though I'm not sure how that could be done efficiently.

From what I can see, the big thing missing is the ability to read / write some sort of output file, or otherwise some way to feed a known set of starting polycubes to generate segments for distributed jobs. With that in place, I would expect N=19 or N=20 could be quite feasible.

@pbj4
Copy link
Author

pbj4 commented Oct 2, 2023

@snowmanam2 Thank you for your feedback. Your interpretation is correct.

I've written the distributed system now and used it to run n = 17 in 8h37m, and it agrees with the 3d rotation count of 457409613979 as well as generating the 4d rotation count of 228779330204.

@hidny seems interested in this version of the problem as well. I think there's a high probability my count is correct since the one derived from it agrees with the other implementations, but it would be good for someone else to confirm as well.

@hidny
Copy link

hidny commented Oct 3, 2023

Unfortunately, I won't be able to confirm your number anytime soon. My java implementation is pretty slow compared to the Rust version of it and it would take me around 2 weeks of CPU time to get the number. I also have no experience with Rust. I might eventually experiment with the Rust code and get it that way, but that won't be any time soon.

@HakMe2Deth
Copy link

@snowmanam2 Thank you for your feedback. Your interpretation is correct.

I've written the distributed system now and used it to run n = 17 in 8h37m, and it agrees with the 3d rotation count of 457409613979 as well as generating the 4d rotation count of 228779330204.

@hidny seems interested in this version of the problem as well. I think there's a high probability my count is correct since the one derived from it agrees with the other implementations, but it would be good for someone else to confirm as well.

Hi all
First time posting on this although I have been following it for a couple of months after I saw Mike's first video.

I have been doing some work on the processing side, not the coding or algorithms and wanted to know what is best to feedback/upload. I have migrated the RUST implementation to Windows as I have access to a few machines with that OS.
Also optimised it on Mint. Definitely getting a performance boost. anywhere from 10% to 50% although I am dubious on the higher increase as that may be cached data. Running in a VM but 20 threads does work well. Servers on Intel are terrible though (or it may be MS VS crap compiler) Also good results in a VM on my PC. an oc'd AMD R7 5700X. And AMD multi-threading is better for these sort of calculations.

So wonder if you can get back to me @pbj4 to see if what I am working on can help?

Thanks

Tony

Some results...


Fresh compile on Intel server - Windows 11 VM Hyper-V on Server 2022

c:\Scripts\polycubes-single>%CD%\target\x86_64-pc-windows-msvc\release\polycubes 12
enumerating up to n = 12...
results:
n: 1, r: 1, p: 1
n: 2, r: 1, p: 1
n: 3, r: 2, p: 2
n: 4, r: 8, p: 7
n: 5, r: 29, p: 23
n: 6, r: 166, p: 112
n: 7, r: 1023, p: 607
n: 8, r: 6922, p: 3811
n: 9, r: 48311, p: 25413
n: 10, r: 346543, p: 178083
n: 11, r: 2522522, p: 1279537
n: 12, r: 18598427, p: 9371094
total time: 3.3252505s
performance: 6472882 r/s, 3265525 p/s

c:\Scripts\polycubes-single>%CD%\target\x86_64-pc-windows-msvc\release\polycubes 13
enumerating up to n = 13...
results:
n: 1, r: 1, p: 1
n: 2, r: 1, p: 1
n: 3, r: 2, p: 2
n: 4, r: 8, p: 7
n: 5, r: 29, p: 23
n: 6, r: 166, p: 112
n: 7, r: 1023, p: 607
n: 8, r: 6922, p: 3811
n: 9, r: 48311, p: 25413
n: 10, r: 346543, p: 178083
n: 11, r: 2522522, p: 1279537
n: 12, r: 18598427, p: 9371094
n: 13, r: 138462649, p: 69513546
total time: 23.9943311s
performance: 6667683 r/s, 3349634 p/s

c:\Scripts\polycubes-single>%CD%\target\x86_64-pc-windows-msvc\release\polycubes 14
enumerating up to n = 14...
results:
n: 1, r: 1, p: 1
n: 2, r: 1, p: 1
n: 3, r: 2, p: 2
n: 4, r: 8, p: 7
n: 5, r: 29, p: 23
n: 6, r: 166, p: 112
n: 7, r: 1023, p: 607
n: 8, r: 6922, p: 3811
n: 9, r: 48311, p: 25413
n: 10, r: 346543, p: 178083
n: 11, r: 2522522, p: 1279537
n: 12, r: 18598427, p: 9371094
n: 13, r: 138462649, p: 69513546
n: 14, r: 1039496297, p: 520878101
total time: 193.2894437s
performance: 6205630 r/s, 3110621 p/s

c:\Scripts\polycubes-single>%CD%\target\x86_64-pc-windows-msvc\release\polycubes 15
enumerating up to n = 15...
results:
n: 1, r: 1, p: 1
n: 2, r: 1, p: 1
n: 3, r: 2, p: 2
n: 4, r: 8, p: 7
n: 5, r: 29, p: 23
n: 6, r: 166, p: 112
n: 7, r: 1023, p: 607
n: 8, r: 6922, p: 3811
n: 9, r: 48311, p: 25413
n: 10, r: 346543, p: 178083
n: 11, r: 2522522, p: 1279537
n: 12, r: 18598427, p: 9371094
n: 13, r: 138462649, p: 69513546
n: 14, r: 1039496297, p: 520878101
n: 15, r: 7859514470, p: 3934285874
total time: 1538.2618343s
performance: 5889112 r/s, 2948481 p/s

@HakMe2Deth
Copy link

Update - heres my AMD - not on a VM - compiled exe:
F:\VirtualMachines\ISOs>polycubes 14
enumerating up to n = 14...
results:
n: 1, r: 1, p: 1
n: 2, r: 1, p: 1
n: 3, r: 2, p: 2
n: 4, r: 8, p: 7
n: 5, r: 29, p: 23
n: 6, r: 166, p: 112
n: 7, r: 1023, p: 607
n: 8, r: 6922, p: 3811
n: 9, r: 48311, p: 25413
n: 10, r: 346543, p: 178083
n: 11, r: 2522522, p: 1279537
n: 12, r: 18598427, p: 9371094
n: 13, r: 138462649, p: 69513546
n: 14, r: 1039496297, p: 520878101
total time: 91.5285238s
performance: 13105017 r/s, 6568994 p/s

And heres the "server" - compiled exe - not a VM -

C:\Scripts\Polycubes>polycubes 14
enumerating up to n = 14...
results:
n: 1, r: 1, p: 1
n: 2, r: 1, p: 1
n: 3, r: 2, p: 2
n: 4, r: 8, p: 7
n: 5, r: 29, p: 23
n: 6, r: 166, p: 112
n: 7, r: 1023, p: 607
n: 8, r: 6922, p: 3811
n: 9, r: 48311, p: 25413
n: 10, r: 346543, p: 178083
n: 11, r: 2522522, p: 1279537
n: 12, r: 18598427, p: 9371094
n: 13, r: 138462649, p: 69513546
n: 14, r: 1039496297, p: 520878101
total time: 191.1922358s
performance: 6273700 r/s, 3144742 p/s

@HakMe2Deth
Copy link

Updated the Linux optimisation - and also implemented another tweak - that definitely has a positive impact
LinuxMint VM on my AMD 7 5700X

enumerating up to n = 12...
results:
n: 1, r: 1, p: 1
n: 2, r: 1, p: 1
n: 3, r: 2, p: 2
n: 4, r: 8, p: 7
n: 5, r: 29, p: 23
n: 6, r: 166, p: 112
n: 7, r: 1023, p: 607
n: 8, r: 6922, p: 3811
n: 9, r: 48311, p: 25413
n: 10, r: 346543, p: 178083
n: 11, r: 2522522, p: 1279537
n: 12, r: 18598427, p: 9371094
total time: 1.473500887s
performance: 14607358 r/s, 7369314 p/s

enumerating up to n = 13...
results:
n: 1, r: 1, p: 1
n: 2, r: 1, p: 1
n: 3, r: 2, p: 2
n: 4, r: 8, p: 7
n: 5, r: 29, p: 23
n: 6, r: 166, p: 112
n: 7, r: 1023, p: 607
n: 8, r: 6922, p: 3811
n: 9, r: 48311, p: 25413
n: 10, r: 346543, p: 178083
n: 11, r: 2522522, p: 1279537
n: 12, r: 18598427, p: 9371094
n: 13, r: 138462649, p: 69513546
total time: 11.122578489s
performance: 14383949 r/s, 7226043 p/s

enumerating up to n = 14...
results:
n: 1, r: 1, p: 1
n: 2, r: 1, p: 1
n: 3, r: 2, p: 2
n: 4, r: 8, p: 7
n: 5, r: 29, p: 23
n: 6, r: 166, p: 112
n: 7, r: 1023, p: 607
n: 8, r: 6922, p: 3811
n: 9, r: 48311, p: 25413
n: 10, r: 346543, p: 178083
n: 11, r: 2522522, p: 1279537
n: 12, r: 18598427, p: 9371094
n: 13, r: 138462649, p: 69513546
n: 14, r: 1039496297, p: 520878101
total time: 88.986149132s
performance: 13479433 r/s, 6756673 p/s

enumerating up to n = 15...
results:
n: 1, r: 1, p: 1
n: 2, r: 1, p: 1
n: 3, r: 2, p: 2
n: 4, r: 8, p: 7
n: 5, r: 29, p: 23
n: 6, r: 166, p: 112
n: 7, r: 1023, p: 607
n: 8, r: 6922, p: 3811
n: 9, r: 48311, p: 25413
n: 10, r: 346543, p: 178083
n: 11, r: 2522522, p: 1279537
n: 12, r: 18598427, p: 9371094
n: 13, r: 138462649, p: 69513546
n: 14, r: 1039496297, p: 520878101
n: 15, r: 7859514470, p: 3934285874
total time: 676.740578156s
performance: 13386218 r/s, 6702030 p/s

@HakMe2Deth
Copy link

a small improvement on the linux compile again. But the 16 calculation had an error earlier when I tried. Probably the overclocked memory or CPU at constant 100% skipped a cycle.

enumerating up to n = 15...
results:
n: 1, r: 1, p: 1
n: 2, r: 1, p: 1
n: 3, r: 2, p: 2
n: 4, r: 8, p: 7
n: 5, r: 29, p: 23
n: 6, r: 166, p: 112
n: 7, r: 1023, p: 607
n: 8, r: 6922, p: 3811
n: 9, r: 48311, p: 25413
n: 10, r: 346543, p: 178083
n: 11, r: 2522522, p: 1279537
n: 12, r: 18598427, p: 9371094
n: 13, r: 138462649, p: 69513546
n: 14, r: 1039496297, p: 520878101
n: 15, r: 7859514470, p: 3934285874
total time: 634.009750047s
performance: 14288419 r/s, 7153732 p/s

@HakMe2Deth
Copy link

HakMe2Deth commented Nov 29, 2023

Added my suggestions, optimisations and tutorials to @pbj4 update - pbj4/polycubes#1

Let me know your thoughts please

Thanks

Tony

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants