Skip to content

OlegKonings/GPU_4Polygon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 

Repository files navigation

GPU_4Polygon

4p

Similar to the CUDA_brute_triangle project, but this implementation brute-force examines all N choose 4 2-D points and finds which polygon(s) contain the greatest number of points inside.

Not quite as fast as the 3 point version, but much faster than a serial implementation. If you want an estimate of a multi-core implementation you can divide the CPU time by the number of cores (not really that simple in real life). Even then much much faster than a multi-core implementation, and this performance despite the fact there is a great deal of 64-bit integer operations which are much slower than 32-bit operations on consumer GPUs.

CPU used= Intel i7 4820k 4.5 GHZ

GPU used= NVIDIA GTX 1080 1.8 Ghz

CUDA 8.0 RC

Windows 7 x64

compile with --use_fast_math flag for better performance

Optimal Polygon Running Time comparison:

NumPointsNumCombosEvaluated 4.5 Ghz CPU time 1.8 Ghz GPU time CUDA Speedup
1003,921,225 4,245 ms 9 ms 471.6x
20064,684,950 148,411ms 250 ms 593.6x

Still think this can be improved, but in general this technique works and will run even faster on a Pascal Titan X, so will post those results soon.

Can handle up to 2000 points, but any more will overflow constant memory.

Linux users need to change the long long(a) casts to (long long)a . Does need a GPU will compute capability >3.0.

<script> (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) })(window,document,'script','//www.google-analytics.com/analytics.js','ga'); ga('create', 'UA-60172288-1', 'auto'); ga('send', 'pageview'); </script>

Releases

No releases published

Packages

No packages published

Languages