Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project Euler-inspired primes-count algorithm. #89

Closed

Conversation

shlomif
Copy link

@shlomif shlomif commented Apr 21, 2020

Based on a solution to a PE problem which I found in a solved problem forum.

Attributed to "Lucy Hedgehog".

Works well for larger ranges. Benchmark:

[shlomif@localhost ~]$ LD_LIBRARY_PATH=~/apps/primesieve/lib64 /usr/bin/time ~/apps/primesieve/bin/primesieve -c1 $((10 ** 12 / 2)) $(( 10 ** 12))
18299775876
4.83user 0.01system 0:04.85elapsed 99%CPU (0avgtext+0avgdata 18828maxresident)k
0inputs+0outputs (0major+6819minor)pagefaults 0swaps
[shlomif@localhost ~]$ time primesieve -c1 $(( 10 ** 12 / 2)) $(( 10 ** 12 ))
Sieve size = 128 KiB
Threads = 8
100%
Seconds: 31.459
Primes: 18299775876
primesieve -c1 $(( 10 ** 12 / 2)) $(( 10 ** 12 ))  249.77s user 0.23s system 794% cpu 31.462 total
[shlomif@localhost ~]$

shlomif added 2 commits Apr 21, 2020
Based on a solution to a PE problem which I found in a solved problem forum.

Attributed to "Lucy Hedgehog".

Works well for larger ranges. Benchmark:

```
[shlomif@localhost ~]$ LD_LIBRARY_PATH=~/apps/primesieve/lib64 /usr/bin/time ~/apps/primesieve/bin/primesieve -c1 $((10 ** 12 / 2)) $(( 10 ** 12))
18299775876
4.83user 0.01system 0:04.85elapsed 99%CPU (0avgtext+0avgdata 18828maxresident)k
0inputs+0outputs (0major+6819minor)pagefaults 0swaps
[shlomif@localhost ~]$ time primesieve -c1 $(( 10 ** 12 / 2)) $(( 10 ** 12 ))
Sieve size = 128 KiB
Threads = 8
100%
Seconds: 31.459
Primes: 18299775876
primesieve -c1 $(( 10 ** 12 / 2)) $(( 10 ** 12 ))  249.77s user 0.23s system 794% cpu 31.462 total
[shlomif@localhost ~]$
```
@shlomif
Copy link
Author

shlomif commented Apr 21, 2020

Another benchmark:

[shlomif@localhost ~]$ LD_LIBRARY_PATH=~/apps/primesieve/lib64 /usr/bin/time ~/apps/primesieve/bin/primesieve -c1 $((2)) $(( 10 ** 12))
37607912018
2.73user 0.01system 0:02.74elapsed 99%CPU (0avgtext+0avgdata 18700maxresident)k
0inputs+0outputs (0major+4053minor)pagefaults 0swaps
[shlomif@localhost ~]$ /usr/bin/time primesieve -c1 2 $(( 10 ** 12 ))
Sieve size = 128 KiB
Threads = 8
100%
Seconds: 58.963
Primes: 37607912018
464.70user 0.31system 0:58.96elapsed 788%CPU (0avgtext+0avgdata 18156maxresident)k
0inputs+0outputs (0major+158385minor)pagefaults 0swaps
[shlomif@localhost ~]$ improvement-percent from 58.96 to 2.74
2051.82481751825%
[shlomif@localhost ~]$

@kimwalisch
Copy link
Owner

kimwalisch commented Apr 21, 2020

Hi shlomif,

Even though I don't fully understand your algorithm I think it is a combinatorial type prime counting algorithm based on Legendre's inclusion-exclusion formula. I am aware of that there are faster ways to count primes than the sieve of Eratosthenes (see here: https://github.com/kimwalisch/primecount#algorithms) for this reason I have created primecount (and also primesum).

However in primesieve I really only want to include prime sieving algorithms. primesieve is already very complex (it is nearly 8000 lines of code) and it is difficult for myself to keep everything in my head when working on primesieve. If I would start adding optimized combinatorial type prime counting algorithms it would get extremely painful for myself to work on this project (and nobody else could possibly understand the code anymore).

@shlomif
Copy link
Author

shlomif commented Apr 21, 2020

Thanks for the reply, @kimwalisch ! I'll give a more meaningful reply later.

@shlomif
Copy link
Author

shlomif commented Apr 21, 2020

@kimwalisch

OK, here goes: first of all I should note that primecount is much faster here than primesieve -c1 even with my changes. Thanks for mentioning it - I may not have registered it when I went over your github public repos.

Anyway, I think you are underestimating yourself and are fully capable of managing primesieve even if it gains more functionality. One of my projects ( https://github.com/shlomif/fc-solve ) has over 30,000 lines of C and C++ code in the production code directory, and 62,175 lines total there, whereas the repo contains 170,740 lines of code. And it is still manageable, and I recently found a way to improve the benchmark time by 10-20%. There are far larger open source and non-FOSS projects, and many are also manageable.

Anyway, I think you should not compromise on what I call external quality (the user experience which includes run time speed) out of fear that the internal quality will suffer. You can look into the literature about refactoring (also https://www.joelonsoftware.com/2000/04/06/things-you-should-never-do-part-i/ and https://www.joelonsoftware.com/2002/01/23/rub-a-dub-dub/ ) to clean up code and manage its complexity. For example, you can extract a function or a method with a meaningful name (say fasterCountPrimesFrom2ToMaxNum(long N) ) and implement the logic there - and test it using automated tests.

I see an unnecessarily slow primesieve -c1 speed as a misfeature which I'd rather not have to live with. It will be better to fix it once and for all than to have to explain it times and again.

Thanks!

@kimwalisch
Copy link
Owner

kimwalisch commented Apr 22, 2020

I see an unnecessarily slow primesieve -c1 speed as a misfeature which I'd rather not have to live with. It will be better to fix it once and for all than to have to explain it times and again.

That's where our opinions differ. A large number of primesieve's users are interested in benchmarking their own fast sieve of Eratosthenes implementations against primesieve. If I would improve primesieve's counting implementation using the combinatorial prime counting algorithm this use case would not be possible anymore.

Also I like the Unix philosophy: write programs that do one thing and do it well. For this reason primesieve only includes prime sieving algorithms and primecount includes combinatorial prime counting algorithms.

@kimwalisch kimwalisch closed this Apr 22, 2020
@shlomif
Copy link
Author

shlomif commented Apr 22, 2020

Hi!

I have a page that tries to debunk the doing one thing wellmantra here: http://shlomifishswiki.branchable.com/Unix_Philosophy_of_One_Tool_for_One_Job/ . The Unix philosophy has some other aspects that I'm more fond of like producing text output that is easy to parse and process/etc. (For more info see http://www.catb.org/esr/writings/taoup/ .)

And regarding that, primesieve already provides the -c1 and so it does that task, only it does it at a suboptimal time (which is doing it not too well). If one wishes to benchmark sieving one can do time bash -c 'primesieve -p1 2 1e9 > /dev/null' or time bash -c 'primesieve -p1 2 1e9 > goodresults.txt', or use sha256sum so you'll have evidence that the output is correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants