Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

One off error in C/CPP API libprimesieve multi-threading demonstrations #137

Closed
marioroy opened this issue Aug 24, 2023 · 8 comments
Closed
Labels

Comments

@marioroy
Copy link

marioroy commented Aug 24, 2023

I came across your GitHub repo and read through the C/CPP documentation. And tried the OpenMP demonstrations. I changed it to counting sum += 1 (experimenting) and noticed the count was incorrect, not matching the primesieve utility.

C API libprimesieve multi-threading example

CPP API libprimesieve multi-threading example

The 1-off error is caused by the following line. I was getting the correct/wrong output depending on the number of threads specified.

uint64_t stop = start + thread_dist < dist ? start + thread_dist : dist;

Fix: Subtract 1 e.g. start + thread_dist - 1.

uint64_t stop = start + thread_dist < dist ? start + thread_dist - 1 : dist;
$ OMP_NUM_THREADS=10 ./primecnt
start = 1, stop = 1000000001
start = 1000000002, stop = 2000000002
start = 2000000003, stop = 3000000003
start = 3000000004, stop = 4000000004
start = 4000000005, stop = 5000000005
start = 5000000006, stop = 6000000006
start = 6000000007, stop = 7000000007
start = 7000000008, stop = 8000000008
start = 8000000009, stop = 9000000009
start = 9000000010, stop = 10000000000
Count of the primes <= 10^10: 455052511
@kimwalisch
Copy link
Owner

I tested both the C and C++ OpenMP sum primes examples using your sum += 1 modification using the latest primesieve-11.1 and it worked correctly for me.

Which version of primesieve are you using (use primesieve -v to find out)? Can you post the example program you used which did not work correctly? Currently I am guessing the bug you have witnessed comes from a breaking change introduced in primesieve-11, I guess you are using an older version of primesieve (in which case you should refer to the documentation of that specific version e.g.: https://github.com/kimwalisch/primesieve/blob/v8.0/doc/CPP_API.md)

@marioroy
Copy link
Author

$ primesieve -v
primesieve 11.1, <https://github.com/kimwalisch/primesieve>
Copyright (C) 2010 - 2023 Kim Walisch
BSD 2-Clause License <https://opensource.org/licenses/BSD-2-Clause>

The C and C++ OpenMP demonstrations sum += 1 work fine for OMP_NUM_THREADS 1 through 9. Then, began failing running with 10 threads. The above output (ran with fix applied) is correct e.g. the start and start values are unique per each segment. That isn't the case without subtracting 1.

@kimwalisch
Copy link
Owner

Then, began failing running with 10 threads.

Yes, you are right. I was able to reproduce the bug now.

@marioroy
Copy link
Author

marioroy commented Aug 24, 2023

Great! For debugging the issue, I added the ordered OpenMP directive to get ordered output.

#include <primesieve.h>
#include <inttypes.h>
#include <stdio.h>
#include <omp.h>

int main(void)
{
  uint64_t sum = 0;
  uint64_t dist = 1e10;
  int threads = omp_get_max_threads();
  uint64_t thread_dist = (dist / threads) + 1;

  #pragma omp parallel for ordered reduction(+: sum)
  for (int i = 0; i < threads; i++)
  {
    uint64_t start = i * thread_dist + 1;
    uint64_t stop = start + thread_dist < dist ? start + thread_dist : dist;

    #pragma omp ordered
    printf("start = %ld, stop = %ld\n", start, stop);

    primesieve_iterator it;
    primesieve_init(&it);
    primesieve_jump_to(&it, start, stop);
    uint64_t prime = primesieve_next_prime(&it);

    /* Sum primes inside [start, stop] */
    for (; prime <= stop; prime = primesieve_next_prime(&it))
      sum += 1;
  }

  printf("Count of the primes <= 10^10: %" PRIu64 "\n", sum);

  return 0;
}

@marioroy
Copy link
Author

Notice the 1-off error. The first segment has a stop value ending in 3. And the 2nd segment has a starting value ending in 3 as well. It just happens that the 1-off error doesn't manifest itself running with 1 to 9 threads.

$ OMP_NUM_THREADS=9 ./primecnt 
start = 1, stop = 1111111113
start = 1111111113, stop = 2222222225
start = 2222222225, stop = 3333333337
start = 3333333337, stop = 4444444449
start = 4444444449, stop = 5555555561
start = 5555555561, stop = 6666666673
start = 6666666673, stop = 7777777785
start = 7777777785, stop = 8888888897
start = 8888888897, stop = 10000000000
Count of the primes <= 10^10: 455052511

$ OMP_NUM_THREADS=10 ./primecnt 
start = 1, stop = 1000000002
start = 1000000002, stop = 2000000003
start = 2000000003, stop = 3000000004
start = 3000000004, stop = 4000000005
start = 4000000005, stop = 5000000006
start = 5000000006, stop = 6000000007
start = 6000000007, stop = 7000000008
start = 7000000008, stop = 8000000009
start = 8000000009, stop = 9000000010
start = 9000000010, stop = 10000000000
Count of the primes <= 10^10: 455052512 -- incorrect

Fix: start + thread_dist - 1 Start and stop values are unique.

$ OMP_NUM_THREADS=9 ./primecnt 
start = 1, stop = 1111111112
start = 1111111113, stop = 2222222224
start = 2222222225, stop = 3333333336
start = 3333333337, stop = 4444444448
start = 4444444449, stop = 5555555560
start = 5555555561, stop = 6666666672
start = 6666666673, stop = 7777777784
start = 7777777785, stop = 8888888896
start = 8888888897, stop = 10000000000
Count of the primes <= 10^10: 455052511

$ OMP_NUM_THREADS=10 ./primecnt 
start = 1, stop = 1000000001
start = 1000000002, stop = 2000000002
start = 2000000003, stop = 3000000003
start = 3000000004, stop = 4000000004
start = 4000000005, stop = 5000000005
start = 5000000006, stop = 6000000006
start = 6000000007, stop = 7000000007
start = 7000000008, stop = 8000000008
start = 8000000009, stop = 9000000009
start = 9000000010, stop = 10000000000
Count of the primes <= 10^10: 455052511

@kimwalisch
Copy link
Owner

The cleanest fix I have come up so far is:

#pragma omp parallel for reduction(+: sum)
for (int i = 0; i < threads; i++)
{
  uint64_t start = i * thread_dist;
  uint64_t stop = start + thread_dist;
  stop = stop < dist + 1 ? stop : dist + 1;
  primesieve_iterator it;
  primesieve_init(&it);
  primesieve_jump_to(&it, start, stop);
  uint64_t prime = primesieve_next_prime(&it);
  
  /* Sum primes inside [start, stop[ */
  for (; prime < stop; prime = primesieve_next_prime(&it))
    sum += 1;
}

I removed + 1 from start, added dist + 1 and use prime < stop instead of prime <= stop.

@marioroy
Copy link
Author

marioroy commented Aug 24, 2023

The output is correct, but not seeing unique starting and stopping values. Here, the segment 1 stopping value is the same as the segment 2 starting value.

$ OMP_NUM_THREADS=10 ./primesum2
start = 0, stop = 1000000001
start = 1000000001, stop = 2000000002
start = 2000000002, stop = 3000000003
start = 3000000003, stop = 4000000004
start = 4000000004, stop = 5000000005
start = 5000000005, stop = 6000000006
start = 6000000006, stop = 7000000007
start = 7000000007, stop = 8000000008
start = 8000000008, stop = 9000000009
start = 9000000009, stop = 10000000001
Count of the primes <= 10^10: 455052511

@kimwalisch
Copy link
Owner

The output is correct, but not seeing unique starting and stopping values.

Is this really an issue? The stop number is simply non-inclusive as mentioned in the comment: Sum primes inside [start, stop[

I also tried to find a fix with an inclusive stop number, but the code is not as clean and requires more + 1, -1 corrections...

kimwalisch added a commit that referenced this issue Aug 24, 2023
@kimwalisch kimwalisch added the bug label Aug 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants