Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gmpy2 Sieve Example Code Problem #492

Open
Landryj opened this issue Jul 13, 2024 · 10 comments
Open

gmpy2 Sieve Example Code Problem #492

Landryj opened this issue Jul 13, 2024 · 10 comments

Comments

@Landryj
Copy link

Landryj commented Jul 13, 2024

When I run the Sieve of Eratosthenes example code (Python 3.11.5, Win 11 Pro, VS Code editor, RAM-32GB), it fails at around 4.5*10^9. The error message is: "gmp: overflow in mpz type". I've also run the code with WSL on a machine with nearly 200GB RAM. Previously, I've had gotten more extensive messages previously.
image

@skirpichev
Copy link
Contributor

Could you provide your code? Which gmpy2 version you are using?

Please note, that 4.5*10**9 is not an integer literal: that explains previous traceback. I.e. when you are using your limit variable - it has a wrong type and you got a type error.

@Landryj
Copy link
Author

Landryj commented Jul 14, 2024

I am running gmpy2==2.2.0 using the code below. When I run it, I get no error and no output!

`import time
import gmpy2

def sieve(limit=5000000000):
'''Returns a generator that yields the prime numbers up to limit.'''

# Increment by 1 to account for the fact that slices  do not include
# the last index value but we do want to include the last value for
# calculating a list of primes.
sieve_limit = gmpy2.isqrt(limit) + 1
limit += 1

# Mark bit positions 0 and 1 as not prime.
bitmap = gmpy2.xmpz(3)

# Process 2 separately. This allows us to use p+p for the step size
# when sieving the remaining primes.
bitmap[4 : limit : 2] = -1
print(len(bitmap))

# Sieve the remaining primes.
for p in bitmap.iter_clear(3, sieve_limit):
    bitmap[p*p : limit : p+p] = -1

return bitmap.iter_clear(2, limit)

if name == "main":
start = time.time()
result = list(sieve())
print(time.time() - start)
print(len(result))
`

@skirpichev
Copy link
Contributor

This is not, apparently, the code you are trying to run in the issue description. This is just an example from docs with an insane limit value (could you think about memory requirements for such a bitmap?!) and some debugging statements. I see no issue here. Just decrease limit value to fit available memory of some computer on Earth.

Could you, please, show us code which you run before and where you got different error messages?

@casevh
Copy link
Collaborator

casevh commented Jul 14, 2024

The fundamental issue is that we are over-flowing the limits of a C long which is always 32 bit for a Windows application. It works fine under WSL which has a 64 bit C long.

The GMP library just does a core dump when it detects an overflow. gmpy2 v2.1.5 links to a C runtime library that outputs the error message gmp: overflow in mpz type. The GMP library used for v2.2.0 doesn't output the error message.

limit is the number of bits in an mpz type. I'll do more research on the actual limits on different platforms.

@casevh
Copy link
Collaborator

casevh commented Jul 14, 2024

I have found some other strange behaviors. I will continue to investigate.

@Landryj
Copy link
Author

Landryj commented Jul 14, 2024

Thanks for investigating. I've gotten various error messages on different machines and on Windows or WSL.

@skirpichev
Copy link
Contributor

The GMP library used for v2.2.0 doesn't output the error message.

It does.

$ ulimit -v $((1024*512))
$ python
Python 3.11.3+ (heads/3.11:9fbb614c4e, Apr 29 2023, 14:18:05) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import gmpy2
>>> bitmap = gmpy2.xmpz(3)
>>> bitmap[3:5000000000:2]=-1
GNU MP: Cannot reallocate memory (old_size=8 new_size=625000000)
Aborted

Which is a bit strange, that OP claims he have seen "gmp: overflow in mpz type" (same type error for earlier versions, see #280) and that gmpy2 version is 2.2.

@casevh
Copy link
Collaborator

casevh commented Jul 14, 2024

I should be more specific. This is a native Windows issue.

The GMP library used in v2.2.0 for x64 native Windows applications is linked against UCRT. It does not display the error message. The GMP library used in v2.1.5 for x64 native Windows applications is linked against MSVCRT. It does display the error message. The GMP library used on Linux displays the expected error message.

I haven't researched why, but I'm guessing that the GMP error message is sent to stderr and UCRT handles it differently.

In either case, GMP is aborting. We get an error message in most environments but not all. It is a red herring for the real issue - how can we make an x64 native application have the same limits as (for example) an x64 Linux application.

@skirpichev
Copy link
Contributor

It does display the error message.

Nothing at all, i.e. no something like "Aborted" above (this is coming from libc)?

Just in case, are you running Python interpreter from CLI or from some IDE like VS? If later, stderr might be redirected somehow.

@Landryj
Copy link
Author

Landryj commented Jul 14, 2024

For whatever it's worth, I run large primes using the 'bitstring' module. The Sieve code example can be found at [https://github.com/scott-griffiths/bitstring/blob/main/doc/walkthrough.ipynb]. I modified the code, as shown below, to run on my SurfacePro6 with 8GB RAM. I set the limit is 10,000,000,000. It ran in about 120 seconds.

`#### Landry Note: source is - https://github.com/scott-griffiths/bitstring/blob/main/doc/walkthrough.ipynb

Added 'import bitstring' and 'import math'

Added code for timing

from bitstring import BitArray
import math
import time

Code_Start_Time = time.time()

Scott Griffiths code below

His code example has 'limit = 100_000_000'

I have increasted it to: 10_000_000_000

Create a BitArray with a hundred million 'one' bits

limit = 10_000_000_000
is_prime = BitArray(limit)
is_prime.set(True)

Manually set 0 and 1 to be not prime.

is_prime.set(False, [0, 1])

For every other integer, if it's set as prime then unset all of its multiples

for i in range(2, math.ceil(math.sqrt(limit))):
if is_prime[i]:
is_prime.set(False, range(i*i, limit, i))

My timing code

Total_Execution_Time = time.time() - Code_Start_Time
print('Execution time:', Total_Execution_Time)

end my timing code

print(f"There are {is_prime.count(True)} primes less than {limit},")
print(f"the largest one of which is {is_prime.rfind('0b1')[0]}")
print(f"and there are {len(list(is_prime.findall('0b101')))} twin primes.")`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants