Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abnormal situation of converting "mpz" to "str" #489

Closed
gpchn opened this issue Jul 1, 2024 · 4 comments
Closed

Abnormal situation of converting "mpz" to "str" #489

gpchn opened this issue Jul 1, 2024 · 4 comments

Comments

@gpchn
Copy link

gpchn commented Jul 1, 2024

Hardware: Windows 10 PC (x86_64)

Python version: 3.12.1

gmpy2 version: 2.2.0

Problem: I am writing a program about calculating pi to n decimal places. It uses the Chudnovsky formula and binary splitting algorithm. Its calculation result is correct and the speed is very fast. Here is the code:

def chudnovsky_binsplit_mpz(digits:int):
    from gmpy2 import mpz, sqrt

    digits *= 2
    def binsplit(a, b):
        if b - a == 1:
            if a == 0:
                Pab = Qab = 1
            else:
                Pab = (6*a-5) * (2*a-1) * (6*a-1)
                Qab = 640320**3//24 * a**3
            Tab = (13591409 + 545140134 * a) * Pab
            if a & 1:
                Tab = -Tab
            return mpz(Pab), mpz(Qab), mpz(Tab)
        else:
            m = (a + b) // 2
            Pam, Qam, Tam = binsplit(a, m)
            Pmb, Qmb, Tmb = binsplit(m, b)
            Pab = Pam * Pmb
            Qab = Qam * Qmb
            Tab = Qmb * Tam + Pam * Tmb
            return Pab, Qab, Tab
    
    terms = digits // 14 + 1
    P, Q, T = binsplit(0, terms)
    return mpz(Q * 426880 * sqrt(10005 * mpz(10)**digits) // T)

def main():
    n = 1000000
    pi = str(chudnovsky_binsplit_mpz(n))
    print(pi)


if __name__ == "__main__":
    main()

The calculated result is "31415......0995". I am sure it is correct because it is the same as the result before acceleration.

The problem is that when I try to take the last number, it does not match the expected result:

print(pi[-1])
# printed 4, not 5

This is indeed confusing. So I started trying to debug:

def main():
    n = 1000000
    pi = str(chudnovsky_binsplit_mpz(n))
    print(pi)

    print(len(pi))
    # 1000001
    correct_pi = "31415926......0995" # Ctrl-C from terminal
    print(type(pi) == type(correct_pi) == str)
    # true
    print(pi == correct_pi)
    # false
    for i,j in zip(pi, correct_pi):
        if i != j:
            print("!")
    # (no "!" was printed)

So the length is correct, the type is correct, each character is correct, but it's just different, and slicing can also cause problems!

Is this a problem of CPython, gmpy2, or something else?...

@gpchn
Copy link
Author

gpchn commented Jul 1, 2024

Just now, I tried to use math.isort() instead gmpy2. I recalculated and the last few digits became "8778"...

Even worse, when I print(pi[-1]) again, the output is 1...

I hate mathematics

update: I found online that the number 1000000th is actually 1

@gpchn gpchn closed this as completed Jul 1, 2024
@gpchn gpchn reopened this Jul 1, 2024
@casevh
Copy link
Collaborator

casevh commented Jul 2, 2024

The key issue in your code it the use of sqrt versus isqrt. You've been encountering precision errors with sqrt. The following code should run correctly both with and without gmpy2. You will need to comment out the gmpy2 import line and uncomment the following lines. See comments in code. I have verified calculations to both 1_000_000 and 10_000_000 digits.

def chudnovsky_binsplit_mpz(digits:int):
    # Use just the next line only to use gmpy2.
    from gmpy2 import mpz, isqrt

    # Comment out the previous line and uncomment the following lines
    # to use native Python. Limited to 10_000_000 digits.

    # from math import isqrt
    # import sys
    # sys.set_int_max_str_digits(10_000_100)
    # mpz = int

    digits *= 2
    def binsplit(a, b):
        if b - a == 1:
            if a == 0:
                Pab = Qab = 1
            else:
                Pab = (6*a-5) * (2*a-1) * (6*a-1)
                Qab = 640320**3//24 * a**3
            Tab = (13591409 + 545140134 * a) * Pab
            if a & 1:
                Tab = -Tab
            return mpz(Pab), mpz(Qab), mpz(Tab)
        else:
            m = (a + b) // 2
            Pam, Qam, Tam = binsplit(a, m)
            Pmb, Qmb, Tmb = binsplit(m, b)
            Pab = Pam * Pmb
            Qab = Qam * Qmb
            Tab = Qmb * Tam + Pam * Tmb
            return Pab, Qab, Tab

    terms = digits // 14 + 1
    P, Q, T = binsplit(0, terms)
    # Use 'isqrt' (integer square root) instead of 'sqrt' (floating
    # point square root).
    #
    # Remove redundant 'mpz(...)' call.
    return Q * 426880 * isqrt(10005 * mpz(10)**digits) // T

def main():
    n = 10_000_000
    pi = str(chudnovsky_binsplit_mpz(n))
    # Print exactly 'n' digits.
    print(pi[:n])


if __name__ == "__main__":
    main()

@gpchn
Copy link
Author

gpchn commented Jul 3, 2024

Thank you for response. But when I run your code, the result is incorrect. I tried several times and eventually found that it lost accuracy when n >= 16383 (perhaps only on my computer).

Also, interestingly, if only print(pi[-1]), it is correct(when n = 1000000).

I think this goes far beyond my knowledge of programming. The purpose of this issue is only to remind you if there are any bugs in gmpy2. If I wrote the wrong code, you can close this issue.

In fact, I have given up on continuing to delve into calculating pi or computer processing large numbers, which is too difficult.

@casevh
Copy link
Collaborator

casevh commented Jul 15, 2024

Closing.

@casevh casevh closed this as completed Jul 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants