New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gmpy2 can't parse valid Python number literals #210
Comments
I'm testing a fix for mpz(). I'll fix mpfr() next. The use of spaces as separators is a GMP behavior, no MPFR. Underscores were introduces in Python 3.6. |
I've done some testing and it becomes complicated to exactly match match Python's handling of underscores in all cases. I tried the simple approach of just removing the underscore characters but then strings that are invalid for Python become valid in gmpy2. (int('1_2_') raises an exception in Python but returns mpz(12) with gmpy2.) If the expectation is that valid Python strings are converted to the same value, then just deleting underscores should work, If all the expectation is that all invalid Python strings raise an exception, then I don't think that will ever be possible. |
On Sun, Mar 10, 2019 at 09:25:32AM -0700, casevh wrote:
If the expectation is that valid Python strings are converted to the same
value, then just deleting underscores should work, If all the expectation
is that all invalid Python strings raise an exception, then I don't think
that will ever be possible.
I think, that both goals should be possible. But certainly, this issue is
not so severe if it blocks the next release.
|
I will get back to this when I have more time (i.e. after the next release). |
This issue also came up in the SageMath project. See for how this was dealt with there. |
@casevh, maybe calling int() on the string first - can be a solution? |
Refreshing my memory... Calling int() would cause performance delays that I don't want to introduce. Do we want to change the behavior for strings that contain embedded spaces (i.e. mpz("1 2 3") currently returns mpz(123))? The simplest solution would be to just strip space and underscore characters. It wouldn't change the current embedded space behavior which is a side-effect of GMP and would accept valid strings that contain underscores. Rewriting the string pre-processor in C to follow Python's behavior exactly is more than I want to do right now. |
This is true. But how common is such a path of initialization? Second, we can only try int()'s string parsing for a failure...
Probably, not. The reason for this issue is to simplify replacement int <-> mpz. If the mpz() constructor will support all stuff the int()'s constructor does - it will be fine. |
Maybe we should patch gmplib/mpfr to support underscores? |
Ok, @casevh, it seems you are happy with the current solution (which accept invalid numeric literals), then I'll close this issue. |
I expect, that any correct number, coming in
str
form must be accepted by gmpy2's mpz/mpq/mpfr.Unfortunately, this is not the case. For example, underscores can't be used to group digits (spaces sometimes are accepted instead):
The text was updated successfully, but these errors were encountered: