New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
substr() indexing is wrong on very big numbers #20105
Comments
Here is some additional numerology. We get the entire string as long as the Most Significant Bit of a pointer size value is not set
then no value, as long as the MSB is set
Then we start getting some of the string again, probably following the overflow, which should unset the MSB
But then it behaves as if the (whichever it is) sign bit is set ...
... because it behaves the same way as this sequence of negative numbers
|
Personally I don't really think this is a bug in substr(). This is an example where perls flexibility with numeric types produces surprising results. Lets look at what perl thinks of that first number: $ perl -MDevel::Peek -le'Dump(18446744073709551615)' This value happens the decimal representation of UV_MAX, eg, 2**64-1. It is the highest value perl alone can represent as a true integer type. Add one and you get this: $ perl -MDevel::Peek -le'Dump(18446744073709551616)' Eg, it has been converted to an NV, a double, eg, floating point, which best represents the original integer. The internals logic uses the macro SvIV() to get a signed representation of the argument, it then checks to see if this IV is actually a UV. As you can see it is a UV so substr does not think it is a negative number. With the NV case the NV is converted to an IV and the "is UV" flag is not set, and it turns into -1. If there is a bug I guess it would be here. The second set of output you provided relates to bignum, and I believe you see a similar set of effects, the exact details I am not sure of but it wouldnt surprise me if a bignum turns into a float which is then converted to an IV and we see the same issue as above. This would occur in just about any part of our API's where we internally cast data into a UV/IV, so if we need to address it (it feels like a "well dont do that") then we should address it at the numeric layer, not the substr() layer. I dont know about the details of NV -> IV/UV conversion, it feels wrong that sign changes in the above cases. Yves |
but then what should someone with a string of 16 exibyte ( |
I can see consistency in the non-bignum examples.
To me, that provides credence to the behaviour. Cheers, |
Above a certain value, the output of
substr EXPR,OFFSET
is the last character of the stringAnd above the same value, the output of
substr EXPR,OFFSET,LENGTH
chop off the last character of the stringThe text was updated successfully, but these errors were encountered: