Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing literal assignment of min int64 is not an integer (-9223372036854775808) #14589

Closed
lewismoten opened this issue Jun 16, 2024 · 14 comments

Comments

@lewismoten
Copy link

lewismoten commented Jun 16, 2024

Description

I'm testing some edge cases with my code and started tracing a problem down to PHP itself. It turns out that assigning the minimum value of a 64 bit signed integer as a literal value ( -9223372036854775808) to a variable is not recognized as an integer. However, minimum value + 1 (-9223372036854775807) is recognized as an integer. In addition, the PHP_INT_MIN constant with the same value is recognized as an integer. Something is going on with the interpretation of numeric literals for only this specific value.

Furthermore, if I start with assigning a variable the minimum value - 1 (ie $value = -9223372036854775807 -1), then the minimum 64 bit signed value is correctly interpreted as an integer.

The following code:

<?php
error_reporting(E_ALL);

function foo(int $value) {
    return "Yes!";
}

echo "PHP Version: ".PHP_VERSION."<br>";

echo "Minimum value: ".PHP_INT_MIN."<br>"; // -9223372036854775808
echo "is_int: ".is_int(PHP_INT_MIN)."<br>";
echo "gettype: ".gettype(PHP_INT_MIN)."<br>";
echo "binary: 0x". strtoupper(unpack('H*', pack('J', PHP_INT_MIN))[1])."<br>";
try {
    echo "function worked: " . foo(PHP_INT_MIN)."<br>";
} catch (Exception $e) {
    echo "function failed: ".$e->getMessage()."<br>";
}
echo "<br>";

echo "PHP_INT_MIN assignment to variable<br>";
$value = PHP_INT_MIN;
echo "number: PHP_INT_MIN<br>";
echo "variable: ".$value."<br>";
echo "is_int: ".is_int($value)."<br>";
echo "gettype: ".gettype($value)."<br>";
echo "binary: 0x". strtoupper(unpack('H*', pack('J', $value))[1])."<br>";
try {
    echo "function type declaration passed: " . foo($value)."<br>";
} catch (Exception $e) {
    echo "function type declaration failed: ".$e->getMessage()."<br>";
}
echo "<br>";

echo "Hard-coded/literal assignment of -9223372036854775808 to variable<br>";
$value = -9223372036854775808;
echo "number: -9223372036854775808<br>";
echo "variable: ".$value."<br>"; // -9.2233720368548E+18 ?
echo "is_int: ".is_int($value)."<br>"; // false?
echo "gettype: ".gettype($value)."<br>"; // double?
echo "binary: 0x". strtoupper(unpack('H*', pack('J', $value))[1])."<br>";
try {
    echo "function type declaration passed: " . foo($value)."<br>";
} catch (Exception $e) {
    echo "function type declaration failed: ".$e->getMessage()."<br>";
}
echo "<br>";

$value = -9223372036854775807;
echo "test: minimum signed int64 + 1<br>";
echo "number: -9223372036854775807<br>";
echo "variable: ".$value."<br>";
echo "is_int: ".is_int($value)."<br>";
echo "gettype: ".gettype($value)."<br>";
echo "binary: 0x". strtoupper(unpack('H*', pack('J', $value))[1])."<br>";
try {
    echo "function type declaration passed: " . foo($value)."<br>";
} catch (Exception $e) {
    echo "function type declaration failed: ".$e->getMessage()."<br>";
}
echo "<br>";

$value = -9223372036854775807 - 1;
echo "test: minimum signed (int64 + 1) - 1<br>";
echo "number: -9223372036854775808<br>";
echo "variable: ".$value."<br>";
echo "is_int: ".is_int($value)."<br>";
echo "gettype: ".gettype($value)."<br>";
echo "binary: 0x". strtoupper(unpack('H*', pack('J', $value))[1])."<br>";
try {
    echo "function type declaration passed: " . foo($value)."<br>";
} catch (Exception $e) {
    echo "function type declaration failed: ".$e->getMessage()."<br>";
}
?>

Resulted in this output:

PHP Version: 8.3.6
Minimum value: -9223372036854775808
is_int: 1
gettype: integer
binary: 0x8000000000000000
function worked: Yes!

PHP_INT_MIN assignment to variable
number: PHP_INT_MIN
variable: -9223372036854775808
is_int: 1
gettype: integer
binary: 0x8000000000000000
function type declaration passed: Yes!

Hard-coded/literal assignment of -9223372036854775808 to variable
number: -9223372036854775808
variable: -9.2233720368548E+18
is_int:
gettype: double
binary: 0x8000000000000000
function type declaration passed: Yes!

test: minimum signed int64 + 1
number: -9223372036854775807
variable: -9223372036854775807
is_int: 1
gettype: integer
binary: 0x8000000000000001
function type declaration passed: Yes!

test: minimum signed (int64 + 1) - 1
number: -9223372036854775808
variable: -9223372036854775808
is_int: 1
gettype: integer
binary: 0x8000000000000000
function type declaration passed: Yes!

But I expected this output instead:

PHP Version: 8.3.6
Minimum value: -9223372036854775808
is_int: 1
gettype: integer
binary: 0x8000000000000000
function worked: Yes!

PHP_INT_MIN assignment to variable
number: PHP_INT_MIN
variable: -9223372036854775808
is_int: 1
gettype: integer
binary: 0x8000000000000000
function type declaration passed: Yes!

Hard-coded/literal assignment of -9223372036854775808 to variable
number: -9223372036854775808
variable: -9223372036854775808
is_int: 1
gettype: integer
binary: 0x8000000000000000
function type declaration passed: Yes!

test: minimum signed int64 + 1
number: -9223372036854775807
variable: -9223372036854775807
is_int: 1
gettype: integer
binary: 0x8000000000000001
function type declaration passed: Yes!

test: minimum signed (int64 + 1) - 1
number: -9223372036854775808
variable: -9223372036854775808
is_int: 1
gettype: integer
binary: 0x8000000000000000
function type declaration passed: Yes!

PHP Version

8.3.6

Operating System

No response

@mvorisek
Copy link
Contributor

here is minimal repro: https://3v4l.org/hps4B

@lewismoten
Copy link
Author

Uncertain, but this may be somehow related to issue #13243

@lewismoten
Copy link
Author

here is minimal repro: https://3v4l.org/hps4B

Thanks for the minimal repo. I wasn't aware of it prior. I was trying to convey as much detail as possible of what works and what doesn't. For now, the work-around to this issue is to just use the penultimate value and subtract one:

<?php
$value = -9223372036854775807 - 1;
?>

@damianwadley
Copy link
Member

damianwadley commented Jun 17, 2024

This is a result of how PHP parses numbers in code: -9223372036854775808 is not a negative number but actually 9223372036854775808 negated. And since that number is too large for a (positive) integer, the result is a negative float.

I thought this was documented but I'm not finding where...

edit: See php/doc-en#2400

@lewismoten
Copy link
Author

Thanks @damianwadley. Looking at how other literals are formatted, I can understand the concept of what is going on with the following example.

<?php
var_dump(PHP_INT_MIN); // integer
var_dump(-9223372036854775808); // literal decimal
var_dump(-0x8000000000000000); // literal hex
var_dump(-0b1000000000000000000000000000000000000000000000000000000000000000); // literal binary
var_dump(-01000000000000000000000); // literal octal
var_dump(PHP_INT_MIN === -9223372036854775808);

int(-9223372036854775808)
float(-9.223372036854776E+18)
float(-9.223372036854776E+18)
float(-9.223372036854775E+18)
float(-9.223372036854776E+18)
bool(false)
?>

So with literal decimals - although technically they represent the minimum value that can be stored as an integer, they themselves are not legitimate integers. I think this edge case should be documented, as it's an odd concept to code that technically says a known/common thing, and is interpreted as another.

I previously tried hunting the documentation for information (ie Type Declarations, Integer, Arithmetic Operators, Assignment Operators) about this to see if it was known behavior but came up empty. Still, this seems like a bug with the parser itself. Technically, -9223372036854775808 is a valid 64 bit integer. I know it is. Isn't it? ... isn't it? no? 😢 😭 Since it's only a problem with this specific value (-9223372036854775808), there should be a hook that detects if the string is equal to PHP_INT_MIN, and treats it as such. I mean, that's what PHP_INT_MIN is for, rite? Why else would it print out as a signed integer if we can't use that as a hard-coded value?

I'm starting to find references to this issue online:

And now I'm finding a flood of bug reports

@lewismoten
Copy link
Author

I'm noticing that integer documentation actually does reference it, but doesn't necessarily make it explicitly clear that the literal value -9223372036854775808 on 64 bit systems will not be interpreted as an integer. To the layman, PHP "appears" to support negative integers since -9223372036854775807 is interpreted as an integer. It's a bit vague as to the limitations that it's trying to convey. It needs to come right out and say what its trying to say, or provide an example.

The size of an int is platform-dependent, although a maximum value of about two billion is the usual value (that's 32 bits signed). 64-bit platforms usually have a maximum value of about 9E18. PHP does not support unsigned ints. int size can be determined using the constant PHP_INT_SIZE, maximum value using the constant PHP_INT_MAX, and minimum value using the constant PHP_INT_MIN.

Needs an update/addition to clarify what it's trying to say:

The negative sign operator prefixing literal numbers is not evaluated when parsing numbers as integers, thus making the minimum value in 32-bit and 64-bit systems impossible cast as an integer. Instead, use PHP_INT_MIN to represent these values when needed. Instead of hard-coding -9223372036854775808 on a 64-bit system, use PHP_INT_MIN.

@lewismoten
Copy link
Author

Closing as this is a duplicate of many issues that have already been vetted and closed as well.

@mvorisek
Copy link
Contributor

Might I ask some php dev if the parsing is really that hard to be changed to parse negative integers first instead of negative operator?

@lewismoten
Copy link
Author

Might I ask some php dev if the parsing is really that hard to be changed to parse negative integers first instead of negative operator?

You could try, but I suspect you won’t get far. Read through some of the comments on those links I posted. Since the number is parsed separately from the negate operator, it doesn’t have the context necessary to convert to a signed integer. The problem has been rediscovered many times. The only action taken has been to update the documentation. Mind you, the documentation is not clear about the actual problem and why this specific value is not an integer.

@mvorisek
Copy link
Contributor

mvorisek commented Jun 17, 2024

I belive this needs to be changed - https://github.com/php/php-src/blob/php-8.3.8/Zend/zend_ini_scanner.l#L354 - to parse signed numbers

testcase/repro: https://3v4l.org/MFUWB

discussion: https://bugs.php.net/bug.php?id=78081#1559116523

@damianwadley
Copy link
Member

I belive this needs to be changed - https://github.com/php/php-src/blob/php-8.3.8/Zend/zend_ini_scanner.l#L354 - to parse signed numbers

That's the INI parser. Check the zend_language files instead.

testcase/repro: https://3v4l.org/MFUWB

The tokenizer is a separate thing - one that doesn't have much knowledge about the actual semantics of PHP code, and I suspect something that can't be modified to tokenize a signed number without that also impacting how it tokenizes other code.

discussion: https://bugs.php.net/bug.php?id=78081#1559116523

Oh hey, look at that, been five years since I complained mentioned that this wasn't documented... 😅

@lewismoten
Copy link
Author

I belive this needs to be changed - https://github.com/php/php-src/blob/php-8.3.8/Zend/zend_ini_scanner.l#L354 - to parse signed numbers

That's the INI parser. Check the zend_language files instead.

testcase/repro: https://3v4l.org/MFUWB

The tokenizer is a separate thing - one that doesn't have much knowledge about the actual semantics of PHP code, and I suspect something that can't be modified to tokenize a signed number without that also impacting how it tokenizes other code.

discussion: https://bugs.php.net/bug.php?id=78081#1559116523

Oh hey, look at that, been five years since I complained mentioned that this wasn't documented... 😅

IMHO, it's still a documentation problem. I overlooked the existing documentation since it wasn't clear that it was related to this specific scenario. Often times you can see others in the comments talking about gotchas and demonstrating examples. None of that is present. This specific issue needs to be called out as clear as daylight. A 1-line code example would go far.

@damianwadley
Copy link
Member

Agreed. Note that php/doc-en#2400 exists.

@lewismoten
Copy link
Author

Looking at zend language, it's hard to discern what is going on. It looks like negate operator is applied to an expression here as it tries to create a token. Note that the code is not aware that it is parsing a number - it just knows that it's an expression being negated. The expression 9223372036854775808 has already (or will be) evaluated as a float.

https://github.com/php/php-src/blob/ac947925c0f2e6d8733b530179fb4ed465918f11/Zend/zend_language_parser.y#L1198C26-L1198C51

Perhaps if anything were to happen, the evaluated expression of 9223372036854775808 could have some kind of state associated with it during parsing to indicate that it although it is currently treated as a float, it could be coerced back to an integer as PHP_INT_MIN if it was negated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants