Permalink
Browse files

json_decode() should generate a syntax error when given "".

Fixes bug #68938 (json_decode() decodes empty string without error).
Patch by jeremy at bat-country dot us.
  • Loading branch information...
LawnGnome committed Feb 2, 2015
1 parent fb803ff commit a7b3abe4e6f5e2fdfd8d55b676c9ca6b3f9c8cc8
Showing with 23 additions and 2 deletions.
  1. +4 −0 NEWS
  2. +1 −0 ext/json/json.c
  3. +7 −2 ext/json/tests/bug54484.phpt
  4. +11 −0 ext/json/tests/bug68938.phpt
View
4 NEWS
@@ -24,6 +24,10 @@ PHP NEWS
. Fixed bug #68571 (core dump when webserver close the socket).
(redfoxli069 at gmail dot com, Laruence)
- JSON:
. Fixed bug #68938 (json_decode() decodes empty string without error).
(jeremy at bat-country dot us)
- Libxml:
. Fixed bug #64938 (libxml_disable_entity_loader setting is shared
between threads). (Martin Jansen)
View
@@ -818,6 +818,7 @@ static PHP_FUNCTION(json_decode)
JSON_G(error_code) = 0;
if (!str_len) {
JSON_G(error_code) = PHP_JSON_ERROR_SYNTAX;
RETURN_NULL();
}
@@ -15,11 +15,16 @@ json_decode("invalid json");
var_dump(json_last_error());
json_decode("\001 invalid json");

This comment has been minimized.

Show comment
Hide comment
@bukka

bukka Feb 2, 2015

Contributor

@LawnGnome This is interesting one! I don't really think that this should be JSON_ERROR_CTRL_CHAR because the control characters are considered only for strings ( https://tools.ietf.org/html/rfc7159#section-7 ) and this is not a string as there are missing double quotes around it. The old decoder set ctrl char error but the new one (jsond based) set syntax error which makes more sense IMHO. It means a tiny BC break (if we consider errors as BC break) that I wasn't aware of. I changed the test in my PR ( bukka@c3cd2d3 ) for now and I plan to merge tomorrow. It's not a big issue that should delay the merging jsond but I'll think about it, check other cases and possibly later ping internals about it... Anyway thanks for adding this test even if it's not really related to this bug. ;)

@bukka

bukka Feb 2, 2015

Contributor

@LawnGnome This is interesting one! I don't really think that this should be JSON_ERROR_CTRL_CHAR because the control characters are considered only for strings ( https://tools.ietf.org/html/rfc7159#section-7 ) and this is not a string as there are missing double quotes around it. The old decoder set ctrl char error but the new one (jsond based) set syntax error which makes more sense IMHO. It means a tiny BC break (if we consider errors as BC break) that I wasn't aware of. I changed the test in my PR ( bukka@c3cd2d3 ) for now and I plan to merge tomorrow. It's not a big issue that should delay the merging jsond but I'll think about it, check other cases and possibly later ping internals about it... Anyway thanks for adding this test even if it's not really related to this bug. ;)

This comment has been minimized.

Show comment
Hide comment
@LawnGnome

LawnGnome Feb 3, 2015

Contributor

To be clear: this isn't my patch. I submitted it on behalf of the reporter of https://bugs.php.net/bug.php?id=68938.

I think you could argue it either way, but as long as it generates an error, I'm not too concerned.

@LawnGnome

LawnGnome Feb 3, 2015

Contributor

To be clear: this isn't my patch. I submitted it on behalf of the reporter of https://bugs.php.net/bug.php?id=68938.

I think you could argue it either way, but as long as it generates an error, I'm not too concerned.

This comment has been minimized.

Show comment
Hide comment
@SteelPangolin

SteelPangolin Feb 3, 2015

I'm the reporter of 68938, and just wanted a different error code from 4 (JSON_ERROR_SYNTAX) to preserve the original intent of the regression test for 54484: that the error code should change after a call to json_decode(''). Pretty much by accident, I found that error code 3 (JSON_ERROR_CTRL_CHAR) was easy to induce this way, but I'm not actually sure if it's the correct error; one could just as easily make a case for JSON_ERROR_SYNTAX, since there's no valid production that starts with a U+0001.

@SteelPangolin

SteelPangolin Feb 3, 2015

I'm the reporter of 68938, and just wanted a different error code from 4 (JSON_ERROR_SYNTAX) to preserve the original intent of the regression test for 54484: that the error code should change after a call to json_decode(''). Pretty much by accident, I found that error code 3 (JSON_ERROR_CTRL_CHAR) was easy to induce this way, but I'm not actually sure if it's the correct error; one could just as easily make a case for JSON_ERROR_SYNTAX, since there's no valid production that starts with a U+0001.

var_dump(json_last_error());
json_decode("");
var_dump(json_last_error());
?>
--EXPECT--
int(0)
int(0)
int(4)
int(0)
int(4)
int(3)
int(4)
@@ -0,0 +1,11 @@
--TEST--
Bug #68938 (json_decode() decodes empty string without indicating error)
--SKIPIF--
<?php if (!extension_loaded("json")) print "skip"; ?>
--FILE--
<?php
json_decode("");
var_dump(json_last_error());
?>
--EXPECT--
int(4)

8 comments on commit a7b3abe

@Majkl578

This comment has been minimized.

Show comment
Hide comment
@Majkl578

Majkl578 Feb 4, 2015

Contributor

This is a BC break, which should be reverted in 5.5 and 5.6 branches.

Contributor

Majkl578 replied Feb 4, 2015

This is a BC break, which should be reverted in 5.5 and 5.6 branches.

@LawnGnome

This comment has been minimized.

Show comment
Hide comment
@LawnGnome

LawnGnome Feb 4, 2015

Contributor

One person's BC break is another person's bug fix. I think it's OK (obviously, since I committed it) — "" is clearly invalid JSON, so signalling it as such seems reasonable.

Contributor

LawnGnome replied Feb 4, 2015

One person's BC break is another person's bug fix. I think it's OK (obviously, since I committed it) — "" is clearly invalid JSON, so signalling it as such seems reasonable.

@Majkl578

This comment has been minimized.

Show comment
Hide comment
@Majkl578

Majkl578 Feb 4, 2015

Contributor

Consider this piece of code (which exists in real applications/frameworks):

$data = json_decode($json);
if (json_last_error() !== JSON_ERROR_NONE) {
    throw new JsonException();
}

While it was completely OK to pass NULL or empty string before (e.g. $json was nullable in database), now it throws an exception.
Here are some real-world examples which will break when NULL/"" is passed: [1], [2], [3], [4], [5], [6], etc..

I'm not arguing about empty string being valid or invalid JSON, it is clearly invalid. But this change is a BC break nevertheless.

But let's see what RM thinks, cc @jpauli.

Contributor

Majkl578 replied Feb 4, 2015

Consider this piece of code (which exists in real applications/frameworks):

$data = json_decode($json);
if (json_last_error() !== JSON_ERROR_NONE) {
    throw new JsonException();
}

While it was completely OK to pass NULL or empty string before (e.g. $json was nullable in database), now it throws an exception.
Here are some real-world examples which will break when NULL/"" is passed: [1], [2], [3], [4], [5], [6], etc..

I'm not arguing about empty string being valid or invalid JSON, it is clearly invalid. But this change is a BC break nevertheless.

But let's see what RM thinks, cc @jpauli.

@bukka

This comment has been minimized.

Show comment
Hide comment
@bukka

bukka Feb 4, 2015

Contributor
Contributor

bukka replied Feb 4, 2015

@jpauli

This comment has been minimized.

Show comment
Hide comment
@jpauli

jpauli Feb 4, 2015

Contributor

This breaks BC and will break applications migrating to a micro version of mature stable branches.
This breaks our release process and so will be reverted from our stable branches (5.5 and 5.6), and will be applied/kept in master

Contributor

jpauli replied Feb 4, 2015

This breaks BC and will break applications migrating to a micro version of mature stable branches.
This breaks our release process and so will be reverted from our stable branches (5.5 and 5.6), and will be applied/kept in master

@Tyrael

This comment has been minimized.

Show comment
Hide comment
@Tyrael

Tyrael Feb 4, 2015

Member

for the record bugfixes are BC breaks by nature, and those are allowed by our rfc process.
but I do agree that in this specific case the bugfix isn't really that important to break applications.
for 5.5/5.6 we could maybe consider adding an E_DEPRECATED (instead setting json_globals.error_code) to indicate that this behavior will be changed/removed in the future, but previously we discussed on the list that introducing new E_DEPRECATED in a micro version isn't that nice.

Member

Tyrael replied Feb 4, 2015

for the record bugfixes are BC breaks by nature, and those are allowed by our rfc process.
but I do agree that in this specific case the bugfix isn't really that important to break applications.
for 5.5/5.6 we could maybe consider adding an E_DEPRECATED (instead setting json_globals.error_code) to indicate that this behavior will be changed/removed in the future, but previously we discussed on the list that introducing new E_DEPRECATED in a micro version isn't that nice.

@Majkl578

This comment has been minimized.

Show comment
Hide comment
@Majkl578

Majkl578 Feb 6, 2015

Contributor

Hmm, it's still present in latest RCs, php-5.6.6RC1 and php-5.5.22RC1. :(

@Tyrael: I think this change is ok for x.y+1.z as is (without E_DEPRECATED), but not for stable branches.

Contributor

Majkl578 replied Feb 6, 2015

Hmm, it's still present in latest RCs, php-5.6.6RC1 and php-5.5.22RC1. :(

@Tyrael: I think this change is ok for x.y+1.z as is (without E_DEPRECATED), but not for stable branches.

@jpauli

This comment has been minimized.

Show comment
Hide comment
@jpauli

jpauli Feb 6, 2015

Contributor

I reverted those from our current RC , 5.5.22RC1 and 5.6.6RC1.
See the commit log

Contributor

jpauli replied Feb 6, 2015

I reverted those from our current RC , 5.5.22RC1 and 5.6.6RC1.
See the commit log

Please sign in to comment.