New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug #75237 (jsonSerialize() - Returning new instance of self causes segfault) #2763

Closed
wants to merge 1 commit into
base: master
from

Conversation

7 participants
@SammyK
Contributor

SammyK commented Sep 20, 2017

This patch fixes bug 75237. I'm pretty sure the garbage collection is correct, but just need to have a second set of eyes on it to make sure I didn't inadvertently break anything.

Also - this just throws a generic Exception. If you think we need to throw a specific exception, let me know and I can update the PR. :)

@jhdxr

This comment has been minimized.

Show comment
Hide comment
@jhdxr

jhdxr Sep 21, 2017

Contributor

I didn't think this PR fix the core problem. this test script still crash.

<?php
class Foo implements JsonSerializable {
  public function jsonSerialize() {
    return new Bar;
  }
}

class Bar implements JsonSerializable {
  public function jsonSerialize() {
    return new Foo;
  }
}

try {
  var_dump(json_encode(new Foo));
} catch (Exception $e) {
  echo $e->getMessage();
}
Contributor

jhdxr commented Sep 21, 2017

I didn't think this PR fix the core problem. this test script still crash.

<?php
class Foo implements JsonSerializable {
  public function jsonSerialize() {
    return new Bar;
  }
}

class Bar implements JsonSerializable {
  public function jsonSerialize() {
    return new Foo;
  }
}

try {
  var_dump(json_encode(new Foo));
} catch (Exception $e) {
  echo $e->getMessage();
}
@bukka

This comment has been minimized.

Show comment
Hide comment
@bukka

bukka Sep 21, 2017

Contributor

This is not a but but just an infinite recursion...

This will limit any recursion which can be easily valid. Just imagine code like this:

<?php

class Foo implements JsonSerializable {
    private $i;

    public function __construct($i) {
        $this->i = $i;
    }
    
    public function jsonSerialize() {
        if ($this->i === 0) {
            return [];
        }
        return [new self($this->i - 1)];
    }
}

$foo = new Foo(2);

echo json_encode($foo);

So nothing to fix in here...

Contributor

bukka commented Sep 21, 2017

This is not a but but just an infinite recursion...

This will limit any recursion which can be easily valid. Just imagine code like this:

<?php

class Foo implements JsonSerializable {
    private $i;

    public function __construct($i) {
        $this->i = $i;
    }
    
    public function jsonSerialize() {
        if ($this->i === 0) {
            return [];
        }
        return [new self($this->i - 1)];
    }
}

$foo = new Foo(2);

echo json_encode($foo);

So nothing to fix in here...

@SammyK

This comment has been minimized.

Show comment
Hide comment
@SammyK

SammyK Sep 21, 2017

Contributor

@jhdxr Nice catch! We'll have to think harder about this.

@bukka I see your point, but a segfault in any case should be fixed with a PHP error message. A possible solution would be to honor the $depth argument in this case.

Any more thoughts on this? :)

Contributor

SammyK commented Sep 21, 2017

@jhdxr Nice catch! We'll have to think harder about this.

@bukka I see your point, but a segfault in any case should be fixed with a PHP error message. A possible solution would be to honor the $depth argument in this case.

Any more thoughts on this? :)

@jhdxr

This comment has been minimized.

Show comment
Hide comment
@jhdxr

jhdxr Sep 21, 2017

Contributor

In fact I'm working on a similar issue: https://bugs.php.net/bug.php?id=74977

previously, my idea is to detect if there is a recursion. but as @bukka pointed out, it also limits some good recursion.

So, how about if we throw a stack overflow exception if we detected the level is too deep?

Contributor

jhdxr commented Sep 21, 2017

In fact I'm working on a similar issue: https://bugs.php.net/bug.php?id=74977

previously, my idea is to detect if there is a recursion. but as @bukka pointed out, it also limits some good recursion.

So, how about if we throw a stack overflow exception if we detected the level is too deep?

@SammyK

This comment has been minimized.

Show comment
Hide comment
@SammyK

SammyK Sep 21, 2017

Contributor

I like that idea. I'm still quite a newbie to internals dev so I'm not quite sure how we'd check to see how deep in the stack we are, but I'm guessing we might be able to access it via EG() somehow? Or I guess we could just add a new var to track how deep we are? :)

Contributor

SammyK commented Sep 21, 2017

I like that idea. I'm still quite a newbie to internals dev so I'm not quite sure how we'd check to see how deep in the stack we are, but I'm guessing we might be able to access it via EG() somehow? Or I guess we could just add a new var to track how deep we are? :)

@SammyK

This comment has been minimized.

Show comment
Hide comment
@SammyK

SammyK Sep 21, 2017

Contributor

I'm not sure if we can harness the same type of error checking when something like this happens:

function foo() {
    foo();
}

foo();

On 3v4l it gives a memory error:

Fatal error: Allowed memory size of %d bytes exhausted (tried to allocate %d bytes) in %s on line %d

On my machine it gives a max function nesting error:

Fatal error: Maximum function nesting level of '%d' reached, aborting! in %s on line %d

Just throwing out some ideas. :)

Contributor

SammyK commented Sep 21, 2017

I'm not sure if we can harness the same type of error checking when something like this happens:

function foo() {
    foo();
}

foo();

On 3v4l it gives a memory error:

Fatal error: Allowed memory size of %d bytes exhausted (tried to allocate %d bytes) in %s on line %d

On my machine it gives a max function nesting error:

Fatal error: Maximum function nesting level of '%d' reached, aborting! in %s on line %d

Just throwing out some ideas. :)

@rmccullagh

This comment has been minimized.

Show comment
Hide comment
@rmccullagh

rmccullagh Sep 21, 2017

Contributor

If there is a defined limit on stack depth, there should be an ini setting that controls it.

Contributor

rmccullagh commented Sep 21, 2017

If there is a defined limit on stack depth, there should be an ini setting that controls it.

@SammyK

This comment has been minimized.

Show comment
Hide comment
@SammyK

SammyK Sep 21, 2017

Contributor

@rmccullagh True that. I'm curious if we should be looking at a stack depth setting or look at the $depth argument in json_encode()? Or perhaps look to $depth first and then fall back to a max stack depth setting in the event someone does something like $depth = PHP_INT_MAX or something?

Contributor

SammyK commented Sep 21, 2017

@rmccullagh True that. I'm curious if we should be looking at a stack depth setting or look at the $depth argument in json_encode()? Or perhaps look to $depth first and then fall back to a max stack depth setting in the event someone does something like $depth = PHP_INT_MAX or something?

@@ -483,6 +483,14 @@ static int php_json_encode_serializable_object(smart_str *buf, zval *val, int op
}
if ((Z_TYPE(retval) == IS_OBJECT) &&
(Z_OBJ(retval) != Z_OBJ_P(val)) &&
ce->name == Z_OBJCE(retval)->name) {

This comment has been minimized.

@rmccullagh

rmccullagh Sep 21, 2017

Contributor

Are you trying to compare strings here? ce->name is a zend_string, so if you want to compare the char values, == will only compare the addresses.

@rmccullagh

rmccullagh Sep 21, 2017

Contributor

Are you trying to compare strings here? ce->name is a zend_string, so if you want to compare the char values, == will only compare the addresses.

This comment has been minimized.

@SammyK

SammyK Sep 24, 2017

Contributor

Nice catch! So if I wanted to compare zend_string's, would I change it to something like this: ZSTR_VAL(ce->name) == ZSTR_VAL(Z_OBJCE(retval)->name)?

@SammyK

SammyK Sep 24, 2017

Contributor

Nice catch! So if I wanted to compare zend_string's, would I change it to something like this: ZSTR_VAL(ce->name) == ZSTR_VAL(Z_OBJCE(retval)->name)?

This comment has been minimized.

@rmccullagh

rmccullagh Sep 24, 2017

Contributor

You could use zend_string_equals, which takes 2 zend_string * arguments. So I believe you could do: zend_string_equals(ce->name, Z_OBJCE(retval)->name) I'm not sure what Z_OBJCE(retval)->name expands to? Is it a zend_string *. or a char *?

@rmccullagh

rmccullagh Sep 24, 2017

Contributor

You could use zend_string_equals, which takes 2 zend_string * arguments. So I believe you could do: zend_string_equals(ce->name, Z_OBJCE(retval)->name) I'm not sure what Z_OBJCE(retval)->name expands to? Is it a zend_string *. or a char *?

This comment has been minimized.

@SammyK

SammyK Sep 25, 2017

Contributor

Oh awesome - thanks so much! :)

@SammyK

SammyK Sep 25, 2017

Contributor

Oh awesome - thanks so much! :)

This comment has been minimized.

@jhdxr

jhdxr Sep 26, 2017

Contributor

@rmccullagh it's a zend_string *, Z_OBJCE returns a _zend_class_entry, and you can find the definition of the structure here. Anyway, I recommend zend_string_equals as well. If there is a wrapper method, then use it.

@jhdxr

jhdxr Sep 26, 2017

Contributor

@rmccullagh it's a zend_string *, Z_OBJCE returns a _zend_class_entry, and you can find the definition of the structure here. Anyway, I recommend zend_string_equals as well. If there is a wrapper method, then use it.

This comment has been minimized.

@sgolemon

sgolemon Oct 10, 2017

Contributor

Already merged, but I just watched your video so here's some thoughts:

  1. If you're just trying to match the class, you can compare the ce only. This will be more correct than comparing names and (nominally) more performant. if ((Z_TYPE_P(retval) == IS_OBJECT) && (Z_OBJCE_P(retval) == ce)) { ... }

  2. The segfault is still easily triggerable via this script. As you mention in your video, there's a much bigger task involved in fixing this properly.

class A implements \JsonSerializable {
  public function jsonSerialize() {
    return new B;
  }
}

class B implements \JsonSerializable {
  public function jsonSerialize() {
    return new A;
  }
}
@sgolemon

sgolemon Oct 10, 2017

Contributor

Already merged, but I just watched your video so here's some thoughts:

  1. If you're just trying to match the class, you can compare the ce only. This will be more correct than comparing names and (nominally) more performant. if ((Z_TYPE_P(retval) == IS_OBJECT) && (Z_OBJCE_P(retval) == ce)) { ... }

  2. The segfault is still easily triggerable via this script. As you mention in your video, there's a much bigger task involved in fixing this properly.

class A implements \JsonSerializable {
  public function jsonSerialize() {
    return new B;
  }
}

class B implements \JsonSerializable {
  public function jsonSerialize() {
    return new A;
  }
}

This comment has been minimized.

@jhdxr

jhdxr Oct 10, 2017

Contributor

@sgolemon this pr has been closed without merge. and the example is exactly what I gave here. BTW, what do you mean by video? I don't see any video mentioned here

@jhdxr

jhdxr Oct 10, 2017

Contributor

@sgolemon this pr has been closed without merge. and the example is exactly what I gave here. BTW, what do you mean by video? I don't see any video mentioned here

This comment has been minimized.

@SammyK

SammyK Oct 10, 2017

Contributor

Thanks @sgolemon! I'll update my blog with your comment. :)

@jhdxr I made a little screencast of how I found & patched this bug. :)

@SammyK

SammyK Oct 10, 2017

Contributor

Thanks @sgolemon! I'll update my blog with your comment. :)

@jhdxr I made a little screencast of how I found & patched this bug. :)

This comment has been minimized.

@sgolemon

sgolemon Oct 10, 2017

Contributor

@jhdxr Oh, heh. I should really read the conversation first. I was only looking at the code review comments and interpreted the "Closed" as merged because we don't really merge via github. :p

@sgolemon

sgolemon Oct 10, 2017

Contributor

@jhdxr Oh, heh. I should really read the conversation first. I was only looking at the code review comments and interpreted the "Closed" as merged because we don't really merge via github. :p

@jhdxr

This comment has been minimized.

Show comment
Hide comment
@jhdxr

jhdxr Sep 21, 2017

Contributor

On my machine it gives a max function nesting error:
Fatal error: Maximum function nesting level of '%d' reached, aborting! in %s on line %d

It's an error generated by xdebug.

I'm curious if we should be looking at a stack depth setting or look at the $depth argument in json_encode()?

IMHO, using $depth of json_encode is enough. It's set to 512 by default, and user cannot bypass this limit.

Contributor

jhdxr commented Sep 21, 2017

On my machine it gives a max function nesting error:
Fatal error: Maximum function nesting level of '%d' reached, aborting! in %s on line %d

It's an error generated by xdebug.

I'm curious if we should be looking at a stack depth setting or look at the $depth argument in json_encode()?

IMHO, using $depth of json_encode is enough. It's set to 512 by default, and user cannot bypass this limit.

@bukka

This comment has been minimized.

Show comment
Hide comment
@bukka

bukka Sep 24, 2017

Contributor

@SammyK Well depth is just for the depth of the final encoded data but recursive call doesn't have to cause extra depth. Consider following code:

<?php

class Foo implements JsonSerializable {
    private $i;

    public function __construct($i) {
        
        $this->i = $i;
    }
    
    public function jsonSerialize() {
        if ($this->i === 0) {
            return 1;
        }
        echo "do something\n";
        return new self($this->i - 1);
    }
}

$foo = new Foo(2);

echo json_encode($foo);

I'm not saying this is the right thing to do but it's a perfectly valid code. In this case the final depth is 0 and you still have recursion...

I think this is a much wider problem and it needs a more generic solution.

Contributor

bukka commented Sep 24, 2017

@SammyK Well depth is just for the depth of the final encoded data but recursive call doesn't have to cause extra depth. Consider following code:

<?php

class Foo implements JsonSerializable {
    private $i;

    public function __construct($i) {
        
        $this->i = $i;
    }
    
    public function jsonSerialize() {
        if ($this->i === 0) {
            return 1;
        }
        echo "do something\n";
        return new self($this->i - 1);
    }
}

$foo = new Foo(2);

echo json_encode($foo);

I'm not saying this is the right thing to do but it's a perfectly valid code. In this case the final depth is 0 and you still have recursion...

I think this is a much wider problem and it needs a more generic solution.

@SammyK

This comment has been minimized.

Show comment
Hide comment
@SammyK

SammyK Sep 24, 2017

Contributor

It's an error generated by xdebug.

Ah, gotcha - thanks. :)

IMHO, using $depth of json_encode is enough. It's set to 512 by default, and user cannot bypass this limit.

The user can change the depth in the third argument of json_encode() by doing something like this:

$json = json_encode($value, 0, PHP_INT_MAX);

So that's the other case I was wanting to protect against. :)

Contributor

SammyK commented Sep 24, 2017

It's an error generated by xdebug.

Ah, gotcha - thanks. :)

IMHO, using $depth of json_encode is enough. It's set to 512 by default, and user cannot bypass this limit.

The user can change the depth in the third argument of json_encode() by doing something like this:

$json = json_encode($value, 0, PHP_INT_MAX);

So that's the other case I was wanting to protect against. :)

@SammyK

This comment has been minimized.

Show comment
Hide comment
@SammyK

SammyK Sep 24, 2017

Contributor

I think this is a much wider problem and it needs a more generic solution.

Gotcha - so you think something like a generic max stack ini setting or something that could apply to all cases that have this issue?

Contributor

SammyK commented Sep 24, 2017

I think this is a much wider problem and it needs a more generic solution.

Gotcha - so you think something like a generic max stack ini setting or something that could apply to all cases that have this issue?

@bukka

This comment has been minimized.

Show comment
Hide comment
@bukka

bukka Sep 24, 2017

Contributor

I think you might want to read this first: #290 :)

Contributor

bukka commented Sep 24, 2017

I think you might want to read this first: #290 :)

@nikic

This comment has been minimized.

Show comment
Hide comment
@nikic

nikic Oct 5, 2017

Member

Closing this per the above discussion.

If someone wants to take a shot at fixing the root problem, my suggestion would be to look into stack guard pages. We may be able to catch segmentation faults caused by stack overflow an display a more user-friendly error message.

Member

nikic commented Oct 5, 2017

Closing this per the above discussion.

If someone wants to take a shot at fixing the root problem, my suggestion would be to look into stack guard pages. We may be able to catch segmentation faults caused by stack overflow an display a more user-friendly error message.

@nikic nikic closed this Oct 5, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment