Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault in _zval_dtor_func when PHP built with ZTS #155

Closed
wants to merge 1 commit into from
Closed

Segfault in _zval_dtor_func when PHP built with ZTS #155

wants to merge 1 commit into from

Conversation

pierre-labastie
Copy link

A bunch of test in the PHP testsuite fail with an error message "tsrm_ls undefined", if PHP is compiled with zts enabled.
The proposed change allows those tests to pass.

A bunch of test in the PHP testsuite fail with an error message "tsrm_ls undefined", if PHP is compiled with zts enabled.
The proposed change allows those tests to pass.
@wsfulton
Copy link
Member

@ojwb I presume this is okay to merge in

@ojwb
Copy link
Member

ojwb commented Mar 30, 2014

I had already seen this, but wanted to take a look into it rather than just applying it (unless 3.0.1 is imminent). The patch looks OK (aside from the unrelated whitespace change) and should work, but I suspect it's easy to just pass the ZTS context info through from the caller, which should be more efficient than using TSRMLS_FETCH(); to get it.

I'm slightly curious why this wasn't spotted before - ZTS is rarely used except on Windows (which is why we get issues with missing things like this), but we had a ZTS fix not that long ago I think. I guess that person didn't run the testsuite, and so only fixed the issues they saw in their own wrappers.

@pierre-labastie
Copy link
Author

Le 30/03/2014 05:48, Olly Betts a écrit :

I had already seen this, but wanted to take a look into it rather than just
applying it (unless 3.0.1 is imminent). The patch looks OK (aside from the
unrelated whitespace change) and should work, but I suspect it's easy to just
pass the ZTS context info through from the caller, which should be more
efficient than using |TSRMLS_FETCH();| to get it.

I'm slightly curious why this wasn't spotted before - ZTS is rarely used
except on Windows (which is why we get issues with missing things like this),
but we had a ZTS fix not that long ago I think. I guess that person didn't run
the testsuite, and so only fixed the issues they saw in their own wrappers.


Reply to this email directly or view it on GitHub
#155 (comment).

Hi,

You may not have seen my original post to swig-devel (sorry if I used the
wrong list). It is when running the testsuite that I spotted those errors.
I am one of the editors of the BLFS project
(http://www.linuxfromscratch.org/blfs).
We used to build PHP with ZTS support, because we build a threaded MPM for
Apache, and used --with-apxs2 in the PHP configure switches (we moved to using
fpm in the development version of the book yesterday, you may see our former
instructions in the stable version). So I run (several times) the SWIG
testsuite with ZTS enabled. Several tests were failing in the PHP testsuite
with 'tsrm_ls undeclared' (among them "Examples/pointer" and
"Examples/cpointer", which I studied more). Actually, I never worried to much
about those tests failing, until we got the same error when building graphviz.
So I digged into it and found that the t_ouput_helper function did not have
the TSRMLS_FETCH()instruction, while tsrm_ls is used at least in FREE_ZVAL.
Since the macro TSRMLS_FETCH() is essentially a declaration, I thought I would
put it at the beginning of the function. A few other functions in phprun.swg
have the same layout.

Some info:
gcc-4.8.2
Swig-3.0.0
PHP-5.5.10
Apache-2.4.9 (not really needed, I guess).

Regards
Pierre Labastie

@ojwb
Copy link
Member

ojwb commented Apr 2, 2014

The issue with TSRMLS_FETCH() is that it's relatively expensive, so when we already have the information it returns, it's much better to instead use TSRMLS_CC, TSRMLS_DC, etc. In a ZTS build, this just results in passing an extra void*** parameter to the function (and in a non-ZTS build, these macros expand to nothing).

I've committed a fix (and also eliminated most of the existing places which call TSRMLS_FETCH()) - please can you retry with the latest git master?

@ojwb
Copy link
Member

ojwb commented Apr 2, 2014

Here's a thread about why you want to avoid calling TSRMLS_FETCH() unnecessarily:

http://thread.gmane.org/gmane.comp.php.devel/59197

The only remaining uses in SWIG are in the SWIG_GetModule() and SWIG_SetModule() macros - AIUI, these are for user use, and I'm unsure if they are expected to work in any function or only in a wrapped method. It seems they are only expected to be called at most once each per load of the module.

@pierre-labastie
Copy link
Author

Le 02/04/2014 12:43, Olly Betts a écrit :

The issue with TSRMLS_FETCH() is that it's relatively expensive, so
when we already have the information it returns, it's much better to
instead use TSRMLS_CC, TSRMLS_DC, etc. In a ZTS build, this just
results in passing an extra void*** parameter to the function (and in
a non-ZTS build, these macros expand to nothing).

I've committed a fix (and also eliminated most of the existing places
which call TSRMLS_FETCH()) - please can you retry with the latest git
master?


Reply to this email directly or view it on GitHub
#155 (comment).

Your fix eliminated all the 'tsrm_ls undeclared' issues (as did what I
proposed, which I must admit was less clever).

There are quite a few other issues: some with the director tests, which

I have not looked at:

checking php testcase director_abstract (with run test)
../../../Examples/Makefile:991: recipe for target 'php_run' failed
make[2]: * [php_run] Erreur de segmentation
Makefile:33: recipe for target 'director_abstract.cpptest' failed
make[1]: * [director_abstract.cpptest] Error 2
checking php testcase director_alternating
checking php testcase director_basic (with run test)
In file included from director_basic_wrap.cxx:1472:0:
director_basic_wrap.h:25:5: erreur: default argument missing for
parameter 4 of ‘SwigDirector_A::SwigDirector_A(zval_, std::complex,
double, void_)’
SwigDirector_A(zval self, std::complex< int > i, double d = 0.0
TSRMLS_DC);
^
director_basic_wrap.h:26:5: erreur: default argument missing for
parameter 4 of ‘SwigDirector_A::SwigDirector_A(zval
, int, bool, void
)’
SwigDirector_A(zval *self, int i, bool j = false TSRMLS_DC);
^
director_basic_wrap.h:34:5: erreur: default argument missing for
parameter 3 of ‘SwigDirector_MyClass::SwigDirector_MyClass(zval
, int,
void**)’
SwigDirector_MyClass(zval self, int a = 0 TSRMLS_DC);
^
director_basic_wrap.cxx: In function ‘void wrap_new_A__SWIG_1(int,
zval
, zval
*, zval
, int, void_)’:
director_basic_wrap.cxx:1925:58: erreur: no matching function for call
to ‘SwigDirector_A::SwigDirector_A(zval*&, std::complex&, void
&)’
result = (A *)new SwigDirector_A(arg0, arg1 TSRMLS_CC);
^
director_basic_wrap.cxx:1925:58: note: candidates are:
director_basic_wrap.cxx:1535:1: note:
SwigDirector_A::SwigDirector_A(zval
, int, bool, void**)
SwigDirector_A::SwigDirector_A(zval *self, int i, bool j TSRMLS_DC):
A(i, j), Swig::Director(self TSRMLS_CC) {
^
director_basic_wrap.cxx:1535:1: note: no known conversion for argument
2 from ‘std::complex’ to ‘int’
director_basic_wrap.cxx:1529:1: note:
SwigDirector_A::SwigDirector_A(zval
, std::complex, double, void**)
SwigDirector_A::SwigDirector_A(zval *self, std::complex< int > i,
double d TSRMLS_DC): A(i, d), Swig::Director(self TSRMLS_CC) {
^
director_basic_wrap.cxx:1529:1: note: no known conversion for argument
3 from ‘void
’ to ‘double’
In file included from director_basic_wrap.cxx:1472:0:
director_basic_wrap.h:22:8: note: SwigDirector_A::SwigDirector_A(const
SwigDirector_A&)
struct SwigDirector_A : public A, public Swig::Director {
^
director_basic_wrap.h:22:8: note: candidate expects 1 argument, 3 provided
director_basic_wrap.cxx: In function ‘void
_wrap_new_MyClass__SWIG_1(int, zval*, zval
, zval_, int, void_)’:
director_basic_wrap.cxx:2788:64: erreur: invalid conversion from
‘void
’ to ‘int’ [-fpermissive]
result = (MyClass *)new SwigDirector_MyClass(arg0 TSRMLS_CC);
^
director_basic_wrap.cxx:1606:1: erreur: initializing argument 2 of
‘SwigDirector_MyClass::SwigDirector_MyClass(zval
, int, void_**)’
[-fpermissive]
SwigDirector_MyClass::SwigDirector_MyClass(zval *self, int a
TSRMLS_DC): MyClass(a), Swig::Director(self TSRMLS_CC)
{../../../Examples/Makefile:982: recipe for target 'php_cpp' failed
make[2]: *_* [php_cpp] Error 1
Makefile:33: recipe for target 'director_basic.cpptest' failed

make[1]: *** [director_basic.cpptest] Error 2

And similar errors for director_classic, director_enum

Then there are segfaults for several tests. I have much studied the
Examples/variables test. Amazingly, even if you comment out all the php
code in rume.php, except the first |require "example.php"| line, and you
comment out the various class definitions in example.php (leaving only
the module loading part), you get the segmentation fault. It occurs
during the
php_request_shutdown part. Actually, it seems, that it is the char
*strvar, which causes the issue.
More to come...

Pierre

@ojwb
Copy link
Member

ojwb commented Apr 2, 2014

At least part of that is that just adding the TSRMLS_DC parameter to the end of the parameter list, which doesn't work when a constructor of a director class has default parameters - working on a fix for that.

@ojwb
Copy link
Member

ojwb commented Apr 2, 2014

OK, I've pushed a fix for that, which should fix many if not all of the compilation errors above - can you please retest?

I see a segfault for director_basic on Debian unstable with a non-ZTS build, and have for some time but haven't managed to get to the bottom of it. It only seems to happen with PHP >= 5.4. With the fix above, can you let me know which tests are segfaulting?

@ojwb ojwb added the bug label Apr 2, 2014
@pierre-labastie
Copy link
Author

Le 02/04/2014 23:43, Olly Betts a écrit :

OK, I've pushed a fix for that, which should fix many if not all of
the compilation errors above - can you please retest?

I see a segfault for director_basic on Debian unstable with a non-ZTS
build, and have for some time but haven't managed to get to the bottom
of it. It only seems to happen with PHP >= 5.4. With the fix above,
can you let me know which tests are segfaulting?

Yes, the compilation errors above are fixed. Still we have segmentation
faults in:

  • Examples/extend
  • Examples/funcptr
  • Examples/variables (more about this one below)
  • test-suite/autodoc
  • test-suite/char_binary
  • test-suite/char_strings
  • test-suite/director_abstract
  • test-suite/director_basic
  • test-suite/director_classic
  • test-suite/director_enum
  • test-suite/director_exception
  • test-suite/director_finalizer
  • test-suite/director_nested
  • test-suite/director_protected
  • test-suite/funcptr_cpp
  • test-suite/grouping
  • test-suite/template_default
  • test-suite/typedef_funcptr
  • test-suite/director_string

The "test-suite/director_thread" has a more explicit error (and ends up
segfaulting too):

checking php testcase director_thread (with run test)
[Thu Apr  3 10:00:14 2014]  Script: 
'/sources/swig/swig-git/Examples/test-suite/php/dir
ector_thread_runme.php'
---------------------------------------
/sources/php/php-5.5.10/Zend/zend_hash.c(563) : Block 0x7f02c85f50d8 status:
Invalid pointer: ((thread_id=0xC8E34700) != (expected=0xCC552700))
---------------------------------------
[Thu Apr  3 10:00:14 2014]  Script: 
'/sources/swig/swig-git/Examples/test-suite/php/dir
ector_thread_runme.php'
---------------------------------------
/sources/php/php-5.5.10/Zend/zend_hash.c(565) : Block 0x7f02c85f5480 status:
Invalid pointer: ((thread_id=0xC8E34700) != (expected=0xCC552700))
---------------------------------------
[Thu Apr  3 10:00:14 2014]  Script: 
'/sources/swig/swig-git/Examples/test-suite/php/dir
ector_thread_runme.php'
---------------------------------------
/sources/php/php-5.5.10/Zend/zend_hash.c(568) : Block 0x7f02c85f53e0 status:
Invalid pointer: ((thread_id=0xC8E34700) != (expected=0xCC552700))
---------------------------------------
[Thu Apr  3 10:00:14 2014]  Script: 
'/sources/swig/swig-git/Examples/test-suite/php/director_thread_runme.php'
---------------------------------------
/sources/php/php-5.5.10/Zend/zend_objects.c(41) : Block 0x7f02c85f5338 
status:
Invalid pointer: ((thread_id=0xC8E34700) != (expected=0xCC552700))
---------------------------------------
[Thu Apr  3 10:00:14 2014]  Script: 
'/sources/swig/swig-git/Examples/test-suite/php/director_thread_runme.php'
---------------------------------------
/sources/php/php-5.5.10/Zend/zend_opcode.c(361) : Block 0x7f02c85f5728 
status:
Invalid pointer: ((thread_id=0xC8E34700) != (expected=0xCC552700))
---------------------------------------
[Thu Apr  3 10:00:14 2014]  Script: 
'/sources/swig/swig-git/Examples/test-suite/php/director_thread_runme.php'
---------------------------------------
/sources/php/php-5.5.10/Zend/zend_opcode.c(361) : Block 0x7f02c85f5608 
status:
Invalid pointer: ((thread_id=0xC8E34700) != (expected=0xCC552700))
---------------------------------------
[Thu Apr  3 10:00:14 2014]  Script: 
'/sources/swig/swig-git/Examples/test-suite/php/director_thread_runme.php'
---------------------------------------
/sources/php/php-5.5.10/Zend/zend_opcode.c(361) : Block 0x7f02c85f52b8 
status:
Invalid pointer: ((thread_id=0xC8E34700) != (expected=0xCC552700))
---------------------------------------
../../../Examples/Makefile:991: recipe for target 'php_run' failed

Coming to the variables test. If I suppress the 'char *strvar' variable,
it works. OTOH, if I just create in some test directory:

/* File example.c */
#include example.h
char *somevar=0;
-------
/* File example.h */
extern char *somevar;
-------
/* File example.i */
%module example
%{
#include example.h
%}
extern char *somevar;
-------
/* File runme.php */
<?php
require "example.php";
?>

and I run:

 > swig -php example.i
 > gcc -c -I /usr/include/php -I /usr/include/php/main -I 
/usr/include/php/Zend -I /usr/include/php/TSRM -fpic -g example.c
 > gcc -c -I /usr/include/php -I /usr/include/php/main -I 
/usr/include/php/Zend -I /usr/include/php/TSRM -fpic -g example_wrap.c
 > ld -shared example.o example_wrap.o -o example.so
 > php -n -d extension_dir=. runme.php

I get the segfault. Here is the gdb backtrace (not that PHP was compiled
with --enable-debug). You can see that the segfault happens when
applying a dtor. Now, I do not know too much about the internals of PHP,
so I cannot say more:

#0  0x00000000008af5e4 in _zval_dtor_func (zvalue=0x7ffff7eefac8,
     __zend_filename=0xdebb58 
"/sources/php/php-5.5.10/Zend/zend_execute.h",
     __zend_lineno=81) at /sources/php/php-5.5.10/Zend/zend_variables.c:35
#1  0x0000000000899c83 in _zval_dtor (zvalue=0x7ffff7eefac8,
     __zend_filename=0xdebb58 
"/sources/php/php-5.5.10/Zend/zend_execute.h",
     __zend_lineno=81) at /sources/php/php-5.5.10/Zend/zend_variables.h:35
#2  0x0000000000899d7d in i_zval_ptr_dtor (zval_ptr=0x7ffff7eefac8,
     __zend_filename=0xdedac0 
"/sources/php/php-5.5.10/Zend/zend_variables.c",
     __zend_lineno=182) at /sources/php/php-5.5.10/Zend/zend_execute.h:81
#3  0x000000000089bf25 in _zval_ptr_dtor (zval_ptr=0x7ffff7eefb60,
     __zend_filename=0xdedac0 
"/sources/php/php-5.5.10/Zend/zend_variables.c",
     __zend_lineno=182) at 
/sources/php/php-5.5.10/Zend/zend_execute_API.c:426
#4  0x00000000008afb29 in _zval_ptr_dtor_wrapper (zval_ptr=0x7ffff7eefb60)
     at /sources/php/php-5.5.10/Zend/zend_variables.c:182
#5  0x00000000008c8582 in zend_hash_apply_deleter (ht=0x1132298,
     p=0x7ffff7eefb48) at /sources/php/php-5.5.10/Zend/zend_hash.c:650
#6  0x00000000008c871d in zend_hash_graceful_reverse_destroy (ht=0x1132298)
     at /sources/php/php-5.5.10/Zend/zend_hash.c:687
#7  0x000000000089af8c in shutdown_executor (tsrm_ls=0x112ea80)
     at /sources/php/php-5.5.10/Zend/zend_execute_API.c:247
#8  0x00000000008b2de7 in zend_deactivate (tsrm_ls=0x112ea80)
     at /sources/php/php-5.5.10/Zend/zend.c:935
#9  0x00000000007fa3f0 in php_request_shutdown (dummy=0x0)
     at /sources/php/php-5.5.10/main/main.c:1808
#10 0x00000000009751e5 in do_cli (argc=5, argv=0x112e920, tsrm_ls=0x112ea80)
     at /sources/php/php-5.5.10/sapi/cli/php_cli.c:1177
#11 0x00000000009759ba in main (argc=5, argv=0x112e920)
     at /sources/php/php-5.5.10/sapi/cli/php_cli.c:1378

Regards
Pierre

@ojwb
Copy link
Member

ojwb commented Apr 3, 2014

I'm dubious that director_thread is necessarily valid for all supported languages, as it assumes it's OK to call back into the interpreter from a different thread. It looks like it was originally Python only, where I assume this is OK, but I suspect those errors are because this isn't supported by PHP.

I don't know enough about PHP internals to know what's going on with the segfault in _zval_dtor_func either, I'm afraid.

Thanks for the reduced testcase though.

@ojwb ojwb changed the title Define tsrm_ls before using it Segfault in _zval_dtor_func Apr 3, 2014
@wsfulton
Copy link
Member

Bump... Is anyone going to fix this patch?

@ojwb
Copy link
Member

ojwb commented May 18, 2014

I can disable director_thread for PHP if you want. Or make it Python-only again - it seems dubious to expect it to work in general.

The other issue only seems to affect a ZTS build of PHP, which is only really used on Windows. I don't use Windows, so it's not something I can easily investigate. It might be related to the segfault we see with recent PHP (I'd expect someone would have reported this before if it had been broken since director support was added).

@wsfulton
Copy link
Member

The director_thread looks generic enough to me for all languages. Peraps only run the test if you are sure threading works (such as when ZTS is not enabled). From what I can make out here though, director_thread isn't the main problem, it is the seg faults in the 20 odd testcases that @pierre-labastie mentioned and the 'char *' variables.

BTW, I'd like to change this from a patch to an issue, but I can't see how to do this in Github :(

@ojwb ojwb changed the title Segfault in _zval_dtor_func Segfault in _zval_dtor_func when PHP built with ZTS Jun 2, 2014
@ojwb
Copy link
Member

ojwb commented Oct 16, 2014

@pierre-labastie Do you still see these segfaults with SWIG trunk? There was a fix for segfaults with directors in PHP >= 5.4 in 0dd7b61, and you reported using 5.5 in an earlier comment, so I wonder if that fix has also solved the remainder of this issue.

@ojwb
Copy link
Member

ojwb commented Oct 16, 2014

BTW, you should only need to apply the single character change to Lib/php/phprun.swg from that commit to test, if that's simpler than building the latest version from git.

@pierre-labastie
Copy link
Author

Le 16/10/2014 02:08, Olly Betts a écrit :

@pierre-labastie https://github.com/pierre-labastie Do you still see
these segfaults with SWIG trunk? There was a fix for segfaults with
directors in PHP >= 5.4 in 0dd7b61
0dd7b61,
and you reported using 5.5 in an earlier comment, so I wonder if that
fix has also solved the remainder of this issue.


Reply to this email directly or view it on GitHub
#155 (comment).

I have not tested that for a while, since now, we (at BLFS,
http://www.linuxfromscratch.org/blfs/) build PHP without ZTS. I'll have
a look at that, but expect some delay (a week or so).

Pierre

@ojwb
Copy link
Member

ojwb commented Jan 9, 2015

@pierre-labastie Did you get a chance to re-test?

@pierre-labastie
Copy link
Author

Le 09/01/2015 01:50, Olly Betts a écrit :

@pierre-labastie https://github.com/pierre-labastie Did you get a
chance to re-test?


Reply to this email directly or view it on GitHub
#155 (comment).

No, sorry. Actually, I am currently having problems with other parts of
SWIG right now (CLISP tests, Ruby tests, and Go tests). You may want to
see http://wiki.linuxfromscratch.org/blfs/ticket/5992 (see comments 4 to 7).
The Ruby part comes from the use of an obsolete variable (Config instead
of RbConfig) in configure, and is new with Ruby 2.2 (until Ruby 2.1,
the use of Config was signaled as deprecated, but possible). For Clisp,
it seems that swig does not accept "class" or "union", only "struc".
For Go, I have not had a chance to retest.

Pierre

@ojwb
Copy link
Member

ojwb commented Jan 11, 2015

It would be better to report those as separate issues, especially as others maintain those parts of SWIG.

@pierre-labastie
Copy link
Author

Le 11/01/2015 12:20, Olly Betts a écrit :

It would be better to report those as separate issues, especially as others
maintain those parts of SWIG.

Understood.

I have reported to swig-devel list. Do you think there is a better place?

Pierre

@ojwb
Copy link
Member

ojwb commented Jan 11, 2015

Opening a ticket is better for reporting a bug.

@pierre-labastie
Copy link
Author

Le 09/01/2015 01:50, Olly Betts a écrit :

@pierre-labastie https://github.com/pierre-labastie Did you get a chance to
re-test?

While I was with SWIG-3.0.3, I recompiled php with zts support. PHP version is
5.6.4. gcc/g++ version is 4.9.2, binutils version is 2.24.

Here is what I get:
+++++++++++++++++
$ make check-php-examples
[...]
checking Examples/php/extend
/bin/sh: line 1: 26317 Segmentation fault php -n -q -d extension_dir=. -d
safe_mode=Off runme.php > /dev/null
../../Makefile:1132: recipe for target 'php_run' failed
make[3]: *** [php_run] Error 139
Makefile:10: recipe for target 'check' failed
make[2]: *** [check] Error 2
Makefile:246: recipe for target 'extend.actionexample' failed
make[1]: *** [extend.actionexample] Error
[..all other tests pass..]
+++++++++++++++++
$ make check-php-test-suite
[...]
checking php testcase director_abstract (with run test)
../../../Examples/Makefile:1132: recipe for target 'php_run' failed
make[2]: *** [php_run] Segmentation fault
[...]
checking php testcase director_basic (with run test)
../../../Examples/Makefile:1132: recipe for target 'php_run' failed
make[2]: *** [php_run] Segmentation fault
[...]
checking php testcase director_classic (with run test)
../../../Examples/Makefile:1132: recipe for target 'php_run' failed
make[2]: *** [php_run] Segmentation fault
[...]
checking php testcase director_enum (with run test)
../../../Examples/Makefile:1132: recipe for target 'php_run' failed
make[2]: *** [php_run] Segmentation fault
checking php testcase director_exception (with run test)
../../../Examples/Makefile:1132: recipe for target 'php_run' failed
make[2]: *** [php_run] Segmentation fault
[...]
checking php testcase director_finalizer (with run test)
../../../Examples/Makefile:1132: recipe for target 'php_run' failed
make[2]: *** [php_run] Segmentation fault
[...]
checking php testcase director_nested (with run test)
../../../Examples/Makefile:1132: recipe for target 'php_run' failed
make[2]: *** [php_run] Segmentation fault
[...]
checking php testcase director_protected (with run test)
../../../Examples/Makefile:1132: recipe for target 'php_run' failed
make[2]: *** [php_run] Segmentation fault
[...]
hecking php testcase director_thread (with run test)
../../../Examples/Makefile:1132: recipe for target 'php_run' failed
make[2]: *** [php_run] Segmentation fault
[...]
checking php testcase director_string (with run test)
../../../Examples/Makefile:1132: recipe for target 'php_run' failed
make[2]: *** [php_run] Segmentation fault
[..all the other tests pass..]
++++++++++++++++

That is, only the "extend" examples, and a few "director" tests in the test
suite fail. A lot of "director" tests pass and all the other tests pass.

Pierre

@ojwb
Copy link
Member

ojwb commented Jan 11, 2015

Thanks for retesting.

Comparing with the list of failing tests from above, these are now passing (so some progress):

  • Examples/funcptr
  • Examples/variables
  • test-suite/autodoc
  • test-suite/char_binary
  • test-suite/char_strings
  • test-suite/funcptr_cpp
  • test-suite/grouping
  • test-suite/template_default
  • test-suite/typedef_funcptr

Leaving the following failures:

  • Examples/extend
  • test-suite/director_abstract
  • test-suite/director_basic
  • test-suite/director_classic
  • test-suite/director_enum
  • test-suite/director_exception
  • test-suite/director_finalizer
  • test-suite/director_nested
  • test-suite/director_protected
  • test-suite/director_thread
  • test-suite/director_string

I notice that all the tests which still fail make use of subclassing via directors (though some passing tests do too).

Rereading the discussion, it seems the originally proposed patch made all the test pass, or is that no longer true with newer versions of PHP?

While avoiding calling TSRMLS_FETCH() is desirable, avoiding segfaults in generated wrapper code clearly trumps that. I don't have any good ideas as to why the current code isn't working. Perhaps the value we store in the swig_zts_ctx member of class Director doesn't remain valid, but that seems very surprising as it belongs to the current thread. And at least the overhead is only incurred when using a ZTS build of PHP.

@pierre-labastie
Copy link
Author

Le 11/01/2015 22:04, Olly Betts a écrit :

Thanks for retesting.

[...]
I notice that all the tests which still fail make use of subclassing via
directors (though some passing tests do too).

Rereading the discussion, it seems the originally proposed patch made all the
test pass, or is that no longer true with newer versions of PHP?

Sorry, I only meant that tests failing with "tsrm_ls undefined" passed with
the patch I proposed. There were also segfaulting tests, and the patch did not
address that.

What I have been able to investigate with gdb, is that segfaulting occurs
because some code assume that the pointer "this" is defined, while it is
actually NIL. Why it is so, I do not know. Clearly, the constructor has been
called before executing that part of the code, but somehow, "this" is lost. I
have not been able to follow the code (the call stack contains tens of
procedures).

Pierre

@ojwb
Copy link
Member

ojwb commented Jan 12, 2015

I think I've found the problem - the upcall check is currently being performed on a NULL object. We get away with this in a non-ZTS build, but in a ZTS build we try to access the swig_zts_ctx member and the NULL this causes the segfault.

Can you test with SWIG commit 682b4dd?

@wsfulton
Copy link
Member

682b4dd breaks the PHP Travis test case called 'director_finalizer' - https://travis-ci.org/swig/swig/jobs/46731933
I would like to release SWIG this Wed, @ojwb can you either revert the patch or fix it before then? Thanks.

@ojwb
Copy link
Member

ojwb commented Jan 12, 2015

I added some tracing code and this isn't a bug in my change, but a bug uncovered by my change.

new SwigDirector_Foo -> 0x147f7e0
_wrap_Foo_orStatus : arg1 = 0x147f7e0
_wrap_Foo_orStatus : director = 0x147f7e8
delete Foo 0x147f7e0
new SwigDirector_Foo -> 0x147f830
_wrap_Foo_orStatus : arg1 = 0x147f830
_wrap_Foo_orStatus : director = 0x147f838
delete Foo 0x147f830
new SwigDirector_Foo -> 0x147f7e0
deleteFoo 0x147f7e0
_wrap_Foo_orStatus : arg1 = 0x147f7e0
Segmentation fault

So that last SwigDirector_Foo object is deleted before the orStatus method gets called. Previously we'd call a method on a NULL pointer here (which I'm fairly sure is undefined behaviour, but we probably mostly get away with). Now we use the correct this pointer for the object and so actually notice that the object is no longer valid.

I'll see if I can work out what's up. Reverting seems a very unappealing option.

@pierre-labastie
Copy link
Author

Le 12/01/2015 01:57, Olly Betts a écrit :

I think I've found the problem - the upcall check is currently being
performed on a |NULL| object. We get away with this in a non-ZTS
build, but in a ZTS build we try to access the |swig_zts_ctx| member
and the NULL |this| causes the segfault.

Can you test with SWIG commit 682b4dd
682b4dd?

Done. I am sorry to say thatthere are still failures, and that there are
regressions with ZTS not enabled:

Without ZTS:

checking php testcase default_args
Parse error: syntax error, unexpected 'LL' (T_STRING), expecting ')' in
/sources/swig/Examples/test-suite/php/default_args.php on line 50
../../../Examples/Makefile:1132: recipe for target 'php_run' failed
make[2]: *** [php_run] Error 255

checking php testcase director_finalizer (with run test)
../../../Examples/Makefile:1132: recipe for target 'php_run' failed

make[2]: *** [php_run] Segmentation fault

With ZTS:
$ make check-php-examples
[...]
checking Examples/php/extend
Segmentation fault
../../Makefile:1132: recipe for target 'php_run' failed
make[3]: *** [php_run] Error 139

$ make check-php-test-suite
[...]
checking php testcase default_args
Parse error: syntax error, unexpected 'LL' (T_STRING), expecting ')' in
/sources/swig/Examples/test-suite/php/default_args.php on line 50
../../../Examples/Makefile:1132: recipe for target 'php_run' failed
make[2]: *** [php_run] Error 255
[...]
checking php testcase director_basic (with run test)
../../../Examples/Makefile:1132: recipe for target 'php_run' failed
make[2]: *** [php_run] Segmentation fault

[...]
checking php testcase director_classic (with run test)
../../../Examples/Makefile:1132: recipe for target 'php_run' failed
make[2]: *** [php_run] Segmentation fault
[...]
checking php testcase director_enum (with run test)
../../../Examples/Makefile:1132: recipe for target 'php_run' failed
make[2]: *** [php_run] Segmentation fault
[...]
checking php testcase director_finalizer (with run test)
../../../Examples/Makefile:1132: recipe for target 'php_run' failed
make[2]: *** [php_run] Segmentation fault
[...]
checking php testcase director_protected (with run test)
../../../Examples/Makefile:1132: recipe for target 'php_run' failed
make[2]: *** [php_run] Segmentation fault
[...]
checking php testcase director_thread (with run test)
../../../Examples/Makefile:1132: recipe for target 'php_run' failed

make[2]: *** [php_run] Segmentation fault

The good new is that director_abstract, director_nested and director_string now pass.

Pierre

@pierre-labastie
Copy link
Author

The regression with testcase default_args has been introduced by:
commit 34787ab
Author: Vadim Zeitlin vz-swig@zeitlins.org
Date: Mon Jan 5 02:50:24 2015 +0100

Python default argument test cases from issue #294

@wsfulton
Copy link
Member

@pierre-labastie Can you check you have a stable build as my local testing and Travis tests at https://travis-ci.org/swig/swig/jobs/46824614 don't show the extend example segfault, nor the director_xxxx tests. Travis does show the default_args breakage (which is down to additional tests for issue #294) and so isn't a regression, but I'd like to see it fixed so the test-suite passes for the release due out now. The most concerning thing is the segfault in director_finalizer. @ojwb, could you look at that please and I'll figure out a fix for the default_args testcase.

@wsfulton
Copy link
Member

Mmm, seems the new default arg tests aimed at testing Python default arg handling have highlighted shortcomings in the PHP default arg handling. There are quite a few problems, I'm wondering if the approach taken for Python coudl be re-used for PHP. @ojwb, I have you don't mind, but I'm going to leave this for you and disable the test for PHP.

@pierre-labastie
Copy link
Author

@wsfulton The extend example segfault, and the director_xxxx tests segfaults all occur when using PHP built with ZTS. I do not know whether this may be called a "stable" build, but this is how this thread started. The reasons why we used ZTS are in my above comment of March 30, 2014. Some comments above seem to imply that the ZTS is not the main target, but @ojwb asked to retest, and this is what I am doing now.

@ojwb
Copy link
Member

ojwb commented Jan 13, 2015

SWIG ought to work with PHP built with ZTS enabled, but generally ZTS is only enabled on Windows which isn't a platform I can test on, and we have had the occasional regression, such as this.

The PHP API has these macros which expand to pass the ZTS pointer around in a ZTS build, and expand to nothing otherwise, so if the use of these macros in a patch is wrong, you don't spot it until you build for ZTS.

@ojwb
Copy link
Member

ojwb commented Jan 13, 2015

@wsfulton I've fixed the default_args issue, so no need to disable that testcase.

@ojwb
Copy link
Member

ojwb commented Jan 13, 2015

@wsfulton And I'm already looking at the director_finalizer testcase. I'd like to better understand what's going on before we decide to just revert it, as I think we've uncovered a deeper problem from what I've seen so far.

@wsfulton
Copy link
Member

@pierre-labastie sorry, I read your post too quickly, when ZTS is not enabled, you have the same results as Travis.

Is it worth finding or building an Ubuntu PPA with ZTS enabled and using this in the future on Travis?

@ojwb Thanks for the default fix.

@ojwb I'm more interested in releasing 3.0.4 asap without regressions than fixing bugs. If you can't fix in the next 18 hours or so, then I suggest we revert for the release and then you can apply a fixed patch at your leisure for 3.0.5.

@ojwb
Copy link
Member

ojwb commented Jan 13, 2015

If you can find a ZTS build for automatically testing with, that'd be very handy. I've contemplated building a custom version to test with, but PHP is a bit of a beast to build, and getting a version on all the machines I work on SWIG on isn't an appealing task.

The issue with reverting is really what we revert to that's actually a better state than where we are now. It's not as simple as that 682b4dd regressed us and pre-682b4dd8 is unregressed - there are different problems at different points in the increasinly epic saga that is this ticket.

Just reverting 682b4dd leaves us invoking undefined behaviour every time we call a director method. It seems in a non-ZTS build we may get away with this, but that's at best brittle, and could be causing problems that just haven't been reported yet. We could revert to an earlier state, for example before any of the fixes related to this ticket, since the ZTS build is already broken with 682b4dd reverted anyway.

I'd really like to understand what's going on to inform the decision about what is most sensible to do for 3.0.5, as the fix could easily be a change as small as the revert, and it would be a shame to release 3.0.4 in 18 hours and in 19 hours time to discover this is a serious problem than needs us to release 3.0.5 next week.

@ojwb
Copy link
Member

ojwb commented Jan 14, 2015

OK, 0dd685b fixes the crash in director_finalizer in a simple way (just make that method static, since it only needs the object context for the ZTS pointer, and we can just pass that in from the caller).

I think there's something screwy going on, as the object destruction seems to happen too early (compared to when I'd expect it to happen, and to when it happens under Python). I'm dubious about how disowning is implemented for PHP (it's nothing like the implementation for Python or Perl, but instead does stuff with the "newobject" flag), but attempting to reimplement that machinery for 3.0.4 seems extremely foolish. With this patch, we aren't calling a method on a NULL pointer or a deleted object, making it better than the situation either before or after 682b4dd.

I'm hopeful this will pass the testsuite for a ZTS build too, but that has been broken for some time, so I don't think holding up 3.0.4 to test that is worthwhile.

@pierre-labastie
Copy link
Author

About building PHP, with or without ZTS, I give below the instructions I use to build it in /usr/local (so no interference with system PHP). You then just have to change your PATH so that it begins with /usr/local/bin.
I think you need (on Debian) zlib1g-dev, libgmp-dev, libgdbm-dev, libgettextpo-dev, librealine6-dev, libbz2-dev, libxml2-dev. Not sure I do not forget some (I assume you have a development environment, with the libc and libstdc++ development versions).

./configure --with-zlib \
            --enable-bcmath \
            --with-bz2 \
            --enable-dba=shared \
            --with-gdbm \
            --with-gmp \
            --enable-ftp \
            --with-gettext \
            --enable-mbstring \
            --with-readline \
            --enable-maintainer-zts

If you do not want ZTS, just suppress the last line. I you want debugging symbols for using gdb, add --enable-debug

@pierre-labastie
Copy link
Author

@ojwb Great job: the last commits you have done allow all the tests to pass with ZTS enabled, except the director_thread one (all the tests pass without ZTS).

@ojwb
Copy link
Member

ojwb commented Jan 14, 2015

How to build it isn't the issue - it's that it's a slow build (https://buildd.debian.org/status/logs.php?pkg=php5&arch=amd64 suggests ~30-90 minutes, though I bet it's longer on my netbook), plus I commonly work on SWIG on at least 4 machines and would need to rebuild a newer version periodically. I don't really want most of my SWIG hacking time to be spent building custom versions of PHP.

I take it that the director_thread testcase still fail like this:

checking php testcase director_thread (with run test)
[Thu Apr 3 10:00:14 2014] Script: '/sources/swig/swig-git/Examples/test-suite/php/director_thread_runme.php'
---------------------------------------
/sources/php/php-5.5.10/Zend/zend_hash.c(563) : Block 0x7f02c85f50d8 status: Invalid pointer: ((thread_id=0xC8E34700) != (expected=0xCC552700)) 

This error reads to me that PHP has noted down the current thread id when creating a data structure and then fails when it isn't the same on a later access. As I said above already, we're assuming that one can call back into the target language interpreter from a different thread, which isn't necessarily going to work for all target languages. I don't see that we can fix this - it just seems it inherently isn't supported by PHP (but in a non-ZTS build, it presumably doesn't bother to check thread_id in this way).

Having a known failure like this isn't useful, so I've disabled this runme in ecf3ab5.

Which means we can at long last close this ticket. Thanks to everyone for your help and patience.

@ojwb ojwb closed this Jan 14, 2015
@pierre-labastie
Copy link
Author

@ojwb : With a core i5 at 2.6 GHz, It takes a couple of minutes to build PHP with the instructions above (using make -j5). I do not know what the Debian timings mean. I guess they build several flavours of PHP, and maybe a lot of modules, which are not built (and not useful for swig tests) with the intructions above. Anyway, it is up to you, and I congratulate you again for resolving the bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants