Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segault in tes suite with 7.4.0beta1 #295

Open
remicollet opened this issue Jul 23, 2019 · 19 comments
Open

segault in tes suite with 7.4.0beta1 #295

remicollet opened this issue Jul 23, 2019 · 19 comments

Comments

@remicollet
Copy link
Contributor

@remicollet remicollet commented Jul 23, 2019

=====================================================================
FAILED TEST SUMMARY
---------------------------------------------------------------------
Casting color and opacity to pixel [tests/003_cast_color_opacity.phpt]
Testing clone keyword [tests/004_clone.phpt]
Test thumbnail bestfit [tests/005_bestfit.phpt]
Test cropthumbnail [tests/006_cropthumbnail.phpt]
Test filling thumbnail with color [tests/007_thumbnail_fill.phpt]
Test importimagepixels [tests/010_importimagepixels.phpt]
Testing that cloned object does not affect the original [tests/012-clone-separation.phpt]
Test Imagick, progressMonitor [tests/127_Imagick_progressMonitor_basic.phpt]
Test Imagick, shaveImage [tests/138_Imagick_shaveImage_basic.phpt]
ImagickKernel::fromMatrix test [tests/145_imagickkernel_coverage.phpt]
Test Imagick, Imagick::exportImagePixels [tests/256_Imagick_exportImagePixels_basic.phpt]
Test localContrastImage [tests/260_localContrastImage.phpt]
Test compositeImageGravity [tests/261_compositeImageGravity.phpt]
Test autoGammaImage [tests/263_autoGammaImage.phpt]
Test Imagick::colorDecisionListImage [tests/277_Imagick_colorDecisionListImage.phpt]
Test Imagick::optimizeimagelayers and Imagick::optimizeimagetransparency [tests/278_Imagick_optimaze_gif.phpt]
Test PECL bug #20636 [tests/bug20636.phpt]
Test PHP bug #59378 writing to php://memory is incomplete [tests/bug59378.phpt]
Imagick::resizeImage prevent 0 width/height images [tests/github_174.phpt]
=====================================================================

Early report, will dig later

@remicollet

This comment has been minimized.

Copy link
Contributor Author

@remicollet remicollet commented Jul 23, 2019

Strange backtrace:

#0  0x00007fffef9ebb56 in ?? () from /lib64/libgomp.so.1
#1  0x00007fffef9e9448 in ?? () from /lib64/libgomp.so.1
#2  0x00007ffff727e58e in start_thread () from /lib64/libpthread.so.0
#3  0x00007ffff73b0713 in clone () from /lib64/libc.so.6

Notice: test suite passes with 7.2 and same version of IM (6.9.10-56)

@Danack

This comment has been minimized.

Copy link
Collaborator

@Danack Danack commented Jul 23, 2019

Do you have any other extensions loaded that are touching threads?

I think I looked at some similar bugs before, and the result was adding this to the readme: https://github.com/Imagick/imagick#openmp

If you don't have any other extensions loaded, and the tests were working reliably before on the same system, and now they're failing consistently, I think doing a git bisect (fun times) on ImageMagick would be the way to progress this issue.

@remicollet

This comment has been minimized.

Copy link
Contributor Author

@remicollet remicollet commented Jul 23, 2019

Do you have any other extensions loaded that are touching threads?

No, test suite run without any other ext

export TEST_PHP_EXECUTABLE=$(which php)
export TEST_PHP_ARGS="-n -d extension=$PWD/modules/imagick.so"
php -n run-tests.php -q --show-diff

I think doing a git bisect (fun times) on ImageMagick

PHP 7.2 and 7.3 build with same IM version are OK...
Rather a change in 7.4 ? (last build OK from me was with alpha1)

@remicollet

This comment has been minimized.

Copy link
Contributor Author

@remicollet remicollet commented Jul 23, 2019

Indeed, setting <policy domain="resource" name="thread" value="1"/> allow test suite to pass...

@remicollet

This comment has been minimized.

Copy link
Contributor Author

@remicollet remicollet commented Jul 23, 2019

Tried with some old versions (6.9.10-40 from April and 6.9.10-46 from May) and encounter the same segfaults.

Notice: imagick 3.4.4 was build May 29th (~7.4.0-alpha1) without any issue (IIRC with IM 6.9.10-48)

@remicollet

This comment has been minimized.

Copy link
Contributor Author

@remicollet remicollet commented Jul 23, 2019

@krakjoe perhaps, as a thread expert, you may have some idea about recent change in PHP which may explain this ?
(notice: this happen for a NTS build, but IM use thread for fatest computation)

@Danack

This comment has been minimized.

Copy link
Collaborator

@Danack Danack commented Jul 23, 2019

PHP 7.2 and 7.3 build with same IM version are OK...
Rather a change in 7.4 ?

The impression I got from looking at similar bug reports was that it might be due to a race condition as although I could get it to happen occasionally, it was never that reliable a reproduce case e.g. 1 in 10 times.

If it's happening constantly for you, then theoretically it should be easier to investigate.

Do you have a Dockerfile for it?

@remicollet

This comment has been minimized.

Copy link
Contributor Author

@remicollet remicollet commented Jul 24, 2019

Do you have a Dockerfile for it?

Dirty and Quickly written

FROM fedora:30

RUN dnf -y update

RUN dnf -y install 'dnf-command(config-manager)'
RUN dnf -y install https://rpms.remirepo.net/fedora/remi-release-30.rpm
RUN dnf config-manager --set-enabled remi
RUN dnf config-manager --set-enabled remi-debuginfo
RUN dnf -y install php74 php74-php-devel php74-php-debuginfo ImageMagick-devel gdb git make
RUN scl enable php74 'php -v'
RUN git clone https://github.com/Imagick/imagick.git
RUN scl enable php74 'cd imagick; phpize; ./configure'
RUN scl enable php74 'cd imagick; make -j4; make test TESTS="-q --show-diff"'
@Danack

This comment has been minimized.

Copy link
Collaborator

@Danack Danack commented Jul 25, 2019

From 8 years ago: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=652960 a very similar backtrace:

Backtrace is
#0  do_wait (val=12, addr=0x100f154) at ../../../src/libgomp/config/linux/wait.h:53
#1  gomp_barrier_wait_end (bar=0x100f150, state=12) at ../../../src/libgomp/config/linux/bar.c:49
#2  0x00007ffff1a8964e in gomp_thread_start (xdata=<optimized out>) at ../../../src/libgomp/team.c:119
#3  0x00007ffff49c1b40 in start_thread (arg=<optimized out>) at pthread_create.c:304
#4  0x00007ffff4ec236d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#5  0x0000000000000000 in ?? ()
@Danack

This comment has been minimized.

Copy link
Collaborator

@Danack Danack commented Jul 25, 2019

Thanks for the docker file. I've added a version that to the repo's docker compose to make this easier for anyone to debug.

Checking out that branch and then doing docker-compose up --build fedora will bring a box up, with everything read to compile inside it, and instructions on how to do that on the screen in front of you.

@Danack

This comment has been minimized.

Copy link
Collaborator

@Danack Danack commented Jul 25, 2019

Opened upstream bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91256

fyi, although yesterday the segfault was happening almost consistently, today with the weather being warmer, it seems to be happening less consistently.

@Danack

This comment has been minimized.

Copy link
Collaborator

@Danack Danack commented Jul 25, 2019

My spidey sense is tingling on the word TLS: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52738

Also, in PHP_MSHUTDOWN_FUNCTION changing:

MagickWandTerminus();

to

MagickWandTerminus();
usleep(2000000);

apparently makes the problem 'go away'.

@Danack

This comment has been minimized.

Copy link
Collaborator

@Danack Danack commented Jul 25, 2019

@Danack

This comment has been minimized.

Copy link
Collaborator

@Danack Danack commented Jul 25, 2019

So according to a guy named Jeff the correct thing to do would be something like:

#if defined(MAGICKCORE_OPENMP_SUPPORT)
	omp_pause_resource_all(omp_pause_hard);
#endif

Difficulty - the function is omp_pause_resource_all is only available in GCC 9, which isn't available on many systems.

Calling:

usleep(1000);

Just after MagickWandTerminus(); in the module shutdown function appears to make the problem go away.

Testing with usleep(1); and the problem is still there...

@Danack

This comment has been minimized.

Copy link
Collaborator

@Danack Danack commented Jul 30, 2019

@remicollet Don't read this until you're off holiday. It can wait until then, but putting it here so I don't forget to check it in.

What I've implemented in the libgomp_segfault branch is, not one, but two new ini settings.

  • imagick.set_single_thread - default Off. If this is set to truthy, the thread limit will be set to one on Imagick module init. If it's falsy there is no change.

  • imagick.shutdown_sleep_count - default 10. This is an integer number for the number of milliseconds() to sleep (achieved through multiple usleep(1000) calls) in the module shutdown.

The reasons I've done it like this are that the sleep hack appears to completely avoid segfaults on shutdown, so I don't think always setting the thread count to 1 is required. However there could easily be platforms where it still segfaults. Having it as an ini setting allows either the end-user, or release manager to tweak that setting on discovery of new information from that particularly platform.

The sleep hack is still going to be required as well as the thread limit setting, as there will be people who want to use multiple threads, but who also would prefer that their software didn't segault randomly.

Doing it like that I think maximises the chance of there being no problems for people, while still allowing sys-admins to control the number of threads used without touching code, which would be a bit annoying otherwise.

Let me know your thoughts when you get back.

@remicollet

This comment has been minimized.

Copy link
Contributor Author

@remicollet remicollet commented Aug 27, 2019

This workaround seems OK to me

And FYI, gmagick is affected in exactly the same way ;)
https://bugs.php.net/bug.php?id=78465

@Danack

This comment has been minimized.

Copy link
Collaborator

@Danack Danack commented Aug 27, 2019

Cool. Guess it's time for a release.

I'll do that imminently, but I'm going to finish and release my secret project first, so that there can be a link in the release notes.

@remicollet

This comment has been minimized.

Copy link
Contributor Author

@remicollet remicollet commented Nov 25, 2019

Cool. Guess it's time for a release.

;)

@ludoge

This comment has been minimized.

Copy link

@ludoge ludoge commented Dec 11, 2019

Platform.sh's imagick support on 7.4 is blocking on this issue - any news on that release?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.