Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very high occurrence of segmentation faults in production using PHP FPM - unable to use it #634

Closed
Tracked by #740
MattBred opened this issue Mar 3, 2022 · 7 comments
Labels
agent-php bug Something isn't working

Comments

@MattBred
Copy link

MattBred commented Mar 3, 2022

Describe the bug
Verison: 1.4.2
System: Ubuntu 20.04 ARM64 (AWS graviton)

I was getting a segfault in my PHP FPM logs every 1-3 minutes. I confirmed its from the Elastic APM agent, because when I turn the extension off, the segfaults stop. When I turn it back on, they happen again.

Unfortunately, I can not use the extension as it is causing way too many errors on production..

I dumped the core and looked at the backtrace with GDB. It looks like an infinite loop/recursion? The entire stack trace is this on repeat, 34000 lines long.

#1880 0x0000ffff90ce6ae8 in elasticApmZendErrorCallback () from /usr/lib/php/20190902/elastic_apm.so
#1881 0x0000aaaac9aa242c in ?? ()
#1882 0x0000aaaac9aa29bc in zend_error ()
#1883 0x0000aaaac9aa2c70 in zend_throw_error ()
#1884 0x0000aaaac9aa1f88 in ?? ()
#1885 0x0000aaaac9c3e77c in zend_fetch_class_by_name ()
#1886 0x0000aaaac9cbd4fc in ?? ()
#1887 0x0000aaaac9cdbbf8 in execute_ex ()
#1888 0x0000aaaac9c3d7b8 in zend_call_function ()
#1889 0x0000aaaac9c3db3c in _call_user_function_ex ()
#1890 0x0000ffff90cf9ee8 in callPhpFunction () from /usr/lib/php/20190902/elastic_apm.so
#1891 0x0000ffff90cfa4b4 in callPhpFunctionRetVoid () from /usr/lib/php/20190902/elastic_apm.so
#1892 0x0000ffff90cf8238 in onPhpErrorToTracerPhpPart () from /usr/lib/php/20190902/elastic_apm.so
#1893 0x0000ffff90ce6880 in elasticApmZendErrorCallbackImpl () from /usr/lib/php/20190902/elastic_apm.so
#1894 0x0000ffff90ce6ae8 in elasticApmZendErrorCallback () from /usr/lib/php/20190902/elastic_apm.so
#1895 0x0000aaaac9aa242c in ?? ()
#1896 0x0000aaaac9aa29bc in zend_error ()
#1897 0x0000aaaac9aa2c70 in zend_throw_error ()
#1898 0x0000aaaac9aa1f88 in ?? ()
#1899 0x0000aaaac9c3e77c in zend_fetch_class_by_name ()
#1900 0x0000aaaac9cbd4fc in ?? ()
#1901 0x0000aaaac9cdbbf8 in execute_ex ()
#1902 0x0000aaaac9c3d7b8 in zend_call_function ()
#1903 0x0000aaaac9c3db3c in _call_user_function_ex ()
#1904 0x0000ffff90cf9ee8 in callPhpFunction () from /usr/lib/php/20190902/elastic_apm.so
#1905 0x0000ffff90cfa4b4 in callPhpFunctionRetVoid () from /usr/lib/php/20190902/elastic_apm.so
#1906 0x0000ffff90cf8238 in onPhpErrorToTracerPhpPart () from /usr/lib/php/20190902/elastic_apm.so
#1907 0x0000ffff90ce6880 in elasticApmZendErrorCallbackImpl () from /usr/lib/php/20190902/elastic_apm.so
#1908 0x0000ffff90ce6ae8 in elasticApmZendErrorCallback () from /usr/lib/php/20190902/elastic_apm.so
#1909 0x0000aaaac9aa242c in ?? ()
#1910 0x0000aaaac9aa29bc in zend_error ()
#1911 0x0000aaaac9aa2c70 in zend_throw_error ()
#1912 0x0000aaaac9aa1f88 in ?? ()
#1913 0x0000aaaac9c3e77c in zend_fetch_class_by_name ()
#1914 0x0000aaaac9cbd4fc in ?? ()
#1915 0x0000aaaac9cdbbf8 in execute_ex ()
#1916 0x0000aaaac9c3d7b8 in zend_call_function ()
#1917 0x0000aaaac9c3db3c in _call_user_function_ex ()
#1918 0x0000ffff90cf9ee8 in callPhpFunction () from /usr/lib/php/20190902/elastic_apm.so
#1919 0x0000ffff90cfa4b4 in callPhpFunctionRetVoid () from /usr/lib/php/20190902/elastic_apm.so
--Type <RET> for more, q to quit, c to continue without paging--
#1920 0x0000ffff90cf8238 in onPhpErrorToTracerPhpPart () from /usr/lib/php/20190902/elastic_apm.so
#1921 0x0000ffff90ce6880 in elasticApmZendErrorCallbackImpl () from /usr/lib/php/20190902/elastic_apm.so

This is the beginning of the trace:

#34193 0x0000aaaac9aa242c in ?? ()
#34194 0x0000aaaac9aa29bc in zend_error ()
#34195 0x0000aaaac9aa2c70 in zend_throw_error ()
#34196 0x0000aaaac9aa1f88 in ?? ()
#34197 0x0000aaaac9c3e77c in zend_fetch_class_by_name ()
#34198 0x0000aaaac9cbd4fc in ?? ()
#34199 0x0000aaaac9cdbbf8 in execute_ex ()
#34200 0x0000aaaac9c3d7b8 in zend_call_function ()
#34201 0x0000aaaac9c3db3c in _call_user_function_ex ()
#34202 0x0000ffff90cf9ee8 in callPhpFunction () from /usr/lib/php/20190902/elastic_apm.so
#34203 0x0000ffff90cfa4b4 in callPhpFunctionRetVoid () from /usr/lib/php/20190902/elastic_apm.so
#34204 0x0000ffff90cf8238 in onPhpErrorToTracerPhpPart () from /usr/lib/php/20190902/elastic_apm.so
#34205 0x0000ffff90ce6880 in elasticApmZendErrorCallbackImpl () from /usr/lib/php/20190902/elastic_apm.so
#34206 0x0000ffff90ce6ae8 in elasticApmZendErrorCallback () from /usr/lib/php/20190902/elastic_apm.so
#34207 0x0000aaaac9aa242c in ?? ()
#34208 0x0000aaaac9aa29bc in zend_error ()
#34209 0x0000aaaac9c33ac0 in ?? ()
#34210 0x0000aaaac9c36ccc in ?? ()
#34211 0x0000aaaac9c300a0 in ?? ()
#34212 0x0000aaaac9c37a7c in ?? ()
#34213 0x0000aaaac9c38bb0 in ?? ()
#34214 0x0000aaaac9c379e4 in ?? ()
#34215 0x0000aaaac9c3a3e4 in ?? ()
#34216 0x0000aaaac9c37b64 in ?? ()
#34217 0x0000aaaac9c38bb0 in ?? ()
#34218 0x0000aaaac9c379e4 in ?? ()
#34219 0x0000aaaac9c39154 in ?? ()
#34220 0x0000aaaac9c37b4c in ?? ()
#34221 0x0000aaaac9c38bb0 in ?? ()
#34222 0x0000aaaac9c379e4 in ?? ()
#34223 0x0000aaaac9c39860 in ?? ()
#34224 0x0000aaaac9c3a81c in ?? ()
#34225 0x0000aaaac9c3a848 in ?? ()
#34226 0x0000aaaac9c10f9c in ?? ()
#34227 0x0000aaaac9c126e0 in compile_file ()
#34228 0x0000ffff90ed2284 in ?? () from /usr/lib/php/20190902/phar.so
#34229 0x0000ffff932b9a10 in ?? () from /usr/lib/php/20190902/opcache.so
#34230 0x0000ffff932bc5a8 in ?? () from /usr/lib/php/20190902/opcache.so
#34231 0x0000aaaac9c9e860 in ?? ()
#34232 0x0000aaaac9cc02a8 in ?? ()
#34233 0x0000aaaac9cda44c in execute_ex ()
#34234 0x0000aaaac9c3d7b8 in zend_call_function ()
#34235 0x0000aaaac9b47fa4 in ?? ()
#34236 0x0000aaaac9c3d5ac in zend_call_function ()
#34237 0x0000aaaac9c3ddbc in zend_lookup_class_ex ()
#34238 0x0000aaaac9c3e664 in zend_fetch_class_by_name ()
#34239 0x0000aaaac9cbdf88 in ?? ()
#34240 0x0000aaaac9cdbdf0 in execute_ex ()
#34241 0x0000aaaac9ce465c in zend_execute ()
#34242 0x0000aaaac9c4d1ec in zend_execute_scripts ()
#34243 0x0000aaaac9bebb20 in php_execute_script ()
#34244 0x0000aaaac9ab7510 in ?? ()
#34245 0x0000ffff95423d50 in __libc_start_main (main=0xaaaac9ab6b00, argc=4, argv=0xffffe25d9de8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=<optimized out>) at ../csu/libc-start.c:308
#34246 0x0000aaaac9ab844c in _start ()

My Elastic APM config is:

extension = elastic_apm.so
elastic_apm.bootstrap_php_part_file = /opt/elastic/apm-agent-php/src/bootstrap_php_part.php
elastic_apm.enabled = false
elastic_apm.server_url = <removed>
elastic_apm.service_name = prod-app
elastic_apm.api_key = <removed>
elastic_apm.environment = prod
elastic_apm.transaction_max_spans = 500
elastic_apm.transaction_sample_rate = 0.02
elastic_apm.server_timeout = 2s

To Reproduce
I cannot say - it was happening every 1-3 minutes on production.

Expected behavior
Not to segfault.

@MattBred MattBred added the bug Something isn't working label Mar 3, 2022
@pjaak
Copy link

pjaak commented Mar 24, 2022

I can confirm we are seeing the same behavior on 1.4.2

@SergeyKleyman
Copy link
Contributor

@pjaak @MattBred Could you please check if you still experiencing the issue with the latest release (v.16) and if so could you please share a simple docker image definition that reproduces this issue?

@MattBred
Copy link
Author

MattBred commented Sep 12, 2022

Sorry @SergeyKleyman we're not going to be running this in production for the forseeable future, and that is where I noticed the issue occur.

Edit: Sorry I didn't mean to close with comment.

@MattBred MattBred reopened this Sep 12, 2022
@pjaak
Copy link

pjaak commented Sep 12, 2022

Hi @SergeyKleyman ,
I will try out 1.6 and get back to you.
Thanks

@xyu
Copy link

xyu commented Nov 21, 2022

We are also seeing this problem on AMD64 with v1.5.2 but only with some requests, will also give 1.6.x a try.

@xyu
Copy link

xyu commented Nov 29, 2022

For what it's worth we were getting the same segfault (same recursion stacktrace) on PHP 8.0 with v1.5.2 but no longer with PHP 8.1 with v1.6.2. Currently compiling PHP 8.0 with v1.6.2 to check if it goes away there as well or not.

EDIT: This is not true, segfaults seems to be happening in a different (later) place so the page I'm testing with renders but PHP FPM still crashes.

@intuibase
Copy link
Contributor

Hey @MattBred @xyu @pjaak

We was working on similar issue and I think it will solve this problem too. Please verify if v1.8.4 release fixes issue for you. If you are still having problems after update, please reopen the issue.

Regards,
Pawel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agent-php bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants