-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RedisCluster segfaults after second connection with cache_slots enabled #1591
Comments
@zorro-fr24 thank you for this report! |
Can confirm we also experienced this issue with a Laravel product on PHP 7.3.6. |
This should now be fixed in the develop branch. |
@zorro-fr24 Are you able to verify that the fix works on your end. We probably want to get 5.0.1 out soon since this bug crashes phpredis. |
Apologies, I was planning to test it directly, but I went on a poorly timed vacation ;) I was able to login and compile the latest develop (5276474) in our test environment and I am no longer seeing segfaults. Our applications seem to run/pass tests etc. Thanks for the quick fix! |
We released 5.0.1 with the fix of this issue. Thank again to @zorro-fr24 for detailed report which allowed quickly fix this critical bug : |
Hi guys - we are still experiencing some problems with cache_slots enabled. The problems we are seeing currently: number of redis connections grows indefinitely, and we are seeing random SIGABRT in php-fpm logs. When trying to run same test script as in issue description (under same environment except that now we run phpredis 5.0.1), we see segmentation fault: Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/php...Reading symbols from /usr/lib/debug/usr/bin/php.debug...done.
done.
(gdb) run -d redis.clusters.cache_slots=0 cache_slots_testcase.php
Starting program: /usr/bin/php -d redis.clusters.cache_slots=0 cache_slots_testcase.php
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
1
[Inferior 1 (process 93728) exited normally]
(gdb) run -d redis.clusters.cache_slots=1 cache_slots_testcase.php
Starting program: /usr/bin/php -d redis.clusters.cache_slots=1 cache_slots_testcase.php
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
1
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff4bb87b9 in malloc_consolidate (av=av@entry=0x7ffff4efe760 <main_arena>) at malloc.c:4153
4153 unlink(av, p, bck, fwd);
(gdb) bt
#0 0x00007ffff4bb87b9 in malloc_consolidate (av=av@entry=0x7ffff4efe760 <main_arena>) at malloc.c:4153
#1 0x00007ffff4bb94ee in _int_free (av=0x7ffff4efe760 <main_arena>, p=0x555555ce3a80, have_lock=0) at malloc.c:4053
#2 0x0000555555833278 in zend_shutdown () at /usr/src/debug/php-7.2.20/Zend/zend.c:910
#3 0x00005555557cdc7b in php_module_shutdown () at /usr/src/debug/php-7.2.20/main/main.c:2453
#4 0x000055555563dceb in main (argc=6, argv=0x555555c75ec0) at /usr/src/debug/php-7.2.20/sapi/cli/php_cli.c:1418 ``` |
diff --git a/redis.c b/redis.c
index b087ac4..504fcda 100644
--- a/redis.c
+++ b/redis.c
@@ -559,8 +559,10 @@ free_reply_callbacks(RedisSock *redis_sock)
/* Passthru for destroying cluster cache */
static void cluster_cache_dtor(zend_resource *rsrc) {
- redisCachedCluster *rcc = (redisCachedCluster*)rsrc->ptr;
- cluster_cache_free(rcc);
+ if (rsrc->ptr) {
+ redisCachedCluster *rcc = (redisCachedCluster*)rsrc->ptr;
+ cluster_cache_free(rcc);
+ }
} @kalmykov could you try to apply this patch and test? |
Hi, @kalmykov Let us know if this solves your issue. II'll see if there's a simple way to add automatic tests that do things like use the connection pool and cluster cache across multiple connections. Right now all of our tests are via cli which makes bugs like this less likely to present themselves. Edit: Also it would be awesome if others can help test slot caching in more production like environment (staging or similar) before we prepare a 5.0.2 release. Neither @yatsukhnenko or myself use RedisCluster (let alone slot caching) in production so it's harder for us to do this. |
A bit more info for everybody. I've been running the RedisCluster unit tests under php-fpm all day and no longer see any crashes with @yatsukhnenko's patch applied. Without the patch I see the exact same I'm going to keep the tests running for a while to make sure we don't see any other issues. |
Ping @kalmykov, @zorro-fr24 |
Are you sure you have applied the patch? 😄 |
Yes, I suspected that as well 😄 On my local machine (OSX) I stopped getting this error, but not on our web servers. Here is what git diff says: index ad25b43..15ec3ce 100644
--- a/php_redis.h
+++ b/php_redis.h
@@ -25,7 +25,7 @@
#define PHP_REDIS_H
/* phpredis version */
-#define PHP_REDIS_VERSION "5.1.0-dev"
+#define PHP_REDIS_VERSION "5.1.0-cluster-cache-patch"
PHP_METHOD(Redis, __construct);
PHP_METHOD(Redis, __destruct);
diff --git a/redis.c b/redis.c
index b087ac4..504fcda 100644
--- a/redis.c
+++ b/redis.c
@@ -559,8 +559,10 @@ free_reply_callbacks(RedisSock *redis_sock)
/* Passthru for destroying cluster cache */
static void cluster_cache_dtor(zend_resource *rsrc) {
- redisCachedCluster *rcc = (redisCachedCluster*)rsrc->ptr;
- cluster_cache_free(rcc);
+ if (rsrc->ptr) {
+ redisCachedCluster *rcc = (redisCachedCluster*)rsrc->ptr;
+ cluster_cache_free(rcc);
+ }
} After that in phpredis dir I executed:
If I run |
Have you restarted php-fpm? |
Yes, however I am running php-cli when getting segfault so I don't think it makes a difference? Edit: I also ran php-fpm for a short while and got the same errors as before |
I'll see if I can get it to break in a CentOS docker environment. And just to be clear, you're able to segfault redis by just running that three line script above? Theoretically that should make it much simpler to fix once we can replicate the issue. |
If I run a script in cli, I get segfault in ~80%. If I run it through gdb I get segfault all the time |
Awesome, thanks for the info. We actually had a recent issue on production that only presented itself under CentOS, so hopefully it's as simple as running it there. Are you on the most current patch of CentOS 7 (1810 I think)? |
yes: |
Hi @kalmykov I've pushed a fix that incorporates @yatsukhnenko's changes and one more correction. This bug would've been detected if I had run the tests through valgrind as the double free was reported. It's probably smart to add valgrind to the automatic testing for situations like this. Hopefully this fixes the issue for you. |
@michael-grunder @yatsukhnenko thanks a lot! I compiled extension from |
Glad to hear it! I'll go ahead and close the issue but don't hesitate to let us know if you have any other issues. |
Hi guys, do you know when this fix is planned to be released in PECL? |
On Monday :) |
Expected behaviour
phpredis should be able to open a connection more than once without segfaulting with the slots cache enabled
Actual behaviour
We are seeing a segfault occur on the second connection attempt. With php-cli we can reproduce by simply connecting twice to the same endpoint in a row ( see below ). With php-fpm behind nginx we see a SIGSEGV on the second http request to the same endpoint.
Setting other cluster distribution and connection persistence and pooling options appear to make no difference
Let me know if you need any more details.
I'm seeing this behaviour on
Steps to reproduce, backtrace or example script
Installed packages:
yum list installed | grep php newrelic-php5.x86_64 8.7.0.242-1 @newrelic newrelic-php5-common.noarch 8.7.0.242-1 @newrelic php-cli.x86_64 7.2.20-1.el7.remi @remi-php72 php-common.x86_64 7.2.20-1.el7.remi @remi-php72 php-debuginfo.x86_64 7.2.20-1.el7.remi @remi-php72-debuginfo php-fedora-autoloader.noarch 1.0.0-1.el7 @epel php-fpm.x86_64 7.2.20-1.el7.remi @remi-php72 php-intl.x86_64 7.2.20-1.el7.remi @remi-php72 php-json.x86_64 7.2.20-1.el7.remi @remi-php72 php-mbstring.x86_64 7.2.20-1.el7.remi @remi-php72 php-mysqlnd.x86_64 7.2.20-1.el7.remi @remi-php72 php-opcache.x86_64 7.2.20-1.el7.remi @remi-php72 php-pdo.x86_64 7.2.20-1.el7.remi @remi-php72 php-pear.noarch 1:1.10.9-3.el7.remi @remi-php72 php-pecl-apcu.x86_64 5.1.17-1.el7.remi.7.2 @remi-php72 php-pecl-apcu-debuginfo.x86_64 5.1.17-1.el7.remi.7.2 @remi-php72-debuginfo php-pecl-igbinary.x86_64 3.0.1-1.el7.remi.7.2 @remi-php72 php-pecl-igbinary-debuginfo.x86_64 3.0.1-1.el7.remi.7.2 @remi-php72-debuginfo php-pecl-imagick.x86_64 3.4.4-1.el7.remi.7.2 @remi-php72 php-pecl-imagick-debuginfo.x86_64 3.4.4-1.el7.remi.7.2 @remi-php72-debuginfo php-pecl-memcached.x86_64 3.1.3-1.el7.remi.7.2 @remi-php72 php-pecl-memcached-debuginfo.x86_64 3.1.3-1.el7.remi.7.2 @remi-php72-debuginfo php-pecl-msgpack.x86_64 2.0.3-1.el7.remi.7.2 @remi-php72 php-pecl-msgpack-debuginfo.x86_64 2.0.3-1.el7.remi.7.2 @remi-php72-debuginfo php-pecl-redis5.x86_64 5.0.0-1.el7.remi.7.2 @remi-php72 php-pecl-redis5-debuginfo.x86_64 5.0.0-1.el7.remi.7.2 @remi-php72-debuginfo php-pgsql.x86_64 7.2.20-1.el7.remi @remi-php72 php-php-gettext.noarch 1.0.12-1.el7 @epel php-process.x86_64 7.2.20-1.el7.remi @remi-php72 php-soap.x86_64 7.2.20-1.el7.remi @remi-php72 php-xml.x86_64 7.2.20-1.el7.remi @remi-php72 php72-php-common.x86_64 7.2.20-1.el7.remi @remi-safe php72-php-json.x86_64 7.2.20-1.el7.remi @remi-safe php72-php-pecl-igbinary.x86_64 3.0.1-1.el7.remi @remi-safe php72-php-pecl-msgpack.x86_64 2.0.3-1.el7.remi @remi-safe php72-runtime.x86_64 2.0-1.el7.remi @remi-safe
Testcase:
Backtrace:
I've checked
develop
branchThe text was updated successfully, but these errors were encountered: