Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uint32_t overflow cause busy loop in 1.0.18 #62

Closed
m6w6 opened this issue Jan 20, 2020 · 7 comments
Closed

uint32_t overflow cause busy loop in 1.0.18 #62

m6w6 opened this issue Jan 20, 2020 · 7 comments

Comments

@m6w6
Copy link
Collaborator

m6w6 commented Jan 20, 2020

Imported from Launchpad using lp2gh.


We are using libmemcached 1.0.18 and observed occasionally very high CPU consumption.
After some debugging, the cause is located in libmemcached/purge.cc :

bool memcached_purge(memcached_instance_st* ptr)
{
    .............
    uint32_t no_msg= memcached_server_response_count(ptr) - 1;
   for (uint32_t x= 0; x < no_msg; x++)     // busy loop here 
   {
         .....
   }
}

During debugging, we found that memcached_server_response_count(ptr) sometimes return a large value which equals to (uint32_t)(0-1) .

In order to fix this issue, we took an quick & dirty solution, which refuses to decrement by 1 when 0. This patch is attached below.

But still, I haven't found the root cause, and I'm afraid the root cause would leads to more possible hidden issues. So hope someone else could catch and fix this bug.

@m6w6 m6w6 added the New label Jan 20, 2020
@m6w6
Copy link
Collaborator Author

m6w6 commented Jan 20, 2020


diff --git a/libmemcached/common.h b/libmemcached/common.h
index 897b096..05345fc 100644
--- a/libmemcached/common.h
+++ b/libmemcached/common.h
@@ -205,7 +205,11 @@ static inline void memcached_server_response_increment(memcached_instance_st* in
 }
 #endif

-#define memcached_server_response_decrement(A) (A)->cursor_active_--
+#define memcached_server_response_decrement(A) do {         \
+          if ((A)->cursor_active_ > 0) {                    \
+                      (A)->cursor_active_--;                \
+          }
+
 #define memcached_server_response_reset(A) (A)->cursor_active_=0

 #define memcached_instance_response_increment(A) (A)->cursor_active_++

@m6w6
Copy link
Collaborator Author

m6w6 commented Jan 20, 2020


@m6w6
Copy link
Collaborator Author

m6w6 commented Jan 20, 2020

  • Comment by: lp:~brandond:
  • Created at: 2015-05-16T04:27:59Z

memcached_server_response_count(ptr) returns 0 when ptr is NULL or ptr->cursor_active is NULL. Decrementing this and assigning to a uint32_t produces 2^32-1, so the error check if (no_msg > 0) only fails if ptr->cursor_active == 1. I suspect the right answer is to not use uint32_t here (before it was uint32_t it was int; not sure why it was converted to unsigned) and instead use int32_t. That change is here:

http://bazaar.launchpad.net/~tangent-trunk/libmemcached/1.2/revision/509/libmemcached/memcached_purge.c

For what it's worth, I've seen this bug tickled when calling set with a key containing a newline (\n) from PHP 5.5's Memcached class. The Memcached server doesn't like that (seems to be interpreting the second line as a new command) and errors out -- eventually triggering this loop as it attempts to unwind from the error.

@m6w6
Copy link
Collaborator Author

m6w6 commented Jan 20, 2020

  • Comment by: lp:~kenorb:
  • Created at: 2015-11-08T19:22:54Z

The same in here, PHP just crashes.
PHP 5.6.15 (cli) (built: Oct 31 2015 14:39:02)

Bt:

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libmemcached.11.dylib         	0x000000010968328d memcached_purge(memcached_instance_st*) + 61
1   libmemcached.11.dylib         	0x0000000109681c8d io_flush(memcached_instance_st*, bool, memcached_return_t&) + 45
2   libmemcached.11.dylib         	0x0000000109681720 _io_write(memcached_instance_st*, void const*, unsigned long, bool, unsigned long&) + 165
3   libmemcached.11.dylib         	0x0000000109681769 memcached_io_write(memcached_instance_st*, void const*, unsigned long, bool) + 20
4   libmemcached.11.dylib         	0x000000010967f667 __mget_by_key_real(memcached_st*, char const*, unsigned long, char const* const*, unsigned long const*, unsigned long, bool) + 1247
5   libmemcached.11.dylib         	0x000000010967fd40 memcached_mget_by_key + 20
6   memcached.so                  	0x000000010965b8d9 php_memc_get_impl + 637
7   php                           	0x000000010894b458 dtrace_execute_internal + 126
8   php                           	0x00000001089c436e zend_do_fcall_common_helper_SPEC + 1450
9   php                           	0x0000000108981b12 execute_ex + 78

@m6w6
Copy link
Collaborator Author

m6w6 commented Jan 20, 2020

  • Comment by: lp:~kenorb:
  • Created at: 2015-11-08T19:23:07Z

This PR probably fixes the issue:
https://github.com/timbunce/Memcached-libmemcached/pull/21/files

@m6w6
Copy link
Collaborator Author

m6w6 commented Jan 20, 2020


Just to be clear, "This PR probably fixes the issue" in the previous comment refers to the fact that the Memcached-libmemcached repo contains a copy of the libmemcached code. It's the changes in that PR that need to be applied here to the upstream libmemcached code. So "This PR probably fixes the issue" -> "The changes in this PR probably fix the issue if we applied them here".

There's also the patch provided by "bin (0x3fffffff)" in the first comment on this issue.

I've not applied the PR to Memcached-libmemcached yet because I'm waiting on some acknowledgement from the libmemcached team of which fix is likely to be applied here.

Any news?

@m6w6
Copy link
Collaborator Author

m6w6 commented Jan 20, 2020

  • Comment by: lp:~lawlielt:
  • Created at: 2017-12-08T06:50:38Z

is there anyone to fix this in next release?

@m6w6 m6w6 removed the New label Jan 20, 2020
@m6w6 m6w6 closed this as completed in 9351ece Jan 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant