Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition issues between the main thread and module threads #12817

Merged
merged 41 commits into from Jan 19, 2024

Conversation

sundb
Copy link
Collaborator

@sundb sundb commented Nov 29, 2023

Fix #12785 and other race condition issues.
See the following isolated comments.

The following report was obtained using SANITIZER thread.

make SANITIZER=thread
./runtest-moduleapi --config io-threads 4 --config io-threads-do-reads yes --accurate
  1. Fixed thread-safe issue in RM_UnblockClient()
    Related discussion: Fix race condition issues between the main thread and module threads #12817 (comment)

    • When blocking a client in a module using RM_BlockClientOnKeys() or RM_BlockClientOnKeysWithFlags()
      with a timeout_callback, calling RM_UnblockClient() in module threads can lead to race conditions
      in updateStatsOnUnblock().

      • Introduced:
        Version: 6.2
        PR: Enabled background and reply time tracking on blocked on keys/blocked on background work clients #7491

      • Touch:
        server.stat_numcommands, cmd->latency_histogram, server.slowlog, and server.latency_events

      • Harm Level: High
        Potentially corrupts the memory data of cmd->latency_histogram, server.slowlog, and server.latency_events

      • Solution:
        Differentiate whether the call to moduleBlockedClientTimedOut() comes from the module or the main thread.
        Since we can't know if RM_UnblockClient() comes from module threads, we always assume it does and
        let updateStatsOnUnblock() asynchronously update the unblock status.

    • When error reply is called in timeout_callback(), ctx is not thread-safe, eventually lead to race conditions in afterErrorReply.

  2. Made RM_Reply*() family API thread-safe
    Related discussion: Fix race condition issues between the main thread and module threads #12817 (comment)
    Call chain: RM_Reply*() -> _addReplyToBufferOrList() -> touch server.current_client

  3. Made freeClient() thread-safe
    Fix [CRASH] Redis crashing on 7.2.3 version #12785

    • Introduced:
      Version: 4.0
      Commit: 3fcf959

    • Harm Level: Moderate

      • Trigger assertion
        It happens when the module thread calls freeClient while the io-thread is in progress,
        which just triggers an assertion, and doesn't make any race condiaions.

      • Touch server.current_client, server.stat_clients_type_memory, and clientMemUsageBucket->clients.
        It happens between the main thread and the module threads, may cause data corruption.

        1. Error reset server.current_client to NULL, but theoretically this won't happen,
          because the module has already reset server.current_client to old value before entering freeClient.
        2. corrupts clientMemUsageBucket->clients in updateClientMemUsageAndBucket().
        3. Causes server.stat_clients_type_memory memory statistics to be inaccurate.
    • Solution:

      • No longer counts memory usage on fake clients, to avoid updating server.stat_clients_type_memory in freeClient.
      • No longer resetting server.current_client in unlinkClient, because the fake client won't be evicted or disconnected in the mid of the process.
      • Judgment assertion io_threads_op == IO_THREADS_OP_IDLE only if c is not a fake client.
  4. Fixed free client args without GIL
    Related discussion: Fix race condition issues between the main thread and module threads #12817 (comment)
    When freeing retained strings in the module thread (refcount decr), or using them in some way (refcount incr), we should do so while holding the GIL,
    otherwise, they might be simultaneously freed while the main thread is processing the unblock client state.

  5. Fix adding fake client to server.clients_pending_write
    It will incorrectly log the memory usage for the fake client.
    Related discussion: Fix race condition issues between the main thread and module threads #12817 (comment)

    • Introduced:
      Version: 4.0
      Commit: 9b01b64

    • Harm Level: None
      Only result in NOP

    • Solution:

      • Don't add fake client into server.clients_pending_write
      • Add c->conn assertion for updateClientMemUsageAndBucket() and updateClientMemoryUsage() to avoid same
        issue in the future.
        So now it will be the responsibility of the caller of both of them to avoid passing in fake client.
  6. Fix calling RM_BlockedClientMeasureTimeStart() and RM_BlockedClientMeasureTimeEnd() without GIL

Other issue

  1. RM_Yield is not thread-safe, fixed via Make RM_Yield thread-safe #12905.

Summarize

  1. Fix thread-safe issues for RM_UnblockClient(), freeClient() and RM_Yield, potentially preventing memory corruption, data disorder, or assertion.
  2. Updated docs and module test to clarify module API users' responsibility for locking non-thread-safe APIs in multi-threading, such as RM_BlockedClientMeasureTimeStart/End(), RM_FreeString(), RM_RetainString(), and RM_HoldString().

About backpot to 7.2

  1. The implement of (1) is not too satisfying, would like to get more eyes.
  2. (2), (3) can be safely for backport
  3. (4), (6) just modifying the module tests and updating the documentation, no need for a backpot.
  4. (5) is harmless, no need for a backpot.

src/networking.c Outdated
@@ -42,7 +42,7 @@ static void setProtocolError(const char *errstr, client *c);
static void pauseClientsByClient(mstime_t end, int isPauseClientAll);
int postponeClientRead(client *c);
char *getClientSockname(client *c);
int ProcessingEventsWhileBlocked = 0; /* See processEventsWhileBlocked(). */
Copy link
Collaborator Author

@sundb sundb Nov 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start version: 7.0.0
introduced: #9963
reason: main thread reads this without getting the GIL.

WARNING: ThreadSanitizer: data race (pid=1969717)
  Write of size 4 at 0x558d01163de0 by thread T14 (mutexes: write M64):
    #0 processEventsWhileBlocked /data/redis_fork/src/networking.c:4116 (redis-server+0xe795d)
    #1 RM_Yield /data/redis_fork/src/module.c:2410 (redis-server+0x20f4c3)
    #2 RM_Yield /data/redis_fork/src/module.c:2381 (redis-server+0x20f4c3)
    #3 bg_call_worker /data/redis_fork/tests/modules/blockedclient.c:116 (blockedclient.so+0x9ade)

  Previous read of size 4 at 0x558d01163de0 by main thread (mutexes: write M86, write M82, write M84):
    #0 afterSleep /data/redis_fork/src/server.c:1867 (redis-server+0x9f4ab)
    #1 aeProcessEvents /data/redis_fork/src/ae.c:408 (redis-server+0x971dc)
    #2 aeMain /data/redis_fork/src/ae.c:496 (redis-server+0x971dc)
    #3 main /data/redis_fork/src/server.c:7212 (redis-server+0x846d5)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another more serious race condition.
start version: 7.0.0
introduced: #9963

Main thread (A):
Module thread (B):
(A)beforesleep(release gil) -> (A)aeApiPoll -> (B)acquire gil -> (B)RM_Yield() -> (B)processEventsWhileBlocked() -> (B)aeApiPoll

WARNING: ThreadSanitizer: data race (pid=2734015)
  Write of size 4 at 0x7f79092f3000 by thread T14 (mutexes: write M64):
    #0 aeApiPoll /data/redis_fork/src/ae_epoll.c:127 (redis-server+0x8c367)
    #1 aeProcessEvents /data/redis_fork/src/ae.c:399 (redis-server+0x8d8ab)
    #2 processEventsWhileBlocked /data/redis_fork/src/networking.c:4119 (redis-server+0xde74a)
    #3 RM_Yield /data/redis_fork/src/module.c:2410 (redis-server+0x1dc38a)
    #4 bg_call_worker /data/redis_fork/tests/modules/blockedclient.c:116 (blockedclient.so+0x9ade)

  Previous write of size 4 at 0x7f79092f3000 by main thread (mutexes: write M86, write M82, write M84):
    #0 aeApiPoll /data/redis_fork/src/ae_epoll.c:127 (redis-server+0x8c367)
    #1 aeProcessEvents /data/redis_fork/src/ae.c:399 (redis-server+0x8d8ab)
    #2 aeMain /data/redis_fork/src/ae.c:496 (redis-server+0x8ddf4)
    #3 main /data/redis_fork/src/server.c:7212 (redis-server+0xb7691)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we have one mutex protecting the times in which we do actual work (our GIL), and another mutex for protecting times when we go to sleep (not to use epoll in 2 threads at the same time)?

if we documented that using the RM_Yield API from a thread has to be done while locked, and it also attempts to lock the AE mutex, then it can only actually run in a very narrow time window. i wonder if that's the right approach.
maybe instead we can block the module thread, and wait for the main tread to perform an AE cycle?

i wonder why at all we allowed it to be run from a thread. i don't remember the use case.
@MeirShpilraien maybe you remember something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discussed this with Meir.
Context: calling RM_Yield from a thread is meant to be done in case you locked the GIL for a long period of time, and don't intend to release it, and meanwhile, want to reply with -BUSY.
In this case, it is likely that that the main thread is already blocked trying to lock the GIL.
However, if server.hz is really low, and there are no clients with pending commands, we can imagine that the main thread's event loop is never gonna wake up, in which case RM_Yield will hang (despite wishing to use AE_DONT_WAIT).

The alternative solution we wanna propose is that instead of adding the new mutex to guard the AE mechanism, we'll change the code so that we process these events from the main thread, instead of the module thread.

  1. the module sets server.busy_module_yield_flags and server.busy_module_yield_reply.
  2. the module sends a character on the pipe to cause the main thread's event loop to wake up.
  3. the module unlocks the GIL, waits for redis to reach the event loop once, and then re-locks the GIL.

p.s. if we agree on the direction, maybe we should extract this topic to another PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be working, I'll give it a shot.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oranagra We're missing the fact that RM_Yield() can be called in the main thread, and if we want to proceed in that direction, we'll need to use old behavior in the main thread, and new direction in the module thread.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, let's do that.
this brings back the detection of which thread we're running from, which is something we weren't happy with, but still i guess that's the right way forward.

@@ -155,6 +153,9 @@ void *bg_call_worker(void *arg) {
RedisModule_Free(bg->argv);
RedisModule_Free(bg);

Copy link
Collaborator Author

@sundb sundb Nov 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start version: 6.2.0
introduced: #8141
reason: release client argv outside of GIL

WARNING: ThreadSanitizer: data race (pid=91603)
  Read of size 4 at 0x0001027f5534 by main thread (mutexes: write M0, write M1, write M2, write M3):
    #0 decrRefCount object.c:393 (redis-server:arm64+0x1000741f0)
    #1 freeClientArgv networking.c:1391 (redis-server:arm64+0x10005f504)
    #2 resetClient networking.c:2061 (redis-server:arm64+0x100061bfc)
    #3 unblockClient blocked.c:212 (redis-server:arm64+0x10016aa80)
    #4 moduleHandleBlockedClients module.c:8316 (redis-server:arm64+0x10019d4e4)
    #5 blockedBeforeSleep blocked.c:758 (redis-server:arm64+0x10016c5c4)
    #6 beforeSleep server.c:1745 (redis-server:arm64+0x100023a8c)
    #7 aeProcessEvents ae.c:379 (redis-server:arm64+0x10000ee58)
    #8 aeMain ae.c:496 (redis-server:arm64+0x100010148)
    #9 main server.c:7213 (redis-server:arm64+0x10003e538)

  Previous write of size 4 at 0x0001027f5534 by thread T7:
    #0 decrRefCount object.c:407 (redis-server:arm64+0x100074358)
    #1 RM_FreeString module.c:2696 (redis-server:arm64+0x100188f20)
    #2 bg_call_worker <null>:49743940 (usercall.so:arm64+0x30ec)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it because of auto-memory?
the stack trace here seems unrelated to the release of the argv array.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I update the stack trace.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so what do we conclude here?
that if a module retains the argv strings, it must release them before unblocking the client, or alternatively with the GIL locked?
@MeirShpilraien PTAL

Copy link
Member

@oranagra oranagra Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually the order doesn't matter, if they both decr at the same time it can mess up the refcount.
i suppose we must clone the stings before branching?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should mention it on RM_FreeString, RM_RetainString, and RM_HoldString. We should mentioned that those function are not thread safe and should only be called when the GIL is held. We should decide if we want to document the exception about the RM_FreeString in case you know you are the only owner.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.s. for some reason i thought that RM_ReplyWithString will also be an issue, but now i see it doesn't touch the refcount. (at least not in the current implementation)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sundb this one is linked to number 4 at the top, which says:

Harm Level: None
Trigger assertion

first of all, an assertion isn't "None" it terminates the process and can cause data loss. maybe change it to "Low"?
secondly this can also mess up the refcount, or cause double free and other issues that can lead to memory corruption.
am i missing anything?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my intention was that it was only caused by the module's use of API, not by the internal implement, so I chose None.
let's change it to LOW.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohh, ok, maybe in addition to the "Harm Level", we can add some explicit "Trigger", to specify that it depends on some rare case, and people can easily rule it out and know they're safe.
or we can leave it as is, the scenario is there, it's just a little bit hard to understand if you're not keen on the details.

@@ -415,7 +415,8 @@ void _addReplyToBufferOrList(client *c, const char *s, size_t len) {
* after the command's reply (specifically important during multi-exec). the exception is
* the SUBSCRIBE command family, which (currently) have a push message instead of a proper reply.
* The check for executing_client also avoids affecting push messages that are part of eviction. */
if (c == server.current_client && (c->flags & CLIENT_PUSHING) &&
// TODO: should forbid the ReplyWith* module family api from being called outside the lock?
if ((c->flags & CLIENT_PUSHING) && c == server.current_client &&
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start version: 7.2.0
introduced: #12326
reason: RM_Reply* is not thread-safe

WARNING: ThreadSanitizer: data race (pid=1969717)
  Write of size 8 at 0x558d01162be0 by main thread (mutexes: write M86, write M82, write M84, write M64):
    #0 processCommandAndResetClient /data/redis_fork/src/networking.c:2466 (redis-server+0xe1db4)
    #1 processInputBuffer /data/redis_fork/src/networking.c:2575 (redis-server+0xe1db4)
    #2 readQueryFromClient /data/redis_fork/src/networking.c:2715 (redis-server+0xe28da)
    #3 callHandler /data/redis_fork/src/connhelpers.h:79 (redis-server+0x2943f2)
    #4 connSocketEventHandler /data/redis_fork/src/socket.c:298 (redis-server+0x2943f2)
    #5 aeProcessEvents /data/redis_fork/src/ae.c:436 (redis-server+0x97245)
    #6 aeMain /data/redis_fork/src/ae.c:496 (redis-server+0x97245)
    #7 main /data/redis_fork/src/server.c:7212 (redis-server+0x846d5)

  Previous read of size 8 at 0x558d01162be0 by thread T14:
    #0 _addReplyToBufferOrList /data/redis_fork/src/networking.c:418 (redis-server+0xdc0d6)
    #1 addReplyProto /data/redis_fork/src/networking.c:474 (redis-server+0xdd0a7)
    #2 RM_ReplyWithCallReply /data/redis_fork/src/module.c:3424 (redis-server+0x212732)
    #3 bg_call_worker /data/redis_fork/tests/modules/blockedclient.c:145 (blockedclient.so+0x9c0d)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like #12817 (comment), this is not about the use of RM_AddReply, it's about using the argv strings (changing their refcount).
p.s. how did you conclude it's related to the above mentioned PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they're not related.
this is because main thread many modify server.current_client when module thread read it.
however, c->flags & CLIENT_PUSHING is always false for module threads, so this is harmless.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sundb this one at the top comment (number 2), says:

Harm Level: Low

but if that's just an access to a variable and then ignoring what we read from it, isn't that "Harm Level: None"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, it should be None.

Comment on lines +1635 to +1639
/* Remove the contribution that this client gave to our
* incrementally computed memory usage. */
if (c->conn)
server.stat_clients_type_memory[c->last_memory_type] -=
c->last_memory_usage;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start version: 7.0.0
introduced: #8687
reason: touch server.stat_clients_type_memory without GIL

WARNING: ThreadSanitizer: data race (pid=90167)
  Write of size 8 at 0x0001006f41e0 by main thread (mutexes: write M0, write M1, write M2, write M3):
    #0 updateClientMemoryUsage server.c:956 (redis-server:arm64+0x10001f760)
    #1 clientsCron server.c:1116 (redis-server:arm64+0x1000201fc)
    #2 serverCron server.c:1451 (redis-server:arm64+0x100021b10)
    #3 processTimeEvents ae.c:331 (redis-server:arm64+0x1000100b8)
    #4 aeProcessEvents ae.c:466 (redis-server:arm64+0x10000f614)
    #5 aeMain ae.c:496 (redis-server:arm64+0x1000103bc)
    #6 main server.c:7212 (redis-server:arm64+0x10003e76c)

  Previous write of size 8 at 0x0001006f41e0 by thread T7:
    #0 freeClient networking.c:1684 (redis-server:arm64+0x10005f30c)
    #1 moduleFreeContext module.c:834 (redis-server:arm64+0x1001821ac)
    #2 RM_FreeThreadSafeContext module.c:8494 (redis-server:arm64+0x10019dcf8)
    #3 worker <null>:17631300 (blockedclient.so:arm64+0x828)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one is resolved by modifying the freeClient code, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and make updateClientMemoryUsage() and clientEvictionAllowed() no longer keep track of non-conn user memory and whether eviction.
However, this has the side effect that server.stat_clients_type_memory[CLIENT_TYPE_NORMAL] will be lower than it was before this fix was made.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can't / shouldn't evict them anyway.
if we tracked them, it was wrong to do that.
we can list this as a fix (not about thread race) in the top comment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never been able to remember how I reproduced it.
I remember using RM_Call to reproduce it, but I forget which command, not the following client no-evict command.

Config:

maxmemory-clients 1g

Command:

RedisModule_Call(ctx,"client","cc","no-evict","off");

Patch:

int clientEvictionAllowed(client *c) {
    serverAssert(c->conn);
    if (server.maxmemory_clients == 0 || c->flags & CLIENT_NO_EVICT) {
        return 0;
    }
    int type = getClientType(c);
    return (type == CLIENT_TYPE_NORMAL || type == CLIENT_TYPE_PUBSUB);
}

serverAssert(c->conn); will be triggered.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please als have a look at #12817 (comment) and top comment(7).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think using RM_Call to call the CLIENT command is valid.
specifically the ones manipulating the current client, like enabling tracking, and so on.
the user may be wanting to operate on the calling client, not the fake client, but that's not currently supported, and i think we should just disallow or disregard this case.

@sundb
Copy link
Collaborator Author

sundb commented Nov 29, 2023

RM_UnblockClient is also not thread-safe.
RM_UnblockClient() -> moduleBlockedClientTimedOut -> updateStatsOnUnblock()(not thread-safe)
FYI @MeirShpilraien

WARNING: ThreadSanitizer: data race (pid=29808)
  Read of size 8 at 0x000104940040 by main thread (mutexes: write M0):
    #0 serverCron server.c:1376 (redis-server:arm64+0x1000214e4)
    #1 processTimeEvents ae.c:331 (redis-server:arm64+0x1000100b8)
    #2 aeProcessEvents ae.c:466 (redis-server:arm64+0x10000f614)
    #3 aeMain ae.c:496 (redis-server:arm64+0x1000103bc)
    #4 main server.c:7212 (redis-server:arm64+0x10003e76c)

  Previous write of size 8 at 0x000104940040 by thread T4:
    #0 updateStatsOnUnblock blocked.c:110 (redis-server:arm64+0x10016a564)
    #1 moduleBlockedClientTimedOut module.c:8375 (redis-server:arm64+0x10019cf8c)
    #2 RM_UnblockClient module.c:8205 (redis-server:arm64+0x10019cd98)
    #3 worker <null>:87148612 (blockonkeys.so:arm64+0x2740)

patch:

diff --git a/tests/modules/blockonkeys.c b/tests/modules/blockonkeys.c
index 94bb36123..b978f698e 100644
--- a/tests/modules/blockonkeys.c
+++ b/tests/modules/blockonkeys.c
@@ -4,6 +4,7 @@
 #include <strings.h>
 #include <assert.h>
 #include <unistd.h>
+#include <pthread.h>
 
 #define UNUSED(V) ((void) V)
 
@@ -442,6 +443,16 @@ int blockonkeys_popall_timeout_callback(RedisModuleCtx *ctx, RedisModuleString *
     return RedisModule_ReplyWithError(ctx, "ERR Timeout");
 }
 
+void *worker(void *arg) {
+    // Retrieve blocked client
+    RedisModuleBlockedClient *bc = (RedisModuleBlockedClient *)arg;
+
+    // Unblock client
+    RedisModule_UnblockClient(bc, NULL);
+
+    return NULL;
+}
+
 /* BLOCKONKEYS.POPALL key
  *
  * Blocks on an empty key for up to 3 seconds. When unblocked by a list
@@ -452,13 +463,20 @@ int blockonkeys_popall(RedisModuleCtx *ctx, RedisModuleString **argv, int argc)
         return RedisModule_WrongArity(ctx);
 
     RedisModuleKey *key = RedisModule_OpenKey(ctx, argv[1], REDISMODULE_READ);
+    RedisModuleBlockedClient *bc;
     if (RedisModule_KeyType(key) == REDISMODULE_KEYTYPE_EMPTY) {
-        RedisModule_BlockClientOnKeys(ctx, blockonkeys_popall_reply_callback,
+        bc = RedisModule_BlockClientOnKeys(ctx, blockonkeys_popall_reply_callback,
                                       blockonkeys_popall_timeout_callback,
                                       NULL, 3000, &argv[1], 1, NULL);
     } else {
         RedisModule_ReplyWithError(ctx, "ERR Key not empty");
+        return REDISMODULE_OK;
     }
+
+    pthread_t tid;
+    int res = pthread_create(&tid, NULL, worker, bc);
+    assert(res == 0);
+    
     RedisModule_CloseKey(key);
     return REDISMODULE_OK;
 }

command: blockonkeys.popall k

@oranagra oranagra self-requested a review December 1, 2023 19:48
tests/modules/blockedclient.c Outdated Show resolved Hide resolved
@@ -155,6 +153,9 @@ void *bg_call_worker(void *arg) {
RedisModule_Free(bg->argv);
RedisModule_Free(bg);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it because of auto-memory?
the stack trace here seems unrelated to the release of the argv array.

Add `moduleOwnsGIL()` to determine if GIL is acquired by current thread, instead of using `ProcessingEventsWhileBlocked
`
src/module.c Outdated Show resolved Hide resolved
src/server.c Outdated Show resolved Hide resolved
Comment on lines +8307 to +8310
if (c && !clientHasModuleAuthInProgress(c)) {
int had_errors = c->deferred_reply_errors ? !!listLength(c->deferred_reply_errors) :
(server.stat_total_error_replies != prev_error_replies);
updateStatsOnUnblock(c, bc->background_duration, reply_us, had_errors);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

following #12817 (comment)
we call updateStatsOnUnblock() here when from RM_UnblockClient().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it mean that before this PR it was called twice?
maybe we should add a comment in moduleBlockedClientTimedOut, explaining the if statement by referring to this call.

Copy link
Collaborator Author

@sundb sundb Dec 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this PR, moduleHandleBlockedClients() ignored updating the block status when the client was blocked on keys.

origin code:

if (c && !clientHasModuleAuthInProgress(c) && !bc->blocked_on_keys) {
    updateStatsOnUnblock();
}

@oranagra
Copy link
Member

i lost track (or never had it), with all the comment threads here.
can you close the ones that are resolved (and list them in the top comment), so we can focus on what's not resolved?
or maybe it's too early, and you need more time to figure things out?

@sundb
Copy link
Collaborator Author

sundb commented Dec 12, 2023

introduced: 9b01b64
reason: adding fake client to server.clients_pending_write

src/redis-server 127.0.0.1:21111(updateClientMemUsageAndBucket+0x5b)[0x55bed1f9efcf]
src/redis-server 127.0.0.1:21111(writeToClient+0x21a)[0x55bed1fc2cd9]
src/redis-server 127.0.0.1:21111(handleClientsWithPendingWrites+0xb8)[0x55bed1fc2de0]
src/redis-server 127.0.0.1:21111(handleClientsWithPendingWritesUsingThreads+0x52)[0x55bed1fc9bda]
src/redis-server 127.0.0.1:21111(beforeSleep+0x271)[0x55bed1fa094e]
src/redis-server 127.0.0.1:21111(aeProcessEvents+0x9e)[0x55bed1f9671b]
src/redis-server 127.0.0.1:21111(aeMain+0x2e)[0x55bed1f96b03]
src/redis-server 127.0.0.1:21111(main+0xd21)[0x55bed1fb084f]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7ffa6ba29d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7ffa6ba29e40]
src/redis-server 127.0.0.1:21111(_start+0x25)[0x55bed1f908d5]

@@ -2484,7 +2487,7 @@ int processCommandAndResetClient(client *c) {
commandProcessed(c);
/* Update the client's memory to include output buffer growth following the
* processed command. */
updateClientMemUsageAndBucket(c);
if (c->conn) updateClientMemUsageAndBucket(c);
Copy link
Collaborator Author

@sundb sundb Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oranagra this is another place to record the memory usage of fake clients.
when unblocking a fake client that blocks on keys, it triggers.

=== REDIS BUG REPORT START: Cut & paste starting from here ===
21845:M 12 Dec 2023 13:25:36.347 # === ASSERTION FAILED ===
21845:M 12 Dec 2023 13:25:36.347 # ==> server.c:1023 'io_threads_op == IO_THREADS_OP_IDLE && c->conn' is not true

------ STACK TRACE ------

Backtrace:
0   redis-server                        0x0000000102d73374 updateClientMemUsageAndBucket.cold.1 + 32
1   redis-server                        0x0000000102c425dc updateClientMemUsageAndBucket + 412
2   redis-server                        0x0000000102c6b604 processCommandAndResetClient + 60
3   redis-server                        0x0000000102ceed08 handleClientsBlockedOnKeys + 792
4   redis-server                        0x0000000102c48f60 processCommand + 2816
5   redis-server                        0x0000000102c6b864 processInputBuffer + 312
6   redis-server                        0x0000000102c635d4 readQueryFromClient + 1364
7   redis-server                        0x0000000102d36380 connSocketEventHandler + 220
8   redis-server                        0x0000000102c3b720 aeProcessEvents + 1100
9   redis-server                        0x0000000102c3b840 aeMain + 32
10  redis-server                        0x0000000102c4f998 main + 2036
11  dyld                                0x00000001908a5058 start + 2224

failed test:

    test "Blpop on async RM_Call fire and forget" {
        assert_equal {Blocked} [r do_rm_call_fire_and_forget blpop l 0]
        r lpush l a
        assert_equal {0} [r llen l]
    }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think that's a bug, they're not "evictable", and the user (application) doesn't control them (and their amount).
we don't show them in CLIENT LIST, and i think we should not count their memory in that mechanism (maybe we should count them elsewhere)

@oranagra oranagra assigned banker and unassigned banker Dec 17, 2023
src/module.c Outdated Show resolved Hide resolved
src/server.c Outdated Show resolved Hide resolved
src/server.c Outdated Show resolved Hide resolved
src/networking.c Outdated
@@ -42,7 +42,7 @@ static void setProtocolError(const char *errstr, client *c);
static void pauseClientsByClient(mstime_t end, int isPauseClientAll);
int postponeClientRead(client *c);
char *getClientSockname(client *c);
int ProcessingEventsWhileBlocked = 0; /* See processEventsWhileBlocked(). */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we have one mutex protecting the times in which we do actual work (our GIL), and another mutex for protecting times when we go to sleep (not to use epoll in 2 threads at the same time)?

if we documented that using the RM_Yield API from a thread has to be done while locked, and it also attempts to lock the AE mutex, then it can only actually run in a very narrow time window. i wonder if that's the right approach.
maybe instead we can block the module thread, and wait for the main tread to perform an AE cycle?

i wonder why at all we allowed it to be run from a thread. i don't remember the use case.
@MeirShpilraien maybe you remember something?

@sundb
Copy link
Collaborator Author

sundb commented Dec 18, 2023

@oranagra moduleOwnsGIL() isn't really right, pthread_id may not be numeric type or it may be struct, not sure if you've seen the earliest use of __thread by me to modify ProcessingEventsWhileBlocked, but __thread isn't a c99 standard, and I'm not sure if I should use it or not, but it's a good choice.

@oranagra
Copy link
Member

@sundb is the last comment in the context of the el_poll_mutex or a new discussion?
i'd rather start discussions in threads so we can focus on each one.
p.s. i think maybe we can compromise on _thread and have certain capabilities of redis require that. but the problem with the pthread_id is probably an issue.

…easureTimeEnd() within GIL"

This reverts commit d272368.
@sundb
Copy link
Collaborator Author

sundb commented Jan 10, 2024

maybe the timeout callback can somehow terminate the thread before it messes with the timer?

force terminating threads is not good practice, are these tests misleading module developers?
and if the module thread is already in RM_UnblockClient(), will there be unpredictable behavior if we terminate it.

or maybe we should have let the module track the time on it's own, and then just report it to redis at the end?

But the purpose of this test is to reply to the user in the middle of blocking, not to wait for the thread to end before replying.

@oranagra
Copy link
Member

maybe the timeout callback can somehow terminate the thread before it messes with the timer?

force terminating threads is not good practice, are these tests misleading module developers? and if the module thread is already in RM_UnblockClient(), will there be unpredictable behavior if we terminate it.

i see that's what you ended up implementing.

or maybe we should have let the module track the time on it's own, and then just report it to redis at the end?

But the purpose of this test is to reply to the user in the middle of blocking, not to wait for the thread to end before replying.

the replying can be done during.. but reporting how much time was used to process the request can be reported to redis at the end (when you unblock the client, and the total time in that timer is added to the command stats)

src/module.c Outdated Show resolved Hide resolved
@oranagra
Copy link
Member

sundb and others added 2 commits January 19, 2024 20:29
Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: oranagra <oran@redislabs.com>
@oranagra oranagra merged commit d064002 into redis:unstable Jan 19, 2024
13 checks passed
@oranagra oranagra added the release-notes indication that this issue needs to be mentioned in the release notes label Jan 19, 2024
@sundb sundb deleted the thread-safe-freeclient branch January 19, 2024 13:16
@PhanSon95
Copy link

We do not release a new version for this PR right ?

@oranagra
Copy link
Member

not yet. but we might backport some of these fixes to the next release of 7.2 (see the bottom of the top comment)

oranagra pushed a commit that referenced this pull request Jan 31, 2024
… same event loop (#13015)

When we use a timer to unblock a client in module, if the timer
period and the block timeout are very close, they will unblock the
client in the same event loop, and it will trigger the assertion.
The reason is that in moduleBlockedClientTimedOut we will protect
against re-processing, so we don't actually call updateStatsOnUnblock
(see #12817), so we are not able to reset the c->duration. 

The reason is unblockClientOnTimeout() didn't realize that bc had
been unblocked. We add a function to the module to determine if bc
is blocked, and then use it in unblockClientOnTimeout() to exit.

There is the stack:
```
beforeSleep
blockedBeforeSleep
handleBlockedClientsTimeout
checkBlockedClientTimeout
unblockClientOnTimeout
unblockClient
resetClient
-- assertion, crash the server
'c->duration == 0' is not true
```
roggervalf pushed a commit to roggervalf/redis that referenced this pull request Feb 11, 2024
## Issues and solutions from redis#12817
1. Touch ProcessingEventsWhileBlocked and calling moduleCount() without
GIL in afterSleep()
    - Introduced: 
       Version: 7.0.0
       PR: redis#9963

   - Harm Level: Very High
If the module thread calls `RM_Yield()` before the main thread enters
afterSleep(),
and modifies `ProcessingEventsWhileBlocked`(+1), it will cause the main
thread to not wait for GIL,
which can lead to all kinds of unforeseen problems, including memory
data corruption.

   - Initial / Abandoned Solution:
      * Added `__thread` specifier for ProcessingEventsWhileBlocked.
`ProcessingEventsWhileBlocked` is used to protect against nested event
processing, but event processing
in the main thread and module threads should be completely independent
and unaffected, so it is safer
         to use TLS.
* Adding a cached module count to keep track of the current number of
modules, to avoid having to use `dictSize()`.
    
    - Related Warnings:
```
WARNING: ThreadSanitizer: data race (pid=1136)
  Write of size 4 at 0x0001045990c0 by thread T4 (mutexes: write M0):
    #0 processEventsWhileBlocked networking.c:4135 (redis-server:arm64+0x10006d124)
    redis#1 RM_Yield module.c:2410 (redis-server:arm64+0x10018b66c)
    redis#2 bg_call_worker <null>:83232836 (blockedclient.so:arm64+0x16a8)

  Previous read of size 4 at 0x0001045990c0 by main thread:
    #0 afterSleep server.c:1861 (redis-server:arm64+0x100024f98)
    redis#1 aeProcessEvents ae.c:408 (redis-server:arm64+0x10000fd64)
    redis#2 aeMain ae.c:496 (redis-server:arm64+0x100010f0c)
    redis#3 main server.c:7220 (redis-server:arm64+0x10003f38c)
```

2. aeApiPoll() is not thread-safe
When using RM_Yield to handle events in a module thread, if the main
thread has not yet
entered `afterSleep()`, both the module thread and the main thread may
touch `server.el` at the same time.

    - Introduced: 
       Version: 7.0.0
       PR: redis#9963

   - Old / Abandoned Solution:
Adding a new mutex to protect timing between after beforeSleep() and
before afterSleep().
Defect: If the main thread enters the ae loop without any IO events, it
will wait until
the next timeout or until there is any event again, and the module
thread will
always hang until the main thread leaves the event loop.

    - Related Warnings:
```
SUMMARY: ThreadSanitizer: data race ae_kqueue.c:55 in addEventMask
==================
==================
WARNING: ThreadSanitizer: data race (pid=14682)
  Write of size 4 at 0x000100b54000 by thread T9 (mutexes: write M0):
    #0 aeApiPoll ae_kqueue.c:175 (redis-server:arm64+0x100010588)
    redis#1 aeProcessEvents ae.c:399 (redis-server:arm64+0x10000fb84)
    redis#2 processEventsWhileBlocked networking.c:4138 (redis-server:arm64+0x10006d3c4)
    redis#3 RM_Yield module.c:2410 (redis-server:arm64+0x10018b66c)
    redis#4 bg_call_worker <null>:16042052 (blockedclient.so:arm64+0x169c)

  Previous write of size 4 at 0x000100b54000 by main thread:
    #0 aeApiPoll ae_kqueue.c:175 (redis-server:arm64+0x100010588)
    redis#1 aeProcessEvents ae.c:399 (redis-server:arm64+0x10000fb84)
    redis#2 aeMain ae.c:496 (redis-server:arm64+0x100010da8)
    redis#3 main server.c:7238 (redis-server:arm64+0x10003f51c)
```

## The final fix as the comments:
redis#12817 (comment)
Optimized solution based on the above comment:

First, we add `module_gil_acquring` to indicate whether the main thread
is currently in the acquiring GIL state.

When the module thread starts to yield, there are two possibilities(we
assume the caller keeps the GIL):
1. The main thread is in the mid of beforeSleep() and afterSleep(), that
is, `module_gil_acquring` is not 1 now.
At this point, the module thread will wake up the main thread through
the pipe and leave the yield,
waiting for the next yield when the main thread may already in the
acquiring GIL state.
    
2. The main thread is in the acquiring GIL state.
The module thread release the GIL, yielding CPU to give the main thread
an opportunity to start
event processing, and then acquire the GIL again until the main thread
releases it.
This is what
redis#12817 (comment)
mentioned direction.

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
roggervalf pushed a commit to roggervalf/redis that referenced this pull request Feb 11, 2024
…edis#12817)

Fix redis#12785 and other race condition issues.
See the following isolated comments.

The following report was obtained using SANITIZER thread.
```sh
make SANITIZER=thread
./runtest-moduleapi --config io-threads 4 --config io-threads-do-reads yes --accurate
```

1. Fixed thread-safe issue in RM_UnblockClient()
Related discussion:
redis#12817 (comment)
* When blocking a client in a module using `RM_BlockClientOnKeys()` or
`RM_BlockClientOnKeysWithFlags()`
with a timeout_callback, calling RM_UnblockClient() in module threads
can lead to race conditions
     in `updateStatsOnUnblock()`.

     - Introduced: 
        Version: 6.2
        PR: redis#7491

     - Touch:
`server.stat_numcommands`, `cmd->latency_histogram`, `server.slowlog`,
and `server.latency_events`
     
     - Harm Level: High
Potentially corrupts the memory data of `cmd->latency_histogram`,
`server.slowlog`, and `server.latency_events`

     - Solution:
Differentiate whether the call to moduleBlockedClientTimedOut() comes
from the module or the main thread.
Since we can't know if RM_UnblockClient() comes from module threads, we
always assume it does and
let `updateStatsOnUnblock()` asynchronously update the unblock status.
     
* When error reply is called in timeout_callback(), ctx is not
thread-safe, eventually lead to race conditions in `afterErrorReply`.

     - Introduced: 
        Version: 6.2
        PR: redis#8217

     - Touch
       `server.stat_total_error_replies`, `server.errors`, 

     - Harm Level: High
       Potentially corrupts the memory data of `server.errors`
   
      - Solution: 
Make the ctx in `timeout_callback()` with `REDISMODULE_CTX_THREAD_SAFE`,
and asynchronously reply errors to the client.

2. Made RM_Reply*() family API thread-safe
Related discussion:
redis#12817 (comment)
Call chain: `RM_Reply*()` -> `_addReplyToBufferOrList()` -> touch
server.current_client

    - Introduced: 
       Version: 7.2.0
       PR: redis#12326

   - Harm Level: None
Since the module fake client won't have the `CLIENT_PUSHING` flag, even
if we touch server.current_client,
     we can still exit after `c->flags & CLIENT_PUSHING`.

   - Solution
      Checking `c->flags & CLIENT_PUSHING` earlier.

3. Made freeClient() thread-safe
    Fix redis#12785

    - Introduced: 
       Version: 4.0
Commit:
redis@3fcf959

    - Harm Level: Moderate
       * Trigger assertion
It happens when the module thread calls freeClient while the io-thread
is in progress,
which just triggers an assertion, and doesn't make any race condiaions.

* Touch `server.current_client`, `server.stat_clients_type_memory`, and
`clientMemUsageBucket->clients`.
It happens between the main thread and the module threads, may cause
data corruption.
1. Error reset `server.current_client` to NULL, but theoretically this
won't happen,
because the module has already reset `server.current_client` to old
value before entering freeClient.
2. corrupts `clientMemUsageBucket->clients` in
updateClientMemUsageAndBucket().
3. Causes server.stat_clients_type_memory memory statistics to be
inaccurate.
    
    - Solution:
* No longer counts memory usage on fake clients, to avoid updating
`server.stat_clients_type_memory` in freeClient.
* No longer resetting `server.current_client` in unlinkClient, because
the fake client won't be evicted or disconnected in the mid of the
process.
* Judgment assertion `io_threads_op == IO_THREADS_OP_IDLE` only if c is
not a fake client.

4. Fixed free client args without GIL
Related discussion:
redis#12817 (comment)
When freeing retained strings in the module thread (refcount decr), or
using them in some way (refcount incr), we should do so while holding
the GIL,
otherwise, they might be simultaneously freed while the main thread is
processing the unblock client state.

    - Introduced: 
       Version: 6.2.0
       PR: redis#8141

   - Harm Level: Low
     Trigger assertion or double free or memory leak. 

   - Solution:
Documenting that module API users need to ensure any access to these
retained strings is done with the GIL locked

5. Fix adding fake client to server.clients_pending_write
    It will incorrectly log the memory usage for the fake client.
Related discussion:
redis#12817 (comment)

    - Introduced: 
       Version: 4.0
Commit:
redis@9b01b64

    - Harm Level: None
      Only result in NOP

    - Solution:
       * Don't add fake client into server.clients_pending_write
* Add c->conn assertion for updateClientMemUsageAndBucket() and
updateClientMemoryUsage() to avoid same
         issue in the future.
So now it will be the responsibility of the caller of both of them to
avoid passing in fake client.

6. Fix calling RM_BlockedClientMeasureTimeStart() and
RM_BlockedClientMeasureTimeEnd() without GIL
    - Introduced: 
       Version: 6.2
       PR: redis#7491

   - Harm Level: Low
Causes inaccuracies in command latency histogram and slow logs, but does
not corrupt memory.

   - Solution:
Module API users, if know that non-thread-safe APIs will be used in
multi-threading, need to take responsibility for protecting them with
their own locks instead of the GIL, as using the GIL is too expensive.

### Other issue
1. RM_Yield is not thread-safe, fixed via redis#12905.

### Summarize
1. Fix thread-safe issues for `RM_UnblockClient()`, `freeClient()` and
`RM_Yield`, potentially preventing memory corruption, data disorder, or
assertion.
2. Updated docs and module test to clarify module API users'
responsibility for locking non-thread-safe APIs in multi-threading, such
as RM_BlockedClientMeasureTimeStart/End(), RM_FreeString(),
RM_RetainString(), and RM_HoldString().

### About backpot to 7.2
1. The implement of (1) is not too satisfying, would like to get more
eyes.
2. (2), (3) can be safely for backport
3. (4), (6) just modifying the module tests and updating the
documentation, no need for a backpot.
4. (5) is harmless, no need for a backpot.

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
roggervalf pushed a commit to roggervalf/redis that referenced this pull request Feb 11, 2024
… same event loop (redis#13015)

When we use a timer to unblock a client in module, if the timer
period and the block timeout are very close, they will unblock the
client in the same event loop, and it will trigger the assertion.
The reason is that in moduleBlockedClientTimedOut we will protect
against re-processing, so we don't actually call updateStatsOnUnblock
(see redis#12817), so we are not able to reset the c->duration. 

The reason is unblockClientOnTimeout() didn't realize that bc had
been unblocked. We add a function to the module to determine if bc
is blocked, and then use it in unblockClientOnTimeout() to exit.

There is the stack:
```
beforeSleep
blockedBeforeSleep
handleBlockedClientsTimeout
checkBlockedClientTimeout
unblockClientOnTimeout
unblockClient
resetClient
-- assertion, crash the server
'c->duration == 0' is not true
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes indication that this issue needs to be mentioned in the release notes
Projects
Status: pending
Status: Done
Development

Successfully merging this pull request may close these issues.

[CRASH] Redis crashing on 7.2.3 version
6 participants