Sort out the mess around writable replicas and lookupKeyRead/Write #9572

zuiderkwast · 2021-09-30T16:14:33Z

Writable replicas now no longer use the values of expired keys. Expired keys are
deleted when lookupKeyWrite() is used, even on a writable replica. Previously,
writable replicas could use the value of an expired key in write commands such
as INCR, SUNIONSTORE, etc..

This commit also sorts out the mess around the functions lookupKeyRead() and
lookupKeyWrite() so they now indicate what we intend to do with the key and
are not affected by the command calling them.

Multi-key commands like SUNIONSTORE, ZUNIONSTORE, COPY and SORT with the
store option now use lookupKeyRead() for the keys they're reading from (which will
not allow reading from logically expired keys).

This commit also fixes a bug where PFCOUNT could return a value of an
expired key.

Test modules commands have their readonly and write flags updated to correctly
reflect their lookups for reading or writing. Modules are not required to
correctly reflect this in their command flags, but this change is made for
consistency since the tests serve as usage examples.

Fixes #6842. Fixes #7475.

src/db.c

src/sort.c

oranagra

i'd like to add quite a few tests.
the obvious ones are the bugs mentioned in #6842, but probably quite a few other tests that verify the intended behavior so it won't be accidentally broken in the future.

src/db.c

src/sort.c

src/db.c

src/hyperloglog.c

Writable replicas now no longer use the data of expired keys. Expired keys are deleted when lookupKeyWrite() is used, even on a writable replica. This commit also sorts out the mess around the commands lookupKeyRead() and lookupKeyWrite() so they now indicate what we intend to do with the key and are not be affected by the command itself. Multi-key commands like sunionstore, zinterstore, copy and sort with the store option now use lookupKeyRead() for the keys they're reading from, but with flags preserving the legacy behaviour (not touching keyspace hits/misses counters, etc.).

Co-authored-by: yoav-steinberg <yoav@monfort.co.il>

Set current_client to AOF client during AOF loading. Add an assert forbidding the WRITE flag in lookupKeyReadWithFlags. Extra: Don't touch keys stats and LRU when determining ASK redirect.

oranagra · 2021-10-17T08:26:54Z

@zuiderkwast i think you should avoid doing git rebase and push -f, just stick to merge and incremental commits.
it's hard to keep track of what's new and what was already reviewed, and since we're gonna squash-merge this anyway, there's no real need for modifying existing commits.

src/hyperloglog.c

tests/integration/replication-3.tcl

src/db.c

tests/integration/replication-3.tcl

src/server.h

tests/integration/replication-3.tcl

oranagra · 2021-10-26T09:39:01Z

@redis/core-team not sure if we need a major decision for this, but since it's a delicate subject, i'd love for you to review.

oranagra · 2021-11-25T10:10:38Z

there is a way to check if the key exists without deleting it (DEBUG OBJECT).
but i don't see why there's a race...
the Tcl code has after 100 so we know at least 100ms passed since we got the reply from PEXPIRE.
this means that when we execute SWAPDB, the check in it that tests the expiration time will surely find that it already expired.
maybe there's an off by one issue, i.e. > 100 vs >= 100, so when the test is super fast is fails.
if that's the case, we can sleep for 101, but i don't see any other explanation.
please look into it and then we'll merge.

enjoy-binbin · 2021-12-09T06:47:39Z

tests/integration/replication-4.tcl

+            }
+        }
+
+        test {Replication of an expired key does not delete the expired key} {


https://github.com/enjoy-binbin/redis/runs/4440834343?check_suite_focus=true#step:4:5814

the test failed once on my CI (FreeBSD), looks like was a timing issue.
I took a look, here are my thoughts:

the key expired before INCR was executed, because the execution time of wait_for_ofs_sync exceeded one second. (or maybe related the KILL).

I didn't think of a good way:

maybe we can use $slave debug sleep ? (But there are also such problems)

add a retry

Check that k is locigally expired but is present in the replica.

also there was a typo (locigally)

Good catch! Thanks for looking into it.

I don't understand how debug sleep can help...?

Add a retry, yes I guess it can work. We can double the expire time at every retry, so starting with 1 second, then 2, 4, 8 etc. We can add a check before the first wait_for_ofs_sync that k didn't expire and if it did, we retry. WDYT?

oh i mean that maybe we can use debug sleep to replace the kill

as for retry. The double is a good idea. Maybe we can start from Ms, reduce the running time.

edit: #11548

…edis#9572) Writable replicas now no longer use the values of expired keys. Expired keys are deleted when lookupKeyWrite() is used, even on a writable replica. Previously, writable replicas could use the value of an expired key in write commands such as INCR, SUNIONSTORE, etc.. This commit also sorts out the mess around the functions lookupKeyRead() and lookupKeyWrite() so they now indicate what we intend to do with the key and are not affected by the command calling them. Multi-key commands like SUNIONSTORE, ZUNIONSTORE, COPY and SORT with the store option now use lookupKeyRead() for the keys they're reading from (which will not allow reading from logically expired keys). This commit also fixes a bug where PFCOUNT could return a value of an expired key. Test modules commands have their readonly and write flags updated to correctly reflect their lookups for reading or writing. Modules are not required to correctly reflect this in their command flags, but this change is made for consistency since the tests serve as usage examples. Fixes redis#6842. Fixes redis#7475.

AviAvni · 2022-02-06T13:21:28Z

src/db.c

+             * shall not be used in readonly commands. Modules are accepted so
+             * that we don't break old modules. */
+            client *c = server.in_eval ? server.lua_client : server.current_client;
+            serverAssert(!c || !c->cmd || (c->cmd->flags & (CMD_WRITE|CMD_MODULE)));


this assert assume that opening a key for write can’t be called from a redis command
but in RedisGraph when a SAVE command end we delete the temporary keys we created during the save process
we need to be able to workaround this assertion
this crashes redis in our tests
stack trace example:

------ STACK TRACE ------ Backtrace: 0 redis-server 0x000000010cad474a lookupKey.cold.1 + 26 1 redis-server 0x000000010c9f1902 lookupKey + 402 2 redis-server 0x000000010ca6b704 RM_OpenKey + 68 3 redisgraph.so 0x000000010d3391f2 _DeleteGraphMetaKeys + 194 4 redisgraph.so 0x000000010d339392 _ClearKeySpaceMetaKeys + 88 5 redisgraph.so 0x000000010d339597 _PersistenceEventHandler + 202 6 redis-server 0x000000010ca74b93 moduleFireServerEvent + 195 7 redis-server 0x000000010ca03fd1 rdbSave + 641 8 redis-server 0x000000010ca08eaa saveCommand + 218 9 redis-server 0x000000010c9cddd6 call + 278 10 redis-server 0x000000010c9cef08 processCommand + 2904

@zuiderkwast we didn't want to break modules, and assumed that c->cmd will be a module command, but with notifications and events, that could be done from other random contexts.

IIRC this assertion was just there in order to help us find native redis commands that are not flagged correctly.
it could have been sufficient to test that we're not on a writable-replica, but we thought that coverage for such a test will be low, and preferred to check our assumption on masters too.

As far as i can tell, our options now are:

remove that assert completely (as was argued before, right?)

make it run only on writable replicas

find another way to exclude modules

please share your thoughts, and remind me what i forgot.

Just to be sure I understand what's happening: Is the RM_OpenKey triggered by something other than a module command, e.g. a keyspace notification or event, which is fired after some real command (SAVE) has executed? or before?

If that's the case, can we set c->cmd to NULL before the firing the event? If we do, then this code will not appear to be part of any command, which I think is better than appearing as being run as part of some read-only command.

Another question: Can this ever happen on a readonly replica? If yes, then perhaps we should set force_delete_expired to false here if we're on a read-only replica to keep it consistent with its primary, rather than only bypassing the assert.

If we do, then I guess we can drop the assert entirely.

From what i know of RedisGraph, when this code happens on a replica, should always be in a fork child, but it could still be from within a command.
i.e. when a SYNC command is received and triggers an immediate fork, the fork child process will have c->cmd still set on the stack.

What i don't like about nullifying c->cmd in the various event dispatches in module.c is that it's very far from the assertion.
i.e. we'll have to backup, nullify, and restore c->cmd and add some comments explaining that it's done to avoid an assertion on the other side of town.

Agree. This assert looks at things far away too...

I think we can remove the assertion and never set force_delete_expired on a writable replica:

int force_delete_expired = flags & LOOKUP_WRITE && !(server.masterhost && server.repl_slave_ro);

@zuiderkwast please make a PR.

There's an assertion added recently to make sure that non-write commands don't use lookupKeyWrite, It was initially meant to be used only on read-only replicas, but we thought it'll not have enough coverage, so used it on the masters too. We now realize that in some cases this can cause issues for modules, so we remove the assert. Other than that, we also make sure not to force expireIfNeeded on read-only replicas. even if they somehow run a write command. See #9572 (comment)

RM_Yield was missing a call to protectClient to prevent redis from processing future commands of the yielding client. Adding tests that fail without this fix. This would be complicated to solve since nested calls to RM_Call used to replace the current_client variable with the module temp client. It looks like it's no longer necessary to do that, since it was added back in redis#9890 to solve two issues, both already gone: 1. call to CONFIG SET maxmemory could trigger a module hook calling RM_Call. although this specific issue is gone, arguably other hooks like keyspace notification, can do the same. 2. an assertion in lookupKey that checks the current command of the current client, introduced in redis#9572 and removed in redis#10248

…10573) RM_Yield was missing a call to protectClient to prevent redis from processing future commands of the yielding client. Adding tests that fail without this fix. This would be complicated to solve since nested calls to RM_Call used to replace the current_client variable with the module temp client. It looks like it's no longer necessary to do that, since it was added back in #9890 to solve two issues, both already gone: 1. call to CONFIG SET maxmemory could trigger a module hook calling RM_Call. although this specific issue is gone, arguably other hooks like keyspace notification, can do the same. 2. an assertion in lookupKey that checks the current command of the current client, introduced in #9572 and removed in #10248

In replica, the key expired before master's `INCR` was arrived, so INCR creates a new key in the replica and the test failed. ``` *** [err]: Replication of an expired key does not delete the expired key in tests/integration/replication-4.tcl Expected '0' to be equal to '1' (context: type eval line 13 cmd {assert_equal 0 [$slave exists k]} proc ::test) ``` This test is very likely to do a false positive if the `wait_for_ofs_sync` takes longer than the expiration time, so give it a few more chances. The test was introduced in redis#9572.

#11548) In replica, the key expired before master's `INCR` was arrived, so INCR creates a new key in the replica and the test failed. ``` *** [err]: Replication of an expired key does not delete the expired key in tests/integration/replication-4.tcl Expected '0' to be equal to '1' (context: type eval line 13 cmd {assert_equal 0 [$slave exists k]} proc ::test) ``` This test is very likely to do a false positive if the `wait_for_ofs_sync` takes longer than the expiration time, so give it a few more chances. The test was introduced in #9572.

redis#11548) In replica, the key expired before master's `INCR` was arrived, so INCR creates a new key in the replica and the test failed. ``` *** [err]: Replication of an expired key does not delete the expired key in tests/integration/replication-4.tcl Expected '0' to be equal to '1' (context: type eval line 13 cmd {assert_equal 0 [$slave exists k]} proc ::test) ``` This test is very likely to do a false positive if the `wait_for_ofs_sync` takes longer than the expiration time, so give it a few more chances. The test was introduced in redis#9572. (cherry picked from commit 06b577a)

#11548) In replica, the key expired before master's `INCR` was arrived, so INCR creates a new key in the replica and the test failed. ``` *** [err]: Replication of an expired key does not delete the expired key in tests/integration/replication-4.tcl Expected '0' to be equal to '1' (context: type eval line 13 cmd {assert_equal 0 [$slave exists k]} proc ::test) ``` This test is very likely to do a false positive if the `wait_for_ofs_sync` takes longer than the expiration time, so give it a few more chances. The test was introduced in #9572. (cherry picked from commit 06b577a)

redis#11548) In replica, the key expired before master's `INCR` was arrived, so INCR creates a new key in the replica and the test failed. ``` *** [err]: Replication of an expired key does not delete the expired key in tests/integration/replication-4.tcl Expected '0' to be equal to '1' (context: type eval line 13 cmd {assert_equal 0 [$slave exists k]} proc ::test) ``` This test is very likely to do a false positive if the `wait_for_ofs_sync` takes longer than the expiration time, so give it a few more chances. The test was introduced in redis#9572.

…edis#10573) RM_Yield was missing a call to protectClient to prevent redis from processing future commands of the yielding client. Adding tests that fail without this fix. This would be complicated to solve since nested calls to RM_Call used to replace the current_client variable with the module temp client. It looks like it's no longer necessary to do that, since it was added back in redis#9890 to solve two issues, both already gone: 1. call to CONFIG SET maxmemory could trigger a module hook calling RM_Call. although this specific issue is gone, arguably other hooks like keyspace notification, can do the same. 2. an assertion in lookupKey that checks the current command of the current client, introduced in redis#9572 and removed in redis#10248

redis#11548) In replica, the key expired before master's `INCR` was arrived, so INCR creates a new key in the replica and the test failed. ``` *** [err]: Replication of an expired key does not delete the expired key in tests/integration/replication-4.tcl Expected '0' to be equal to '1' (context: type eval line 13 cmd {assert_equal 0 [$slave exists k]} proc ::test) ``` This test is very likely to do a false positive if the `wait_for_ofs_sync` takes longer than the expiration time, so give it a few more chances. The test was introduced in redis#9572.

zuiderkwast requested a review from oranagra September 30, 2021 16:14

zuiderkwast commented Sep 30, 2021

View reviewed changes

src/db.c Outdated Show resolved Hide resolved

yoav-steinberg reviewed Oct 7, 2021

View reviewed changes

src/db.c Outdated Show resolved Hide resolved

src/db.c Show resolved Hide resolved

src/sort.c Outdated Show resolved Hide resolved

src/sort.c Outdated Show resolved Hide resolved

oranagra added this to Backlog in 7.0 via automation Oct 10, 2021

oranagra moved this from Backlog to In progress in 7.0 Oct 10, 2021

oranagra reviewed Oct 10, 2021

View reviewed changes

src/db.c Show resolved Hide resolved

src/db.c Outdated Show resolved Hide resolved

src/db.c Outdated Show resolved Hide resolved

src/db.c Outdated Show resolved Hide resolved

src/db.c Outdated Show resolved Hide resolved

src/sort.c Outdated Show resolved Hide resolved

yoav-steinberg reviewed Oct 12, 2021

View reviewed changes

src/db.c Show resolved Hide resolved

oranagra reviewed Oct 12, 2021

View reviewed changes

src/hyperloglog.c Show resolved Hide resolved

zuiderkwast and others added 9 commits October 14, 2021 08:35

Apply suggestions from code review

c002931

Co-authored-by: yoav-steinberg <yoav@monfort.co.il>

Fixup: Review suggestions

49b2c48

Fixup: Add assert in expireIfNeeded

710e9d9

Fixup: spelling typo

ef55207

Fixup: Modules test commands marked as readonly

068e7a1

Fixup: Make COPY use plain lookupKeyRead()

562b96f

Fix the assert in expireIfNeeded for eval and debug loadaof + stuff

3b1e307

Set current_client to AOF client during AOF loading. Add an assert forbidding the WRITE flag in lookupKeyReadWithFlags. Extra: Don't touch keys stats and LRU when determining ASK redirect.

Add test cases

6f01fb0

zuiderkwast force-pushed the writable-replicas branch from 86dd019 to 6f01fb0 Compare October 14, 2021 11:54

Add WRITE flag to all test module write commands

32a7045

zuiderkwast force-pushed the writable-replicas branch from 0c09d59 to 32a7045 Compare October 14, 2021 14:08

oranagra reviewed Oct 17, 2021

View reviewed changes

src/hyperloglog.c Show resolved Hide resolved

tests/integration/replication-3.tcl Outdated Show resolved Hide resolved

tests/integration/replication-3.tcl Outdated Show resolved Hide resolved

zuiderkwast added 3 commits October 21, 2021 20:53

Fix review comments

cca77ac

Merge remote-tracking branch 'redis/unstable' into writable-replicas

a439aa9

Attemt to fix failing test

aecc32c

oranagra reviewed Oct 24, 2021

View reviewed changes

src/db.c Outdated Show resolved Hide resolved

tests/integration/replication-3.tcl Outdated Show resolved Hide resolved

Review comments

34d1119

yossigo reviewed Oct 25, 2021

View reviewed changes

src/server.h Outdated Show resolved Hide resolved

oranagra approved these changes Oct 26, 2021

View reviewed changes

tests/integration/replication-3.tcl Outdated Show resolved Hide resolved

Sleep one extra millisecond in test case

34676f9

oranagra merged commit acf3495 into redis:unstable Nov 28, 2021

oranagra moved this from Awaits merge to Done in 7.0 Nov 28, 2021

zuiderkwast deleted the writable-replicas branch November 28, 2021 10:40

enjoy-binbin reviewed Dec 9, 2021

View reviewed changes

enjoy-binbin mentioned this pull request Jan 26, 2022

solve race in expiration test #10192

Merged

AviAvni reviewed Feb 6, 2022

View reviewed changes

oranagra moved this from Done to In progress in 7.0 Feb 7, 2022

zuiderkwast mentioned this pull request Feb 7, 2022

Remove assert and refuse delete expired on ro replicas #10248

Merged

oranagra moved this from In progress to Done in 7.0 Feb 7, 2022

This was referenced Feb 13, 2022

Make EXISTS consistent with GET for expiry on slaves #4765

Closed

Redis slaves, while not allowed to expire keys without master input, should reply to clients consistently with the key expire information. #187

Closed

oranagra mentioned this pull request Apr 12, 2022

Fix RM_Yield bug processing future commands of the current client. #10573

Merged

soloestoy mentioned this pull request Nov 11, 2022

Add CLIENT NO-TOUCH for clients to run commands without affecting LRU/LFU of keys #11483

Merged

enjoy-binbin mentioned this pull request Nov 27, 2022

Fix replication on expired key test timing issue, give it more chances #11548

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sort out the mess around writable replicas and lookupKeyRead/Write #9572

Sort out the mess around writable replicas and lookupKeyRead/Write #9572

zuiderkwast commented Sep 30, 2021 •

edited by oranagra

oranagra left a comment

oranagra commented Oct 17, 2021

oranagra commented Oct 26, 2021

oranagra commented Nov 25, 2021

enjoy-binbin Dec 9, 2021

zuiderkwast Dec 9, 2021

enjoy-binbin Dec 9, 2021 •

edited

AviAvni Feb 6, 2022

oranagra Feb 6, 2022

zuiderkwast Feb 6, 2022 •

edited

oranagra Feb 7, 2022

zuiderkwast Feb 7, 2022

oranagra Feb 7, 2022

Sort out the mess around writable replicas and lookupKeyRead/Write #9572

Sort out the mess around writable replicas and lookupKeyRead/Write #9572

Conversation

zuiderkwast commented Sep 30, 2021 • edited by oranagra

oranagra left a comment

Choose a reason for hiding this comment

oranagra commented Oct 17, 2021

oranagra commented Oct 26, 2021

oranagra commented Nov 25, 2021

enjoy-binbin Dec 9, 2021

Choose a reason for hiding this comment

zuiderkwast Dec 9, 2021

Choose a reason for hiding this comment

enjoy-binbin Dec 9, 2021 • edited

Choose a reason for hiding this comment

AviAvni Feb 6, 2022

Choose a reason for hiding this comment

oranagra Feb 6, 2022

Choose a reason for hiding this comment

zuiderkwast Feb 6, 2022 • edited

Choose a reason for hiding this comment

oranagra Feb 7, 2022

Choose a reason for hiding this comment

zuiderkwast Feb 7, 2022

Choose a reason for hiding this comment

oranagra Feb 7, 2022

Choose a reason for hiding this comment

zuiderkwast commented Sep 30, 2021 •

edited by oranagra

enjoy-binbin Dec 9, 2021 •

edited

zuiderkwast Feb 6, 2022 •

edited