Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make tracking invalidation messages always after command's reply #9422

Merged
merged 12 commits into from Oct 7, 2021

Conversation

huangzhw
Copy link
Collaborator

@huangzhw huangzhw commented Aug 29, 2021

Tracking invalidation messages were sometimes sent in inconsistent order, before the command's reply rather than after.
In addition to that, they were sometimes embedded inside other commands responses, like MULTI-EXEC and MGET.
Fix #8206 and #8935

@oranagra oranagra added this to Backlog in 7.0 via automation Aug 29, 2021
@oranagra oranagra moved this from Backlog to In Review in 7.0 Aug 29, 2021
Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would like to ask @madolson to take a look too.
and i would like to ask to perform some trivial benchmark to check the impact of this.

src/server.c Show resolved Hide resolved
tests/unit/tracking.tcl Outdated Show resolved Hide resolved
tests/unit/tracking.tcl Outdated Show resolved Hide resolved
@oranagra
Copy link
Member

@huangzhw what's the content of that last force-push? is it just a rebase from unstable or anything else?

p.s. we usually merge PRs with squash-merge, so it's more convenient to review if there are no force-pushes, and i'd rather merge unstable into the PR instead of a rebase.

@huangzhw
Copy link
Collaborator Author

I just rebase unstable as there are conflicts. Only the last commit is new.

Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
@madolson @yossigo @soloestoy does one of you want to run a second pair of eyes on this?
in theory there's a potential for a temp memory spike, since we flush that list quite frequently, i don't think that's a realistic concern.
There's also an additional lookupClientByID, but that's kinda trivial too i think.

@oranagra oranagra added release-notes indication that this issue needs to be mentioned in the release notes state:to-be-merged The PR should be merged soon, even if not yet ready, this is used so that it won't be forgotten approval-needed Waiting for core team approval to be merged labels Sep 19, 2021
@oranagra
Copy link
Member

oranagra commented Sep 19, 2021

triggered daily CI: https://github.com/redis/redis/actions/runs/1250884079 (failures are unrelated to this PR)

Copy link
Contributor

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose I'm also not clear about the direction where we are deferring the sending of the invalidation, and not deferring the invalidation itself until outside of the command execution, which seems a little more straight forward of an implementation.

src/server.c Outdated Show resolved Hide resolved
src/tracking.c Outdated Show resolved Hide resolved
src/server.c Outdated Show resolved Hide resolved
src/tracking.c Outdated Show resolved Hide resolved
src/tracking.c Outdated Show resolved Hide resolved
@oranagra
Copy link
Member

I suppose I'm also not clear about the direction where we are deferring the sending of the invalidation, and not deferring the invalidation itself until outside of the command execution, which seems a little more straight forward of an implementation.

@madolson assuming i understand what you mean, and i remember the details:
deferring the invalidation (the decision of whatever to send a message or not), rather than just the message itself would cause us to make the wrong decisions (skip sending messages that we should have sent)

src/server.c Outdated Show resolved Hide resolved
src/tracking.c Outdated Show resolved Hide resolved
src/tracking.c Show resolved Hide resolved
src/server.c Outdated Show resolved Hide resolved
@oranagra oranagra added this to To Do in 6.2 Backport via automation Oct 4, 2021
Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is ready to be merged.
any last concerns?

Copy link
Contributor

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no concerns, took another pass through and it LGTM.

@oranagra oranagra merged commit fd135f3 into redis:unstable Oct 7, 2021
@huangzhw huangzhw deleted the tracking branch October 7, 2021 12:44
@oranagra oranagra moved this from In Review to Done in 7.0 Oct 11, 2021
@oranagra
Copy link
Member

oranagra commented Nov 2, 2021

@huangzhw one of the new tests in this PR seems to have a rare timing issue:
https://github.com/redis/redis/runs/4068817198?check_suite_focus=true

*** [err]: Tracking invalidation message of eviction keys should be before response in tests/unit/tracking.tcl
Expected '0' to be equal to 'invalidate volatile-key' (context: type eval line 21 cmd {assert_equal $res {invalidate volatile-key}} proc ::test)

can you please take a look

@oranagra
Copy link
Member

oranagra commented Nov 2, 2021

interestingly, it looks like the failure is semi consistent now, happening a lot in one of the external tests (either clustered or not clustered)
https://github.com/redis/redis/runs/4078424493?check_suite_focus=true

@huangzhw
Copy link
Collaborator Author

huangzhw commented Nov 2, 2021

I have a question. In external test between every start_server is the data set clean?

@oranagra
Copy link
Member

oranagra commented Nov 2, 2021

yes, look for flushall in server.tcl

@oranagra
Copy link
Member

oranagra commented Nov 2, 2021

looking at the test code i think i may realized the problem.
the line that does set used [s used_memory] at the beginning of the test may get the wrong idea if there's some lazy eviction going on in the background.

so maybe we need to copy this code from lazyfree.tcl:

        # make the previous test is really done before sampling used_memory
        wait_for_condition 50 100 {
            [s lazyfree_pending_objects] == 0
        } else {
            fail "lazyfree isn't done"
        }

or maybe even promote that to a function in util.tcl to be shared between them and maybe others in the future...

@huangzhw
Copy link
Collaborator Author

huangzhw commented Nov 2, 2021

I check the code before this test in tracking.tcl. I think no tests cause this. I runned this test with --loop and external, I don't get failures. So I suspect other tests cause this.

@oranagra
Copy link
Member

oranagra commented Nov 2, 2021

yes, the fact it only happens in external mode, is also a good indication it depends on other tests.
i posted a hypothesis above, and a PR to attempt to fix it: #9722

@oranagra
Copy link
Member

oranagra commented Nov 2, 2021

Looks like my hypothesis was wrong https://github.com/redis/redis/actions/runs/1412769681
Still failing...
I guess we must add some prints and reproduce it so we know what's going on..

@oranagra
Copy link
Member

oranagra commented Nov 2, 2021

found the problem: #9726

@oranagra oranagra moved this from To Do to In progress in 6.2 Backport Apr 12, 2022
oranagra pushed a commit to oranagra/redis that referenced this pull request Apr 27, 2022
…is#9422)

Tracking invalidation messages were sometimes sent in inconsistent order,
before the command's reply rather than after.
In addition to that, they were sometimes embedded inside other commands
responses, like MULTI-EXEC and MGET.

(cherry picked from commit fd135f3)
This was referenced Apr 27, 2022
oranagra pushed a commit that referenced this pull request Apr 27, 2022
Tracking invalidation messages were sometimes sent in inconsistent order,
before the command's reply rather than after.
In addition to that, they were sometimes embedded inside other commands
responses, like MULTI-EXEC and MGET.

(cherry picked from commit fd135f3)
@oranagra oranagra moved this from In progress to Done in 6.2 Backport Apr 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approval-needed Waiting for core team approval to be merged release-notes indication that this issue needs to be mentioned in the release notes state:to-be-merged The PR should be merged soon, even if not yet ready, this is used so that it won't be forgotten
Projects
Archived in project
4 participants