feat: add validation of acl users #1743

kostasrim · 2023-08-25T17:00:08Z

add validation for categories
add tests

Once I implement acl deluser I will brutally test this and polish.

dranikpg · 2023-08-25T18:36:10Z

src/server/main_service.cc

+  if (!acl::IsUserAllowedToInvokeCommand(*cntx, *cid)) {
+    (*cntx)->SendError(absl::StrCat("User: ", cntx->authed_username,
+                                    " does not have the credentials to execute this command"));
+    return true;
+  }


Here's the catch and it's actually a very dangerous bug. I added specifically the Verify functions but forgot to add a comment here: Any validation logic should only be placed in verify functions.

The reason for is that verification is always performed on a const connection context pointer. Its possible to do this in parallel. Executing in parallel with one context is not possible, so I use just "stub" contexts to transfer replies.

This means that if you use this context for verification, it's acl_username won't be set for squshing.

So first:

Please add a test case with squashing. You can do the following: 1. enable multi_exec_squash 2. Execute a 10 command multi/exec transaction that spans multiple shards (important - use simple single shard commands).

(IDK why you actually use pytests for everything. They're easier to use, sure 🙂 But they don't run on every pr)

Second:

There are two solutions:

Use VerifyCommandState to check allowance. It has docs in its header above it. It's fine for all cases except MULTI/EXEC, because with it, commands are verified only when added to the multi/exec buffer. So if at the moment EXEC is called the acl rights change, the verification is invalid. Does it work differently in Redis?

See DispatchMonitor below. It checks conn_context.squashing_info.parent? Smth like this, I don't remember 🙂 This way you can get a const pointer to the original context.

Copy acl_username into the stub context, but I'm not a fan of this... Better let validation operate on the original context

Sorry for the inconvenience with my squashing stuff 😅 😄

PS: If you choose (2) maybe please extract the const ConnectionContext* and pass it to VerifyExecution 🙂 So all validation is nicely grouped together

Sorry for my late reply, I wanted to go over in detail and understand what the exact issue is and what have I missed. Even though you explained, I originally thought that I had it covered when I first pushed this PR and added the error checking within the InvokeCommand but what I didn't take into account is as you mentioned the stub context. The good thing is that I learned a few things and I now have a better understanding of our squashing/multi transaction mechanism.

So a few things:

We can't really rely only VerifyCommandState or on VerifyCommandExecution. We need both and the reason is that for multi transactions we need to check twice as per your first point. a) when we issue the command, it could be the case, that the user does not have the credentials and therefore we can issue an error immediately. b) a multi command was issued and some admin changed the user's acl categories between the call of multi and the call to exec. For this particular case we need to use VerifyCommandExecution which covers that corner case. Best part is that we now also have two different errors. For the first case, the error is that the user does not have the credentials and for the later that the user's acl categories have changed inbeteween. IMO this is far clearer for the user + this is what redis does.

There is an issue that we send multiple error responses, I will fix this but on another PR (plz see my comment on the pytest -- it will become clear what the problem is).

Copy acl_username into the stub context, but I'm not a fan of this... Better let validation operate on the original context

Neither am I, we should perform checks on the OG context and that's what my changes do.

(IDK why you actually use pytests for everything. They're easier to use, sure 🙂 But they don't run on every pr)

Because the most important for now is to have to end to end, I don't want to change unit tests until I have something stable. I am planning to add them in the future though.

P.s. I am not a big fun of having the regression tests not being run per PR but I understand the rationale of why we wouldn't want that. I think this will be problematic especially when the team grows as we will start having cascading failures (multiple people who pushed in a 3 hour window can all break the regression tests) but for now this is a non-issue so it's not worth discussing

dranikpg · 2023-08-25T18:38:32Z

tests/dragonfly/acl_family_test.py

+    result = await async_client.execute_command("AUTH default nopass")
+    result == "ok"
+
+    # Vlad goes rogue starts giving admin stats to random users


dranikpg · 2023-08-25T18:43:14Z

src/server/acl/validator.cc

+
+namespace dfly::acl {
+
+[[nodiscard]] bool IsUserAllowedToInvokeCommand(const ConnectionContext& cntx,


* fix: fix index loading Signed-off-by: Vladislav <vladislav.oleshko@gmail.com> Co-authored-by: Kostas Kyrimis <kostaskyrim@gmail.com>

…1744) If an empty buffer is passed to the socket Recv function - it returns error 103. Even if we returned success, this would lead to the endless loop since the parser requires more data to parse the load. Fixes #1680 Signed-off-by: Roman Gershman <roman@dragonflydb.io>

* feat(server): support multi eval in lock ahead mode 1. remove validation to allow multi eval only in global script mode 2. send error if there is a mode conflict when running eval inside multi 3. reset uniqe_keys_ when transaction finishes

Also, add sccache debug log in hope to understand why we get 0 hits sometimes. Signed-off-by: Roman Gershman <roman@dragonflydb.io>

* fix: fix defrag stats Signed-off-by: Vladislav <vladislav.oleshko@gmail.com>

1. If the first request sent to the connection is large (2kb or more) Dragonfly was closing the connection. 2. Changed server side error reporting according to memcache protocol: https://github.com/memcached/memcached/blob/master/doc/protocol.txt#L172 3. Fixed the wrong casting in DispatchCommand. 4. Remove practically unused code that translated opstatus to strings. Signed-off-by: Roman Gershman <roman@dragonflydb.io>

* feat(server): Support limiting the number of open connections. * * Update helio after the small fix was merged to master * Don't limit admin connections (and add a test case) * Resolve CR comments

* fix: Run defrag on dbs > 0 as well Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>

While loading rdb snapshot, if oom is reached a bad alloc exception is thrown. Now we catch it and write warning to log and fali loader. Signed-off-by: adi_holden <adi@dragonflydb.io>

docs: snapshot_cron

* feat: implement CONFIG GET command The command returns all the matched arguments and their current values. In addition, this PR adds mutability semantics to each config - whether it can be changed at runtime. Fixes #1700 Signed-off-by: Roman Gershman <roman@dragonflydb.io> --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io>

* fix: add master metric the role metric is hard to work with as it generates two time serieses, for each of the role label values. Will remove the role metric once we update the production versions. --------- Signed-off-by: ashotland <ari@dragonflydb.io>

kostasrim · 2023-08-29T11:27:54Z

tests/dragonfly/acl_family_test.py

+    assert res == b"OK"
+
+    res = await client.execute_command("EXEC")
+    # TODO(we need to fix this, basiscally SQUASHED/MULTI transaction commands


@dranikpg This little code needs some polishing. I will address this in a separate PR. Basically, we need to squash the error messages into one....

tests/dragonfly/acl_family_test.py

dranikpg · 2023-08-29T22:04:29Z

src/server/main_service.cc

+optional<ErrorReply> Service::VerifyCommandExecution(const CommandId* cid,
+                                                     const ConnectionContext* cntx) {
+  // TODO: Move OOM check here
+  return VerifyConnectionAclStatus(cid, cntx, "ACL rules changed between the MULTI and EXEC");
+}


I forgot: Actually verifycommandexecution is called for all commands, not sure why you use this error here

kostasrim added 17 commits August 22, 2023 20:44

feat(AclFamily): add acl setuser command

f6d8d1e

chore: add tests, add overloaded header, improve implementation

f4dbe93

fix: clang bug on Overloaded

d37290a

Merge branch 'main' into acl_part_4_add_acl_set_user

0cf215e

fix: revert accidental removal of a line in acl_family_test

20412b9

chore: address gh comments

9d3b767

Merge branch 'main' into acl_part_4_add_acl_set_user

cc2432d

chore: use StartsWith instead of string comparison

c0356d8

chore: replace starts_with member function with absl::StartsWith

2251f8e

feat(AclFamily): add AUTH for acl members

ff6e3f0

Merge branch 'main' into acl_part_5_add_acl_auth

16aa85f

chore: address gh comments

d98d270

Merge branch 'main' into acl_part_5_add_acl_auth

d84ff8f

chore: apply gh comments

b4a9163

Merge branch 'main' into acl_part_5_add_acl_auth

cdfdb29

feat: add acl validation

32ce688

Merge branch 'main' into acl_part_6_add_validation_of_users

649b71a

kostasrim requested a review from dranikpg August 25, 2023 17:00

dranikpg requested changes Aug 25, 2023

View reviewed changes

dranikpg and others added 11 commits August 28, 2023 21:03

fix: fix index loading (#1742)

3aff951

* fix: fix index loading Signed-off-by: Vladislav <vladislav.oleshko@gmail.com> Co-authored-by: Kostas Kyrimis <kostaskyrim@gmail.com>

fix: extend CI running time (#1749)

2ff8c71

Also, add sccache debug log in hope to understand why we get 0 hits sometimes. Signed-off-by: Roman Gershman <roman@dragonflydb.io>

fix: fix defrag stats (#1740)

cc16036

* fix: fix defrag stats Signed-off-by: Vladislav <vladislav.oleshko@gmail.com>

feat(server): Support limiting the number of open connections. (#1670)

be56a27

* feat(server): Support limiting the number of open connections. * * Update helio after the small fix was merged to master * Don't limit admin connections (and add a test case) * Resolve CR comments

fix: Run defrag on dbs > 0 as well (#1737)

e9f541d

* fix: Run defrag on dbs > 0 as well Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>

fix(server): rdb loader catch bad alloc (#1748)

9cb1415

While loading rdb snapshot, if oom is reached a bad alloc exception is thrown. Now we catch it and write warning to log and fali loader. Signed-off-by: adi_holden <adi@dragonflydb.io>

docs: add snapshot_cron flag in README (English and zh-CN) (#1729)

e94b6c4

docs: snapshot_cron

ashotland and others added 4 commits August 28, 2023 21:03

Merge branch 'main' into acl_part_6_add_validation_of_users

1c1cea5

chore: address gh comments and fix multi transactions

0ac9c0e

Merge branch 'main' into acl_part_6_add_validation_of_users

12e176f

kostasrim requested a review from dranikpg August 29, 2023 11:13

kostasrim self-assigned this Aug 29, 2023

kostasrim commented Aug 29, 2023

View reviewed changes

dranikpg approved these changes Aug 29, 2023

View reviewed changes

tests/dragonfly/acl_family_test.py Show resolved Hide resolved

kostasrim merged commit 7c43cbf into main Aug 29, 2023
10 checks passed

kostasrim deleted the acl_part_6_add_validation_of_users branch August 29, 2023 15:52

dranikpg reviewed Aug 29, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add validation of acl users #1743

feat: add validation of acl users #1743

kostasrim commented Aug 25, 2023

dranikpg Aug 25, 2023 •

edited

Loading

kostasrim Aug 29, 2023 •

edited

Loading

dranikpg Aug 25, 2023

dranikpg Aug 25, 2023

kostasrim Aug 29, 2023

dranikpg Aug 29, 2023


		namespace dfly::acl {

		[[nodiscard]] bool IsUserAllowedToInvokeCommand(const ConnectionContext& cntx,

feat: add validation of acl users #1743

feat: add validation of acl users #1743

Conversation

kostasrim commented Aug 25, 2023

dranikpg Aug 25, 2023 • edited Loading

Choose a reason for hiding this comment

kostasrim Aug 29, 2023 • edited Loading

Choose a reason for hiding this comment

dranikpg Aug 25, 2023

Choose a reason for hiding this comment

dranikpg Aug 25, 2023

Choose a reason for hiding this comment

kostasrim Aug 29, 2023

Choose a reason for hiding this comment

dranikpg Aug 29, 2023

Choose a reason for hiding this comment

dranikpg Aug 25, 2023 •

edited

Loading

kostasrim Aug 29, 2023 •

edited

Loading