Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add validation of acl users #1743

Merged
merged 32 commits into from
Aug 29, 2023
Merged

Conversation

kostasrim
Copy link
Contributor

  • add validation for categories
  • add tests

Once I implement acl deluser I will brutally test this and polish.

Comment on lines 990 to 994
if (!acl::IsUserAllowedToInvokeCommand(*cntx, *cid)) {
(*cntx)->SendError(absl::StrCat("User: ", cntx->authed_username,
" does not have the credentials to execute this command"));
return true;
}
Copy link
Contributor

@dranikpg dranikpg Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the catch and it's actually a very dangerous bug. I added specifically the Verify functions but forgot to add a comment here: Any validation logic should only be placed in verify functions.

The reason for is that verification is always performed on a const connection context pointer. Its possible to do this in parallel. Executing in parallel with one context is not possible, so I use just "stub" contexts to transfer replies.

This means that if you use this context for verification, it's acl_username won't be set for squshing.

So first:

Please add a test case with squashing. You can do the following: 1. enable multi_exec_squash 2. Execute a 10 command multi/exec transaction that spans multiple shards (important - use simple single shard commands).

(IDK why you actually use pytests for everything. They're easier to use, sure 🙂 But they don't run on every pr)

Second:

There are two solutions:

  1. Use VerifyCommandState to check allowance. It has docs in its header above it. It's fine for all cases except MULTI/EXEC, because with it, commands are verified only when added to the multi/exec buffer. So if at the moment EXEC is called the acl rights change, the verification is invalid. Does it work differently in Redis?

  2. See DispatchMonitor below. It checks conn_context.squashing_info.parent? Smth like this, I don't remember 🙂 This way you can get a const pointer to the original context.

  3. Copy acl_username into the stub context, but I'm not a fan of this... Better let validation operate on the original context

Sorry for the inconvenience with my squashing stuff 😅 😄

PS: If you choose (2) maybe please extract the const ConnectionContext* and pass it to VerifyExecution 🙂 So all validation is nicely grouped together

Copy link
Contributor Author

@kostasrim kostasrim Aug 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for my late reply, I wanted to go over in detail and understand what the exact issue is and what have I missed. Even though you explained, I originally thought that I had it covered when I first pushed this PR and added the error checking within the InvokeCommand but what I didn't take into account is as you mentioned the stub context. The good thing is that I learned a few things and I now have a better understanding of our squashing/multi transaction mechanism.

So a few things:

  1. We can't really rely only VerifyCommandState or on VerifyCommandExecution. We need both and the reason is that for multi transactions we need to check twice as per your first point. a) when we issue the command, it could be the case, that the user does not have the credentials and therefore we can issue an error immediately. b) a multi command was issued and some admin changed the user's acl categories between the call of multi and the call to exec. For this particular case we need to use VerifyCommandExecution which covers that corner case. Best part is that we now also have two different errors. For the first case, the error is that the user does not have the credentials and for the later that the user's acl categories have changed inbeteween. IMO this is far clearer for the user + this is what redis does.

  2. There is an issue that we send multiple error responses, I will fix this but on another PR (plz see my comment on the pytest -- it will become clear what the problem is).

Copy acl_username into the stub context, but I'm not a fan of this... Better let validation operate on the original context

Neither am I, we should perform checks on the OG context and that's what my changes do.

(IDK why you actually use pytests for everything. They're easier to use, sure 🙂 But they don't run on every pr)

Because the most important for now is to have to end to end, I don't want to change unit tests until I have something stable. I am planning to add them in the future though.

P.s. I am not a big fun of having the regression tests not being run per PR but I understand the rationale of why we wouldn't want that. I think this will be problematic especially when the team grows as we will start having cascading failures (multiple people who pushed in a 3 hour window can all break the regression tests) but for now this is a non-issue so it's not worth discussing

result = await async_client.execute_command("AUTH default nopass")
result == "ok"

# Vlad goes rogue starts giving admin stats to random users
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤣


namespace dfly::acl {

[[nodiscard]] bool IsUserAllowedToInvokeCommand(const ConnectionContext& cntx,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😆

dranikpg and others added 11 commits August 28, 2023 21:03
* fix: fix index loading

Signed-off-by: Vladislav <vladislav.oleshko@gmail.com>
Co-authored-by: Kostas Kyrimis <kostaskyrim@gmail.com>
…1744)

If an empty buffer is passed to the socket Recv function - it returns error 103.
Even if we returned success, this would lead to the endless loop since the parser
requires more data to parse the load.
Fixes #1680

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* feat(server): support multi eval in lock ahead mode

1. remove validation to allow multi eval only in global script mode
2. send error if there is a mode conflict when running eval inside multi
3. reset uniqe_keys_ when transaction finishes
Also, add sccache debug log in hope to understand
why we get 0 hits sometimes.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* fix: fix defrag stats

Signed-off-by: Vladislav <vladislav.oleshko@gmail.com>
1. If the first request sent to the connection is large (2kb or more)
   Dragonfly was closing the connection.
2. Changed server side error reporting according to memcache protocol:
   https://github.com/memcached/memcached/blob/master/doc/protocol.txt#L172
3. Fixed the wrong casting in DispatchCommand.
4. Remove practically unused code that translated opstatus to strings.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* feat(server): Support limiting the number of open connections.

* * Update helio after the small fix was merged to master
* Don't limit admin connections (and add a test case)

* Resolve CR comments
* fix: Run defrag on dbs > 0 as well

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
While loading rdb snapshot, if oom is reached a bad alloc exception is thrown. Now we
catch it and write warning to log and fali loader.

Signed-off-by: adi_holden <adi@dragonflydb.io>
* feat: implement CONFIG GET command

The command returns all the matched arguments and their current values.
In addition, this PR adds mutability semantics to each config - whether it can be
changed at runtime.

Fixes #1700

Signed-off-by: Roman Gershman <roman@dragonflydb.io>

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
ashotland and others added 4 commits August 28, 2023 21:03
* fix: add master metric

the role metric is hard to work with as it generates two time serieses,
for each of the role label values.

Will remove the role metric once we update the production versions.

---------

Signed-off-by: ashotland <ari@dragonflydb.io>
@kostasrim kostasrim self-assigned this Aug 29, 2023
assert res == b"OK"

res = await client.execute_command("EXEC")
# TODO(we need to fix this, basiscally SQUASHED/MULTI transaction commands
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dranikpg This little code needs some polishing. I will address this in a separate PR. Basically, we need to squash the error messages into one....

@kostasrim kostasrim merged commit 7c43cbf into main Aug 29, 2023
10 checks passed
@kostasrim kostasrim deleted the acl_part_6_add_validation_of_users branch August 29, 2023 15:52
Comment on lines +803 to +807
optional<ErrorReply> Service::VerifyCommandExecution(const CommandId* cid,
const ConnectionContext* cntx) {
// TODO: Move OOM check here
return VerifyConnectionAclStatus(cid, cntx, "ACL rules changed between the MULTI and EXEC");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot: Actually verifycommandexecution is called for all commands, not sure why you use this error here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants