sql: create a new "admin audit log" #59356

knz · 2021-01-23T18:55:50Z

This feature was requested by customers: "we want a trace of SQL
queries run by admin users".

A naive implementation of this request would memorize that a user has
the admin role at the moment the session was opened, and
subsequently log every statement performed within that session. This
approach would be relatively simple.

However, it has two shortcomings:

if the user is granted the admin role after they are logged in,
this would not work and fail to report some admin queries.
if the admin role is revoked after they are logged in, it would
over-report non-admin queries as if they were admin queries.

To overcome these shortcomings, this PR instead tracks when the admin
privilege is effectively used, i.e. the fact the user was an admin
is used to bypass some other, more natural check.

It then logs the query to the "SENSITIVE_ACCESS" channel after it has
executed.

This ensures events are produced for most "interesting" cases:

configuration changes, that have special case for admin
statements that check role options and auto-grant all options to admin
statements that bypass other regular privilege checks because the user is
in the admin role.
statements that are able to perform an operation because privilege
was explicitly granted to the admin role and the user belong to it.

The resulting feature is very close to what the customer requests,
but just so slightly different: in case where an admin user performs
a SQL query that it legitimately can perfom based on its own
privileges, and not that of the admin role, that query would not be
reported. Here an example where the feature would not report an event:

$ cockroach sql --user myuser  # NB: 'myuser' has admin role

CREATE DATABASE mydb; -- event reported: admin bypass for option CREATEDB

DROP DATABASE mydb; -- no event: myuser is the actual owner of mydb, no admin needed

Release note (security update): It is now possible to track all SQL
statements executed by users that have the admin role, when
the admin privileges are effectively used -- i.e. when the user
would not have otherwise access to the statement if they were not
an admin member. This can be achieved via the cluter setting
sql.log.admin_audit.enabled; this emits structured events
of type admin_query to the logging channel SENSITIVE_ACCESS.

This feature was requested by customers: "we want a trace of SQL queries run by admin users". A naive implementation of this request would memorize that a user has the `admin` role at the moment the session was opened, and subsequently log every statement performed within that session. This approach would be relatively simple. However, it has two shortcomings: - if the user is granted the `admin` role after they are logged in, this would not work and fail to report some admin queries. - if the `admin` role is revoked after they are logged in, it would over-report non-admin queries as if they were admin queries. To overcome these shortcomings, this PR instead tracks when the admin privilege is *effectively used*, i.e. the fact the user was an admin is used to bypass some other, more natural check. It then logs the query to the "SENSITIVE_ACCESS" channel after it has executed. This ensures events are produced for most "interesting" cases: - configuration changes, that have special case for `admin` - statements that check role options and auto-grant all options to `admin` - statements that bypass other regular privilege checks because the user is in the `admin` role. - statements that are able to perform an operation because privilege was explicitly granted to the `admin` role and the user belong to it. The resulting feature is very close to what the customer requests, but just so slightly different: in case where an admin user performs a SQL query that it legitimately can perfom based on its own privileges, and not that of the `admin` role, that query would not be reported. Here an example where the feature would *not* report an event: ``` $ cockroach sql --user myuser # NB: 'myuser' has admin role CREATE DATABASE mydb; -- event reported: admin bypass for option CREATEDB DROP DATABASE mydb; -- no event: myuser is the actual owner of mydb, no admin needed ``` Release note (security update): It is now possible to track all SQL statements executed by users that have the `admin` role, when the `admin` privileges are *effectively* used -- i.e. when the user would not have otherwise access to the statement if they were not an `admin` member. This can be achieved via the cluter setting `sql.log.admin_audit.enabled`; this emits structured events of type `admin_query` to the logging channel SENSITIVE_ACCESS.

cockroach-teamcity · 2021-01-23T18:55:56Z

This change is

knz · 2021-01-23T19:02:51Z

Hey @thtruo @piyush-singh, and @solongordon @ajwerner , we have a little situation on our hands.

To start, I invite you to read the PR description at the top. The context is explained in the linked issue.

The reason why we need to discuss is that we have a design trade-off:

the "simple" approach, described above and (currently) not used, is incomplete, and risks under-reporting;
the "precise" approach taken in this PR, which reports exactly, but may surprise the customer who requested the feature;
the "coarse" approach where we would re-check whether the user currently belongs to the admin role, anew upon every statement executed. I did not consider this approach because I believe it would incur an unacceptable performance overhead.

There are two questions that need answering:

from a PM perspective: what does the customer truly expect? Maybe the phrasing of the original issue was vague. Would they be satisfied with the approach proposed here, or do they want something else?
from the engineering perspective: is my assessment of design point (3) above accurate? That the overhead is unacceptable?
- If so, are we comfortable maintaining the infrastructure defined in this PR, which requires every privilege check inside the SQL layer to go through the API in authorization.go? If not, is there a better approach?

knz · 2021-01-25T18:26:47Z

cc @aaron-crl as you might have opinions too

ajwerner

the "coarse" approach where we would re-check whether the user currently belongs to the admin role, anew upon every statement executed. I did not consider this approach because I believe it would incur an unacceptable performance overhead.

Isn't this cheap-ish? We cache the membership in memory for each user as a mapping. In practice this should be a mutex lock and a hash map lookup which I suspect we are already doing for each table involved in each query. In short, I'm surprised to hear this is seen as expensive.

Reviewable status: complete! 0 of 0 LGTMs obtained

knz · 2021-01-27T15:20:35Z

@ajwerner Isn't this cheap-ish?

I don't know man, it doesn't seem like it. The main problem is that we can't use the "current txn" to determine the membership -- the txn may be in error / aborted at the time the logging occurs.

Even if we changed the code to pull the admin bool at the beginning of each statement, not at the end, we'd still have the same problem: if a statement is issued while the txn is aborted (e.g. ROLLBACK), and our requirement is "log every admin statement", we need a way to determine for that ROLLBACK whether the user is an admin or not.

That would suggest creating a sub-txn for every statement to pass to the HasAdminRole() function. Does that sound reasonable?

ajwerner

Even if we changed the code to pull the admin bool at the beginning of each statement, not at the end, we'd still have the same problem: if a statement is issued while the txn is aborted (e.g. ROLLBACK), and our requirement is "log every admin statement", we need a way to determine for that ROLLBACK whether the user is an admin or not.

Given the leasing protocol, it seems totally valid to query the admin bool only in the first statement of any given transaction and then maintain it for the duration of that transaction even if it enters the rollback state. If you want to be pedantic, you could imagine resetting that value in the case you revoke the admin role from yourself, or more generally, perhaps on any change to the role membership.

All in all, that doesn't seem to bad. We legitimately call HasAdminRole for every table for every statement during planning if I read the code correctly.

Reviewable status: complete! 0 of 0 LGTMs obtained

knz · 2021-01-27T16:02:27Z

it seems totally valid to query the admin bool only in the first statement of any given transaction

hm ok I like that, I'll try that next.

aaron-crl · 2021-02-08T17:51:54Z

For what it's worth, from a security log consumption perspective I would expect a "log of all admin" activities to contain all the actions my admins conducted during their a privileged session (including those that did not require administrative permissions).

This keeps with the spirit of auditing and avoids the trivial case where a query may be constructed in a way that masks its parameters though non-admin actions but is then executed with a single opaque admin call.

This is why in a unix environment we tend to collect the command history for users in sudo in addition to explicit sudo calls. Can they still potentially circumvent this auditing: yes, but it starts to look very irregular quickly.

ajwerner · 2021-02-08T19:00:40Z

For what it's worth, from a security log consumption perspective I would expect a "log of all admin" activities to contain all the actions my admins conducted during their a privileged session (including those that did not require administrative permissions).

This keeps with the spirit of auditing and avoids the trivial case where a query may be constructed in a way that masks its parameters though non-admin actions but is then executed with a single opaque admin call.

This is why in a unix environment we tend to collect the command history for users in sudo in addition to explicit sudo calls. Can they still potentially circumvent this auditing: yes, but it starts to look very irregular quickly.

Nice. Seems like we've converged on the idea that we want to audit all operations from all transactions where the user has access to the admin role.

knz · 2021-02-09T09:07:56Z

Ok thanks for explaining.
The new issue is that I am finding myself properly unable to work on this throughout this milestone due to other/higher priorities. We need to find someone to do the work if it's made to complete for v21.1.

@thtruo can you manage this please? This may involve you talking to someone on the new SQL Queries team.

knz · 2021-02-09T12:11:29Z

@solongordon @RichardJCai as you can see in the discussion above, my PR here was not taking the right approach but @ajwerner made a concrete counter-proposal.

I am going to close this PR now, but would be grateful if you could follow up. Cheers

RichardJCai · 2021-02-10T16:53:29Z

All in all, that doesn't seem to bad. We legitimately call HasAdminRole for every table for every statement during planning if I read the code correctly.

Reviewable status: complete! 0 of 0 LGTMs obtained

Hey, where do we call HasAdminRole for every table for every statement? I think I'm missing this.

ajwerner · 2021-02-10T18:52:28Z

All in all, that doesn't seem to bad. We legitimately call HasAdminRole for every table for every statement during planning if I read the code correctly.

Reviewable status: complete! 0 of 0 LGTMs obtained

Hey, where do we call HasAdminRole for every table for every statement? I think I'm missing this.

Well, every time we check privileges on a descriptor we do the role expansion:

cockroach/pkg/sql/authorization.go

Line 122 in ee72e00

    
           hasPriv, err := p.checkRolePredicate(ctx, user, func(role security.SQLUsername) bool {

knz added this to In progress in DB Server & Security via automation Jan 23, 2021

knz marked this pull request as draft January 25, 2021 08:35

knz mentioned this pull request Jan 25, 2021

logging: standardize the event log and bring structured details to external logs #57629

Closed

26 tasks

ajwerner reviewed Jan 26, 2021

View reviewed changes

ajwerner reviewed Jan 27, 2021

View reviewed changes

knz closed this Feb 9, 2021

DB Server & Security automation moved this from In progress to Done 21.1 Feb 9, 2021

knz mentioned this pull request Feb 9, 2021

sql: extend SQL audit logging to "operations run by admin users" #58334

Closed

RichardJCai mentioned this pull request Feb 17, 2021

sql: Extend SQL audit logging to log operations run by admin users #60708

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: create a new "admin audit log" #59356

sql: create a new "admin audit log" #59356

knz commented Jan 23, 2021

cockroach-teamcity commented Jan 23, 2021

knz commented Jan 23, 2021 •

edited

knz commented Jan 25, 2021

ajwerner left a comment

knz commented Jan 27, 2021

ajwerner left a comment

knz commented Jan 27, 2021

aaron-crl commented Feb 8, 2021

ajwerner commented Feb 8, 2021

knz commented Feb 9, 2021

knz commented Feb 9, 2021

RichardJCai commented Feb 10, 2021

ajwerner commented Feb 10, 2021

sql: create a new "admin audit log" #59356

sql: create a new "admin audit log" #59356

Conversation

knz commented Jan 23, 2021

cockroach-teamcity commented Jan 23, 2021

knz commented Jan 23, 2021 • edited

knz commented Jan 25, 2021

ajwerner left a comment

Choose a reason for hiding this comment

knz commented Jan 27, 2021

ajwerner left a comment

Choose a reason for hiding this comment

knz commented Jan 27, 2021

aaron-crl commented Feb 8, 2021

ajwerner commented Feb 8, 2021

knz commented Feb 9, 2021

knz commented Feb 9, 2021

RichardJCai commented Feb 10, 2021

ajwerner commented Feb 10, 2021

knz commented Jan 23, 2021 •

edited