Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql,builtins: remove HasAdminRole checks #116844

Merged
merged 5 commits into from Mar 19, 2024

Conversation

rafiss
Copy link
Collaborator

@rafiss rafiss commented Dec 20, 2023

These crdb_internal builtins are used for debugging, so should not
require admin access. Instead, we use the VIEWCLUSTERMETADATA or
REPAIRCLUSTERMETADATA privilege.

informs: #114384


sql: make admin, root, and node implicit owners of all objects

This lets us remove specific checks for the admin role in many places.

sql: remove unneeded calls to hasAdminRole

All these checks already have a privilege check in the same place, which
already implicitly checks for the admin role.

Release note: None

Copy link

blathers-crl bot commented Dec 20, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rafiss rafiss force-pushed the remove-hasAdminRole branch 6 times, most recently from 6c9aabc to 0d731af Compare January 8, 2024 21:29
@rafiss rafiss force-pushed the remove-hasAdminRole branch 3 times, most recently from 55e0c12 to cb86402 Compare January 25, 2024 21:54
@rafiss rafiss marked this pull request as ready for review January 25, 2024 21:54
@rafiss rafiss requested review from a team as code owners January 25, 2024 21:54
@rafiss rafiss requested review from dt and fqazi and removed request for a team January 25, 2024 21:54
return nil, err
}
if !isAdmin {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does “view cluster metadata” really correspond to “read all data regardless of permissions”? I don’t have the written description of it in front of me but just by the name I assume it is eg SHOW RANGES or something not reading any kv?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my thinking was that technically, you can see arbitrary KVs using SHOW RANGES, so this PR is not a privilege escalation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can leave out builtins like scan that can expose data from arbitrary key ranges. Barring that I would be okay with this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't see arbitrary KVs: you can see some keys if they were chosen as split keys, but you cannot see values. We already point out that indexed values end up in keys when we talk about domiciling or other data sensitivity questions, but values are generally only visible to those granted access and admins.

Copy link
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, 21 of 21 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @fqazi and @rafiss)


pkg/sql/sem/builtins/builtins.go line 7204 at r2 (raw file):

		),
	),
	"crdb_internal.reset_index_usage_stats": makeBuiltin(

this function is not a read only data, meaning is deleted all stats from the index usage table. I'm not sure we want non-admins to be able to do it.
cc @kevin-v-ngo for some feedback


pkg/sql/sem/builtins/builtins.go line 7230 at r2 (raw file):

		},
	),
	"crdb_internal.reset_sql_stats": makeBuiltin(

same comment as above.
@kevin-v-ngo are we okay with non-admins that have the role REPAIRCLUSTERMETADATA be able to reset sql stats?


pkg/sql/sem/builtins/builtins.go line 7256 at r2 (raw file):

		},
	),
	"crdb_internal.reset_activity_tables": makeBuiltin(

same as bove


pkg/sql/sem/builtins/builtins.go line 7282 at r2 (raw file):

		},
	),
	"crdb_internal.reset_insights_tables": makeBuiltin(

same as above


pkg/server/application_api/stmtdiag_test.go line 126 at r2 (raw file):

		"SELECT crdb_internal.request_statement_bundle('SELECT _', 0::FLOAT, 0::INTERVAL, 0::INTERVAL)",
	)
	require.Contains(t, err.Error(), "requesting statement bundle requires VIEWACTIVITY privilege")

did you replace the admin by another option, such as VIEWCLUSTERMETADATA? Just confirming if VIEWACTIVITY is now the only requirement or if there is any other role that would also allow this operation

Copy link
Collaborator Author

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @fqazi, @kevin-v-ngo, and @maryliag)


pkg/sql/sem/builtins/builtins.go line 7204 at r2 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

this function is not a read only data, meaning is deleted all stats from the index usage table. I'm not sure we want non-admins to be able to do it.
cc @kevin-v-ngo for some feedback

since it's not read-only, that's why i made it use "REPAIRCLUSTERMETADATA" and not "VIEWCLUSTERMETADATA"

the reasoning is that when we are debugging cloud clusters, customers don't want us to use admin, and instead we should only use the minimum set of privileges

@dt
Copy link
Member

dt commented Jan 25, 2024

I have some discomfort with this directionally, and indeed wonder we should be doing the opposite and ensuing all crdb_internal functions require admin?

crdb_internal is explicitly a prefix where we put internal things, and hold them to wholly different quality/safety bars during development to make them more powerful and thus useful in debugging, and cheaper/faster to implement, again with the intent of having more of these tools at our disposal, to be more useful when it comes to debugging. Engineers, myself included, will readily identify a task that would be easier with an internal builtin and add it, even if it is wildly unsafe or violates normal auth checks. Right now we do that "safely" by throwing an admin check on it, but I worry that we could be in for some violated expectations, or at the very least a more burdensome vetting process, if every debugging builtin needs to consider what is the widest audience to which it can be granted safely, are its docs/safety checks/etc all calibrated to that audience, etc. In many ways, just saying "internal == admin" would minimize the likelihood a quick debugging tool ends up causing a security disclosure / TA.

There have, in the past, been crdb_internal builtins that can corrupt a cluster, violate all the norms of our privilege model, or otherwise significantly violate general exceptions around sql-exposed functionality. Making these easier for non-admins to call seems like it introduces a risk of unintended consequences: authors of internal tools generally assume that these are only things used by admins who know exactly what they're doing, but operators may be granting viewclustermetadata to only moderately trained DBAs or SREs on the assumption that the DB would keep them from doing anything too bad.

I also question why we want to make crdb_internal callable by more users? Ideally if non-admins are asking to call these functions, should we be directing them to call non-crdb_internal functions instead of making the internal functions available to non-admins?

@dt
Copy link
Member

dt commented Jan 25, 2024

when we are debugging cloud clusters, customers don't want us to use admin, and instead we should only use the minimum set of privileges

Could we add SHOW syntax for the builtins that we want to use as non-admins instead of opening up the builtins?

@rafiss
Copy link
Collaborator Author

rafiss commented Jan 26, 2024

I have some discomfort with this directionally, and indeed wonder we should be doing the opposite and ensuing all crdb_internal functions require admin?

well, this was already was discussed and litigated before. the conclusion where we landed with Abhinav/Duoc/Dikshant and other stakeholders was that we should move away from admin as much as possible (CRDB-17640). this was needed urgently enough that the decision we landed on was to replace all the existing admin checks with checks for one of two new privileges that were explicitly added to replace admin (VIEWCLUSTERMETADATA or REPAIRCLUSTERMETADATA ... or one of the other existing privileges if there actually was already a good fit). since VIEW/REPAIRCLUSTERMETADATA were added purely to replace admin checks, i don't feel that it's accurate to say we are loosening up these endpoints too much.

@dt
Copy link
Member

dt commented Jan 26, 2024

My worry above is that "Abhinav/Duoc/Dikshant and other stakeholders" might not have included "engineers who are adding debugging tools"? As mentioned above, implementation of low-effort/quick debugging is often not treated (for good reasons) the same as user-facing feature work.

existing privileges if there actually was already a good fit

This is what concerns me most here: if we have individual engineers hastily throwing together a quick debugging tool, relying on them to reach for the right privs in the right places increases the chance something going wrong. And when things go wrong in this way it often requires a security disclosure/TA. For functions in the "internal debugging tool" category, that do not get the more rigorous thought and care that goes into user-facing features and their carefully determined access controls, the simplicity of having a single authorization decision -- "is allowed to use potentially-all-powerful internal debug tools?" -- seems like the safest way to not accidentally violate access expectations.

If we really don't want "admin" to be that check, then perhaps "repairclustermetadata" could be the check but it should be the check, that every internal builtin uses, rather than asking them to make nuanced decisions (which could be wrong)?

@dikshant
Copy link

If we really don't want "admin" to be that check, then perhaps "repairclustermetadata" could be the check but it should be the check, that every internal builtin uses, rather than asking them to make nuanced decisions (which could be wrong)?

That is what we are suggesting we should do. If the concern is engineers might not know to put it under repairclustermetadata, couldn't we have a test that fails if someone uses admin and have the error message point them towards using repairclustermetadata?

If we need to modify a bunch of other built ins to use repairclustermetadata we can track that as a separate issue.

@dt
Copy link
Member

dt commented Jan 29, 2024

That is what we are suggesting we should do

This change as it stands isn't quite doing that -- making repairclustermetadata the check used by debugging builtins -- though: some things are being changed to check viewclustermetadata while others are checking repairclustermetadata and others are checking other privileges. Asking for nuanced determinations like this on a per-function (or per-flag within a function potentially) basis in debugging tools is what worries me; often there are non-obvious implications of powerful debugging tools, particularly when more than one author is working on them or re-using existing code that might be able to do things that aren't readily apparent (we already have an example, in this diff, where a debugging builtin was moved from admin to VIEWCLUSTERMETADATA even though it grants access to far more than just meta data).

So if we're suggesting we'll replace admin with repair 1:1, that seems reasonably simple/easy for engineers to follow, but only if we actually are doing it 1:1.

@rafiss rafiss force-pushed the remove-hasAdminRole branch 2 times, most recently from 4959589 to 55ae099 Compare March 15, 2024 19:32
@rafiss rafiss requested a review from a team as a code owner March 15, 2024 19:32
@rafiss rafiss requested review from DrewKimball and removed request for a team March 15, 2024 19:32
These crdb_internal builtins are used for debugging, so should not
require admin access. Instead, we use the VIEWCLUSTERMETADATA or
REPAIRCLUSTERMETADATA privilege.

Release note: None
All these checks already have a privilege check in the same place, which
already implicitly checks for the admin role.

Release note: None
@rafiss
Copy link
Collaborator Author

rafiss commented Mar 18, 2024

@dt this has been updated so the builtins all use REPAIRCLUSTERMETADATA. i also added a commit that aliases it to REPAIRCLUSTER. (the internal key cannot be changed)

@dt dt removed their request for review March 18, 2024 21:57
@rafiss
Copy link
Collaborator Author

rafiss commented Mar 19, 2024

tftr!

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 19, 2024

Build failed:

@rafiss
Copy link
Collaborator Author

rafiss commented Mar 19, 2024

bors r-

Release note (sql change): The REPAIRCLUSTERMETADATA privilege now has
an alternative name: REPAIRCLUSTER. Both names can be used
interchangably.
@rafiss rafiss requested a review from a team as a code owner March 19, 2024 03:02
@rafiss
Copy link
Collaborator Author

rafiss commented Mar 19, 2024

I did a find/replace of REPAIRCLUSTERMETADATA/REPAIRCLUSTER in comments, variables names, and test expectations.

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 19, 2024

@craig craig bot merged commit a16fcaf into cockroachdb:master Mar 19, 2024
21 of 22 checks passed
@rafiss rafiss deleted the remove-hasAdminRole branch March 19, 2024 14:21
stevendanna added a commit to stevendanna/cockroach that referenced this pull request Mar 28, 2024
In cockroachdb#116844 we added a CREATE USER statement to this tests setup. That
statement produces 2 schema change jobs that could race with the
coordination required for the assertions in this test to be valid.

Here we solve this by only running the AfterJobStateMachine callback
for the job we care about.

Fixes cockroachdb#121149

Release note: None
craig bot pushed a commit that referenced this pull request Mar 28, 2024
121250: jobs: deflake TestRegistryLifecycle r=dt a=stevendanna

In #116844 we added a CREATE USER statement to this tests setup. That statement produces 2 schema change jobs that could race with the coordination required for the assertions in this test to be valid.

Here we solve this by only running the AfterJobStateMachine callback for the job we care about.

Fixes #121149

Release note: None

Co-authored-by: Steven Danna <danna@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants