Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql,*: align settings defaults for UA #119221

Merged
merged 6 commits into from Feb 15, 2024
Merged

sql,*: align settings defaults for UA #119221

merged 6 commits into from Feb 15, 2024

Conversation

dt
Copy link
Member

@dt dt commented Feb 14, 2024

Several features are controlled for such that they can behave differently secondary tenants, with settings or knobs that can choose between restricting or altering the behavior to be more aligned with a multi-tenant hosted service like CC Serverless or a dedicated cluster, either in the system tenant today or in UA app tenants.

Previously, when multi-tenant hosted deployments were the only deployments in which secondary tenants were used, it made sense for such controls to assume it as their default, e.g. default to restricting a behavior that is running int a secondary tenant to run as if it is in a hosted MT service. However, as we move to unified architecture and running all workloads in secondary tenants, even in dedicated and self-hosted clusters, these defaults no longer make as much sense: the out-of-the-box default experience for an application using a UA cluster should be as close to one using a pre-UA cluster, and if the behavior needs to behave differently in a MT hosted/serverless environment, that environment can be responsible for reconfiguring or overriding it accordingly.

To that end, the changes in this CL flip the defaults on several controls of behaviors in secondary tenants, to align the out-of-box experience of a UA cluster with that of a non-UA cluster. When our hosted MT service deploys the version in which these default changes take effect, it will need to add explicit configuration overrides accordingly.

See individual commits for details and release notes.

Epic: none.

@dt dt requested a review from stevendanna February 14, 2024 19:19
@dt dt requested review from a team as code owners February 14, 2024 19:19
@dt dt requested review from herkolategan, DarrylWong and yuzefovich and removed request for a team February 14, 2024 19:19
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@dt dt requested a review from jaylim-crl February 14, 2024 19:31
This makes the out-of-the-box UA behavior match the pre-UA behavior.
For serverless / hosted MT deployments, we can configure the non-default
behavior accordingly.

Release note: none.
Epic: none.
…efault

Today non-UA clusters see non-redacted traces. With this change, UA
clusters will as well out of the box, making the out-of-the-box UA
behavior match the pre-UA behavior. For hosted MT / serverless
deployment, we can override this ourselves to redact traces shown
to MT tenants.

Release note: none.
Epic: none.
Prior to UA, zone config operations worked out of the box. With this
change a UA cluster will also allow them out of the box without any
modification. In a serverless/MT deployment we can override this to
restrict secondary tenants.

Release note: none.
Epic: none.
These didn't match their non-UA behavior of being default on.

Release note (enterprise change): clusters created to run PCR no longer automatically disable the spanconfig.range_coalescing settings.
Users who started using PCR on 23.1 or 23.2 may wish to manually reset these settings.

Epic: none.
…ault

Pre-UA this worked by default, and now in a UA cluster it will work by
default out of the box as well. For non-UA MT clusters where MR is not
supported, it can be overridden accordingly.

Release note: none.
Epic: none.
Other drop operations are gated only on safe updates, so this
one can be too.

Release note: none.
Epic: none.
Copy link
Collaborator

@stevendanna stevendanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I think this is the right direction to take things. We'll want to make a list of overrides that CC should set.

The last commit regarding disable_drop_virtual_cluster gives me a little pause. I agree that DROP TENANT isn't different from DROP DATABASE in terms of the danger it represents. Our main concern at the time is that DROP TENANT is something we might instruct the user to run on a PCR destination cluster that would be pretty disastrous to run on the source cluster (combined with the concern that sql_safe_updates doesn't apply to someone who is driving their cluster through something that doesn't set that session var). But, you were very clever putting this last in the series of commits as that empty initial cluster config makes it much more tempting.

Comment on lines -93 to -96
// Disable range coalescing (as long as the problems related
// to range coalescing have not been solved yet).
"SET CLUSTER SETTING spanconfig.range_coalescing.system.enabled = false",
"SET CLUSTER SETTING spanconfig.range_coalescing.application.enabled = false",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might consider backporting this change to 23.2.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would feel better if we did, but we need to identify/write up a bug it is fixing first to do so.

@dt
Copy link
Member Author

dt commented Feb 15, 2024

DROP TENANT is something we might instruct the user to run on a PCR destination cluster that would be pretty disastrous to run on the source cluster

Sure, and DROP DATABASE is something that you might run in your staging or development cluster but which would be disastrous if run in production, right?

@dt
Copy link
Member Author

dt commented Feb 15, 2024

We'll want to make a list of overrides that CC should set.

I have this sitting my editor to copy to a Jira ticket when this merges:

After all serverless hosts are on 23.2, but before they go to 24.1, they and newly created clusters will need to:

// Flip the override for all tenants so they do not send manual splits/scatters.
"ALTER VIRTUAL CLUSTER ALL SET CLUSTER SETTING sql.virtual_cluster.feature_access.manual_range_scatter.enabled = false",
"ALTER VIRTUAL CLUSTER ALL SET CLUSTER SETTING sql.virtual_cluster.feature_access.manual_range_split.enabled = false",

// Enable trace redaction
"SET CLUSTER SETTING trace.redact_at_virtual_cluster_boundary.enabled = true",

// Disable unrestricted zone config changes in secondary tenants 
"SET CLUSTER SETTING sql.virtual_cluster.feature_access.zone_configs.enabled = false",
"SET CLUSTER SETTING sql.virtual_cluster.feature_access.zone_configs_unrestricted.enabled = false",

// Disable multi-region abstractions in secondary tenants (on non-MR hosts)
"SET CLUSTER SETTING sql.virtual_cluster.feature_access.multiregion.enabled = false",

@dt
Copy link
Member Author

dt commented Feb 15, 2024

TFTR!

bors r+

@jaylim-crl
Copy link
Collaborator

We'll want to make a list of overrides that CC should set.

I have this sitting my editor to copy to a Jira ticket when this merges:

After all serverless hosts are on 23.2, but before they go to 24.1, they and newly created clusters will need to:

// Flip the override for all tenants so they do not send manual splits/scatters.
"ALTER VIRTUAL CLUSTER ALL SET CLUSTER SETTING sql.virtual_cluster.feature_access.manual_range_scatter.enabled = false",
"ALTER VIRTUAL CLUSTER ALL SET CLUSTER SETTING sql.virtual_cluster.feature_access.manual_range_split.enabled = false",

// Enable trace redaction
"SET CLUSTER SETTING trace.redact_at_virtual_cluster_boundary.enabled = true",

// Disable unrestricted zone config changes in secondary tenants 
"SET CLUSTER SETTING sql.virtual_cluster.feature_access.zone_configs.enabled = false",
"SET CLUSTER SETTING sql.virtual_cluster.feature_access.zone_configs_unrestricted.enabled = false",

// Disable multi-region abstractions in secondary tenants (on non-MR hosts)
"SET CLUSTER SETTING sql.virtual_cluster.feature_access.multiregion.enabled = false",

Could we also set these settings now (prior to 23.2 upgrade), if we used the old names?

@dt
Copy link
Member Author

dt commented Feb 15, 2024

Could we also set these settings now (prior to 23.2 upgrade), if we used the old names?

Yep, any of them that exist now, if we set them using their current (23.1) names, that'll still be the storage key for them in 23.2 and 24.1, so no need to wait for the aliases.

@craig
Copy link
Contributor

craig bot commented Feb 15, 2024

Build succeeded:

@craig craig bot merged commit d398bfb into cockroachdb:master Feb 15, 2024
12 checks passed
@dt dt deleted the ua-defaults branch February 15, 2024 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants