Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Cloud users to set web_idle_timeout and similar settings. #18829

Closed
programmerq opened this issue Nov 28, 2022 · 6 comments
Closed

Allow Cloud users to set web_idle_timeout and similar settings. #18829

programmerq opened this issue Nov 28, 2022 · 6 comments
Assignees
Labels
bug c-cro Internal Customer Reference c-puf Internal Customer Reference cloud Cloud support-load This issue generates support load

Comments

@programmerq
Copy link
Contributor

Expected behavior:

It should be possible to update the web_idle_timeout option on a Cloud tenant.

Current behavior:

If you run tctl get cluster_networking_config on a Cloud instance, it shows that the current settings are controlled via a config file. If you try to make any changes to the object via tctl create -f cluster_networking_config.yaml, you get the following warning:

ERROR: The cluster networking configuration resource is managed by static configuration. We recommend removing configuration from teleport.yaml, restarting the servers and trying this command again.

If you would still like to proceed, re-run the command with both --force and --confirm flags.

If you supply both -f and --confirm, it does make the change, but it only stays in effect until the pod is restarted.

Additionally, it is possible to remove the proxy_listener_mode: 1 bit, which is not ideal. Values that need to be a certain way in Cloud should be enforced to stay at that value.

client_idle_timeout and idle_timeout_message are also useful options to allow users to control within the cluster_networking_config resource.

Bug details:

  • Teleport version: 10.3.8 (Cloud)
  • Recreation steps
  • Debug logs
@programmerq programmerq added bug cloud Cloud c-cro Internal Customer Reference labels Nov 28, 2022
@russjones
Copy link
Contributor

@vitorenesduarte Let's see if we can follow our dynamic configuration method for web_idle_timeout. See the following RFDs for more details.

https://github.com/gravitational/teleport/blob/master/rfd/0016-dynamic-configuration.md
https://github.com/gravitational/teleport/blob/master/rfd/0028-cluster-config-resources.md

If you have questions, feel free to ping me and I can step you through what we want.

@vitorenesduarte
Copy link
Contributor

vitorenesduarte commented Dec 5, 2022

Summary

For this we need:

Details

Currently the cluster_networking_config has the label teleport.dev/origin: config-file even if the initial auth configuration does not contain any cluster_networking_config field.

This seems to be due to this line:

cfg.Auth.NetworkingConfig, err = types.NewClusterNetworkingConfigFromConfigFile(types.ClusterNetworkingConfigSpecV2{

Teleport changes (part I)

cluster_networking_config should only have the label teleport.dev/origin: config-file when the initial auth configuration does indeed contain cluster_networking_config fields. Otherwise, the label should be teleport.dev/origin: defaults.

This will allow the cluster_networking_config to be updated without providing the --confirm flag (both by Cloud (see below) and by users afterwards).

Cloud changes

When a tenant is being provisioned, Cloud's tenant operator can create a cluster_networking_config containing the proxy_listener_mode field and the tunnel_strategy field.

This process would be similar to what we do currently for the cluster_auth_preference: https://github.com/gravitational/cloud/blob/deef018bfbb2577ccc38c275081dc7fb6e159afa/pkg/tenantoperator/configmaps.go#L550-L566

Teleport changes (part II)

The above changes should allow users to set a custom cluster_networking_config (e.g. modify web_idle_timeout mentioned in the issue description).
However, not all changes are valid (e.g. changing proxy_listener_mode).

To prevent invalid changes, we can extend the existing cloud-only resource validation so only a subset of the fields can be modified:

func ValidateResource(res types.Resource) error {
// All checks below are Cloud-specific.
if !GetModules().Features().Cloud {
return nil
}
switch r := res.(type) {
case types.AuthPreference:
switch r.GetSecondFactor() {
case constants.SecondFactorOff, constants.SecondFactorOptional:
return trace.BadParameter("cannot disable two-factor authentication on Cloud")
}
case types.SessionRecordingConfig:
switch r.GetMode() {
case types.RecordAtProxy, types.RecordAtProxySync:
return trace.BadParameter("cannot set proxy recording mode on Cloud")
}
if !r.GetProxyChecksHostKeys() {
return trace.BadParameter("cannot disable strict host key checking on Cloud")
}
}
return nil
}

@dboslee
Copy link
Contributor

dboslee commented Jun 7, 2023

There are fields in the cluster networking config that we MUST prevent cloud tenants from updating as changes to these fields could break networking and thus the customers access to their teleport cluster. These fields include

  • ProxyListenerMode
  • TunnelStrategy

However we don't want to lock these to a specific value as we want the ability to update these on our end if necessary.

There are additional fields that I think would be good for us to prevent customers from updating as they could increase load on our infrastructure and alter the way we expect agents to reconnect and heartbeat, but these fields don't have the same risk of a customer breaking their teleport cluster. These fields include:

  • KeepAliveInterval
  • KeepAliveCountMax

As a general rule anything a customer could configure to alter the way teleport handles networking or agent behavior we should assess the impact that changing the value could have on a cloud tenant.

@r0mant
Copy link
Collaborator

r0mant commented Aug 18, 2023

Looks like this wasn't fully addressed because the config is still being reverted after pods restart. @lxea

@r0mant r0mant reopened this Aug 18, 2023
@ravicious ravicious added the support-load This issue generates support load label Dec 4, 2023
@TeleLos TeleLos added the c-puf Internal Customer Reference label Feb 9, 2024
@TeleLos
Copy link
Contributor

TeleLos commented Feb 9, 2024

c-puf is looking to have a longer timeout for ssh nodes and desktop clients.

@r0mant
Copy link
Collaborator

r0mant commented Feb 13, 2024

This was implemented a while ago.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug c-cro Internal Customer Reference c-puf Internal Customer Reference cloud Cloud support-load This issue generates support load
Projects
None yet
Development

No branches or pull requests

8 participants