-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Teleport 16 Test Plan #42118
Comments
Desktop Access @probakowski @ibeckermayer
Binaries / OS compatibilityVerify that our software runs on the minimum supported OS versions as per Windows @ravicious
Azure offers virtual machines with the Windows 10 2016 LTSB image. This image runs on Windows 10 macOS @camscale
Linux @camscale
Machine ID @timothyb89
With an SSH node registered to the Teleport cluster:
With a Postgres DB registered to the Teleport cluster:
With a Kubernetes cluster registered to the Teleport cluster:
With a HTTP application registered to the Teleport cluster:
Host users creation @atburkeHost users creation docs
CA rotations @fspmarshall
Proxy Peering
SSH Connection Resumption @fspmarshallVerify that SSH works, and that resumable SSH is not interrupted across a Teleport Cloud tenant upgrade.
Verify that SSH works, and that resumable SSH is not interrupted across a control plane restart (of either the root or the leaf cluster).
EC2 Discovery @marcoandredinis
Azure Discovery @marcoandredinis
GCP Discovery @lxea
IP Pinning @AntonAMAdd a role with
Assist @jakuleAssist is not supported by
IGS @smallinsky
Teleport SAML Identity Provider @flyinghermitVerify SAML IdP service provider resource management. Docs:
Manage Service Provider (SP)
SAML service provider catalog
Resources |
This comment was marked as off-topic.
This comment was marked as off-topic.
Arguable if this is something we need to address, but it seemed better to "document" it anyway. |
Nitpick: the theme picker text is bugged |
|
Performance Test ResultsCloudLoad TestsSoak TestsOrigin: us-east-1 Target: us-east-1tsh bench ssh --duration=30m root@node-agents-5b8c8bb49-zzh6r-09 /busybox/ls -lah /
* Requests originated: 17998
* Requests failed: 0
Histogram
Percentile Response Duration
---------- -----------------
25 241 ms
50 250 ms
75 262 ms
90 305 ms
95 393 ms
99 1286 ms
100 4959 ms Origin: us-west-2 Target: us-east-1tsh bench ssh --duration=30m root@node-agents-5b8c8bb49-zzh6r-09 /busybox/ls -lah /
* Requests originated: 17992
* Requests failed: 0
Histogram
Percentile Response Duration
---------- -----------------
25 879 ms
50 890 ms
75 905 ms
90 952 ms
95 1196 ms
99 1795 ms
100 2997 ms etcd1Postgres1Firestore1Footnotes |
Database Access load test (PostgreSQL and MySQL)Setupsame as previous test but in EKS with a single node group:
Teleport cluster (all deployed on the EKS cluster):
Databases:
Note: Databases were configured using discovery running inside the database agent.
MySQL10 connections/second (90 Percentile 80ms)![]() ![]() ![]() ![]()
50 connections/second (90 Percentile 467ms)![]() ![]() ![]() ![]()
PostgreSQL10 connections/second (90 Percentile 93ms)![]() ![]() ![]() ![]()
50 connections/second (90 Percentile 499ms)![]() ![]() ![]() ![]()
Database Access resources count testSetupThis is an one-time manual setup:
500 databases per agent, 50k keepalives5k unique
20 databases per agent, 10k keepalives1k unique
|
Manual Testing Plan
Below are the items that should be manually tested with each release of Teleport.
These tests should be run on both a fresh installation of the version to be released
as well as an upgrade of the previous version of Teleport.
Adding nodes to a cluster @lxea
Labels @lxea
Trusted Clusters @bl-nero
RBAC @bl-nero
Make sure that invalid and valid attempts are reflected in audit log. Do this with both Teleport and Agentless nodes.
Verify that custom PAM environment variables are available as expected. @atburke
Users @codingllama
With every user combination, try to login and signup with invalid second
factor, invalid password to see how the system reacts.
WebAuthn in the release
tsh
binary is implemented using libfido2 forlinux/macOS. Ask for a statically built pre-release binary for realistic
tests. (
tsh fido2 diag
should work in our binary.) Webauthn in Windowsbuild is implemented using
webauthn.dll
. (tsh webauthn diag
withsecurity key selected in dialog should work.)
Touch ID requires a signed
tsh
, ask for a signed pre-release binary so youmay run the tests.
Windows Webauthn requires Windows 10 19H1 and device capable of Windows
Hello.
Adding Users OTP
Adding Users WebAuthn
Adding Users via platform authenticator
Managing MFA devices
tsh mfa add
tsh mfa add
tsh mfa add
tsh mfa ls
tsh mfa rm
tsh mfa rm
second_factor: on
inauth_service
, should failLogin Password Only (upgraded password-only user)
Login with MFA
tsh mfa add
Login OIDC
Login SAML
Login GitHub
Deleting Users
Backends @rosstimothy
Session Recording @capnspacehook
Enhanced Session Recording @jakule
disk
,command
andnetwork
events are being logged.enhanced_recording
role option.Auditd @jakule
teleport/lib/auditd/common.go
Lines 25 to 34 in 7744f72
Audit Log @rosstimothy
Audit log with dynamodb
Audit log with Firestore
Failed login attempts are recorded
Interactive sessions have the correct Server ID
server_id
is the ID of the node in "session_recording: node" modeserver_id
is the ID of the node in "session_recording: proxy" modeforwarded_by
is the ID of the proxy in "session_recording: proxy" modeNode/Proxy ID may be found at
/var/lib/teleport/host_uuid
in thecorresponding machine.
Node IDs may also be queried via
tctl nodes ls
.Exec commands are recorded
scp
commands are recordedSubsystem results are recorded
Subsystem testing may be achieved using both
Recording Proxy mode
and
OpenSSH integration.
Assuming the proxy is
proxy.example.com:3023
andnode1
is a node runningOpenSSH/sshd, you may use the following command to trigger a subsystem audit
log:
sftp -o "ProxyCommand ssh -o 'ForwardAgent yes' -p 3023 %r@proxy.example.com -s proxy:%h:%p" root@node1
External Audit Storage @nklaassen
External Audit Storage must be tested on an Enterprise Cloud tenant.
Instructions for deploying a custom release to a cloud staging tenant: https://github.com/gravitational/teleport.e/blob/master/dev-deploy.md
tsh play <session-id>
worksInteract with a cluster using
tsh
@capnspacehookThese commands should ideally be tested for recording and non-recording modes as they are implemented in a different ways.
Interact with a cluster using
ssh
@timothyb89Make sure to test both recording and regular proxy modes.
Verify proxy jump functionality @atburke
Log into leaf cluster via root, shut down the root proxy and verify proxy jump works.
Interact with a cluster using the Web UI @capnspacehook
X11 Forwarding @Joerger
xeyes
andxclip
:apt install x11-apps xclip
xeyes
. Thenbrew install xclip
.ssh_service.x11.enabled = yes
tsh ssh -X user@node xeyes
tsh ssh -X root@node xeyes
tsh ssh -Y server01 "echo Hello World | xclip -sel c && xclip -sel c -o"
should print "Hello World"tsh ssh -X server01 "echo Hello World | xclip -sel c && xclip -sel c -o"
should fail with "BadAccess" X errorUser accounting @atburke
/var/run/utmp
on Linux./var/log/wtmp
on Linux./var/log/btmp
on Linux.Combinations @Joerger
For some manual testing, many combinations need to be tested. For example, for
interactive sessions the 12 combinations are below.
Add an agentless Node in a local cluster.
Add a Teleport Node in a local cluster.
Add an agentless Node in a remote (leaf) cluster.
Add a Teleport Node in a remote (leaf) cluster.
Teleport with EKS/GKE @AntonAM
Teleport with multiple Kubernetes clusters @tigrato
Note: you can use GKE or EKS or minikube to run Kubernetes clusters.
Minikube is the only caveat - it's not reachable publicly so don't run a proxy there.
tsh login
, check thattsh kube ls
has your clusterkubectl get nodes
,kubectl exec -it $SOME_POD -- sh
tsh login
, check thattsh kube ls
has your clusterkubectl get nodes
,kubectl exec -it $SOME_POD -- sh
tsh login
, check thattsh kube ls
has your clusterkubectl get nodes
,kubectl exec -it $SOME_POD -- sh
tsh login
, check thattsh kube ls
has both clusterstsh kube login
kubectl get nodes
,kubectl exec -it $SOME_POD -- sh
on the new clustertsh login
, check thattsh kube ls
has all clustersname
andlabels
Step 2
login value matching the rowsname
columnname
orlabels
in the search bar worksname
columKubernetes exec via WebSockets/SPDY @AntonAM
To control usage of websockets on kubectl side environment variable
KUBECTL_REMOTE_COMMAND_WEBSOCKETS
can be used:KUBECTL_REMOTE_COMMAND_WEBSOCKETS=true kubectl -v 8 exec -n namespace podName -- /bin/bash --version
. With-v 8
logging levelyou should be able to see
X-Stream-Protocol-Version: v5.channel.k8s.io
in case kubectl is connected over websockets to Teleport.To do tests you'll need kubectl version at least 1.29, Kubernetes cluster v1.29 or less (doesn't support websockets stream protocol v5)
and cluster v1.30 (does support it by default) and to access them both through kube agent and kubeconfig each.
KUBECTL_REMOTE_COMMAND_WEBSOCKETS=false
KUBECTL_REMOTE_COMMAND_WEBSOCKETS=true
X-Stream-Protocol-Version: v5.channel.k8s.io
)X-Stream-Protocol-Version: v5.channel.k8s.io
)Kubernetes auto-discovery @AntonAM
tctl create
.tctl create -f
.tctl rm
.Kubernetes Secret Storage @AntonAM
Statefulset
Kubernetes Pod RBAC @AntonAM
kubernetes_resources
:{"kind":"pod","name":"*","namespace":"*"}
- must allow access to every pod.{"kind":"pod","name":"<somename>","namespace":"*"}
- must allow access to pod<somename>
in every namespace.{"kind":"pod","name":"*","namespace":"<somenamespace>"}
- must allow access to any pod in<somenamespace>
namespace.*
wildcards -<some-name>-*
and regex forname
andnamespace
fields.go-client
.kubernetes_resources
:kubernetes_groups
that denies exec into a podsearch_as_roles
is not allowed.Teleport with FIPS mode @bl-nero
ACME @bl-nero
Migrations @tigrato
SSH should work for both main and old clusters
SSH should work
Command Templates
When interacting with a cluster, the following command templates are useful:
OpenSSH
Teleport
Teleport with SSO Providers
GitHub External SSO @capnspacehook
tctl sso
family of commands @TenerFor help with setting up sso connectors, check out the [Quick GitHub/SAML/OIDC Setup Tips]
tctl sso configure
helps to construct a valid connector definition:tctl sso configure github ...
creates valid connector definitionstctl sso configure oidc ...
creates valid connector definitionstctl sso configure saml ...
creates valid connector definitionstctl sso test
test a provided connector definition, which can be loaded fromfile or piped in with
tctl sso configure
ortctl get --with-secrets
. Validconnectors are accepted, invalid are rejected with sensible error messages.
tctl sso test
.SSO login on remote host @atburke
tsh
should be running on a remote host (e.g. over an SSH session) and use thelocal browser to complete and SSO login. Run
tsh login --callback <remote.host>:<port> --bind-addr localhost:<port> --auth <auth>
on the remote host. Note that the
--callback
URL must be able to resolve to the--bind-addr
over HTTPS.Teleport Plugins @EdwardDowling
Teleport Operator @hugoShaka
teleport-cluster
Helm chart and the operator enabledAWS Node Joining @hugoShaka
Docs
ec2:DescribeInstances
permissions for local account:TELEPORT_TEST_EC2=1 go test ./integration -run TestEC2NodeJoin
TELEPORT_TEST_EC2=1 go test ./integration -run TestIAMNodeJoin
Kubernetes Node Joining @hugoShaka
Azure Node Joining @marcoandredinis
Docs
GCP Node Joining @marcoandredinis
Docs
Cloud Labels @atburke
and with tag
foo
:bar
. Verify that a node running on the instance has labelaws/foo=bar
.foo
:bar
. Verify that a node running on theinstance has label
azure/foo=bar
.and with label
foo
:bar
and tagbaz
:quux
. Verify that a node running on the instance has labelsgcp/label/foo=bar
andgcp/tag/baz=quux
.Passwordless @codingllama
This feature has additional build requirements, so it should be tested with a
pre-release build (eg:
https://cdn.teleport.dev/tsh-v16.0.0-alpha.2.pkg
).This sections complements "Users -> Managing MFA devices".
tsh
binaries foreach operating system (Linux, macOS and Windows) must be tested separately for
FIDO2 items.
Diagnostics
Commands should pass all tests.
tsh fido2 diag
(macOS/Linux)tsh touchid diag
(macOS only)tsh webauthnwin diag
(Windows only)Registration
tsh mfa add
, choose WEBAUTHN andpasswordless)
tsh mfa add
, choose TOUCHID)tsh mfa add
, choose WEBAUTHN andpasswordless)
Login
tsh login --auth=passwordless
)tsh login --auth=passwordless
)tsh login --auth=passwordless --mfa-mode=cross-platform
uses FIDO2tsh login --auth=passwordless --mfa-mode=platform
uses platform authenticatortsh login --auth=passwordless --mfa-mode=auto
prefers platform authenticatorthe same device)
(
auth_service.authentication.passwordless = false
)(
auth_service.authentication.connector_name = passwordless
)(
tsh login --auth=local
)Touch ID support commands
tsh touchid ls
workstsh touchid rm
works (careful, may lock you out!)Device Trust @codingllama
Device Trust requires Teleport Enterprise.
This feature has additional build requirements, so it should be tested with a
pre-release build (eg:
https://cdn.teleport.dev/teleport-ent-v16.0.0-alpha.2-linux-amd64-bin.tar.gz
).Client-side enrollment requires a signed
tsh
for macOS, make sure to use thetsh
binary fromtsh.app
.Additionally, Device Trust Web requires Teleport Connect to be installed (device
authentication for the Web is handled by Connect).
A simple formula for testing device authorization is:
Inventory management
tctl devices add
)tctl devices add --enroll
)tctl devices ls
)tctl devices rm
)tctl devices rm
)tctl devices enroll
)tctl devices enroll
)Device enrollment
Enroll/authn device on macOS (
tsh device enroll
)Enroll/authn device on Windows (
tsh device enroll
)Enroll/authn device on Linux (
tsh device enroll
)Linux users need read/write permissions to /dev/tpmrm0. The simplest way is
to assign yourself to the
tss
group. Seehttps://goteleport.com/docs/access-controls/device-trust/device-management/#troubleshooting.
Verify device extensions on TLS certificate
Note that different accesses have different certificates (Database, Kube,
etc).
Verify device extensions on SSH certificate
Device authentication
tsh or Connect
Web UI (requires Connect)
Confirm that it works by failing first. Most protocols can be tested using
device_trust.mode="required". App Acess and Deskop Access require a custom
role (see [enforcing device trust][enforcing-device-trust]).
[enforcing-device-trust]: https://goteleport.com/docs/access-controls/device-trust/enforcing-device-trust/#app-access-support).
Device authorization
device_trust.mode other than "off" or "" not allowed (OSS)
device_trust.mode="off" doesn't impede access (Enterprise and OSS)
device_trust.mode="optional" doesn't impede access, but issues device
extensions on login
device_trust.mode="required" enforces enrolled devices
device_trust.mode="required" is enforced by processes and not only by
Auth APIs
Testing this requires issuing a certificate without device extensions
(mode="off"), then changing the cluster configuration to mode="required" and
attempting to access a process directly, without a login attempt.
Role-based authz enforces enrolled devices
(device_trust.mode="optional" and role.spec.options.device_trust_mode="required")
Device authorization works correctly for both require_session_mfa=false
and require_session_mfa=true
Device authorization applies to Trusted Clusters
(root with mode="optional" and leaf with mode="required")
Device audit (see lib/events/codes.go)
Web Authentication Confirmed" events
Corresponding "Device Authenticated" events have both
web_authentication=true and web_session_id set.
data (for certificates with device extensions)
Binary support
tsh
for macOS gives a sane errormessage for
tsh device enroll
attempts.Device support commands
tsh device collect
(macOS)tsh device asset-tag
(macOS)tsh device collect
(Windows)tsh device asset-tag
(Windows)tsh device collect
(Linux)tsh device asset-tag
(Linux)Hardware Key Support @Joerger
Hardware Key Support is an Enterprise feature and is not available for OSS.
You will need a YubiKey 4.3+ to test this feature.
This feature has additional build requirements, so it should be tested with a pre-release build (eg:
https://cdn.teleport.dev/teleport-ent-v16.0.0-alpha.2-linux-amd64-bin.tar.gz
).Server Access
This test should be carried out on Linux, MacOS, and Windows.
Set
auth_service.authentication.require_session_mfa: hardware_key_touch
in your cluster auth settings and login.tsh login
tsh ssh
tsh proxy db --tunnel
HSM Support @nklaassen
Docs
Run the full test suite with each HSM/KMS:
Moderated session @rosstimothy
Create two Teleport users, a moderator and a user. Configure Teleport roles to require that the moderator moderate the user's sessions. Use
TELEPORT_HOME
totsh login
as the user in one terminal, and the moderator in another.Ensure the default
terminationPolicy
ofterminate
has not been changed.For each of the following cases, create a moderated session with the user using
tsh ssh
and join this session with the moderator usingtsh join --role moderator
:Ctrl+C
in the user terminal disconnects the moderator as the session has ended.Ctrl+C
in the moderator terminal disconnects the moderator and terminates the user's session as the session no longer has a moderator.t
in the moderator terminal terminates the session for all participants.Performance @rosstimothy @fspmarshall @espadolini
Scaling Test
Scale up the number of nodes/clusters a few times for each configuration below.
Perform reverse tunnel node scaling tests for all backend configurations:
Perform the following additional scaling tests on DynamoDB:
Soak Test
Run 30 minute soak test directly against direct and tunnel nodes
and via label based matching. Tests should be run against a Cloud
tenant.
Concurrent Session Test
Run a concurrent session test that will spawn 5 interactive sessions per node in the cluster:
Robustness
resources which do not require a moderated session and in async recording
mode from an already issued certificate.
which require a moderated session and in async recording mode from an already
issued certificate.
are restarted.
Teleport with Cloud Providers
AWS @camscale
GCP @marcoandredinis
IBM @hugoShaka
Application Access @gabrielcorado
debug_app: true
works.name.rootProxyPublicAddr
and well aspublicAddr
.name.rootProxyPublicAddr
.app.session.start
andapp.session.chunk
events are created in the Audit Log.app.session.chunk
points to a 5 minute session archive with multipleapp.session.request
events inside.tsh play <chunk-id>
can fetch and print a session chunk archive.tsh apps login
.tsh
commands.tsh aws
tsh aws --endpoint-url
(this is a hidden flag)tsh apps login
.tsh az
commands.tsh proxy az
andaz
commands.tsh apps login
.tsh gcloud
commands.tsh gsutil
commands.tsh proxy gcloud
andgcloud
/gsutil
commands.tctl create
.tctl create -f
.tctl rm
.Add Application
links to documentation.Database Access @greedy52
select pg_sleep(10)
followed by ctrl-c is a good query to test.)assume_role_arn: ""
andexternal_id: "<id>"
Azure single-server MySQL and Postgres(EOL Sep 2024 and Mar 2025, let's skip)assume_role_arn: ""
andexternal_id: "<id>"
assume_role_arn: ""
andexternal_id: "<id>"
Azure single-server MySQL and Postgres(EOL Sep 2024 and Mar 2025, let's skip)Verify all supported modes:
keep
,best_effort_drop
AWS RDS Postgres.(e2e test)AWS RDS MySQL.(e2e test)AWS RDS MariaDB.(e2e test)db.session.start
is emitted when you connect.db.session.end
is emitted when you disconnect.db.session.query
is emitted when you execute a SQL query.tsh db ls
shows only databases matching role'sdb_labels
.db_users
.db_names
. @Tenerdb.session.start
is emitted when connection attempt is denied.db_names
. @GavinFrazardb.session.query
is emitted when command fails due to permissions.db_names
. @gabrielcoradotsh db connect
.tctl create
.tctl create -f
.tctl rm
.Please configure discovery in Discovery Service instead of Database Service.
Can detect and register RDS instances.(e2e test)assume_role_arn
andexternal_id
is set.Can detect and register Redshift clusters.(e2e test)name
,description
,type
, andlabels
Step 2
login value matching the rowsname
columnlabels
TLS Routing @greedy52
v2
configuration starts only a single listener for proxy service, in contrast withv1
configuration.Given configuration: @GavinFrazar
*:3080
for proxy service. Given the configuration above, 3022 and 3025 will be opened for other services.v1
, there should be additional ports 3023 and 3024.multiplex
modeauth_service.proxy_listener_mode: "multiplex"
@GavinFrazarweb_proxy_addr == tunnel_addr
tsh db connect
works through proxy running inmultiplex
modetsh proxy db
with a GUI client. @GavinFrazarmultiplex
modessh -o "ForwardAgent yes" -o "ProxyCommand tsh proxy ssh" user@host.example.com
ssh -o "ForwardAgent yes" -o "ProxyCommand tsh proxy ssh --user=%r --cluster=leaf-cluster %h:%p" user@node.foo.com
tsh ssh
access through proxy running in multiplex modemultiplex
mode, usingtsh
multiplex
mode behind L7 load balancertsh login
andtctl
@GavinFrazartsh ssh
andtsh config
@GavinFrazartsh proxy db
andtsh db connect
@GavinFrazartsh proxy app
andtsh aws
@GavinFrazartsh proxy kube
@GavinFrazarThe text was updated successfully, but these errors were encountered: