Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predictable option combining #43

Open
vitorenesduarte opened this issue Nov 14, 2022 · 7 comments
Open

Predictable option combining #43

vitorenesduarte opened this issue Nov 14, 2022 · 7 comments

Comments

@vitorenesduarte
Copy link

vitorenesduarte commented Nov 14, 2022

In the end of this text there's a table containing all Teleport RBAC options:

  • the 1st & 2nd columns ("Option name" and "Description") were copied from the docs
  • the 3rd & 4rd columns ("Multi-role behavior" and "Default") were inferred from the docs & code
  • the 5th, 6th & 7th columns ("Predicate's option name", "Predicate's multi-policy behavior" and ""Predicate's default") contain proposals for what we could do in predicate
  • the 8th column ("Backwards-compatible") indicates whether the proposal is backwards compatible (and why it's not if that's the case)
  • the 9th column ("References") contains code references for the 3rd & 4rd column
  • the 10th column ("Notes") contains additional notes

As it can be seen in the 3rd column, there's no consistent rule applied when combining options belonging to different roles (e.g. both and and or are used for booleans).

Option combining should select the least permissive alternative, but that's not always the case (see e.g. forward_agent and port_forwarding).

Predictable combining rules

Ideally we should always apply the same rule (e.g. always and).
However, this is not possible as options can be booleans, numbers, and strings (at least).

So maybe we can try to apply:

  1. and for booleans
  2. min for everything else

1. and for booleans

If an option is a boolean, then false should represent the least permissive option.
This ensures that when we and two of these, we get the least permissive alternative.
For this, the option verb should be something permissive (e.g. allow_*).

2. min for everything else

This should work if the option values can be totally ordered.

This is already the case for numbers (used e.g. in max_session_ttl and max_connections).
In this case, the smallest number should represent the least permissive option.

For strings, we have to totally order them in increasing permissiveness.
For example, for the lock option we have two possible values (strict and best_effort).
Here we can use min if strict < best_effort.

Least permissive by default

We want that options are combined in a way that the least permissive option is selected.

Does this mean that we should also have, by default, the least permissive option?

This doesn't happen today with:

  • port_forwarding, ssh_file_copy, desktop_clipboard:
    • default: true
    • least permissive: false
  • disconnect_expired_cert:
    • default: false
    • least permissive: true
  • lock, recording_session.ssh:
    • default: best_effort
    • least permissive: strict
  • enhanced_recording:
    • default: {command, network}
    • least permissive: {command, network, disk}

Are these defaults not the least permissive option to make Teleport more usable?

With predicate, we could change these defaults, and maybe have a backwards-compatible migration if role-to-policy translation (#50) is aware of this.

Least permissive durations

For TTLs and timeouts, the least permissive default would be something like 1s.
So it seems that we would need sane defaults that are not too permissive but still allow the product to be usable at the same time.

Unrestricted defaults today

We have some options with unrestricted defaults:

  • client_idle_timeout, max_sessions, max_connections and max_kubernetes_connections: the defaults seem to be 0, so no restriction is set

Should this be changed?

Option naming

Prefixes

Should option names be prefixed by the protocol (e.g. ssh or desktop) they apply to for more clarity?

Meaning

Should boolean option names mean permissive actions so that logical and always gives the least permissive?

For example, today record_session.desktop: true is restrictive.
So to combine this with record_session.desktop: false in a way that we get the least permissive option, we have to do a logical or.

If we want to always do a logical and for boolean options, then true should be the least permissive.
For this, in the example above, the option name should instead be something like desktop.allow_session_unrecorded.

The same applies to pin_source_ip which could instead be named ssh.allow_source_ip_unpinned.

Non-deterministic option combining

If cert extensions from different roles set different values for the same key, the value that is selected seems non-deterministic (see R23).

Should we just error in this case?

Also, from R23, it seems that these cert extensions could override the cert extensions set by Teleport itself.

Teleport RBAC options

Option name Description Multi-role behavior Default Predicate's option name Predicate's multi-policy behavior Predicate's default Backwards compatible References Notes
max_session_ttl Max. time to live (TTL) of a user's SSH certificates min 30h session_ttl min 30h ✔️ R1, R2
client_idle_timeout Forcefully terminate active SSH sessions after an idle interval min 0 client_idle_timeout min ✔️ R3 0 means no restriction
max_sessions Total number of session channels which can be established across a single SSH connection via Teleport min 0 ssh.max_sessions_per_connection min ✔️ R4 0 means no restriction
max_connections Enterprise-only limit on how many concurrent sessions can be started via Teleport min 0 ssh.max_connections min ✔️ R5 0 means no restriction
max_kubernetes_connections Defines the maximum number of concurrent Kubernetes sessions per user min 0 kubernetes.max_connections min ✔️ R6 0 means no restriction
lock Locking mode (strict or best_effort) min where strict < best_effort best_effort locking_mode min where strict < best_effort strict ✔️ R7, R8
record_session.ssh Defines the Session recording mode min where strict < best_effort best_effort ssh.session_recording_mode min where strict < best_effort strict ✔️ R1, R9
cert_format Not in the documentation. min where oldssh < standard standard ssh.cert_format min where standard < oldssh standard ❌ (why: combining rule) R1, R10
record_session.desktop Defines the Session recording mode logical or true desktop.allow_session_unrecorded logical and false ✔️ R1, R11
forward_agent Allow SSH agent forwarding logical or false ssh.allow_agent_forwarding logical and false ❌ (why: combining rule) R12
port_forwarding Allow TCP port forwarding logical or true ssh.allow_port_forwarding logical and false ❌ (why: combining rule) R1, R13
permit_x11_forwarding Allow users to enable X11 forwarding with OpenSSH clients and servers logical or false ssh.allow_x11_forwarding logical and false ❌ (why: combining rule) R14
ssh_file_copy Allow SCP/SFTP logical and true ssh.allow_file_copy logical and false ✔️ R1, R15
disconnect_expired_cert Forcefully terminate active SSH sessions when a client certificate expires logical or false ssh.allow_expired_cert logical and false ✔️ R16
pin_source_ip Enable source IP pinning for SSH certificates. Note: IP pinning is currently in Preview mode logical or false ssh.allow_source_ip_unpinned logical and false ✔️ R17
create_host_user Allow users to be automatically created on a host logical and (within only the for roles matching a Node) false ssh.allow_host_user_creation logical and false ✔️ R18 Seems to be the only option that applies only if there's a "Node" match. See Slack thread.
desktop_clipboard Allow clipboard sharing for desktop sessions logical and true desktop.allow_clipboard logical and false ✔️ R1, R19
desktop_directory_sharing Not in the documentation. logical and true desktop.allow_directory_sharing logical and false ✔️ R1, R20
enhanced_recording Indicates which events should be recorded by the BFP-based session recorder set union {command, network} ssh.session_recording_bpf_events set union {command, network, disk} ✔️ R1, R21
cert_extensions Specifies extensions to be included in SSH certificates map union + non-determinism on key-clash {} ssh.cert_extensions map union + error on key-clash {} ✔️ R22, R23

Teleport RBAC options (TODO)

Option name Description Multi-role behavior Default Predicate's option name Predicate's multi-policy behavior Predicate's default Backwards compatible References Notes
require_session_mfa Enforce per-session MFA or PIV-hardware key restrictions on user login sessions (no, yes, hardware_key, hardware_key_touch) For per-session MFA, Logical "OR" i.e. evaluates to "yes" if at least one role requires session MFA
request_access Enterprise-only Access Request strategy (optional, always or reason)
request_prompt Prompt for the Access Request "reason" field

Undocumented options

Above we have two options from api/types/types.pb.go that are not documented:

  • desktop_directory_sharing
  • cert_format

Questions

  • max_session_ttl: This does not apply only to a specific protocol (e.g. ssh or desktop), right?
  • client_idle_timeout: This does not apply only to a specific protocol (e.g. ssh or desktop), right?
  • lock: This does not apply only to a specific protocol (e.g. ssh or desktop), right?
  • cert_extensions: What if cert extensions from different roles set different values for the same key? Which value is selected?
  • require_session_mfa: How can it be a logical "OR" if we have values like hardware_keyand hardware_key_touch?
@klizhentas
Copy link
Collaborator

klizhentas commented Nov 14, 2022

Write tests with new heuristics and types and review with @nklaassen to make sure it all makes sense.

  • Success criteria: implement all of the options tests in predicate and show Nic to see if he is OK with this or confused.

@nklaassen
Copy link

nklaassen commented Nov 15, 2022

I think we have to figure out which questions we want policies to answer, how we will have to define the policies in order to answer those questions, and what the interface for asking those questions will be.

The first big question we currently answer with roles is: Can identity_a in state_b access resource_c with options(x, y, and z)?

Here identity_a is often a user, state_b captures MFA state, resource_c is some labeled resource, and options(x, y, and z) includes the host login, db login, AWS role, etc. I'll refer to all of these together as the context of the access question.

The reason I'm breaking this out is that for this question there is a simpler formulaic answer, access is defined by two simple questions:

  1. Does any 1 single role deny access with this context? If so, deny.
  2. Does any 1 single role allow access with this context? If so, allow.
    Or, in pseudo-code:
allow = !(role_a.deny(context) || role_b.deny(context) || ...) && (role_a.allow(context) || role_b.allow(context) || ...)

For answering this access question, there is no reason to combine individual options across roles, each role can be considered atomically. I think this could be extended to policies so that policies can be considered atomically without combining options across policies

Things get more complicated, and you may need to combine options across roles/policies, when you start asking other questions. Questions like:

  • Does identity_a require per-session MFA when accessing resource_b?
  • Should identity_a be allowed to do port-forwarding while accessing resource_b?
  • What should be the maximum TTL of a new certificate for identity_a?
  • Should new certificates for identity_a be subject to IP-pinning?

I think we probably need to enumerate all of these questions and ask ourselves

  1. When will this question be asked? E.g. when checking access, when deciding to prompt for MFA or not, when generating a new cert, when enumerating all possible logins for display in the UI?
  2. What context will be available when this question is asked?
  3. Will all policies be combined, or only ones "matching" a certain resource?
  4. Will answering this question require an independent predicate/Z3 expression from an allow/deny access expression, or can this question be transformed to an option on the access question?
  5. Ultimately, are access policies the right tool to answer this question or would be a better fit for roles, or login rules, or some other tool?

I think once we answer those ^ we'll know what context we have and whether the logic all needs to be combined into a single expression, and we can better answer the specific logical combination rules, because they really depend on the context of the question.

@klizhentas
Copy link
Collaborator

klizhentas commented Nov 16, 2022

For answering this access question, there is no reason to combine individual options across roles, each role can be considered atomically. I think this could be extended to policies so that policies can be considered atomically without combining options across policies

I don't think that's the case actually. Consider two policies (predicate does not use roles):

Policy A:

options: 
   max_session_ttl < 1h
 allow:
     access_node: logins == root

policy b:

options: 
   max_session_ttl < 10m
 allow:
     access_node: logins == root

If you only consider policy a, Alice would be allowed to get a cert with logins root for 1 hour. That is not the case today, because Alice will be only allowed to get a cert with root for 10 minutes, as limited by the option.

Our goal today is to stick to backwards compatibility and use the safest default. In this case < 10m is the safest default that is applied across the policy set.

Regarding all your other questions, can you please present them with specific examples we use today. I'm not sure I understand all other questions well enough to answer them.

@klizhentas
Copy link
Collaborator

It is possible to answer these questions with predicate in Z3 and without it in Golang it today, because all the parameters are known.

  • Does identity_a require per-session MFA when accessing resource_b?
  • Should identity_a be allowed to do port-forwarding while accessing resource_b?
  • What should be the maximum TTL of a new certificate for identity_a?
  • Should new certificates for identity_a be subject to IP-pinning?

Do you have any specific policy examples that you are not clear about? If you write those down in RBAC we will show a scenario in predicate that matches this behavior.

@nklaassen
Copy link

Please disregard my last comment, I was misunderstanding the cases when these options needed to be combined and thought this was a much more general problem if options may need to combine in different ways depending on the context. Essentially all my questions are answered by the fact that these options are can all basically be considered statically at the time user certificates are generated and most of these are baked into the certificate.

@nklaassen
Copy link

Some answers to current questions:

This implies that forward_agent: true is the least permissive. Is that correct?

That's not correct, unfortunately we currently choose the most permissive option here

This implies that port_forwarding: true is the least permissive. Is that correct?

same as agent forwarding

max_sessions: what's the behavior here?

Currently its min but 0 is ignored as the default value, so min > 0

enhanced_recording: what's the behavior here?

It's a union, which is the least permissive, you're right

permit_x11_forwarding: what's the behavior here?

logical OR, unfortunately we currently choose the most permissive option here

request_access: what's the behavior here?

basically its min ordered by reason, always, optional

request_prompt: what's the behavior here?

one non-empty prompt is chosen arbitrarily

max_connections: what's the behavior here?

Currently its min but 0 is ignored as the default value, so min > 0

cert_extensions: what's the behavior here?

union

@vitorenesduarte
Copy link
Author

Thanks @nklaassen. I've updated the issue description with your answers + other observations/questions. These include:

  • least-permissive defaults vs usability
  • option naming
  • non-deterministic combining in cert extensions

The information in the table now contains:

  • option name (for Teleport RBAC & suggestion for predicate)
  • default value (for Teleport RBAC & suggestion for predicate)
  • combining rule (for Teleport RBAC & suggestion for predicate)
  • references to the Go code performing the option combining
  • additional notes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants