Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add restricted session #7099

Merged
merged 2 commits into from
Jul 16, 2021
Merged

Conversation

eyakubovich
Copy link
Contributor

Adds the ability to block network traffic on SSH sessions.
The deny/allow lists of IPs are specified in teleport.yaml file.
Supports both IPv4 and IPv6 communication.

This feature currently relies on enhanced recording for
cgroup management so that needs to be enabled as well.

-- Design rationale:
This patch uses Linux Security Module (LSM) hooks, specifically
security_socket_connect and security_socket_sendmsg, to control
egress traffic. The LSM provides two advantages over socket filtering
program types.

  • It's executed early enough that the task information is available.
    This makes it easy to report PID, COMM, etc.
  • It becomes a model for extending restrictions beyond networking.

The set of enforced cgroups is stored in a BPF hash map and the
deny/allow lists are stored in BPF trie maps. An IP address is
first checked against the deny list. If found, it's checked for
an override in the allow list.

IPv4 addresses are additionally registered in IPv6 trie (as mapped)
to account for dual stacks. However it is unclear if this is sufficient
as 4-to-6 transition methods utilize a multitude of translation and
tunneling methods.

Copy link
Contributor

@klizhentas klizhentas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a couple of notes on api versioning and resource interfaces. Haven't reviewed the actual BPF implementation, but otherwise API, UX looks good.

}

// SetNetworkRestrictions updates the network restrictions
func (c *Client) SetNetworkRestrictions(ctx context.Context, nr *types.NetworkRestrictions) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all setters/getters lets create an interface wrapper similar to https://github.com/gravitational/teleport/blob/master/api/types/role.go#L34 and pass it here. Let's change the API to use:

https://github.com/gravitational/teleport/blob/master/api/client/client.go#L964

This lets us to convert V3/V4 internally without changing all the methods in case of simple version updates.


// NewNetworkRestrictions creates a new NetworkRestrictions with the given name.
func NewNetworkRestrictions() *NetworkRestrictions{
return &NetworkRestrictions{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be V4 resource (V is the version of the API, not the version of the resource). @russjones we should explain it somewhere, I see other folks being confused as well.

V in resources is a version of the API. We released roles V4, therefore all new resources will be created with V4 version. Not ideal, but exists historic reasons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, will fix. That said, sorry, this makes little sense. If the whole API version is bumped, wouldn't you have to rename all existing resources to the new version suffix by that logic?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally - yes, we'd bump the version API for all resources. However this would have required renaming every time we updated a single resource, so by convention unchanged resources of V3 are also V4 if the latest API version is V4.

Confusing, I know, we should improve in the future or at least explain cc @russjones @Joerger

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just version the resources, not the whole API? If you're versioning the whole API, I suppose it should be reflected in the name of the service (https://github.com/gravitational/teleport/blob/master/api/client/proto/authservice.proto#L889).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to add to the confusion here, but from my understanding the Version on the resource corresponds to the Resource version, not the API version.

Additionally, the API version is going to follow lock step with the Teleport version, so having the resource version tied to API version won't make much sense going forward.

@klizhentas I think that this may have warped from its original conception over time, partially due to the API changes I've been working on. Changes had to be made to fix version validation consistency issues. The Version and Version suffix are now always equal, except during version to version migrations (RoleV3 -> RoleV4 is an ongoing migration). Also, some new resources we've added are V1 or V2, such as ClusterAuditConfigV2 and ClusterNetworkingConfigV2.

I agree that we should come up with and document a clear guideline around resource versions, they were also unclear to new contributors such as myself. For now, NetworkRestrictions could be made either V4 or V1 without any drawbacks.

string CIDR = 1 [ (gogoproto.jsontag) = "cidr" ];
}

message NetworkRestrictionsSpec {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpecV4

@eyakubovich eyakubovich force-pushed the ey/add-access-ctl branch 2 times, most recently from f652e3c to cbcab03 Compare July 2, 2021 21:23
return nil
}

func (r *NetworkRestrictionsV4) GetKind() string { return r.Kind }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convention in the code base is to not put put these on a single line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

char LICENSE[] SEC("license") = "Dual BSD/GPL";

// Keeps the set of cgroups which are enforced
BPF_HASH(restricted_cgroups, u64, u8, 8192);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update BPF_HASH, BPF_LPM_TRIE, and BPF_RING_BUF sizes to be constants?

@@ -338,6 +338,7 @@ func GetCheckerForBuiltinRole(clusterName string, recConfig types.SessionRecordi
types.NewRule(types.KindSessionRecordingConfig, services.RO()),
types.NewRule(types.KindClusterAuthPreference, services.RO()),
types.NewRule(types.KindSemaphore, services.RW()),
types.NewRule(types.KindNetworkRestrictions, services.RW()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the node need RW permissions to this resource? If my reading is correct, it's a singleton resource created using tctl commands and nodes only need to read from it. So services.RO() should be sufficent?


// Parse will parse the enhanced session recording configuration.
func (r *RestrictedSession) Parse() (*restricted.Config, error) {
enabled, _ := apiutils.ParseBool(r.Enabled)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the error being dropped?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question, don't remember. I'll fix.

var hdr auditEventHeader
err := binary.Read(buf, binary.LittleEndian, &hdr)
if err != nil {
return auditEventHeader{}, fmt.Errorf("Corrupt event header: %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trace.BadParameter

// not in CIDR format, try as a plain IP
ip := net.ParseIP(cidr)
if ip == nil {
return nil, fmt.Errorf("%q is not an IP nor CIDR", cidr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trace.BadParameter

if err = resetNetworkRestrictions(ctx, client); err != nil {
return trace.Wrap(err)
}
fmt.Printf("network restrictions have been reset to defaults\n")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default is allow all right? So maybe network restrictions have been reset to defaults (allow all) just to the user understand the implications of what was done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct, will update

ev.hdr.cgroup = cg;
ev.hdr.pid = (u32) (bpf_get_current_pid_tgid() >> 32);
ev.hdr.type = BLOCKED_IPV6;
bpf_get_current_comm(&ev.hdr.task, sizeof(ev.hdr.task));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Indentation issue.

ev.hdr.cgroup = cg;
ev.hdr.pid = (u32) (bpf_get_current_pid_tgid() >> 32);
ev.hdr.type = BLOCKED_IPV4;
bpf_get_current_comm(&ev.hdr.task, sizeof(ev.hdr.task));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Indentation issue.

Comment on lines +67 to +92
// newNetworkAuditEvent creates events.SessionNetwork, filling in common fields
// from the SessionContext
func newNetworkAuditEvent(ctx *bpf.SessionContext, hdr *auditEventHeader) events.SessionNetwork {
return events.SessionNetwork{
Metadata: events.Metadata{
Type: api.SessionNetworkEvent,
Code: api.SessionNetworkCode,
},
ServerMetadata: events.ServerMetadata{
ServerID: ctx.ServerID,
ServerNamespace: ctx.Namespace,
},
SessionMetadata: events.SessionMetadata{
SessionID: ctx.SessionID,
},
UserMetadata: events.UserMetadata{
User: ctx.User,
Login: ctx.Login,
},
BPFMetadata: events.BPFMetadata{
CgroupID: hdr.CGroupID,
Program: bpf.ConvertString(unsafe.Pointer(&hdr.Command)),
PID: uint64(hdr.PID),
},
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@klizhentas You're reviewed the event structure?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reuses the existing event but added a Restriction field to indicate if it was allowed or denied. Enhanced Recording reports everything as ALLOWED and Restricted Session as DENIED. This causes an oddity in the UI. A denied action will end up with two audit log entries that look identical on the dashboard. One coming from the Enhanced Recording and one from Restricted Session. Furthermore, if you look at JSON details, the first one will show up as Restriction=ALLOWED and the second one as Restriction=DENIED 🤷‍♂️

Comment on lines 2252 to 2257
// Allow lists the addresses that should be allowed.
repeated AddressCondition Allow = 5
[ (gogoproto.nullable) = false, (gogoproto.jsontag) = "allow" ];
// Deny lists the addresses that should be denied even if they're allowed by Allow condition.
repeated AddressCondition Deny = 6
[ (gogoproto.nullable) = false, (gogoproto.jsontag) = "deny" ];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason these are 5 and 6 instead of 1 and 2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. There was a previous version with more members. Should be 1 and 2 now.

@@ -0,0 +1,2043 @@
syntax = "proto3";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this file was accidentally checked in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😱

@eyakubovich eyakubovich force-pushed the ey/add-access-ctl branch 2 times, most recently from c235438 to 888153b Compare July 15, 2021 17:32
@eyakubovich eyakubovich force-pushed the ey/add-access-ctl branch 3 times, most recently from e639d96 to 459e9a4 Compare July 16, 2021 22:27
Adds the ability to block network traffic on SSH sessions.
The deny/allow lists of IPs are specified in teleport.yaml file.
Supports both IPv4 and IPv6 communication.

This feature currently relies on enhanced recording for
cgroup management so that needs to be enabled as well.

-- Design rationale:
This patch uses Linux Security Module (LSM) hooks, specifically
security_socket_connect and security_socket_sendmsg, to control
egress traffic. The LSM provides two advantages over socket filtering
program types.
- It's executed early enough that the task information is available.
  This makes it easy to report PID, COMM, etc.
- It becomes a model for extending restrictions beyond networking.

The set of enforced cgroups is stored in a BPF hash map and the
deny/allow lists are stored in BPF trie maps. An IP address is
first checked against the allow list. If found, it's checked for
an override in the deny list. The policy is default deny. However,
the absence of the NetworkRestrictions API object is allow all.

IPv4 addresses are additionally registered in IPv6 trie (as mapped)
to account for dual stacks. However it is unclear if this is sufficient
as 4-to-6 transition methods utilize a multitude of translation and
tunneling methods.
@russjones russjones merged commit 67c0eb3 into gravitational:master Jul 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants