data/aws/vpc: Restrict ICMP ingress to our security groups #2412

wking · 2019-09-26T04:33:15Z

And in the UPI CloudFormation templates too.

We've allowed ICMP ingress for OpenStack since 6f76298 (coreos/tectonic-installer#1), which did not motivate the ICMP ingress. Allowing ICMP ingress for AWS dates back to b620c16 (coreos/tectonic-installer#264). The master rule was restricted to the VPC in e7bd29a (coreos/tectonic-installer#2147). And the worker rules was restricted to the VPC in e131a74 (#1550) in-cluster ICMP ingress.

There are reasons to allow in-cluster ICMP, including Path MTU Discovery (PMTUD). Folks also use ping to [troubleshoot connectivity][4]. Restricting this to in-cluster security groups will avoid exposing ICMP ports to siblings living in shared VPCs, as we move towards allowing the installer to launch clusters in a pre-existing VPC. It might also block ICMP ingress from our load balancers, where we probably want PMTUD and possibly other ICMP calls. I'm not sure if there's a convenient way to allow access from the load-balancers while excluding it from sibling clusters that share the same VPC, but this commit is my attempt to get that.

CC @cuppett, @squeed

openshift-ci-robot · 2019-09-26T04:33:31Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [wking]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

abhinavdahiya · 2019-09-26T07:00:39Z

personally i think ICMP is okay in the virtual network, it is an important debugging tool and would block ping from a bastion not part of the cluster.

would like to see @cuppett and @squeed 's opinion.

squeed · 2019-09-26T11:41:32Z

I'd rather not block ICMP. Blocking it for non-Internet-exposed hosts has essentially no security benefit, and comes at great cost of usability. As you identified, PMTUD (which is critical, as AWS has non-1500-MTU) is one casualty of blocking ICMP.

If, nevertheless, end-users demand this, then we should still allow the types+codes specified in http://shouldiblockicmp.com/ throughout the VPC.

It might also block ICMP ingress from our load balancers...

Fortunately, AWS is clever enough and auto-adds a SG always allowing ICMP between the VPC and a xLB.

wking · 2019-09-26T14:36:05Z

Fortunately, AWS is clever enough and auto-adds a SG always allowing ICMP between the VPC and a xLB.

This was my concern. With this in place, why would we not want this PR? To ping from a bastion, you could put your bastion in the worker group. I see no need to support ping, PMTUD, etc. between machines in our cluster and unrelated machines in the same VPC. They should be coming in through LBs like external users (although they could use internal LBs too).

And in the UPI CloudFormation templates too. We've allowed ICMP ingress for OpenStack since 6f76298 (OpenStack prototype, 2017-02-16, coreos/tectonic-installer#1), which did not motivate the ICMP ingress. Allowing ICMP ingress for AWS dates back to b620c16 (modules/aws: tighten security groups, 2017-04-19, coreos/tectonic-installer#264). The master rule was restricted to the VPC in e7bd29a (modules/aws/vpc - Better security for master nodes, 2017-10-16, coreos/tectonic-installer#2147). And the worker rules was restricted to the VPC in e131a74 (aws: fix ICMP ACL, 2019-04-08, openshift#1550), before which a typo had blocked all ICMP ingress. There are reasons to allow in-cluster ICMP, including Path MTU Discovery (PMTUD) [1,2,3]. Folks also use ping to troubleshoot connectivity [4]. Restricting this to in-cluster security groups will avoid exposing ICMP ports to siblings living in shared VPCs, as we move towards allowing the installer to launch clusters in a pre-existing VPC. It might also block ICMP ingress from our load balancers, where we probably want PMTUD and possibly other ICMP calls. I'm not sure if there's a convenient way to allow access from the load-balancers while excluding it from sibling clusters that share the same VPC, but this commit is my attempt to get that. [1]: http://shouldiblockicmp.com/ [2]: https://tools.ietf.org/html/rfc1191 [3]: https://tools.ietf.org/html/rfc8201 [4]: https://bugzilla.redhat.com/show_bug.cgi?id=1689857#c2

wking · 2019-09-26T16:13:42Z

Fortunately, AWS is clever enough and auto-adds a SG always allowing ICMP between the VPC and a xLB.

Do you have a doc reference for that? The classic, application and network LB docs make it sound like you need to set these up manually.

I've pushed a213842 -> aff01b9 to fix the tf-fmt issue.

wking · 2019-09-26T16:16:57Z

Hmm, here is Terraform saying that security_group is only valid for application load balancers and classic load balancers. So maybe our network load balancers get some auto-added magic, but we should check to see if the Kubernetes AWS cloud provider has security-group magic for its classic LBs.

abhinavdahiya · 2019-09-26T16:41:00Z

To ping from a bastion, you could put your bastion in the worker group.

i don't think the users should be required to add the bastian to worker security group, that has more privilege access... than just ping.

wking · 2019-09-26T16:58:02Z

i don't think the users should be required to add the bastian to worker security group, that has more privilege access... than just ping.

They could certainly create a new security group to allow pings. And really, I think the best test for node connectivity is oc get nodes through the Kubernetes API, falling back to SSH access. I don't see a need to support out-of-cluster pings.

jstuever · 2019-09-26T23:03:29Z

Looks like you are allowing workers to masters and masters to workers... but not masters to masters or workers to workers?

cuppett · 2019-10-01T18:18:17Z

Restricting ping-of-deaths to be an intra-cluster/SDN issue only by default has my vote. This will be a thing when we have multiple clusters in the same VPC (either UPI today or the feature tomorrow).

We don't need ping from ELBs. If allowed, I'd do it just WorkerSecurityGroup <-> WorkerSecurityGroup and Master<->Master and Master -> Worker. I wouldn't allow Worker->Master for same ping-of-death from workload killing control plane reasons. (Not sure if you can control via network policies similar kinds of things within the SDN as well for our control plane operators.)

openshift-ci-robot · 2019-10-04T15:07:03Z

@wking: The following tests failed, say /retest to rerun them all:

Test name	Commit	Details	Rerun command
ci/prow/e2e-aws	`aff01b9`	link	`/test e2e-aws`
ci/prow/e2e-aws-scaleup-rhel7	`aff01b9`	link	`/test e2e-aws-scaleup-rhel7`
ci/prow/e2e-aws-disruptive	`aff01b9`	link	`/test e2e-aws-disruptive`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

jstuever · 2019-10-11T20:37:15Z

I might be confused here... I see masters can ping workers, and workers can ping masters. Is it somehow inherent that masters can ping masters and workers can ping workers?

IMHO it's better to implement this to improve cluster isolation even if it doesn't fix the internal cluster ping ddos issue mentioned above.

abhinavdahiya · 2019-11-06T17:46:29Z

So i think we are dropping improving isolation wrt ICMP in a network, and keeping things as-is. Unless we see reports of users requesting that.

/close

openshift-ci-robot · 2019-11-06T17:46:30Z

@abhinavdahiya: Closed this PR.

In response to this:

So i think we are dropping improving isolation wrt ICMP in a network, and keeping things as-is. Unless we see reports of users requesting that.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Sep 26, 2019

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 26, 2019

openshift-ci-robot requested review from jstuever and patrickdillon September 26, 2019 04:34

wking force-pushed the awc-icmp-ingress-restriction branch from 018c446 to a213842 Compare September 26, 2019 04:34

wking force-pushed the awc-icmp-ingress-restriction branch from a213842 to aff01b9 Compare September 26, 2019 16:13

openshift-ci-robot closed this Nov 6, 2019

wking deleted the awc-icmp-ingress-restriction branch November 7, 2019 02:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data/aws/vpc: Restrict ICMP ingress to our security groups #2412

data/aws/vpc: Restrict ICMP ingress to our security groups #2412

wking commented Sep 26, 2019

openshift-ci-robot commented Sep 26, 2019

abhinavdahiya commented Sep 26, 2019

squeed commented Sep 26, 2019

wking commented Sep 26, 2019

wking commented Sep 26, 2019

wking commented Sep 26, 2019

abhinavdahiya commented Sep 26, 2019

wking commented Sep 26, 2019

jstuever commented Sep 26, 2019

cuppett commented Oct 1, 2019

openshift-ci-robot commented Oct 4, 2019

jstuever commented Oct 11, 2019

abhinavdahiya commented Nov 6, 2019

openshift-ci-robot commented Nov 6, 2019

data/aws/vpc: Restrict ICMP ingress to our security groups #2412

data/aws/vpc: Restrict ICMP ingress to our security groups #2412

Conversation

wking commented Sep 26, 2019

openshift-ci-robot commented Sep 26, 2019

abhinavdahiya commented Sep 26, 2019

squeed commented Sep 26, 2019

wking commented Sep 26, 2019

wking commented Sep 26, 2019

wking commented Sep 26, 2019

abhinavdahiya commented Sep 26, 2019

wking commented Sep 26, 2019

jstuever commented Sep 26, 2019

cuppett commented Oct 1, 2019

openshift-ci-robot commented Oct 4, 2019

jstuever commented Oct 11, 2019

abhinavdahiya commented Nov 6, 2019

openshift-ci-robot commented Nov 6, 2019