Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restrict Azure NSG rules to allow external access only to load balancer IP #54177

Merged

Conversation

itowlson
Copy link
Contributor

@itowlson itowlson commented Oct 18, 2017

What this PR does / why we need it: On Azure, we create NSG (Network Security Group) rules on the vnet to allow external clients to access services exposed as type LoadBalancer. At the moment, these rules have a destination of Any, which means that they will permit requests on the opened port to any IP within the vnet. This PR restricts the security rules so that they admit external access only to the load balancer IP.

Which issue this PR fixes: None in upstream - reported as Azure/acs-engine#1619

Special notes for your reviewer: None

Release note:

Azure NSG rules for services exposed via external load balancer 
now limit the destination IP address to the relevant front end load 
balancer IP.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 18, 2017
@k8s-ci-robot
Copy link
Contributor

Hi @itowlson. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Oct 18, 2017
@itowlson
Copy link
Contributor Author

/assign @colemickens

@colemickens
Copy link
Contributor

/sig azure

cc: @jdumars @brendandburns @anhowe to possibly suggest an Azure employee to review.

If no one from Azure takes a look soon, I can do a closer review - change looks okay from a quick skim though.

@seanknox
Copy link

seanknox commented Oct 19, 2017

/assign @jackfrancis

Hey Jack, can you give this a review?

@k8s-ci-robot
Copy link
Contributor

@seanknox: GitHub didn't allow me to assign the following users: jackfrancis.

Note that only kubernetes members can be assigned.

In response to this:

/assign @jackfrancis

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link

@seanknox seanknox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for me, though want to have @jackfrancis sign off too. @jdumars this is a fix to shore up expected cloud provider behavior that we'll want to land in v1.8.2 if possible.

@seanknox
Copy link

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Oct 19, 2017
@@ -299,7 +299,7 @@ func TestReconcileSecurityGroupNewServiceAddsPort(t *testing.T) {

sg := getTestSecurityGroup()

sg, _, err := az.reconcileSecurityGroup(sg, testClusterName, &svc1, true)
sg, _, err := az.reconcileSecurityGroup(sg, testClusterName, &svc1, to.StringPtr("13.70.140.150"), true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an acceptable use of a hard-coded IP?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in unit tests, so it is never actually used as an IP address. But I will change it to something meaningless / invalid - the tests will still work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use an rfc 1918 address.

if err != nil {
return nil, err
}
sg, sgNeedsUpdate, err := az.reconcileSecurityGroup(sg, clusterName, service, lbIP, true /* wantLb */)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does locking down to the lbIP affect the delivery of the external IP to the node? The LB is set to "enableFloatingIP": true, and I believe this drops the external IP address on the box.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested the following scenarios:

  • Single node so the request is guaranteed to land on the right node
  • Multiple nodes so kubeproxy has to forward the request to the right node
  • Initiating a request from within the cluster to exercise the Node->LB->Node route

All of these are working, though I've only tested interactively.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From our conversation I think this addresses the routing cases that you wanted to check - let me know if I'm still not covering the bases!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also check to ensure that the source IP preservation still works with this:

https://kubernetes.io/docs/tutorials/services/source-ip/

Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran a test using the source-ip-app tool from the page you linked. Without the externalTrafficPolicy: Local setting, the tool reported client_address=10.244.0.1. With the externalTrafficPolicy: Local setting, the tool reported client_address=167.220.242.83. So I think source IP preservation is still okay. Thanks for flagging it!

(I'm still not sure how to automate this testing though. It feels like our testing matrix is getting more complex by the day and ad hoc hand testing won't cut it forever...)

@lachie83
Copy link
Member

Hey @colemickens I would like to understand if changing from ANY to the LB destination IP in the NSG rule will break anything. Can you shed some light on why it was originally set to ANY?

@colemickens
Copy link
Contributor

@lachie83 Embarrassingly, I think it was just a result of not thinking about it very closely. I would feel more comfortable if it was a reference to the PIP rather than the PIP's current address, but I can't imagine any reason the PIP's IP would ever be released/renewed and be different. It seems like overall this change is safe and appropriate.

The only concern is backward compatibility, but even then I can't think of much. Just a possibility that manual action may be required to re-write the rules, for example, if the reconciliation code doesn't look at every field in terms of making sure all necessary rules are in place.

@jdumars
Copy link
Member

jdumars commented Oct 24, 2017

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 24, 2017
@lachie83
Copy link
Member

@colemickens thanks for responding and don't be embarrassed at all. I just wanted to know if there was a reason why so that we didn't break anything. This needs some real use-case testing for both north-south (off to on-cluster) traffic paths as well as east/west given how this provider works.

if sgNeedsUpdate {
glog.V(3).Infof("ensure(%s): sg(%s) - updating", serviceName, *sg.Name)
// azure-sdk-for-go introduced contraint validation which breaks the updating here if we don't set these
// to nil. This is a workaround until https://github.com/Azure/go-autorest/issues/112 is fixed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this has been fixed? Can you verify we're on the updated SDK and then remove this workaround?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 (there's a number of places with this workaround, they should all be removable.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed that we are on azure-sdk-for-go 10.0.4-beta and go-autorest 8.0.0. According to Azure/go-autorest#112 (comment), the fix was merged prior to in go-autorest 8.0.0-beta. Will remove the workaround.

@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 31, 2017
@feiskyer
Copy link
Member

/retest

@rjtsdl
Copy link

rjtsdl commented Nov 5, 2017

I personally prefer to bind NSG rule with PIP, rather than PIP's current IP address.

@rjtsdl
Copy link

rjtsdl commented Nov 8, 2017

@itowlson Ivan, are we still pushing this PR? If it cannot make the timeline of 1.9, let's hold on this for now. As we have quite a big change in the azure cloud provider folder.

@brendandburns
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2017
@jdumars
Copy link
Member

jdumars commented Nov 8, 2017

/approve

@k8s-github-robot
Copy link

/test all

Tests are more than 96 hours old. Re-running tests.

@jdumars
Copy link
Member

jdumars commented Nov 8, 2017

/approve no-issue

@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: brendandburns, itowlson, jdumars

Associated issue requirement bypassed by: jdumars

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 8, 2017
@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 93ef57a into kubernetes:master Nov 9, 2017
@kevinkim9264
Copy link
Contributor

Will this patch be back-ported? or be released in next minor release by any chance?

@jdumars
Copy link
Member

jdumars commented Nov 9, 2017

@kevinkim9264 I added the candidate label, and we can see if it gets CP'd by someone.

@kevinkim9264
Copy link
Contributor

@jdumars thank you!

@k8s-cherrypick-bot
Copy link

Removing label cherrypick-candidate because no release milestone was set. This is an invalid state and thus this PR is not being considered for cherry-pick to any release branch. Please add an appropriate release milestone and then re-add the label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet