-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrict Azure NSG rules to allow external access only to load balancer IP #54177
Restrict Azure NSG rules to allow external access only to load balancer IP #54177
Conversation
Hi @itowlson. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/assign @colemickens |
/sig azure cc: @jdumars @brendandburns @anhowe to possibly suggest an Azure employee to review. If no one from Azure takes a look soon, I can do a closer review - change looks okay from a quick skim though. |
/assign @jackfrancis Hey Jack, can you give this a review? |
@seanknox: GitHub didn't allow me to assign the following users: jackfrancis. Note that only kubernetes members can be assigned. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for me, though want to have @jackfrancis sign off too. @jdumars this is a fix to shore up expected cloud provider behavior that we'll want to land in v1.8.2 if possible.
/ok-to-test |
@@ -299,7 +299,7 @@ func TestReconcileSecurityGroupNewServiceAddsPort(t *testing.T) { | |||
|
|||
sg := getTestSecurityGroup() | |||
|
|||
sg, _, err := az.reconcileSecurityGroup(sg, testClusterName, &svc1, true) | |||
sg, _, err := az.reconcileSecurityGroup(sg, testClusterName, &svc1, to.StringPtr("13.70.140.150"), true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an acceptable use of a hard-coded IP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's in unit tests, so it is never actually used as an IP address. But I will change it to something meaningless / invalid - the tests will still work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use an rfc 1918 address.
if err != nil { | ||
return nil, err | ||
} | ||
sg, sgNeedsUpdate, err := az.reconcileSecurityGroup(sg, clusterName, service, lbIP, true /* wantLb */) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does locking down to the lbIP affect the delivery of the external IP to the node? The LB is set to "enableFloatingIP": true, and I believe this drops the external IP address on the box.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tested the following scenarios:
- Single node so the request is guaranteed to land on the right node
- Multiple nodes so kubeproxy has to forward the request to the right node
- Initiating a request from within the cluster to exercise the Node->LB->Node route
All of these are working, though I've only tested interactively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From our conversation I think this addresses the routing cases that you wanted to check - let me know if I'm still not covering the bases!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also check to ensure that the source IP preservation still works with this:
https://kubernetes.io/docs/tutorials/services/source-ip/
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran a test using the source-ip-app
tool from the page you linked. Without the externalTrafficPolicy: Local
setting, the tool reported client_address=10.244.0.1
. With the externalTrafficPolicy: Local
setting, the tool reported client_address=167.220.242.83
. So I think source IP preservation is still okay. Thanks for flagging it!
(I'm still not sure how to automate this testing though. It feels like our testing matrix is getting more complex by the day and ad hoc hand testing won't cut it forever...)
Hey @colemickens I would like to understand if changing from ANY to the LB destination IP in the NSG rule will break anything. Can you shed some light on why it was originally set to ANY? |
@lachie83 Embarrassingly, I think it was just a result of not thinking about it very closely. I would feel more comfortable if it was a reference to the PIP rather than the PIP's current address, but I can't imagine any reason the PIP's IP would ever be released/renewed and be different. It seems like overall this change is safe and appropriate. The only concern is backward compatibility, but even then I can't think of much. Just a possibility that manual action may be required to re-write the rules, for example, if the reconciliation code doesn't look at every field in terms of making sure all necessary rules are in place. |
/lgtm |
@colemickens thanks for responding and don't be embarrassed at all. I just wanted to know if there was a reason why so that we didn't break anything. This needs some real use-case testing for both north-south (off to on-cluster) traffic paths as well as east/west given how this provider works. |
if sgNeedsUpdate { | ||
glog.V(3).Infof("ensure(%s): sg(%s) - updating", serviceName, *sg.Name) | ||
// azure-sdk-for-go introduced contraint validation which breaks the updating here if we don't set these | ||
// to nil. This is a workaround until https://github.com/Azure/go-autorest/issues/112 is fixed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this has been fixed? Can you verify we're on the updated SDK and then remove this workaround?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 (there's a number of places with this workaround, they should all be removable.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed that we are on azure-sdk-for-go 10.0.4-beta and go-autorest 8.0.0. According to Azure/go-autorest#112 (comment), the fix was merged prior to in go-autorest 8.0.0-beta. Will remove the workaround.
/retest |
I personally prefer to bind NSG rule with PIP, rather than PIP's current IP address. |
@itowlson Ivan, are we still pushing this PR? If it cannot make the timeline of 1.9, let's hold on this for now. As we have quite a big change in the azure cloud provider folder. |
/lgtm |
/approve |
/test all Tests are more than 96 hours old. Re-running tests. |
/approve no-issue |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: brendandburns, itowlson, jdumars Associated issue requirement bypassed by: jdumars The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/test all [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here. |
Will this patch be back-ported? or be released in next minor release by any chance? |
@kevinkim9264 I added the candidate label, and we can see if it gets CP'd by someone. |
@jdumars thank you! |
Removing label |
What this PR does / why we need it: On Azure, we create NSG (Network Security Group) rules on the vnet to allow external clients to access services exposed as type LoadBalancer. At the moment, these rules have a destination of
Any
, which means that they will permit requests on the opened port to any IP within the vnet. This PR restricts the security rules so that they admit external access only to the load balancer IP.Which issue this PR fixes: None in upstream - reported as Azure/acs-engine#1619
Special notes for your reviewer: None
Release note: