Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-28637: Update cluster-wide proxy Risks and Mitigations for Windows nodes. #1610

Merged
merged 1 commit into from Apr 25, 2024

Conversation

jrvaldes
Copy link
Contributor

This change updates the Risks and Mitigations section in the cluster-wide proxy enhancements to set expectations on the scope of the operator's responsibilities alongside recommendations to enable traffic through proxy by default for Powershell sessions.

@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Apr 22, 2024
@openshift-ci-robot
Copy link

@jrvaldes: This pull request references Jira Issue OCPBUGS-28637, which is invalid:

  • expected the bug to target the "4.16.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

This change updates the Risks and Mitigations section in the cluster-wide proxy enhancements to set expectations on the scope of the operator's responsibilities alongside recommendations to enable traffic through proxy by default for Powershell sessions.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jrvaldes
Copy link
Contributor Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Apr 22, 2024
@openshift-ci-robot
Copy link

@jrvaldes: This pull request references Jira Issue OCPBUGS-28637, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.16.0) matches configured target version for branch (4.16.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @rrasouli

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

* Maintain normal functionality in non-proxied clusters

### Non-Goals

* First-class support/enablement of proxy utilization for user-provided applications
* *ingress* and reverse proxy settings are out of scope
* Monitor cert expiration dates or automatically replace expired CAs in the cluster's trust bundle
* Windows workloads created in Windows nodes with cluster-wide proxy enabled do not inherit proxy settings from the
node. This is default behavior on Linux Nodes side as well.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the default behavior for Linux Nodes as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

@@ -110,6 +116,24 @@ Although cluster infra resources already do a best effort validation on the user
a user could provide non-functional proxy settings/certs. This would be propagated to their Windows nodes and workloads,
taking down existing application connectivity and preventing new Windows nodes from being bootstrapped.

To enable global HTTP proxy by default on Windows nodes with cluster-wide proxy for all PowerShell's sessions,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is doing a very good job at sharing the mitigation, but not the risk.
For what purpose would they do this? Please add if this has any impact on the functionality as a Node in the cluster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not following here.

This is a mitigation for a non-goal, and the impact is indeed the traffic of the Powershell session going via proxy. The latter does not introduce any additional risk.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrvaldes this section is for risk and mitigation, the risk needs to be explained first followed by its mitigation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the mitigation implies the risk but it might be useful to call it out directly:

If users use Windows images with PowerShell as the default shell, then there is a risk that outbound traffic from the PowerShell CLI wouldn't go through the cluster-wide proxy by default. Although this does not affect WMCO/OpenShift's view of the Node, this is different on-instance behavior than an admin would see if they were using CMD Prompt. To mitigate this...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

Powershell sessions that reads the proxy environment variables maintained by WMCO and populate the `DefaultWebProxy` property:
```powershell
Set-Content -Path $PROFILE.AllUsersCurrentHost -Value '$proxyValue=[Environment]::GetEnvironmentVariable("HTTP_PROXY", "Process")' -Force
Add-Content -Path $PROFILE.AllUsersCurrentHost -Value '[System.Net.WebRequest]::DefaultWebProxy = New-Object System.Net.Webproxy("$proxyValue")' -Force
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we also want to provide powershell guidance for HTTPS/custom certs, we can point to using this constructor when creating the System.Net.Webproxy object

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

where `PROXY_URL` is the URI of the cluster-wide proxy.

Run the following commands in the Windows Node with cluster-wide proxy enabled to create a default profile for the
Powershell sessions that reads the proxy environment variables maintained by WMCO and populate the `DefaultWebProxy` property:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should mention the example commands below are only for HTTP_PROXY and this needs to be done for all proxy variables.

@@ -59,13 +59,19 @@ requests to off-cluster services, and those that use a
+ [Additional certificate authorities](https://docs.openshift.com/container-platform/4.12//rest_api/config_apis/proxy-config-openshift-io-v1.html#spec-trustedca) required to validate the proxy's certificate
* Configure the proxy settings in WMCO-managed components on Windows nodes (kubelet, containerd runtime)
* React to changes to the cluster-wide proxy settings during WMCO runtime
* Synchronize environment variables `NO_PROXY`, `HTTP_PROXY`, and `HTTPS_PROXY` on Windows nodes with cluster-wide
proxy settings in both shells _CMD_ and _PowerShell_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that this point conflicts with the PowerShell non-goals: PowerShell's shell sessions do not inherit proxy settings by default on Windows nodes. Additionally, do we specifically do anything for it to be synchronized in both shells?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-worded it for more clarity, but it does not. This refers to the environment variables.

The PowerShell non-goals, cover enabling proxy by default in Powershell session, where the environment variables should exist, as per goals.

* Windows workloads created in Windows nodes with cluster-wide proxy enabled do not inherit proxy settings from the
node. This is the default behavior on Linux Nodes side as well.
* PowerShell's shell sessions do not inherit proxy settings by default on Windows nodes with cluster-wide proxy enabled
(See [Risks and Mitigations](#risks-and-mitigations) section for recommendations)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link is broken for me, can you double-check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mansikulkarni96
Copy link
Member

/approve

@jrvaldes
Copy link
Contributor Author

jrvaldes commented Apr 24, 2024

/cc @mburke5678 @rrasouli

for docs and QE

@jrvaldes
Copy link
Contributor Author

/cc @mandre

@openshift-ci openshift-ci bot requested a review from mandre April 24, 2024 16:44
Copy link
Contributor

@saifshaikh48 saifshaikh48 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 24, 2024
@mburke5678
Copy link

/lgtm

@@ -110,6 +116,29 @@ Although cluster infra resources already do a best effort validation on the user
a user could provide non-functional proxy settings/certs. This would be propagated to their Windows nodes and workloads,
taking down existing application connectivity and preventing new Windows nodes from being bootstrapped.

In case users use Windows images with PowerShell as the default shell, then there is a risk that outbound traffic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please clarify if Windows images refers to a VM image, or container image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I replaced images with nodes to avoid confusion as the proxy configuration for Windows Workloads is out of the scope for this.

@sebsoto
Copy link
Contributor

sebsoto commented Apr 24, 2024

LGTM other than the above comment

This change updates the Risks and Mitigations section in the
cluster-wide proxy enhancements to set expectations on the
scope of the operator's responsibilities alongside
recommendations to enable traffic through proxy by default
for Powershell sessions.
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Apr 24, 2024
@sebsoto
Copy link
Contributor

sebsoto commented Apr 24, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 24, 2024
@mandre
Copy link
Member

mandre commented Apr 25, 2024

Sorry, I'm not the most qualified when it comes to windows containers. Would it be possible to add your team lead (@sebsoto ?) as an approver of this repo?
/unassign mandre

@jrvaldes
Copy link
Contributor Author

jrvaldes commented Apr 25, 2024

Sorry, I'm not the most qualified when it comes to windows containers. Would it be possible to add your team lead (@sebsoto ?) as an approver of this repo? /unassign mandre

@mandre Thanks for the recommendation; I think that is a good idea.

In the interim, the team lead LGTM'ed already. Could that suffice for you to \approve?

@jrvaldes
Copy link
Contributor Author

/cc @mrunalp

@openshift-ci openshift-ci bot requested a review from mrunalp April 25, 2024 12:16
Copy link
Contributor

openshift-ci bot commented Apr 25, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mansikulkarni96, mrunalp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 25, 2024
Copy link
Contributor

openshift-ci bot commented Apr 25, 2024

@jrvaldes: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 70969ab into openshift:master Apr 25, 2024
2 checks passed
@openshift-ci-robot
Copy link

@jrvaldes: Jira Issue OCPBUGS-28637: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-28637 has been moved to the MODIFIED state.

In response to this:

This change updates the Risks and Mitigations section in the cluster-wide proxy enhancements to set expectations on the scope of the operator's responsibilities alongside recommendations to enable traffic through proxy by default for Powershell sessions.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants