Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need proxy support in air-gapped environment #4585

Open
hanlins opened this issue May 7, 2021 · 30 comments
Open

Need proxy support in air-gapped environment #4585

hanlins opened this issue May 7, 2021 · 30 comments
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/documentation Categorizes issue or PR as related to documentation. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@hanlins
Copy link
Member

hanlins commented May 7, 2021

User Story

As an operator, I would like to add proxy setting configurations to capi providers for the air-gapped environments.

Detailed Description

In air-gapped environment, cluster API provider pods might be deployed in air-gapped environment, and thus cannot talk to the infrastructure provider directly. In this scenario, a proxy server is typically deployed to enable the connectivity and audit the traffic that bypasses the firewall. It would be ideal if we can have a mechanism to plumb the proxy server configurations to the cluster API provider pods, so that they can be able to communicate with the infrastructure.

Anything else you would like to add:
One approach I think think of is to have something like this:

HTTP_PROXY=xxx clusterctl init

The implementation should be similar to kubernetes/kubernetes#84559.

[Miscellaneous information that will assist in solving the issue.]

/kind feature

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label May 7, 2021
@enxebre
Copy link
Member

enxebre commented May 10, 2021

For such scenario we would also want the ability to configure https_proxy and no_proxy.

We'd need to flesh out details here, define and agree on what an air gapped env is and what scenarios and behaviour exactly we want to support end to end, e.g would this be a one shot thing? or would we want capi components to watch a "proxy config" and react to changes there?
I think this will probably deserve a proposal having all the details.

@fabriziopandini
Copy link
Member

fabriziopandini commented May 10, 2021

@hanlins I'm starting to think about this use case, and my main concern is that adding proxy settings can't be achieved by simple variable substitution, which is the only templating solution supported in clusterctl as of today.
The only two options I can see here are:

  • to rely on different templating solutions injected in the clusterctl library
  • use mutating web hooks

Also, the ongoing work on ManagedCluster might provide some help here, but this is still TBD
IF this can help, I'm happy to chat about this

@vincepri
Copy link
Member

vincepri commented Jul 6, 2021

/milestone Next

@k8s-ci-robot k8s-ci-robot added this to the Next milestone Jul 6, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 4, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 3, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@joejulian
Copy link
Contributor

/reopen

We just encountered a customer that needs this, too.

It could be done through templating in cmd/clusterctl/client/repository.NewComponents with an option that contains the values for https_proxy, http_proxy, and no_proxy.

@k8s-ci-robot
Copy link
Contributor

@joejulian: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

We just encountered a customer that needs this, too.

It could be done through templating in cmd/clusterctl/client/repository.NewComponents with an option that contains the values for https_proxy, http_proxy, and no_proxy.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dlipovetsky
Copy link
Contributor

/reopen

@k8s-ci-robot
Copy link
Contributor

@dlipovetsky: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Feb 1, 2022
@dlipovetsky
Copy link
Contributor

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Feb 2, 2022
@sbueringer
Copy link
Member

/assign @ykakarap
Can you please assess if it would be possible to extend clusterctl to inject http proxy env vars in the provider manifests.

@fabriziopandini
Copy link
Member

/milestone v1.2

@k8s-ci-robot k8s-ci-robot modified the milestones: Next, v1.2 Feb 11, 2022
@faiq
Copy link

faiq commented Feb 14, 2022

Hey I left a message on the #cluster-api slack channel to no avail :( Is it possible to get involved with the effort here? What's the criteria that we're going to be using to asses if this is possible or not? I'd love to see this feature happen so please let me know where I can help.

@ykakarap
Copy link
Contributor

Catching up on the issue. Will get back soon. :)

@faiq I will take a look at this and post my findings here.

@fabriziopandini fabriziopandini added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Jul 29, 2022
@fabriziopandini fabriziopandini removed this from the v1.2 milestone Jul 29, 2022
@fabriziopandini fabriziopandini removed the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Jul 29, 2022
@fabriziopandini
Copy link
Member

/triage accepted
/unassign @ykakarap

@joejulian could you share how you fixed this problem as per #4585 (comment) so someone can pick up the work in CAPI
/help

@k8s-ci-robot
Copy link
Contributor

@fabriziopandini:
This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • Does this issue have zero to low barrier of entry?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/triage accepted
/unassign @ykakarap

@joejulian could you share how you fixed this problem as per #4585 (comment) so someone can pick up the work in CAPI
/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. labels Oct 3, 2022
@faiq
Copy link

faiq commented Oct 3, 2022

@fabriziopandini we modify the core-components.yaml file with this kustomization overlay

apiVersion: apps/v1
kind: Deployment
metadata:
  name: NA
spec:
  template:
    spec:
      containers:
        - name: manager
          env:
            - name: HTTP_PROXY
              value: ${HTTP_PROXY:=""}
            - name: HTTPS_PROXY
              value: ${HTTPS_PROXY:=""}
            - name: NO_PROXY
              value: ${NO_PROXY:=""}

@dlipovetsky
Copy link
Contributor

Sounds like we have at least 3 options, in order from "least work required" to "most work required" from our users:

  1. Include these env variables in the manifest for the core provider.
  2. Document how to add these env variables by patching the manifest, e.g. with kustomize.
  3. Document how to use a mutating webhook to set these env variables.

(In every cases, users need to include information like the Pods and Services CIDRs in the NO_PROXY variable, along with fixed values like localhost, etc.)

@joejulian
Copy link
Contributor

@fabriziopandini I don't remember what we did (and I don't work there anymore so I can't go back and check).

@joejulian
Copy link
Contributor

Sounds like we have at least 3 options, in order from "least work required" to "most work required" from our users:

1. Include these env variables in the manifest for the core provider.

2. Document how to add these env variables by patching the manifest, e.g. with kustomize.

3. Document how to use a mutating webhook to set these env variables.

(In every cases, users need to include information like the Pods and Services CIDRs in the NO_PROXY variable, along with fixed values like localhost, etc.)

I think it's obvious I support 1. :)

  1. Seems odd that we'd rebuild this entire toolset around templating but this one bit we'd require using kustomize.
  2. How would this webhook be installed without the proxy config?

@fabriziopandini
Copy link
Member

I agree that adding env var to the manifest is the simplest way forward, my only concern is that in the past we got push-back for this type of change by folks using git-ops and trying to use yaml files directly (and in fact there is #3881 asking to remove all the variables we currently have).

@joejulian
Copy link
Contributor

I've never been a fan of adding the complexity of templating to cluster-api a la ClusterClass, but the community felt the return was worth it. Embracing that change; I'm not sure, now, where the distinction lies between one form of templating and another. Is there a way to solve this that's more in line with ClusterClass, maybe?

@sbueringer
Copy link
Member

sbueringer commented Oct 5, 2022

Q: 1. Include these env variables in the manifest for the core provider.

In air-gapped environment, cluster API provider pods might be deployed in air-gapped environment, and thus cannot talk to the infrastructure provider directly.

Just for my understanding. For which connections do we need the http proxy configuration?

  1. communication from CAPI to infra provider APIs (AWS,Azure,...)
  2. communication from CAPI to workload clusters
  3. both

I'm just a bit confused because the original ask was for the infra provider, but core CAPI is not accessing it. And having it consistently in infra providers would require agreement with infra providers (maybe an addition to the contract)

@chrischdi
Copy link
Member

  1. communication from workload clusters to endpoints (registry, internet, ...)

@sbueringer
Copy link
Member

sbueringer commented Oct 6, 2022

4. communication from workload clusters to endpoints (registry, internet, ...)

Should be probably from controllers / mgmt cluster to registry/internet?

I think the issue is about setting proxy for CAPI providers/controllers only (based on the PR description).

But based on the title it could be proxy support in general.

@joejulian
Copy link
Contributor

I don't think you can add generalized proxy support. There's no standard.

@enxebre
Copy link
Member

enxebre commented Jun 30, 2023

Sounds like we have at least 3 options, in order from "least work required" to "most work required" from our users:
Include these env variables in the manifest for the core provider.
Document how to add these env variables by patching the manifest, e.g. with kustomize.
Document how to use a mutating webhook to set these env variables.
(In every cases, users need to include information like the Pods and Services CIDRs in the NO_PROXY variable, along with fixed values like localhost, etc.)

Agreed, at minimum we could provide some guidance docs

/kind documentation

@k8s-ci-robot k8s-ci-robot added the kind/documentation Categorizes issue or PR as related to documentation. label Jun 30, 2023
@fabriziopandini
Copy link
Member

/priority backlog

@k8s-ci-robot k8s-ci-robot added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Apr 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/documentation Categorizes issue or PR as related to documentation. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests