Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Disable network policy on the existing AKS Cluster (to allow migration to overlay) #3845

Open
denniszielke opened this issue Aug 9, 2023 · 47 comments

Comments

@denniszielke
Copy link

denniszielke commented Aug 9, 2023

Tentative GA Date

July 2024

Is your feature request related to a problem? Please describe.
I want to upgrade an existing AKS cluster with calico network policy to overlay but that is not supported with activated network policy. So that means I cannot follow the upgrade path. #3720

Describe the solution you'd like
I want an ARM api that allows to deactivate network policy. Similar to #3084 but here to ensure the migration.

Describe alternatives you've considered
Afaik no alternatives.

This feature has been included in the v20240207 release and can be followed in release tracker

@denniszielke denniszielke added the feature-request Requested Features label Aug 9, 2023
@olsenme olsenme added this to In Progress (Development) in Azure Kubernetes Service Roadmap (Public) Aug 11, 2023
@olsenme
Copy link
Contributor

olsenme commented Aug 11, 2023

In progress

@VincentS
Copy link

Any update on this ?

@chasewilson
Copy link
Contributor

Any update on this ?

Still in progress here. Aiming at having this out in the coming months.

@cderocco5
Copy link

cderocco5 commented Nov 10, 2023

Is there an estimated date when this will be complete? I have a 500 node AKS cluster and I want to disable Azure NPM to prevent putting pressure on the apiserver

@Santhyrama
Copy link

I have many workload running on the cluster and IP seems to exhaust, azure overlay is good option for it, but when can we expect the feature to disable network policy in existing CNI cluster?

@kelvin-ko
Copy link

Is there any update on this feature request? We are looking for feature to move our multiple clusters(a few tens actually) to CNI overlay before they are hitting IP exhaustion situation..

@Shert
Copy link

Shert commented Jan 8, 2024

I'll also be very interested in this new feature, I think ip exhaustion is a common problem

@chasewilson
Copy link
Contributor

@kelvin-ko @Shert We're aiming for the end of this month but could slip to next depending on a few factors out of our control.

@arsnyder16
Copy link

@chasewilson How does this feature relate to enabling a network policy on an existing cluster? We currently have no network policy on a cluster but want to enable calico.

@robogatikov
Copy link

@chasewilson How does this feature relate to enabling a network policy on an existing cluster? We currently have no network policy on a cluster but want to enable calico.

@arsnyder16 , yes, after this feature is complete, enabling network policy Calico on an existing cluster will be allowed.

@brianereynolds
Copy link

Hi @chasewilson , will this feature allow me to change the network policy from calico to Azure Network Policy Manager ? I want to switch my existing clusters to use long-term support, but I can't (currently) do this as they are configured to use Calico.

@jblaaa-codes
Copy link

Is there an update to the release schedule?

@robogatikov
Copy link

Hi @chasewilson , will this feature allow me to change the network policy from calico to Azure Network Policy Manager ? I want to switch my existing clusters to use long-term support, but I can't (currently) do this as they are configured to use Calico.

Once this feature is rolled out, you will be able to do it in 2 steps:

  1. Uninstall Calico Network Policy Manager (az aks update -n -g --network-policy none)
  2. Install Azure Network Policy Manager (az aks update -n -g --network-policy azure)

@robogatikov
Copy link

Is there an update to the release schedule?

What @chasewilson said in his comment still holds true (end of January - beginning of March)

@CrunchyBlue
Copy link

Do we have a release schedule when this might make it to Azure GovCloud?

@zensonic
Copy link

zensonic commented Feb 23, 2024

So I tried this in west europe on a cluster and got the following. Am I misreading that this should be part of v20240207?

image

Fair enough

image

But still

image

@PixelRobots
Copy link
Collaborator

I don't believe this feature has rolled out currently.

Keep an eye here for the announcement.

@zensonic
Copy link

zensonic commented Feb 23, 2024

Ohh, my mistake. It just said

image

In the top of this, so I assumed. I will wait and have subscriped

@fgarcia-cnb
Copy link

i just tested updating network policy "in-place" on a cluster in west central US and it worked great! the release tracker was just updated today, so maybe it was just released. same command @zensonic ran.

it didnt work in westus2, which makes sense since the "currently in operation" column in the release tracker shows it running an old version (as in @zensonic's west europe case)

@PixelRobots
Copy link
Collaborator

Yeah it looks like it is still rolling out. So once all regions are updated it should be fully out.

@amitmavgupta
Copy link

It looks good, tested an upgrade from

  • Kubenet on netpol as "Calico" to Azure CNI Overlay with netpol set to "none"
  • Azure CNI overlay with netpol set to "none" to Azure CNI powered by Cilium with netpol set to "cilium"

@chasewilson
Copy link
Contributor

I tested the following for a kubenet with calico for policies

az aks update -g -n --network-policy none
az aks update -g -n --network-plugin azure --network-plugin-mode overlay

so far everything worked - however when I tried to re-enable calico for network policies using

az aks update -g -n --network-policy calico 

the cluster and ended up in a fail state complaining about:

plugin type="calico" failed (add): no podCidr for node

@wedaly

@wedaly
Copy link

wedaly commented Feb 26, 2024

hi all, we're still in the process of enabling this feature and will update this thread when it's ready.

@cderocco5
Copy link

cderocco5 commented Feb 26, 2024

Is only the adding of a network policy still in the process of testing? Are we able to remove a network policy now with
az aks update -g -n --network-policy none

@zensonic
Copy link

zensonic commented Feb 27, 2024 via email

@fgarcia-cnb
Copy link

still doesnt work for me in westus2 either, even though the release tracker says its complete. westcentralus does work

@fgarcia-cnb
Copy link

just started working in westus2

@zensonic
Copy link

I just upgraded the cluster to 1.27.9 - still no luck in westeurope... Still on track with early march?

@chasewilson
Copy link
Contributor

@zensonic, thanks for checking in. Yes we still are on track, there was a second toggle rollout required and it's in the process of rolling out. It doesn't have the visibility of release tracker unfortunately but it should be out soon.

@terraboops
Copy link

Thanks for your transparency and work on this!

The release tracker shows this as updated everywhere now; is that correct? If not, what's the best way to confirm the release of this? :)
https://releases.aks.azure.com/webpage/index.html#tabus

@wedaly
Copy link

wedaly commented Mar 12, 2024

hi all, thanks for your patience waiting for this feature! The code and configuration changes have now reached every region. We're doing a final review of the documentation and will publish that soon as well.

@PixelRobots
Copy link
Collaborator

Whilst we wait for the official docs, check out my blog post. https://pixelrobots.co.uk/2024/03/first-look-changing-or-disabling-your-network-policy-provider-on-aks/

@rgarcia89
Copy link

rgarcia89 commented Mar 13, 2024

I just migrated a cluster in germany west central from kubelet and calico as policy engine to azure-cni overlay with calico as policy engine 🎉

@tsiv-at-nnit-com
Copy link

tsiv-at-nnit-com commented Mar 13, 2024

Just to chip in. It is working nicely for us as well. We are doing this in switching form calico network policy to azure in TF state/running clusters without destroying the clusters

  1. terraform rm $kubernetesobject
  2. az aks update -g $rg -n $aksname --network-policy none
  3. az aks update -g $rg -n $aksname --network-policy azure
  4. terraform import $kubernetesobject $azureaksresourceid
  5. terraform plan # LTS code with new network policy set to azure
  6. terraform apply # LTS code with new network policy set to azure

It takes us around 2.5 hours on the clusters we run. It flips the node agents and networks in the process. It behaves like a couple of patching rounds/aks upgrades

az aks update can be resumed if it times out btw. We experienced a timeout, but it could be mended by a rerun. Thanks to the very nice PG in MS for this feature!

@chasewilson
Copy link
Contributor

Thank you all for your patience and feedback! A huge shout out to @wedaly and @robogatikov for the work they put in to this feature and to get it out!

@rgarcia89
Copy link

@tsiv-at-nnit-com I am using terraform for the deployment of clusters too. For me it worked to just migrate the clusters to azure and afterwards update the definition of it in the main.auto.tfvars file from kubenet to azure. Afterwards using tf apply, the change was detected but since it matches with the defined state, it just reported no changes.

@chasewilson chasewilson moved this from In Progress (Development) to Public Preview (Shipped & Improving) in Azure Kubernetes Service Roadmap (Public) Mar 13, 2024
@robogatikov
Copy link

The official documentation on uninstalling Network Policy engine (Azure NPM or Calico) is here: https://learn.microsoft.com/en-us/azure/aks/use-network-policies

@robogatikov
Copy link

Also don't hesitate to open a support ticket if you run into any issues (like upgrade request timeout for @tsiv-at-nnit-com) so we can troubleshoot.

@davem-git
Copy link

This is the network policy right, not the network plugin? is there away to remove the plugin as well?

@zensonic
Copy link

This is the network policy right, not the network plugin? is there away to remove the plugin as well?

Is for the policy.. for us to get to azure network policy because of desire for long term support.. we were on calico policy, but ofc MS can not support anything but their own stuff (more or less, world is not black and white) long term..

Network plugin change is a reprovision as of now. Until a moment ago so was network policy change 😊

@chasewilson chasewilson moved this from Public Preview (Shipped & Improving) to Generally Available (Done) in Azure Kubernetes Service Roadmap (Public) Mar 19, 2024
@gp-sharma
Copy link

Hello Everyone,

I have been trying to remove the network policy from aks cluster. For me the command itself throws an error-

$ az aks update --resource-group my_rg_name --name my_cluster_name --network-policy none
ERROR: unrecognized arguments: --network-policy none

Examples from AI knowledge base:
az aks update --resource-group MyResourceGroup --name MyManagedCluster --load-balancer-managed-outbound-ip-count 2
Update a kubernetes cluster with standard SKU load balancer to use two AKS created IPs for the load balancer outbound connection usage.

az aks update --resource-group MyResourceGroup --name MyManagedCluster --api-server-authorized-ip-ranges 0.0.0.0/32
Restrict apiserver traffic in a kubernetes cluster to agentpool nodes.

az version
Show the versions of Azure CLI modules and extensions in JSON format by default or format configured by --output (autogenerated)

https://docs.microsoft.com/en-US/cli/azure/aks#az_aks_update
Read more about the command in reference docs

could someone help me understand the issue ?

@amitmavgupta
Copy link

@gp-sharma you are disabling this on an AKS cluster that was created with Kubenet/ BYOCNI or some other plugin? Can you add that here?

Assuming that you are ok aks-preview extension version 0.5.166 or higher?

@gp-sharma
Copy link

gp-sharma commented Apr 5, 2024

@amitmavgupta we have used "kubenet" network plugin while creating aks cluster.

aks-preview version is 2.0.0b8.

@amitmavgupta
Copy link

amitmavgupta commented Apr 8, 2024

@tnn-simon
Copy link

When will this change be released in a stable API? Looking forward to manage this through my IaC pipelines.

@chasewilson
Copy link
Contributor

When will this change be released in a stable API? Looking forward to manage this through my IaC pipelines.

@tnn-simon we're aiming for a July GA, thanks for the interest!

@amitmavgupta thanks for the help in the comments!

@chasewilson chasewilson moved this from Generally Available (Done) to Public Preview (Shipped & Improving) in Azure Kubernetes Service Roadmap (Public) Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Azure Kubernetes Service Roadmap (Pub...
Public Preview (Shipped & Improving)
Development

No branches or pull requests