Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure Container Apps does NOT work with VNet integration and Azure Firewall as specified in the doc #227

Closed
1 of 3 tasks
doanduyhai opened this issue May 23, 2022 · 38 comments
Assignees
Labels
In progress Solution/feature is being worked on issue Networking Related to ACA networking

Comments

@doanduyhai
Copy link

Please provide us with the following information:

This issue is a: (mark with an x)

  • bug report -> please search issues before submitting
  • documentation issue or request
  • regression (a behavior that used to work and stopped in a new release)

Issue description

Region: westeurope

Creating an Azure Container Apps environment using an internal VNet and an Azure firewall does not work even if I followed all the documentation here:

Steps to reproduct

  1. Create a VNet with a 10.1.0.0/16 CIDR address space
  2. Create a subnet K8SControlPlaneSubnet01 in the VNet with a 10.1.2.0/23 address space
  3. Create a subnet K8SClusterSubnet01 in the VNet with a 10.1.4.0/27 address space
  4. Create a Standard Azure firewall + Firewall policies
  5. Create the following Network rules in the firewall policy
Source Address Protocol Port Destination Address/Service Tags
10.1.0.0/16 UDP 1194 AzureCloud.WestEurope
10.1.0.0/16 TCP 9000 AzureCloud.WestEurope
10.1.0.0/16 TCP 443 AzureMonitor
10.1.0.0/16 TCP 443 AzureCloud.WestEurope
  1. Create the following Application rules in the firewall policy
Name Source Address Protocol FQDNs/Service Tags
hcp 10.1.0.0/16 Https:443 *.hcp.westeurope.azmk8s.io
mcr 10.1.0.0/16 Https:443 mcr.microsoft.com,*.data.mcr.microsoft.com
azure 10.1.0.0/16 Https:443 management.azure.com,login.microsoftonline.com,packages.microsoft.com,acs-mirror.azureedge.net,dc.services.visualstudio.com,graph.microsoft.com
azure-monitor 10.1.0.0/16 Https:443 .ods.opinsights.azure.com,.oms.opinsights.azure.com,*.monitoring.azure.com
ubuntu 10.1.0.0/16 Https:443,Http:80 security.ubuntu.com,azure.archive.ubuntu.com,changelogs.ubuntu.com,motd.ubuntu.com
azure-policy 10.1.0.0/16 Https:443 data.policy.core.windows.net,store.policy.core.windows.net,dc.services.visualstudio.com
aks 10.1.0.0/16 Https:443 AzureKubernetesService
  1. Create a custom route table with the following route:
    0.0.0.0/0 --> <azure_firewall_private_ip_address>
  2. NO NSG IS APPLIED TO THE SUBNETS, TO SIMPLIFY DEBUGGING
  3. Assign the custom route table to the subnets K8SControlPlaneSubnet01 & K8SClusterSubnet01
  4. Create a log analytics workspace
  5. Create an Azure Container Apps Environment with the following ARM Template
{
    "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {},
    "variables": {},
    "resources": [
        {
            "type": "Microsoft.App/managedEnvironments",
            "apiVersion": "2022-03-01",
            "name": "poc-container-apps-env",
            "location": "westeurope",
            "properties": {
                "vnetConfiguration": {
                    "internal": true,
                    "infrastructureSubnetId": "/subscriptions/xxx/resourceGroups/poc-azure-container-apps/providers/Microsoft.Network/virtualNetworks/poc-azure-container-apps-westeurope-vnet01/subnets/K8SControlPlaneSubnet01",
                    "runtimeSubnetId": "/subscriptions/xxx/resourceGroups/poc-azure-container-apps/providers/Microsoft.Network/virtualNetworks/poc-azure-container-apps-westeurope-vnet01/subnets/K8SClusterSubnet01",
                    "dockerBridgeCidr": "10.2.0.1/16",
                    "platformReservedCidr": "10.0.0.0/16",
                    "platformReservedDnsIP": "10.0.0.2"
                },
                "appLogsConfiguration": {
                    "destination": "log-analytics",
                    "logAnalyticsConfiguration": {
                        "customerId": "xxx",
                        "sharedKey": "xxx"
                    }
                }
            }
        }
    ]
}

Deployment information:

  • Start time: 5/22/2022, 9:53:34 PM
  • Correlation ID: 7c08a0ab-f48b-4411-b6a0-96a55e56112b

Expected behavior

The Container Apps Environment should be created

Actual behavior

The deployment fails with a cryptic error:

{"code":"DeploymentFailed","message":"At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details.","details":[{"code":"OperationFailed","message":"Managed environment failed to initialize"}]}

Screenshots

Additional context

The deployment always fails at the step of creation of the kubernetes load balancer. Sometimes it also fails at creating kubenertes-internal load balancer

I suspect some asymetric routing involved because the load balancers created in the MC_xxx resource groups have public IPs and they are forced tunnelled through Azure Firewall

@ghost ghost added the Needs: triage 🔍 Pending a first pass to read, tag, and assign label May 23, 2022
@torosent torosent added investigating currently looking into the issue and removed Needs: triage 🔍 Pending a first pass to read, tag, and assign labels May 25, 2022
@JennyLawrance
Copy link

JennyLawrance commented Jun 1, 2022

@doanduyhai, Did you allow both inbound and outbound communication within the control plane subnet?

https://github.com/microsoft/azure-container-apps/wiki/Lock-down-VNET-with-Network-Security-Groups-and-Firewall

@doanduyhai
Copy link
Author

Hello @JennyLawrance

As said in the steps to reproduce, there is no NSG used in my settings

So the subnet-to-subnet communication is all open within the VNet. Consequently, the control plane subnet (named K8SControlPlaneSubnet01 in my settings) is open for inbound/outbound communication

@chinadragon0515
Copy link
Member

@doanduyhai I will investigate it

@doanduyhai
Copy link
Author

Thanks @chinadragon0515

Do not hesitate to ping me if you need any further detail to reproduce the issue

@chinadragon0515
Copy link
Member

@doanduyhai I have two questions,

  1. Can I know whether your firewall shares the same vnet as container apps (different subnet) or use different vnet?
  2. Can you help to enable diagnostic logs of firewall and check is there any traffic was blocked?

In the meanwhile I try to recreate that firewall shares the same vnet as container apps.

@doanduyhai
Copy link
Author

@chinadragon0515

  1. The firewall is in the same VNet, in its own subnet (AzureFirewallSubnet). No NSG
  2. When querying the denied traffic from Firewall logs, I have no answer, e.g. no blocked traffic over 1h during the test
AzureDiagnostics
| where msg_s contains "Denied"
| project TimeGenerated, msg_s
| order by TimeGenerated desc 

@chinadragon0515
Copy link
Member

@doanduyhai I have found the issue, we are discussing the fix, and I will update here when the fix is deployed. thanks.

@doanduyhai
Copy link
Author

@chinadragon0515 Great news ! Can you tell more about the issue ? Network integration ? Missing firewall rules ?

@oramoss
Copy link

oramoss commented Jun 7, 2022

This also affecting us for last few days...glad to have found this thread.

As mentioned, can we have some knowledge of the issue and potential ETA for resolution please?

@joshuadmatthews
Copy link

I am attempting to join container apps to an existing vnet, no firewall or nsgs at all at this point. Would this issue be the cause of my "Managed environment failed to initialize" error?

@chinadragon0515
Copy link
Member

We are actively working on the issue, but we do not have a target date to share. I will update here when we have a date to share.

This issue only occurred when UDR is used, if UDR is not used, there is no issue.

@oramoss
Copy link

oramoss commented Jun 9, 2022

We seem to have similar issue and we have no UDRs involved on the target VNET/Subnets in question.

@doanduyhai
Copy link
Author

@chinadragon0515 Is the issue related to asymetric routing or something else ?

@joshuadmatthews
Copy link

joshuadmatthews commented Jun 22, 2022

@chinadragon0515 I am seeing the firewall blocking my DAPR component's requests to the providers when a UDR attempts to route them there.

HTTPS request from 10.0.0.4:51038. Action: Deny. Reason: SNI TLS extension was missing.

This gets picked up if I try to add an Azure Service Bus PubSub component to DAPR. Service Bus is exposed on the VNet with a Private Endpoint and a related Private DNS Zone for it attached to both the hub and spoke vnets. I have application rules in my firewall that should allow traffic from the control plane subnet to the private endpoint FQDN, but seems the firewall is denying it before it gets to the custom rules. If I turn off the UDR this routes correctly to the private endpoint, verified by the fact that all networking is turned off on the service bus and it succeeds without the UDR.

Is this issue related to what you are working on?

@chinadragon0515
Copy link
Member

@oramoss if no UDR used, then it is not related to this, please open a ticket and we can investigate what could be wrong.

@chinadragon0515
Copy link
Member

@doanduyhai it is not related to asymetric routing, when UDR is used with 0.0.0.0, it will change the node outbound IP addresses, and then cause the issue.

For now, using custom user-defined routes (UDRs) or ExpressRoutes, other than with UDRs of selected destinations that you own, are not yet supported for Container App Environments with VNETs.

We are working with partner team to resolve this, I will update here when we have target date to resolve.

@chinadragon0515
Copy link
Member

@joshuadmatthews

The error you see is not same as the common UDR error, can you open a ticket with the detail configuration information, so I can investigate more to see whether it is the same issue or different issue.

For now, using custom user-defined routes (UDRs) or ExpressRoutes, other than with UDRs of selected destinations that you own, are not yet supported for Container App Environments with VNETs.

@graemefoster
Copy link

Also see #255 for similar issue.

@jagiraud
Copy link

@chinadragon0515 can you share any estimate when this could be solved? Weeks, months?

@linhht
Copy link

linhht commented Aug 11, 2022

I got the same issue when deploying ContainerApp behind FW (goal is to control outbound traffic of the app)

image

@chinadragon0515
Copy link
Member

@jagiraud The issue is been actively worked on, it should be resolved in next couple months, I will share here when we have more accurate date. Sorry for inconvenience and thanks for your patience.

@SonOfBytes
Copy link

@chinadragon0515 - we are validating Container Apps for use in a Hub and Spoke architecture (Spoke Subscription peered to Hub Subscription with Firewall) using Enterprise Scale patterns.

Once this UDR issue is addressed is it expected that we will be able to use Container Apps to serve container hosted services across the International CORP network (internal private azure network - no internet egress/ingress) from Container Apps Service? We want to make sure there is no dependency on the solution needing to go internet bound and public endpoints.

Asking because we cannot validate this because of the UDR issue. Thank you!

@mustafazeya
Copy link

Earlier we used to create UDR with our own managed CIDRs and it was working just fine. Now I see that a managed route table is getting created with aks-agentpool prefix and getting attached to the ACA subnet. How are we now supposed to route traffic to firewall??

@mustafazeya
Copy link

Earlier we used to create UDR with our own managed CIDRs and it was working just fine. Now I see that a managed route table is getting created with aks-agentpool prefix and getting attached to the ACA subnet. How are we now supposed to route traffic to firewall??

This is fixed now !

@ToonVanhoutte
Copy link

@chinadragon0515: Any more precise timeline you can share? Facing this issue. Thank you

@chinadragon0515
Copy link
Member

We are still working on it, no target to share now. Will update when we have a ETA.

@mthoger
Copy link

mthoger commented Nov 8, 2022

I'm puzzled how you can GA a service with vnet integration and not support UDRs.
you push hub and spoke topologies, and Enterprises will always, force traffic to central monitor and security point and not allow direct traffic to the internet.

every time you GA Vnet integrations for Azure services ( azure firewall, app services, functions and what not ) , force tunneling is not supported.

I have a hard time, understanding use cases, for this where you don't need access to on-premises data resources in a secure manner.

Anyway, looking forward to test this service, when you have a ETA.

@dmexs
Copy link

dmexs commented Nov 8, 2022

@mthoger come check out the networking channel on discord. Another user was able to help me with a couple service tag routes to get my UDR working.

@alhardy
Copy link

alhardy commented Dec 7, 2022

Any update on this?

@chinadragon0515
Copy link
Member

chinadragon0515 commented Dec 14, 2022

We are working on a solutions now.

For now, if you want to use UDR, you can add a route like this one to bypass Azure cloud traffic, and it works too.

MicrosoftTeams-image

@TheIronRock95
Copy link

Is there an update on this, something to share on the possible ETA for the definitive solution?

@iwalt
Copy link

iwalt commented Feb 7, 2023

@TheIronRock95 the ACA roadmap is now public - ETA looks to be end of March 2023.

@chinadragon0515
Copy link
Member

We have new network architecture implemented in ACA and announcement public preview today,
With new architecture, there is no limit on UDR/NAT gateway and so on.

For more detail, you can refer to this announcement.
https://azure.microsoft.com/en-us/updates/public-preview-azure-container-apps-offers-new-plan-and-pricing-structure/

@KrylixZA
Copy link

We have new network architecture implemented in ACA and announcement public preview today, With new architecture, there is no limit on UDR/NAT gateway and so on.

For more detail, you can refer to this announcement. https://azure.microsoft.com/en-us/updates/public-preview-azure-container-apps-offers-new-plan-and-pricing-structure/

Any potential ETA on when this may become GA?

@torosent
Copy link
Member

Summer 2023

@anrub
Copy link

anrub commented Sep 8, 2023

We used this UDR configuration very actively on all (>100) our Container App Environments. And this environments are now "Consumption only" - which means UDR is not supported there, although it was supported in the past.

Is it guaranteed that this original container app setup, described like in this issue, keeps working?

I think the UDR feature was just dropped from the orginal release, as all "old" cappenvs are now automatically "Consumption Only".

@torosent
Copy link
Member

We used this UDR configuration very actively on all (>100) our Container App Environments. And this environments are now "Consumption only" - which means UDR is not supported there, although it was supported in the past.

Is it guaranteed that this original container app setup, described like in this issue, keeps working?

I think the UDR feature was just dropped from the orginal release, as all "old" cappenvs are now automatically "Consumption Only".

@anrub. UDR with target 0.0.0.0/0 was never supported in Consumption Only. There was a workaround some users used to get passed the limitation. We haven't changed anything in the Consumption Only plan architecture as it's GA. so this workaround will work. In order to use the supported UDR with 0.0.0.0/0 please use Workload Profiles

@chinadragon0515
Copy link
Member

UDR is supported with worklaod profile envs, refer to https://learn.microsoft.com/en-us/azure/container-apps/workload-profiles-overview

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
In progress Solution/feature is being worked on issue Networking Related to ACA networking
Projects
None yet
Development

No branches or pull requests