-
Notifications
You must be signed in to change notification settings - Fork 210
Description
Summary
When deploying a Flex Consumption Function App that is configured with both VNet Integration (outbound) and a Private Endpoint (inbound), the deployment consistently fails.
The code package uploads successfully, but the deployment hangs at the [Kudu-RemoveWorkersStep] and fails with a 100-second HttpClient.Timeout. This error is caused by the FunctionsSyncManager failing to sync the new triggers with the Azure management plane.
This happens even when the app's subnet is correctly configured with a NAT Gateway for internet access and a Microsoft.Storage service endpoint.
The only workaround is to temporarily delete the Function App's Private Endpoint, which confirms this is a platform-level networking bug related to this specific topology.
Steps to Reproduce
- Create a Flex Consumption Function App.
- Create a VNet with an apps subnet.
- Configure VNet Integration (outbound) on the Function App, linking it to the apps subnet.
- Configure an inbound Private Endpoint on the Function App, connecting it to a subnet in the same VNet.
- Create a publicly accessible Storage Account for AzureWebJobsStorage.
- On the apps subnet, configure the following:
- Add a Microsoft.Storage Service Endpoint.
- Attach a NAT Gateway (with a Public IP) to provide outbound internet access.
- Attempt to deploy any function code using the Azure CLI (from a local file or SAS URL):
az functionapp deployment source config-zip -g <rg> -n <app-name> --src "my-package.zip"
Expected Behavior
The deployment completes successfully. The new function code is deployed, and the new triggers (e.g., A, B, D) are correctly synced and visible in the Azure Portal.
Actual Behavior
The az cli command hangs for several minutes after showing Deployment endpoint responded with status code 202. It then fails with a "partially successful" message.
The deployment logs show the package upload completes, but the sync times out:
...
{"log_time": "2025-10-27T13:58:25.9867565Z", "id": "...", "message": "[Kudu-UploadPackageStep] completed. Uploaded package to storage successfully.", "type": 0},
{"log_time": "2025-10-27T13:58:26.1048234Z", "id": "...", "message": "[Kudu-RemoveWorkersStep] starting.", "type": 0},
{"log_time": "2025-10-27T14:03:30.1222084Z", "id": "...", "message": "Deployment was successful with Error: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing.", "type": 1}
The app is left in a broken state:
- The Azure Portal shows the old functions (A, B, C).
- The admin API (/admin/functions) shows the new functions (A, B, D).
- The synctriggers endpoint fails, and the app's internal logs show this exception:
System.Threading.Tasks.TaskCanceledException: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing.
at Microsoft.Azure.WebJobs.Script.WebHost.Management.FunctionsSyncManager.TrySyncTriggersAsync
Workaround
The only effective workaround is to temporarily delete the Private Endpoint on the Function App.
- Delete the Private Endpoint.
- Temporarily allow public access to the app (via Access Restrictions).
- Re-run the deployment.
- The deployment succeeds in seconds.
- Re-add the Private Endpoint and re-enable Access Restrictions.
This strongly indicates the Private Endpoint is causing a routing conflict that breaks the FunctionsSyncManager's outbound call to the public Azure management plane, even when a NAT Gateway is present.
