Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to perform slot swap for Premium Function with network restricted storage account #8448

Closed
jarrodd07 opened this issue Jun 9, 2022 · 21 comments
Assignees
Labels

Comments

@jarrodd07
Copy link

Cannot perform a slot swap operation on a Function App in an Elastic Premium App Service Plan when the underlying storage account has network restrictions.
When a slot swap is performed a 500 Internal Server Error is returned with the error message "There was an unexpected error swapping slots 'staging' and 'production' for site ''. Please try to cancel your swap operation. ExtendedCode: 04093".
In the Diagnose and solve problems blade there is a further error: "Swap failed. Details: System.ServiceModel.Web.WebFaultException`1[Microsoft.Web.Hosting.Administration.ErrorEntity]: Bad Request (Fault Detail is equal to Code: BadRequest, ExtendedCode: Storage access failed. Storage volume is currently in R/O mode, Message: 04234)."

The storage account has firewall rules enabled restricting access to certain IPs and the function app connects via a private endpoint using vnet integration.

When I disable the network restrictions the slot swap operation succeeds.

Check for a solution in the Azure portal

No solution in the Azure Portal.

Investigative information

Please provide the following:

  • Timestamp: 2022-06-08T03:12:00Z
  • Function App version: 4
  • Function App name:
  • Function name(s) (as appropriate):
  • Request ID of swap operation (unsure if this is the same as Invocation ID): 74fe4110-28cd-4fd0-8262-d5b1b39a1ad8
  • Invocation ID:
  • Region: Australia East

Repro steps

Provide the steps required to reproduce the problem:

  1. Create function app with deployment slot inside Elastic Premium App Service Plan with network restricted storage account (setup is similar to this sample template but with a deployment slot)
  2. Perform slot swap

Expected behavior

Slot swap should succeed.

Actual behavior

Slot swap fails with error "There was an unexpected error swapping slots 'staging' and 'production' for site ''. Please try to cancel your swap operation. ExtendedCode: 04093".
In the Diagnose and solve problems blade there is a further error: "Swap failed. Details: System.ServiceModel.Web.WebFaultException`1[Microsoft.Web.Hosting.Administration.ErrorEntity]: Bad Request (Fault Detail is equal to Code: BadRequest, ExtendedCode: Storage access failed. Storage volume is currently in R/O mode, Message: 04234)."

Known workarounds

Slot swap succeeds when network restrictions are removed from the storage account.

Related information

Provide any related information

  • Programming language used: C# .Net 6
  • Links to source
  • Bindings used
@ghost ghost assigned liliankasem Jun 9, 2022
@lemonahmas
Copy link

@liliankasem Hi Lilian, please kindly go through chat content on Teams. This Github issue is related to a support ticket and please do let me know if anything is needed. Cheers.

@liliankasem
Copy link
Member

liliankasem commented Jun 13, 2022

Looks like this is a known issue with a bug fix completed, it will be in the next release but there isn't an ETA on that right now. This is an an underlying platform issue and not a Functions specific issue so there isn't anything we can do here but wait for the release. I can keep this open to track that the bug fix addresses the issue and let you know when the platform release is out.

The only available workaround would be to disable the firewall while swapping.

@jarrodd07
Copy link
Author

Perfect thanks that for that @liliankasem

@lsuarez5280
Copy link

lsuarez5280 commented Jun 14, 2022

@liliankasem Is there any public location to track release milestones for defects like this so I can keep an eye out? This issue has been plaguing me for some time as well and I didn't note a milestone attached.

@nzthiago
Copy link
Member

This was caused by an internal platform component, and I’ll keep the issue open to notify when the component fix has been fully released. Unfortunately, the ETA for a full roll out is within the next 3 to 4 months.

@nzthiago
Copy link
Member

The ETA remains as per above, between September and October.

@lsuarez5280
Copy link

@nzthiago @liliankasem This seems to have gotten worse recently. The workaround of disabling the storage account firewall settings to allow public access, performing the swap, and restoring it after the swap no longer functions. Is there an ETA on this deployment? There certainly appear to have been changes to the host stack with new access restriction preview features so I imagine something has changed recently.

@lindenle
Copy link

Same issue here. A more specific ETA would be greatly appreciated.

@nzthiago
Copy link
Member

The rollout is well under way and in many regions already, but not 100% done yet. The ETA for full rollout I shared above still applies, to complete in October. Sorry to hear you are running into issues, the behavior you are describing is not expected and can be due to other reasons. I suggest raising a support call if you can't swap even if no firewall settings are in place, and I can follow up if you share the support number.

@nzthiago
Copy link
Member

nzthiago commented Oct 5, 2022

@lsuarez5280 @lindenle - apologies, I have checked in with our support team and there is an issue that has recently emerged that can be the cause of the issue you're facing as well. We have documentation showing that the requirement for this scenario to work was for customers to have WEBSITE_CONTENTOVERVNET = 1. It looks like it was working for some customers without this because their app ended up accessing the storage over public internet and thereby eliminating any security Vnet provides. Please raise a support call if it starts working after setting that App Setting or other issues as they can help investigate further.

@lsuarez5280
Copy link

Thank you for the follow-up, @nzthiago. I'm going to work through some testing this evening to find out exactly what state we're in. Unfortunately, the BICEP deployments in question somehow were able to produce the same managed ID for both the site and deployment slot, which is negatively impacting other areas, so I'll be working through redeployment tonight to attempt to remediate this issue and then testing the deployment slot swap after the failed portions of IaC are rebuilt.

These issues have occurred in spite of using WEBSITE_CONTENTOVERVNET throughout this process, so I'll check in after testing this scenario once more.

@lsuarez5280
Copy link

@nzthiago I've attempted any number of firewall and network configurations to see if I could successfully complete a swap. However, all of them have reported the same response below when investigating the issue in the support blade.

Swap failed. Details: System.ServiceModel.Web.WebFaultException`1[Microsoft.Web.Hosting.Administration.ErrorEntity]: Bad Request (Fault Detail is equal to Code: BadRequest, ExtendedCode: Storage access failed. Authentication for [redacted].file.core.windows.net failed., Message: 04234).

I've opened support ticket 2210060010000492 in response as requested.

@nzthiago
Copy link
Member

nzthiago commented Oct 7, 2022

@lsuarez5280 I've been keeping track of that support ticket, I believe they have given you an update and hopefully it's fixed. This is separate from the slot swap operation failing that this GitHub issue was raised for so keeping it open until we have confirmation that that is working too.

@lsuarez5280
Copy link

lsuarez5280 commented Oct 12, 2022

Thanks for the follow-up @nzthiago. At this point we're back to previous state and awaiting the deployment of this fix to my scale unit. The previous workaround is allowing swaps to function in the interim. Apologies for the slow response. It seems Mondays do in fact happen. 😅

@liliankasem liliankasem assigned nzthiago and unassigned liliankasem Oct 26, 2022
@kapil-07
Copy link

@nzthiago Facing the same issue, Could you please confirm whether this is fixed currently when deploying on westeurope region, I am facing this issue.

@nzthiago
Copy link
Member

Hi everyone, we are in final testing phase to ensure we have addressed completely and will work on documenting the steps so you can complete the swap without issues when storage is network restricted. Hoping to have an update in the next couple of weeks.

@shun-jiang
Copy link

@nzthiago is it fixed in WestUS region? We have the swap issue after moving the storage to vNet.

@ankit-sheth
Copy link

ankit-sheth commented Nov 28, 2022

@nzthiago : Is this issue resolved ? Now are we able to do SWAP of Slots under the Private VNet ?
Still, i am getting the same error !! Please update/verify on this.

@nzthiago
Copy link
Member

nzthiago commented Nov 28, 2022

Hi everyone, apologies for the delay in updating this issue, we went through an extra load of testing before sharing.
The fix is deployed but we had to introduce a new app setting that you should set on your production slot (or the swap slot if you're swapping between two subslots) called WEBSITE_OVERRIDE_STICKY_DIAGNOSTICS_SETTINGS and set it to 0 (zero). I.e.,

WEBSITE_OVERRIDE_STICKY_DIAGNOSTICS_SETTINGS=0 

This will allow you to swap the slots when the storage account is network restricted. Here is our documentation on app settings. This should not have any impact on your Azure Monitor related diagnostics settings configuration and is related to the legacy Application Log Settings configuration, which was preventing Premium Functions slot swaps from occurring.

Next steps on our side are:

  • We will add to our backlog a work item for this setting to defaulted for Premium Functions, so you won't have to add it but currently no ETA for this, so the above is the current final solution.
  • We will add the app setting to our App Settings list documentation

@ankit-sheth
Copy link

ankit-sheth commented Nov 29, 2022

Thanks @nzthiago this really helps.

One more help required.. related to VNET enabled function app and deploy code.
Facing difficulty in deploying the zip through az or curl command to apply the code on private endpoint enabled function app (.scm). It gives 403 when try to apply it from my jenkins VM (from outside the VNet VM).

I have tried by applying the WEBSITE_CONTENTOVERVNET:1 and WEBSITE_DNS_SERVER : value, as above suggested, but not works.
Is there any other way we can manage it from out of the VNet ? Or any settings which can help to make it work.

@nzthiago
Copy link
Member

@ankit-sheth It's likely the networking restrictions are preventing you from deploying from outside the VNet. I'd recommend creating a support call if you need some hands-on help with that specific scenario.
Closing this issue for now and recommend creating a new issue or support tickets as needed if you have any trouble with VNet and Premium Functions.

@Azure Azure locked as resolved and limited conversation to collaborators Dec 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

9 participants