Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v3 Azure Functions running old code after successful Bitbucket CI/CD deployment #5663

Open
dasbdavis opened this issue Feb 18, 2020 · 70 comments

Comments

@dasbdavis
Copy link

We've got Bitbucket continuous deployment set up for a couple of our v3 Azure Functions (Azure Function App -> Platform Features -> Container settings -> CI/CD (Bitbucket)). No build provider selected, as the Bitbucket option doesn't seem to allow for it. Trigger branch is master. Function app is running from package. All are HTTP trigger functions. Nothing fancy.

Whenever we commit to master, it does indeed trigger a deployment-- which completes successfully. I can see the commit that triggered the deployment and all of that. The problem is, the function is still (usually) running old code after this deployment. I've tried restarting the function app, no success. I've tried completely stopping the app waiting for a bit and restarting, no success. Sometimes even manual deployment from Visual Studio doesn't work.

Once I remove the CI/CD pipe from Bitbucket, however, things go back to normal as far as manual deployments from VS go.

I've been able to reproduce this effect several times. Please let me know if you need any additional information.

@ankitkumarr
Copy link
Contributor

@dasbdavis, Do you have an app setting WEBSITE_RUN_FROM_PACKAGE or similar that in your function app? (Run From Package) If so, would you mind removing that and then trying the CI/CD pipeline?

I think what's possible is that your function app is typically deployed using Run From Package, which means that the site assumes that the content is deployed at /home/data/SitePackages. But, I don't think the CI/CD webhook deployment puts the artifact there, so your site may end up using stale deployment.

If above doesn't work, would you mind sharing you function app name, and I can look to see if anything seems fishy.

@dasbdavis
Copy link
Author

Sorry for the delay-- I didn't see that you'd responded. I'll try this as soon as I can and let you know.

@mattmelton
Copy link

I've had a similar issue with python dynamically loading old grpc protobuf files that are no longer compatible with new code.

With WEBSITE_RUN_FROM_PACKAGE=0, the deployment performs an "in-place sync". Unfortunately this has a tenancy to leave old files around, especially locked files or run-time generated files (i.e. *.pyc). In our case the old files are slurped at runtime causing version mismatch errors on method dispatch.

WEBSITE_RUN_FROM_PACKAGE=1 resolves the issue but means we can't set or rotate the host/functions keys programmatically.

I believe an A/B deployment into a separate directory, rather than in-place sync, would solve this issue.

@lopezbertoni
Copy link

lopezbertoni commented Aug 6, 2020

@ankitkumarr Any insight if this was ever fixed? We're having the same issues running under Consumption Plan and Premium Plan.
Our CI/CD us pushing the artifact to Azure (WEBSITE_RUN_FROM_PACKAGE set to 1) and after that we're doing an Azure Functions Restart as suggested by MSFT support but that didn't help either.
Would deploying to a slot and doing a hot swap help?
Please advise. If you need more data I'll be happy to provide it.

@ankitkumarr
Copy link
Contributor

@lopezbertoni, would you mind elaborating you scenario? How are you pushing the artifact to Azure? What's the publishing process, and what issue are you seeing exactly?

@lopezbertoni
Copy link

@ankitkumarr

  1. Push to an Azure Function using Azure DevOps. Steps in the build pipeline are:
  • dotnet restore
  • dotnet build
  • dotnet publish
  • Publish Artifact
  1. Release the artifact with the following steps in Azure DevOps

Issue is that we deploy the Azure Function and we check the logs in Applications Insights and see that log statements that where completely removed from the code are still being executed.

We then stopped/started the Azure Function from the portal and this issue persisted. Eventually we stopped the Azure Function for around 5 mins and then started it again and the deployed code started executing fine.

This was deployed to a Premium Service Plan.

Please advise on how to fix this or if there's a workaround other than manually stopping/starting each processor every time we deploy.

@ankitkumarr
Copy link
Contributor

@lopezbertoni, thanks for all the info. A couple more questions that'd help me narrow down the cause --

  1. What OS is your function app on? (Windows / Linux)
  2. Can you share your function app name, and a time period when you did this deployment for me to look at the logs? If you prefer to share the name privately, you can follow these steps.

@lopezbertoni
Copy link

@ankitkumarr

  1. Windows function app
  2. assessment-events-processor-qa deployed to Central US. This was deployed on August 5th.
    Here's some build information if it helps.
{
"version": "1.0.0.1349",
"commitHash": "f9b666fea571120eb9c09732519acebe9e9b0deb",
"versionDate": "2020.8.5.1",
"branchName": "staging"
}

@ghost ghost added the no-recent-activity label Aug 14, 2020
@ghost
Copy link

ghost commented Aug 14, 2020

This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment.

@arek-avanade
Copy link

@ankitkumarr Any update on this? I experienced the same problem today. Deploying from DevOps, WEBSITE_RUN_FROM_PACKAGE=1, the function seems to be running an old code after successful deployment. There was no change in our deployment scripts recently and everything seemed to be working fine until today, although maybe the problem was there before, just unnoticed.

@ghost ghost removed the no-recent-activity label Aug 17, 2020
@mattmelton
Copy link

I believe my issues were due to this Kudu bug: projectkudu/kudu#2972.

We've worked around the issues by moving to container functions. Previously I saw code that triggered "impossible" exceptions, i.e. exceptions in lines of that didn't exist in that release.

@SeppeDev
Copy link

Today we encoutered the same problem. No changes came through after redeployments. I restared the functionApp and such, but did not have effect (didn't wait very long, as suggested by @lopezbertoni ). We use Azure DevOps, and in the releasetask there, we had our "Deployment method" on "Auto-detect". Worked perfectly fine before, but now that I changed it explicitly to "Zip Deploy", our codechange came through. I'm not entirely sure that this is a fix for the problem, or just a coincidence, but I thought I'd share.

@lopezbertoni
Copy link

@SeppeDev Just to follow up / help. When we did a quick restart if didn't work. When we did a quick start/stop it didn't work.
It picked up the new code once we stopped, waited for about 5 mins and started again.

@SeppeDev
Copy link

@lopezbertoni , ok thanks, we didn't wait for 5 minutes after stopping it, just a quick restart and a quick stop and start, so what fixed is for us probably is the change in the Release in DevOps. Thanks.

@lopezbertoni
Copy link

lopezbertoni commented Aug 22, 2020

@ankitkumarr
This happened again with our Production deployments from last night. Yay for no deploy Fridays 😀.
We deployed 3 times and it didn't update. Eventually they did after several restarts.
All of these processors where deployed to a premium plan.

@ghost ghost added the no-recent-activity label Aug 26, 2020
@ghost
Copy link

ghost commented Aug 26, 2020

This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment.

@lopezbertoni
Copy link

@ankitkumarr Any update on this? Any workaround at least? We're running production with 10+ Functions and every deploy we need to stop, wait for 5 mins and start the functions to ensure the latest code is running until we know this is reliable.
Would slot deployment help?

@ghost ghost removed the no-recent-activity label Aug 26, 2020
@ankitkumarr
Copy link
Contributor

@lopezbertoni, yes apologies for the delay. I will take a look at this as soon as I can.

@marc-perreaut
Copy link

marc-perreaut commented Jun 7, 2021

I am experiencing the same issue (old code version still running despite successful deployment) since June 4th and feel stuck:

  • Restarting the function has no effect
  • Redeploying the function has no effect
  • Deploying the code in a new slot has no effect

The issue is random, but the lack (ongoing) occurrence is tough.

I am happy to get a working workaround and any update on this issue.

@akakaule
Copy link

akakaule commented Jun 10, 2021

I'm also experiencing the same issue. I have a service bus triggered function running .net core code where some of the invocations has executed code old code. It seems that only a few of the executions used the old code. I would really like to have an status update on this issue. The old code that are being executed is from a deployment earlier than may 3rd. So from a really old deployment. Downloading the assembly from the bin folder everything looks good.

The runtime is v3 and runs using the consumption plan (Windows). The Azure Function App task from a Azure DevOps yaml pipeline is used for deployment with deploymentMethod not specified (auto).

We have the WEBSITE_RUN_FROM_PACKAGE = 1

@tippesi
Copy link

tippesi commented Jul 19, 2021

We have the same problem mentioned by @akakaule . After deployment last Friday it seems like sometimes old code is being executed and sometimes the newly deployed code. Today we tried to deploy logs to find the error on our side, but the logs are only visible in some runs. In these cases the function behaves the same way it did before the deployment. We can also add that it doesn't seem to be related to an old instance which is still running. Additionally, it seems to be related only to updated functions in our function app. New functions we added always work with the new code. The old/updated ones sometimes seem to run the old code. Also only our production environment has the issue, our other environments work with the new code as intended. We also have slots for deployment in place.

@tippesi
Copy link

tippesi commented Jul 19, 2021

After making sure that all our slots ran the same code (by redeploying), it now works again. We assume the traffic is somehow split between both slots, even though just the production slot should have been used. In the image below is our current setup, which doesn't seem to work correctly. Of course this defeats the purpose of using a slot swap.
MicrosoftTeams-image

@eric-winkler
Copy link

Hi Folks,

Over the past couple of weeks, I've been seeing a Queue trigger being executed intermittently by a version of my function app that was deployed sometime prior to July 02, despite there being dozens of new versions deployed to the function app since that time.

Trawling through the logs, I've identified that it is a specific host/instance that is running the old code;
HostInstanceId: 49db15dc-0012-41ed-b558-4fe7aced0fdf
Cloud_RoleInstance: 5AF3CA72-637562946012875197
It appears every invocation from this instance (and only this instance) is running an implementation from an old deployment

So far, the following approaches have been unsuccessful in killing this old rogue instance;

  • fresh deployments
  • creating, swapping and deleting slots
  • stopping/starting and restarting the function app
  • Setting the scale-out limit to 1, and back up to 200

I'm using;

  • v3 runtime
  • Linux consumption plan
  • deployed via github actions (Azure/functions-action@v1)
  • WEBSITE_RUN_FROM_PACKAGE = [uri]

@tomhundley
Copy link

I'm having the same issue. Function app on Linux. Premium plan. Deploy from Azure DevOps. Restart. Old code runs for about 5 more minutes. New code magically starts.

Function app name: medchron-carnotaurus-eus-dev-v1-0

This is a problem. Please advise. Thanks.

@simon-tarr
Copy link

simon-tarr commented Aug 6, 2021

Also experiencing the same issue. Function app on Linux, deployed via Github Actions. Initial deploy of code worked fine. Subsequent pushes with updates to the function....function app continues to run the old code.

I've poked around the files at https://<func_name>.scm.azurewebsites.net/DebugConsole and can see the new code, so our deployment from Github has clearly worked successfully (also verified by no deployment errors). Yet in the Azure portal the old code is still visible in Functions -> Function Name -> Code + Test and our web app isn't executing the new functionality which was in the most recent deployment.

This is very bizarre behaviour and needs a fix ASAP :)

@mattchatterley
Copy link

Also seeing this frequently (and anecdotally if we do stuff out of hours, that might just be perception). Repeatedly deploying seems to eventually solve it, but very frustrating.

@SGirousse
Copy link

Hello,
In our project we are also facing it really frequently lately (We discovered that issue because of some incompatibility with our database updates but maybe it was already there before and simply never noticed it).
Is there any roadmap on fixing that issue ?

@mattchatterley
Copy link

We worked around this in the end by ensuring the .zip has a unique name (e.g. deploy-107, deploy-108) every time we use az deploy and this seems to solve as a viable work around.

@SGirousse
Copy link

We worked around this in the end by ensuring the .zip has a unique name (e.g. deploy-107, deploy-108) every time we use az deploy and this seems to solve as a viable work around.

The deployed package does already have an unique name in our case (it does contains the build id of Azure DevOps). It looks like :

  - task: AzureFunctionApp@1
    displayName: "Deploy the Azure Function"
    inputs:
      azureSubscription: myserviceconnxxx
      appType: functionAppLinux
      appName: azfnxxxx
      package: "$(System.DefaultWorkingDirectory)/build$(Build.BuildId).zip"
      appSettings: # ...

So either I misunderstood, either it is not a viable solution in our case.

@tavaneftekhar
Copy link

I'm having the same issue. Using VS Code to push updates to the function app. Is there a solution to ensure old instances are not run weeks later?

@percy2112
Copy link

I'm having the same issue. Function app on Linux. Premium plan. Deploy from Azure DevOps. Restart. Wait for 1 hour. but it did not help.

@edgBR
Copy link

edgBR commented Jul 6, 2022

We are having the same issue here with a Function App deployed using docker containers.

@obiii can add more details.

@dojo87
Copy link

dojo87 commented Oct 11, 2022

We might have the same issue.

  • Node app
  • Windows Azure Function
  • run from package
  • on dev and qa Y1 service plan, on stage EP1 and production EP2
  • slow swap deployment from Azure DevOps

On dev/qa the code was replaced with (call it "B"). On stage and prod it was running the old code ("A") after deployment. Funny thing that we discovered a bug in the new code, and Production running on old code was a life saver, but possibly a ticking bomb.

After we fixed the bug, we deployed to dev/qa without issues ("C" code) and on Production when slot deployment was happening (before swap) - just for a moment we saw logs from the erroneous "B" version. Then it got to "C". Magic.

@SGirousse
Copy link

We might have the same issue.

  • Node app
  • Windows Azure Function
  • run from package
  • on dev and qa Y1 service plan, on stage EP1 and production EP2
  • slow swap deployment from Azure DevOps

On dev/qa the code was replaced with (call it "B"). On stage and prod it was running the old code ("A") after deployment. Funny thing that we discovered a bug in the new code, and Production running on old code was a life saver, but possibly a ticking bomb.

After we fixed the bug, we deployed to dev/qa without issues ("C" code) and on Production when slot deployment was happening (before swap) - just for a moment we saw logs from the erroneous "B" version. Then it got to "C". Magic.

Please be careful as there is also some dark magic with a possibility that previous version got reused in next days/weeks.
It happened to our and also to other people looking at that thread comments.

@emilytweeden
Copy link

We have the same issue. Our Azure Function app is connected to Github for redeployment, and even after disconnecting it and reconnecting it to the Github branch it is still running code from a month ago.

@edgBR
Copy link

edgBR commented Oct 13, 2022

In the title in says v3 but we are actually having the same problem with v4.

This issue has been opened for more than 2 years. Honestly it starts to feel embarrassing.

I have used AWS Lambda before and I have never had any problems like this before. Neither using docker or using the normal functions with python code.

Our team is considering to move to nuclio or other function as a service framework as we feel that AZF is not mature enough. I would like to know if someone is in the same track

BR
E

@timo-schuerg
Copy link

Same problem running here, also on v4. I changed my function app from standard functions to durable functions. Deployment was successful, can see the new files in wwwroot, but portal does not show the new endpoints and I can't call the starter function.

@fpollet-altanova
Copy link

Same issue in 2023, files are not updated in ~/site/wwwroot/ after successful deployment...

@TrendafilGechev
Copy link

Same issue here. I CAN see the newly deployed files in Function App -> Code + Test. However, when a method executes and logs are collected from App Insights it behaves as if nothing was deployed and App Insights shows old logs which are removed from latest codebase.

@sls19050
Copy link

Same issue in 2023 March, the old code was running after a successful deployment (from Azure DevOps) to my V4 function app. I am going to try deploying twice so that the "old code" is updated. Hopefully deploying twice will resolve this issue. Do we have an ETA as to when will this get resolved, or should I just deploy twice as part of best practice?

@alexsorokoletov
Copy link

Having the same exact problem with v4, both with Linux and Windows hosts.
Can't CI/CD from Github to Azure so that it would work every time (or even every 2nd time).

It is very frustrating.

@ursaciuc-adrian
Copy link

We are encountering this issue with v4 Functions (dotnet-isolated in Windows).
We have a standard release pipeline in Azure DevOps with an Azure Function deployment task that automatically sets RUN_FROM_PACKAGE = 1 as the default value.
The pipeline deploys multiple functions, and in 95% of cases, everything works as expected. However, on specific occasions, despite every log indicating a successful deployment, we still observe the presence of old DLL files for one of the functions when checking through Kudu. All other functions are deployed correctly.

@cuzzlor
Copy link

cuzzlor commented Aug 10, 2023

Plllease MSFT will you take this srsly? This situation is really sucky.

@drweb86
Copy link

drweb86 commented Feb 9, 2024

Have similar situation here. We have 2 slots, one with 100% traffic, one with 0 traffic. During swap with some random chance, previous version is used instead of new one.

@ursaciuc-adrian
Copy link

We had a support ticket opened with Microsoft and wasted more than a month with them trying to figure out a solution. In the end, they said that they didn't know what was happening, and no resolution was given.

A patch that can be implemented to mitigate this issue, or at least alert when it happens, is to create an endpoint that returns the version and call that endpoint as part of the validation after the deployment and check the version.

@drweb86
Copy link

drweb86 commented Feb 9, 2024

Here're my findings below.

Terms
Slot - ability to handle certain % of requests by specific set of instances.
Instance - VM image with our application build. There can be several worker processes executing App on instance.
Worker process - particular dotnet instance that executes the service.

Upscale/downscale
When Azure sees, that slot cannot handle amount of requests, it will upscale, i.e.
a) can increase amount of worker processes in instance
b) can spawn another instance.
Upscaling is limited by limits. This limits are 1 instance max and 1 worker max for our Service.

When Azure does not see requests coming to App for some time, it will downscale it, reduce worker processes, instances.

Our usage for slots - for zero downtime via swapping. We have 2 slots: staging and production.

Publishing.
A. New build is published to staging slot. Published means that minimum amount of instances are launched.
Staging slot instances serve 0% of requests in our configuration. Production slot, which handles 100% requests.
B. Slot swap. Slot swap means, that Staging instances are associated with Production slot. Production slot instances become Staging instances, and are terminated after completion of processing requests they were processing.
Swap will not happen if any staging instance finished with exception on start.

Azure bug.
Termination of former production instances happens 1-3 minutes after switching to production instances.
During termination it restarts instance with old code (BUG) and then terminates the instance.
I.e. it reanimates instance its going to kill and then kills it.

@amalayverma
Copy link

amalayverma commented Oct 9, 2024

I was also facing this issue since long. But now I have fixed it. I think the issue is with Azure/functions-action@v1 which I was using earlier. Somehow it is not handling the deployment properly. It does the .zip deployment and update the package reference in scm correctly but somehow does not update the content of the file.
To fix this I zip the package explicitly and then deployed it using either

  • name: Deploy to App Service
    run: |
    az functionapp deployment source config-zip -g ${{ env.AZURE_RESOURCE_GROUP_NAME }} -n ${{ env.AZURE_FUNC_NAME }} --src '${{ env.ARTIFACT_DOWNLOAD_DIR }}/${{ env.ARTIFACT_NAME }}.zip'

OR

  • name: Deploy to App Service
    uses: azure/webapps-deploy@v2
    with:
    app-name: ${{ env.AZURE_FUNC_NAME }}
    package: '${{ env.ARTIFACT_DOWNLOAD_DIR }}/${{ env.ARTIFACT_NAME }}.zip'

Refer For Complete Workflow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests