Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always On Azure Functions #967

Open
KirkMunro opened this issue Sep 25, 2018 · 14 comments

Comments

@KirkMunro
Copy link

commented Sep 25, 2018

I have a question/issue with the Always On functionality of Azure Functions. The documentation for Always On states the following:

Indicates that your web app needs to be loaded at all times. By default, web apps are unloaded after they have been idle. It is recommended that you enable this option when you have continuous WebJobs running on the web app.

The way that is written, and the way that Always On has been described, it sounds like the Azure Function (web app) should always be loaded and ready to go; however, in practice I've noticed the following behavior that makes me feel something isn't right with how Always On is implemented:

  1. When I deploy an update to an Azure Function, that update is not deployed and made ready until I invoke it once. If I'm using Always On, I feel I should rightly expect that any update will be loaded and ready to go immediately after it is deployed, so that the responsiveness of Azure Functions remains optimal. It's so bad that I've considered writing my own logic to invoke an endpoint after I update just so that the package is truly deployed and made ready, but that isn't work I should have to do myself with Azure Functions that are using an App Service Plan that supports Always On. This, in my mind, is a design issue with Azure Functions that should be corrected.

  2. If my Azure Function endpoints are not busy for a period of time (not sure how long, but I've noticed this after they are idle for a bit), and then I invoke one of the Azure Functions, it takes longer than it does when it is "hot". Yet this Azure Function package is supposed to be Always On, which means it should always be "hot", should it not? The current behavior makes me feel like the package still enters a cold state from time to time and has to be warmed back up again, which again is contradictory to how Always On is supposed to function.

Am I missing something? Or is Always On not truly functioning like Always On should?

@ConnorMcMahon

This comment has been minimized.

Copy link

commented Sep 26, 2018

This is not expected behavior, having a dedicated (App Service plan) Azure Function app should not experience cold start.

Could you provide us with an application name, and ideally a UTC timestamp of a bad cold start you experienced with that application after it was idle. If you don't feel comfortable sharing you application privately, just provide this information and we can help diagnose this problem.

@KirkMunro

This comment has been minimized.

Copy link
Author

commented Sep 26, 2018

Thanks @ConnorMcMahon. I just reproduced the behavior I'm seeing for the second item identified above.

Here is the data you need to have a closer look:

UTC execution time: 2018-09-26 11:53:01.886
Execution id:       2f203813-8956-4221-b45a-f32d697c000b
Region:             East US 2

That specific invocation (which I performed in a test deployment of one of my function apps) was launched almost a day since the previous invocation, took 705.8432ms, and then immediately after that I invoked the function a bunch more with the same logic, with the duration ranging between 20.3958ms and 74.2701ms. Also note that this is a simplified example.

Here's another example, this time showing an invocation that does a bit more work, where the function took 22 seconds (!) after it had been sitting for a while, only to then run repeatedly with runtimes in the range of 2-3 seconds:

UTC execution time: 2018-09-19 17:57:40
Execution id:       e2cc86bf-38ce-4750-abb2-3f47d1e619bd
Region:             East US 2

As for after deployment, here is an invocation that I did after a deployment, which took 11.3 seconds to run, followed by the same invocation taking in the range of 2-3 seconds:

UTC execution time: 2018-09-26 13:16:29
Execution id:       aed283e0-baef-4a92-bff7-e0e397ceb663
Region:             East US 2

In an Always On environment, I would expect a deployment to result in the function app being deployed to a worker in the background, with ongoing requests being sent to the existing deployment that is always on, and then once the new deployment has finished, any requests from that point on would go to the new deployment, so that performance remains optimal. That does not appear to be what is happening. It doesn't seem to matter if I wait after deployment either...I still take the initial hit with it taking longer to get going once I do a deployment.

If you can look into these examples, and share some guidance about how I can use Always On functions with a reasonable expected duration regardless of deployments, scaling, etc., I would appreciate it.

@123Jun321

This comment has been minimized.

Copy link

commented Oct 15, 2018

I also have the same problem.but now I think whether this is a normal phenomenon?have one can explain this?

@KirkMunro

This comment has been minimized.

Copy link
Author

commented Oct 15, 2018

@ConnorMcMahon Have you made any progress with your investigation into this issue?

@ConnorMcMahon

This comment has been minimized.

Copy link

commented Oct 15, 2018

@KirkMunro

The first one could be something as simple as the .NET framework not having dependencies in your function JITed. A way to test this would be to have a timer trigger run occasionally that has the same dependencies that does some dummy work. Unfortunately there is not much always on can help with that case.

As for the deployment case, after a deployment, your function host is stopped. The next time it gets pinged by always-on (I forget the exact details, but I believe it is once a minute), the function application should start back up. In the case that you gave me, since you tested immediately after deployment, AlwaysOn had not yet pinged the function app into starting again. If you are seeing this behavior even after waiting several minutes, please give me that timerange and I will investigate then.

I will post another update soon after I have finished investigating the second execution you listed.

@KirkMunro

This comment has been minimized.

Copy link
Author

commented Oct 15, 2018

@ConnorMcMahon Just to call it out, the last execution was after a deployment; however, as indicated above, if I deploy an update to an always on function app, I would expect the deployment to be completed and initialized while requests that come in during the deployment are run against the existing web app deployment, and then once the new deployment is available, switch over to it and tear down the old deployment. There definitely seems to be room for improvement when it comes to function app execution time after an update.

That aside, that one was actually the least concerning to me (although I still think it should have a more optimal deployment process to maintain high availability and fast runtime). The other two are more concerning.

@ConnorMcMahon

This comment has been minimized.

Copy link

commented Oct 15, 2018

@KirkMunro I reread your original message and edited my answer correspondingly. In terms of that more optimal deployment cycle, that is something that we support via slots.

@ConnorMcMahon

This comment has been minimized.

Copy link

commented Oct 15, 2018

Looking at the second execution, none of the execution time is time spent by the functions runtime. We start executing the function code at 17:57:41.91 and finish executing at 17:58.04. The fact that this took 10 times as long as subsequent executions is very concerning.

I realize in my previous answer I assumed you were using C#. Do you mind sharing if your function executed is using C# or Node, and how many dependencies you are using?

This is not exactly a cold-start (where the functions runtime is not up and running). My leading theory is that this is a "luke-warm" start. If you have lots of dependencies in your function code, and it has been a long time since executing that function, then all of those dependencies may have to be reloaded into memory. This is something that we could optimize for in the functions runtime. However, a 20 second difference in execution time is definitely something that is concerning.

Have you seen similar behavior in the first execution of running your function locally?

@jeffhollan

This comment has been minimized.

Copy link
Member

commented Nov 6, 2018

PInging this issue to see if was sorted out. @KirkMunro any updates?

@KirkMunro

This comment has been minimized.

Copy link
Author

commented Nov 6, 2018

Thanks for drawing my attention back to this @jeffhollan.

To respond to @ConnorMcMahon's comments:

  1. Last I checked, slots were still in preview, not supported, which made me hesitant to use them. Has that changed?

  2. Regarding the number of dependencies, I wouldn't say there are a lot. 5 of the DLL dependencies added were included to work around this not being available yet. Other than that the PS 5 reference assemblies and Newtonsoft.Json NuGet packages were referenced. I think the big issue with the "luke-warm" start though is in this case PowerShell was being invoked from the Azure function runtime, and with that invocation you would have a lot of dlls loaded into memory as Azure PowerShell modules or AWS modules were loaded into the PowerShell session. Subsequent invocations would have those binaries loaded already, so they would run much faster.

As for whether or not I've seen similar behavior in the first execution of running my function locally, yes, I've noticed on first run it can take some time.

@ConnorMcMahon

This comment has been minimized.

Copy link

commented Nov 6, 2018

@KirkMunro, that makes a lot of sense why you would see that performance hit then. Have you considered running a timer trigger that activates a dummy PowerShell script that would load all of those dependencies? You could figure out how frequently to run the timer by seeing how often you hit this "luke-warm" start after startup, as I imagine those dependencies don't stay in memory forever. This unfortunately probably wouldn't work too well with the deployment case, but it should help the other cases.

@KirkMunro

This comment has been minimized.

Copy link
Author

commented Nov 6, 2018

I heard suggestions about a timer trigger approach for the free tier of Azure Functions to keep the runtime alive/ready, and now with AlwaysOn functions. I have to say though, that feels like a hack that shouldn't be necessary if you're using AlwaysOn. From the outside, the name AlwaysOn is overselling it.

Regardless, that work would have to be done by someone else, as I have left the role I was in when I was working on this project. I'll still keep working on Azure Functions in the meantime (but not the specific ones associated with this issue as I am no longer involved with that code), and I'll be looking at v2 of the runtime as well as the upcoming official PowerShell support to help this technology mature.

@ConnorMcMahon

This comment has been minimized.

Copy link

commented Nov 7, 2018

@KirkMunro, AlwaysOn is actually a feature of Azure App Service, that Functions on a dedicated plan gets for free. It is not a specialized feature for functions, it simply pings the app to keep it alive. Trying to JIT user code dependencies feels like it is outside of the purview of the feature.

With that said, JITing the code of your function manually does feel against the spirit of serverless, and maybe there is a space for us to make this easier.

@ColbyTresness

This comment has been minimized.

Copy link
Collaborator

commented Mar 13, 2019

@ConnorMcMahon is this issue being left open to track anything?

@ColbyTresness ColbyTresness added this to the Active Questions milestone Mar 13, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.