New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cold start of Azure Function with large npm module trees is slow in Consumption plan #298

Closed
arafato opened this Issue Apr 25, 2016 · 127 comments

Comments

Projects
None yet
@arafato

arafato commented Apr 25, 2016

Hi--,

I've notice that the initial call of an Azure function (i.e. a couple of minutes after the last call) takes a significant amount of time (up to 15 secs). Is there any way to pre-warm a function other than initially calling it? Will the startup performance be improved in the near future? I'm using it on the Dynamic plan. Thanks!

Best regards
--Oliver.

@davidebbo

This comment has been minimized.

Show comment
Hide comment
@davidebbo

davidebbo Apr 25, 2016

Contributor

Currently, there is a short idle timeout on Function apps, so if they don't get requests for 5 minutes the next request causes a cold start which takes longer. We are looking into increasing this timeout, which will make the situation much less common.

Contributor

davidebbo commented Apr 25, 2016

Currently, there is a short idle timeout on Function apps, so if they don't get requests for 5 minutes the next request causes a cold start which takes longer. We are looking into increasing this timeout, which will make the situation much less common.

@fabiocav fabiocav modified the milestones: Current, Next Jul 19, 2016

@FlavorScape

This comment has been minimized.

Show comment
Hide comment
@FlavorScape

FlavorScape Sep 5, 2016

Ive seen delays up to 2 minutes, or even timeout with Azure Function app and node.js on a dynamic hosting plan. This is terrible! running functions app version .5

FlavorScape commented Sep 5, 2016

Ive seen delays up to 2 minutes, or even timeout with Azure Function app and node.js on a dynamic hosting plan. This is terrible! running functions app version .5

@arafato

This comment has been minimized.

Show comment
Hide comment
@arafato

arafato Sep 5, 2016

Since upgrading to 0.5 delays have increased from 15 secs up to 50 secs on my side.

arafato commented Sep 5, 2016

Since upgrading to 0.5 delays have increased from 15 secs up to 50 secs on my side.

@arafato

This comment has been minimized.

Show comment
Hide comment
@arafato

arafato Sep 5, 2016

I'm currently mitigating the cold-start issue with an additional pre-warming Azure function that periodically calls the other Azure function like so.

let http = require('http');

module.exports = function (context, myTimer) {
        http.get('http://your.azure-function.net', (res) => {
        context.log(`Successfully called endpoint. Got response: ${res.statusCode}`);   
        context.done();
    }).on('error', (e) => {
        context.log(`Error: ${e.message}`);
        context.done();
    });
};

This function is called every 4 minutes through a timer-triggered function with the following
cron job expression 0 */4 * * * *

arafato commented Sep 5, 2016

I'm currently mitigating the cold-start issue with an additional pre-warming Azure function that periodically calls the other Azure function like so.

let http = require('http');

module.exports = function (context, myTimer) {
        http.get('http://your.azure-function.net', (res) => {
        context.log(`Successfully called endpoint. Got response: ${res.statusCode}`);   
        context.done();
    }).on('error', (e) => {
        context.log(`Error: ${e.message}`);
        context.done();
    });
};

This function is called every 4 minutes through a timer-triggered function with the following
cron job expression 0 */4 * * * *

@vance

This comment has been minimized.

Show comment
Hide comment
@vance

vance Sep 5, 2016

I tried this. I have a timer calling every 4 minutes. I still get delays up to 5 MINUTES Here's the log of it. Fine, fine, fine, TERRIBLE...

screen shot 2016-09-05 at 12 12 35 am

vance commented Sep 5, 2016

I tried this. I have a timer calling every 4 minutes. I still get delays up to 5 MINUTES Here's the log of it. Fine, fine, fine, TERRIBLE...

screen shot 2016-09-05 at 12 12 35 am

@vance

This comment has been minimized.

Show comment
Hide comment
@vance

vance Sep 5, 2016

So, my classic plan app is running Function App v .4 and it seems that the debugging messages may work better? I get errors showing that don't get in the dynamic app (dynamic app just hangs for long periods). Could faults on one Function affect the performance of the whole app? Sometimes I even get "BAD GATEWAY"

vance commented Sep 5, 2016

So, my classic plan app is running Function App v .4 and it seems that the debugging messages may work better? I get errors showing that don't get in the dynamic app (dynamic app just hangs for long periods). Could faults on one Function affect the performance of the whole app? Sometimes I even get "BAD GATEWAY"

@davidebbo

This comment has been minimized.

Show comment
Hide comment
@davidebbo

davidebbo Sep 6, 2016

Contributor

This is strange. To make sure that this is indeed related to 0.5 and not some other factors, can you try going back to 0.4 (instructions here)?

Contributor

davidebbo commented Sep 6, 2016

This is strange. To make sure that this is indeed related to 0.5 and not some other factors, can you try going back to 0.4 (instructions here)?

@peterjuras

This comment has been minimized.

Show comment
Hide comment
@peterjuras

peterjuras Sep 6, 2016

For me personally, it is not only about the cold start/idle issue, but as I wrote in #529 I think it would be good to have some transparency of the scenarios that are targeted by Azure functions.

If I understand things correctly, the start up time of a function is not only occuring on a cold start of a single container, but also affects new scaled instances if the dynamic plan wants to scale out.

This means that if I have an API HTTP triggered function, and one instance of that function is already running (let's say an execution takes 100ms), it could wait until the execution finishes and then cater the new request, leading to a maximum execution time of 200ms (which is kind of fine).

But instead, the dynamic plan could decide to scale out, and now we have the cold start penalty again. This means that instead of a maximum execution time of 200ms, this request will go above and beyond 5 seconds.

Could you explain how you are handling these cases?

peterjuras commented Sep 6, 2016

For me personally, it is not only about the cold start/idle issue, but as I wrote in #529 I think it would be good to have some transparency of the scenarios that are targeted by Azure functions.

If I understand things correctly, the start up time of a function is not only occuring on a cold start of a single container, but also affects new scaled instances if the dynamic plan wants to scale out.

This means that if I have an API HTTP triggered function, and one instance of that function is already running (let's say an execution takes 100ms), it could wait until the execution finishes and then cater the new request, leading to a maximum execution time of 200ms (which is kind of fine).

But instead, the dynamic plan could decide to scale out, and now we have the cold start penalty again. This means that instead of a maximum execution time of 200ms, this request will go above and beyond 5 seconds.

Could you explain how you are handling these cases?

@davidebbo

This comment has been minimized.

Show comment
Hide comment
@davidebbo

davidebbo Sep 6, 2016

Contributor

@peterjuras let's keep this issue focused on the potential cold start regression which is causing cold start of over 1 minute for some users. We can use a separate thread for a more general discussion of how the system work.

For users running into these extreme cold starts, various questions that would help investigating:

  1. Does 0.4 vs 0.5 make a difference?
  2. What functions are in the Function App? Is it just the one JS functions, or are there others? In particular, do you have any F# functions (there may be an issue there)?
  3. Can you share the name of a Function App where you saw this, and the UTC time of one specific slow cold start?
Contributor

davidebbo commented Sep 6, 2016

@peterjuras let's keep this issue focused on the potential cold start regression which is causing cold start of over 1 minute for some users. We can use a separate thread for a more general discussion of how the system work.

For users running into these extreme cold starts, various questions that would help investigating:

  1. Does 0.4 vs 0.5 make a difference?
  2. What functions are in the Function App? Is it just the one JS functions, or are there others? In particular, do you have any F# functions (there may be an issue there)?
  3. Can you share the name of a Function App where you saw this, and the UTC time of one specific slow cold start?
@vance

This comment has been minimized.

Show comment
Hide comment
@vance

vance Sep 8, 2016

So I did a long running test again where I ping every 3 minutes, the response times are all under 1000 MS unless one of the scheduled jobs failed to fire. Then after the six minutes lapse, the response time goes to a whopping 150,000-280,000 MS. I will try reverting to .4 and see what happens.

vance commented Sep 8, 2016

So I did a long running test again where I ping every 3 minutes, the response times are all under 1000 MS unless one of the scheduled jobs failed to fire. Then after the six minutes lapse, the response time goes to a whopping 150,000-280,000 MS. I will try reverting to .4 and see what happens.

@davidebbo

This comment has been minimized.

Show comment
Hide comment
@davidebbo

davidebbo Sep 8, 2016

Contributor

@vance so you mean you are measuring http functions, but the app happens to also have a scheduled function? What's the function's schedule? Where does 'six minute lapse' come from?

Contributor

davidebbo commented Sep 8, 2016

@vance so you mean you are measuring http functions, but the app happens to also have a scheduled function? What's the function's schedule? Where does 'six minute lapse' come from?

@vance

This comment has been minimized.

Show comment
Hide comment
@vance

vance Sep 9, 2016

I mean I have a timer triggered function doing a HTTP request to my other function. It is set to 3 minutes. However, sometimes it does not fire, and six minutes elapse. The next time it runs, the execution time is extreme. Sometimes it's so slow, it creates a ripple effect where it goes from 280K to 140K to 30K or so, before it finally returns to under 1k MS for the next 10-20 calls.

vance commented Sep 9, 2016

I mean I have a timer triggered function doing a HTTP request to my other function. It is set to 3 minutes. However, sometimes it does not fire, and six minutes elapse. The next time it runs, the execution time is extreme. Sometimes it's so slow, it creates a ripple effect where it goes from 280K to 140K to 30K or so, before it finally returns to under 1k MS for the next 10-20 calls.

@davidebbo

This comment has been minimized.

Show comment
Hide comment
@davidebbo

davidebbo Sep 9, 2016

Contributor

That's very odd behavior. If you can share your Function App name (either directly or indirectly), and the UTC time when you saw this, we can hopefully analyse the situation.

Contributor

davidebbo commented Sep 9, 2016

That's very odd behavior. If you can share your Function App name (either directly or indirectly), and the UTC time when you saw this, we can hopefully analyse the situation.

@vance

This comment has been minimized.

Show comment
Hide comment
@vance

vance Sep 9, 2016

Sure, here's a dummy app in the same region. fscape-dynamic3. Any function endpoint shows the extreme time in the logs, I think.

The screenshot is dated

05 Sep 2016 19:12:35 GMT

so it should be around this time.

vance commented Sep 9, 2016

Sure, here's a dummy app in the same region. fscape-dynamic3. Any function endpoint shows the extreme time in the logs, I think.

The screenshot is dated

05 Sep 2016 19:12:35 GMT

so it should be around this time.

@vance

This comment has been minimized.

Show comment
Hide comment
@vance

vance Sep 9, 2016

So I reverted to v 0.4 similar results. Also, don't know if it's related, but sometimes the "pinged" services even timeout, or even issue a CORS error, when other requests work. I tried just having a local script run a query every 10 minutes. See that result in the second image. No server changes happened at this time.

image

image

vance commented Sep 9, 2016

So I reverted to v 0.4 similar results. Also, don't know if it's related, but sometimes the "pinged" services even timeout, or even issue a CORS error, when other requests work. I tried just having a local script run a query every 10 minutes. See that result in the second image. No server changes happened at this time.

image

image

@dennis-yemelyanov

This comment has been minimized.

Show comment
Hide comment
@dennis-yemelyanov

dennis-yemelyanov Sep 10, 2016

I'm seeing similar symptoms. Although this might be an unrelated issue, in my case most of the time seems to be spent loading the azure-storage library. Any ideas if this is normal or how this can be fixed?

context.log('requiring azure');
const azure = require('azure-storage');
context.log('required azure'); 

2016-09-10T21:24:46.626 requiring azure
2016-09-10T21:27:04.641 required azure

dennis-yemelyanov commented Sep 10, 2016

I'm seeing similar symptoms. Although this might be an unrelated issue, in my case most of the time seems to be spent loading the azure-storage library. Any ideas if this is normal or how this can be fixed?

context.log('requiring azure');
const azure = require('azure-storage');
context.log('required azure'); 

2016-09-10T21:24:46.626 requiring azure
2016-09-10T21:27:04.641 required azure

@vance

This comment has been minimized.

Show comment
Hide comment
@vance

vance Sep 10, 2016

Actually, this might explain a lot of the delay. I think it may be multiplied if you hit a bunch of functions simultaneously when coming out of sleep. Maybe it pegs the I/O for the VM. I do know that this dependency is seems rather large...

I have no idea how this virtualization is engineered, but I wonder if there's some way to snapshot these static resources as globals in a shared memory location that's immutable. Just a thought, don't know if that's even possible.

vance commented Sep 10, 2016

Actually, this might explain a lot of the delay. I think it may be multiplied if you hit a bunch of functions simultaneously when coming out of sleep. Maybe it pegs the I/O for the VM. I do know that this dependency is seems rather large...

I have no idea how this virtualization is engineered, but I wonder if there's some way to snapshot these static resources as globals in a shared memory location that's immutable. Just a thought, don't know if that's even possible.

@dennis-yemelyanov

This comment has been minimized.

Show comment
Hide comment
@dennis-yemelyanov

dennis-yemelyanov Sep 11, 2016

Yeah keeping it in memory might not be possible is some scenarios (at least not within the host process, which might restart etc.). I think the main question though is why does even fresh initialization take so long? Azure-storage indeed has tons of dependencies (I was shocked when I first saw it, way too many for a library that effectively is just a simple rest API wrapper). But when I run the same code locally (in a new node process) it executes instantly without any issues, which makes me think it could be an Azure runtime issue...

dennis-yemelyanov commented Sep 11, 2016

Yeah keeping it in memory might not be possible is some scenarios (at least not within the host process, which might restart etc.). I think the main question though is why does even fresh initialization take so long? Azure-storage indeed has tons of dependencies (I was shocked when I first saw it, way too many for a library that effectively is just a simple rest API wrapper). But when I run the same code locally (in a new node process) it executes instantly without any issues, which makes me think it could be an Azure runtime issue...

@vance

This comment has been minimized.

Show comment
Hide comment
@vance

vance Sep 12, 2016

Wonder if it's the network or storage used. Maybe when these instances come back from idle, the data must get loaded over the network since they are not bound to any particular hardware (or VM)? I did not see these long startup times when hitting Classic Service Plan app (with AlwaysOn turned off).

vance commented Sep 12, 2016

Wonder if it's the network or storage used. Maybe when these instances come back from idle, the data must get loaded over the network since they are not bound to any particular hardware (or VM)? I did not see these long startup times when hitting Classic Service Plan app (with AlwaysOn turned off).

@arafato

This comment has been minimized.

Show comment
Hide comment
@arafato

arafato Sep 13, 2016

Good point @vance . Deleting an "average" node_modules folder on a dynamic plan takes well over 5 minutes in many cases. IOPS seems to be very low...

arafato commented Sep 13, 2016

Good point @vance . Deleting an "average" node_modules folder on a dynamic plan takes well over 5 minutes in many cases. IOPS seems to be very low...

@vance

This comment has been minimized.

Show comment
Hide comment
@vance

vance Sep 27, 2016

Any way to have the container manager or whatever app pool manager cache the resources on the container disk, so they at least come back to life without going over the network? ... at least for the first instance.

vance commented Sep 27, 2016

Any way to have the container manager or whatever app pool manager cache the resources on the container disk, so they at least come back to life without going over the network? ... at least for the first instance.

@securityvoid

This comment has been minimized.

Show comment
Hide comment
@securityvoid

securityvoid Mar 1, 2017

Okay, I'm having some issues with the Webpack technique and I'm wondering what other people are doing.

I've mentioned this before, but I created a build-script. It goes through each one of the index.js for each function and extracts every require. These then go into a single, central javascript file like @1N50MN14 mentioned earlier.

e.g.
global.deps = { package1: require('package1'), package2: require('package2'), }

The final file is WebPacked and uglified, each index.js is set to include this file, and the references for each library are updated to reference the appropriate require.

e.g.
var myvar = globals.deps['package1'];

The problem I'm running into is that the final JavaScript file, while only 700kb, is large enough that if I add it to a VERY simple test function; the cold start takes longer than 5 minutes. Since I'm on a consumption plan, you can't set the timer above 5 minutes.

For example, the following will take over 5 minutes to run:

'use strict';
require('D:\\home\\site\\repository\\azure.deps.js');

module.exports = function(context){
    context.res =  {
        status: 200,
        headers: {
            "Content-Type" : "application/json"
        },
        body : { "status" : "alive"}
    }
    context.done();
}

Now, in the above example I don't even need the require, so it can be taken out. However, just as an example if I drop out the require, it of course runs really quickly. I'm sure if I ever got it cold started, the warm function would not have any issue with the 5 minutes. However, with the 5 minute timer it just dies before it can ever get warm.

What I don't understand is I don't think my dependency list is too terribly long:

  "dependencies": {
    "crypto": "0.0.3",
    "dotenv": "^4.0.0",
    "fs": "0.0.1-security",
    "https": "^1.0.0",
    "jsonwebtoken": "^7.3.0",
    "jwk-to-pem": "^1.2.6",
    "lodash": "^4.17.4",
    "path": "^0.12.7",
    "q": "^1.4.1",
    "xml2js": "^0.4.17",
    "xmlbuilder": "^8.2.2"
  }

As a result it seems like there should be some way to get this to work, and in fact if anyone is getting anything to run; I feel like I should be able to get my code to run.

securityvoid commented Mar 1, 2017

Okay, I'm having some issues with the Webpack technique and I'm wondering what other people are doing.

I've mentioned this before, but I created a build-script. It goes through each one of the index.js for each function and extracts every require. These then go into a single, central javascript file like @1N50MN14 mentioned earlier.

e.g.
global.deps = { package1: require('package1'), package2: require('package2'), }

The final file is WebPacked and uglified, each index.js is set to include this file, and the references for each library are updated to reference the appropriate require.

e.g.
var myvar = globals.deps['package1'];

The problem I'm running into is that the final JavaScript file, while only 700kb, is large enough that if I add it to a VERY simple test function; the cold start takes longer than 5 minutes. Since I'm on a consumption plan, you can't set the timer above 5 minutes.

For example, the following will take over 5 minutes to run:

'use strict';
require('D:\\home\\site\\repository\\azure.deps.js');

module.exports = function(context){
    context.res =  {
        status: 200,
        headers: {
            "Content-Type" : "application/json"
        },
        body : { "status" : "alive"}
    }
    context.done();
}

Now, in the above example I don't even need the require, so it can be taken out. However, just as an example if I drop out the require, it of course runs really quickly. I'm sure if I ever got it cold started, the warm function would not have any issue with the 5 minutes. However, with the 5 minute timer it just dies before it can ever get warm.

What I don't understand is I don't think my dependency list is too terribly long:

  "dependencies": {
    "crypto": "0.0.3",
    "dotenv": "^4.0.0",
    "fs": "0.0.1-security",
    "https": "^1.0.0",
    "jsonwebtoken": "^7.3.0",
    "jwk-to-pem": "^1.2.6",
    "lodash": "^4.17.4",
    "path": "^0.12.7",
    "q": "^1.4.1",
    "xml2js": "^0.4.17",
    "xmlbuilder": "^8.2.2"
  }

As a result it seems like there should be some way to get this to work, and in fact if anyone is getting anything to run; I feel like I should be able to get my code to run.

@1N50MN14

This comment has been minimized.

Show comment
Hide comment
@1N50MN14

1N50MN14 Mar 1, 2017

@securityvoid 700k uglified code sounds like a lot are you sure you're doing it right? path, fs, crypto, https they all sound like native node modules which you don't need to include in your bundle. Also, did you make sure your node_modules folder is either in .gitignore or completely deleted this has been the reason cold start too slow. To be on the safe side, double check in the portal node_modules does not exist in your bundle directory (or other directories you don't need it) and your git integration is properly in sync.

My entry file (named index.src.js) looks something like this:

global.deps = {
  Promise: require('bluebird'),
  jwt: require('jsonwebtoken'),
  uuid: require('uuid')
}

then I have a shell script (that would be .bat on Windows I guess?) that I use to bundle and minifiy which looks like this:

rm index.bundle.min.js
rm index.bundle.js
npm install
browserify --node index.src.js > index.bundle.js
uglify -s index.bundle.js -o index.bundle.min.js 
rm -rf ./node_modules

And in my function code I use the deps like so:

require('../SDK/index.bundle.min')

var Promise = global.deps.Promise;
var uuid = global.deps.uuid;

module.exports = function(context)
{
  //code
}

Hope some of this helps ;)

1N50MN14 commented Mar 1, 2017

@securityvoid 700k uglified code sounds like a lot are you sure you're doing it right? path, fs, crypto, https they all sound like native node modules which you don't need to include in your bundle. Also, did you make sure your node_modules folder is either in .gitignore or completely deleted this has been the reason cold start too slow. To be on the safe side, double check in the portal node_modules does not exist in your bundle directory (or other directories you don't need it) and your git integration is properly in sync.

My entry file (named index.src.js) looks something like this:

global.deps = {
  Promise: require('bluebird'),
  jwt: require('jsonwebtoken'),
  uuid: require('uuid')
}

then I have a shell script (that would be .bat on Windows I guess?) that I use to bundle and minifiy which looks like this:

rm index.bundle.min.js
rm index.bundle.js
npm install
browserify --node index.src.js > index.bundle.js
uglify -s index.bundle.js -o index.bundle.min.js 
rm -rf ./node_modules

And in my function code I use the deps like so:

require('../SDK/index.bundle.min')

var Promise = global.deps.Promise;
var uuid = global.deps.uuid;

module.exports = function(context)
{
  //code
}

Hope some of this helps ;)

@securityvoid

This comment has been minimized.

Show comment
Hide comment
@securityvoid

securityvoid Mar 1, 2017

@1N50MN14 I'll go back and double-check all the dependencies to be sure; but I think I have it all setup right. I can tell you that I'm gitignoring node_modules, and that the code runs fine locally. As a result it should be right.

Since I'm doing this as a deploy script my code actually has the "normal" requires prior to the script running that does all the updates.

Only the optimized code ever makes it into the wwwroot (no node_modules).

securityvoid commented Mar 1, 2017

@1N50MN14 I'll go back and double-check all the dependencies to be sure; but I think I have it all setup right. I can tell you that I'm gitignoring node_modules, and that the code runs fine locally. As a result it should be right.

Since I'm doing this as a deploy script my code actually has the "normal" requires prior to the script running that does all the updates.

Only the optimized code ever makes it into the wwwroot (no node_modules).

@securityvoid

This comment has been minimized.

Show comment
Hide comment
@securityvoid

securityvoid Mar 1, 2017

@1N50MN14 I went back and did a print-out of the JSON Stats from Webpack to see everything that was being packed up. It looks like the ~700k is correct. A good 3/4 of which are associated with the jsonwebtoken package. That package does do a lot, so it sort of makes sense.

@davidebbo Are there plans on how to handle this? I can effectively undo all the Webpacking, but then I'm still back to the slow response times.

securityvoid commented Mar 1, 2017

@1N50MN14 I went back and did a print-out of the JSON Stats from Webpack to see everything that was being packed up. It looks like the ~700k is correct. A good 3/4 of which are associated with the jsonwebtoken package. That package does do a lot, so it sort of makes sense.

@davidebbo Are there plans on how to handle this? I can effectively undo all the Webpacking, but then I'm still back to the slow response times.

@ricklove

This comment has been minimized.

Show comment
Hide comment
@ricklove

ricklove Mar 1, 2017

@securityvoid

With my technique, I have around a 10 second cold start (which is equivalent to the average C# cold start from my tests). Some of my files are 3MB which is huge, but I don't run uglify and it doesn't seem to matter. (It seems that the cold-start issue with Azure Functions has to do with latency not bandwidth. It seems the number of file dependencies is the problem and large files don't slow it down much.)

Some observations:

  • I don't manually do anything to prune dependencies.
  • I leave everything in the original node_modules, place everything in its final position and then run webpack. This pulls in the node_modules dependencies so that it is no longer needed at all. (In my deployment folder, I have no node_modules and I have no project.json.)
  • I do all this on my machine before I deploy.
  • It takes around a minute for my build tool to run webpack on everything.
  • When I deploy, it takes a few minutes for everything to update from the time I git push.
  • Cold start time is around 10 seconds
    • I have run my cold-start measurment tests from a separate azure function and concluded that Javascript, C#, and F# Functions all have equivalent (within 20%) cold start times (assuming a webpack pre-built solution like this).
  • There are probably some points of improvement, but I have obtained sufficient performance for production use.

Folder structure:

  • root
    • node_modules
    • project.json
    • src (Typescript)
    • lib (Generated from src)
    • .deployments (points to deployments)
    • deployments
      • no project.json
      • [function-folder]
        • function.json
        • index.js (requires build.js)
        • build.source.js (points to entry point in lib folder)
        • build.js
          • built using webpack
          • contains all dependencies from lib and node_modules
          • largest is around 3MB (I don't use uglify)

Process:

  • Run tsc to transpile from the src folder into a lib folder in my project.
    • The code in this folder still points to the original node_modules for external require.
  • Generate a function-folder for each endpoint (this can be done manually).
    • This includes:
      • function.json
      • index.js
      • build.source.js
  • Run webpack against each build.source.js which produces a build.js
    • This now has every code from the lib folder and all dependencies from node_modules.
  • Git push the project with the deployments folder to the production branch
    • no node_modules folder is included
    • lib folder is not needed (but I include it because my CI tests run against it)

ricklove commented Mar 1, 2017

@securityvoid

With my technique, I have around a 10 second cold start (which is equivalent to the average C# cold start from my tests). Some of my files are 3MB which is huge, but I don't run uglify and it doesn't seem to matter. (It seems that the cold-start issue with Azure Functions has to do with latency not bandwidth. It seems the number of file dependencies is the problem and large files don't slow it down much.)

Some observations:

  • I don't manually do anything to prune dependencies.
  • I leave everything in the original node_modules, place everything in its final position and then run webpack. This pulls in the node_modules dependencies so that it is no longer needed at all. (In my deployment folder, I have no node_modules and I have no project.json.)
  • I do all this on my machine before I deploy.
  • It takes around a minute for my build tool to run webpack on everything.
  • When I deploy, it takes a few minutes for everything to update from the time I git push.
  • Cold start time is around 10 seconds
    • I have run my cold-start measurment tests from a separate azure function and concluded that Javascript, C#, and F# Functions all have equivalent (within 20%) cold start times (assuming a webpack pre-built solution like this).
  • There are probably some points of improvement, but I have obtained sufficient performance for production use.

Folder structure:

  • root
    • node_modules
    • project.json
    • src (Typescript)
    • lib (Generated from src)
    • .deployments (points to deployments)
    • deployments
      • no project.json
      • [function-folder]
        • function.json
        • index.js (requires build.js)
        • build.source.js (points to entry point in lib folder)
        • build.js
          • built using webpack
          • contains all dependencies from lib and node_modules
          • largest is around 3MB (I don't use uglify)

Process:

  • Run tsc to transpile from the src folder into a lib folder in my project.
    • The code in this folder still points to the original node_modules for external require.
  • Generate a function-folder for each endpoint (this can be done manually).
    • This includes:
      • function.json
      • index.js
      • build.source.js
  • Run webpack against each build.source.js which produces a build.js
    • This now has every code from the lib folder and all dependencies from node_modules.
  • Git push the project with the deployments folder to the production branch
    • no node_modules folder is included
    • lib folder is not needed (but I include it because my CI tests run against it)
@securityvoid

This comment has been minimized.

Show comment
Hide comment
@securityvoid

securityvoid Mar 1, 2017

@ricklove Thank's for sharing. If you have 3mb files, the 700k definitely isn't why my stuff is so slow. I'm going to try without uglifying and see if there is a difference.

BTW, are you using the standard consumption plan?

securityvoid commented Mar 1, 2017

@ricklove Thank's for sharing. If you have 3mb files, the 700k definitely isn't why my stuff is so slow. I'm going to try without uglifying and see if there is a difference.

BTW, are you using the standard consumption plan?

@ricklove

This comment has been minimized.

Show comment
Hide comment
@ricklove

ricklove Mar 1, 2017

@securityvoid Yes, I'm using the consumption plan.

For your case, I suspect there is something in your deployments folder that has many files that is getting loaded during cold start.

Also, I just extended my cold-start timing tests so I can directly compare a JavaScript run with no dependencies against a 3MB function that uses azure-storage.

My test runs every 12 minutes alternating which function it calls (from an external function app). So it's a good test from the perspective of a client - measuring the actual turn around time from initial call to client receipt of the response (the monitor page of the function itself only shows ~2 seconds, when this test shows ~10 seconds).

I'll give a report tomorrow.

ricklove commented Mar 1, 2017

@securityvoid Yes, I'm using the consumption plan.

For your case, I suspect there is something in your deployments folder that has many files that is getting loaded during cold start.

Also, I just extended my cold-start timing tests so I can directly compare a JavaScript run with no dependencies against a 3MB function that uses azure-storage.

My test runs every 12 minutes alternating which function it calls (from an external function app). So it's a good test from the perspective of a client - measuring the actual turn around time from initial call to client receipt of the response (the monitor page of the function itself only shows ~2 seconds, when this test shows ~10 seconds).

I'll give a report tomorrow.

@securityvoid

This comment has been minimized.

Show comment
Hide comment
@securityvoid

securityvoid Mar 1, 2017

@ricklove I would think if it was something generically in my deployments folder then the function would not behave different whether I had that require or not.

This times out every time:

'use strict';
require('D:\\home\\site\\repository\\azure.deps.js');

module.exports = function(context){
    context.res =  {
        status: 200,
        headers: {
            "Content-Type" : "application/json"
        },
        body : { "status" : "alive"}
    }
    context.done();
}

This runs every time:

'use strict';

module.exports = function(context){
    context.res =  {
        status: 200,
        headers: {
            "Content-Type" : "application/json"
        },
        body : { "status" : "alive"}
    }
    context.done();
}

It doesn't make any sense for me to have issues with the 700k file when you have a 3M file, but it seems like the file itself is somehow the issue. I suppose the alternative is that the file name is just the last little bit that pushes it over the 5m mark.

In any case, removing uglify did not really help.

securityvoid commented Mar 1, 2017

@ricklove I would think if it was something generically in my deployments folder then the function would not behave different whether I had that require or not.

This times out every time:

'use strict';
require('D:\\home\\site\\repository\\azure.deps.js');

module.exports = function(context){
    context.res =  {
        status: 200,
        headers: {
            "Content-Type" : "application/json"
        },
        body : { "status" : "alive"}
    }
    context.done();
}

This runs every time:

'use strict';

module.exports = function(context){
    context.res =  {
        status: 200,
        headers: {
            "Content-Type" : "application/json"
        },
        body : { "status" : "alive"}
    }
    context.done();
}

It doesn't make any sense for me to have issues with the 700k file when you have a 3M file, but it seems like the file itself is somehow the issue. I suppose the alternative is that the file name is just the last little bit that pushes it over the 5m mark.

In any case, removing uglify did not really help.

@securityvoid

This comment has been minimized.

Show comment
Hide comment
@securityvoid

securityvoid Mar 1, 2017

@ricklove Arg! I think I just noticed my own typo.

If you notice in the file path of the require is "repository". That is the non-webpacked version of the file. The webpacked version is in the wwwroot.

Since the file existed, it didn't error out, and in fact would work... given enough time.

I'm redeploying now to confirm that solves things.

securityvoid commented Mar 1, 2017

@ricklove Arg! I think I just noticed my own typo.

If you notice in the file path of the require is "repository". That is the non-webpacked version of the file. The webpacked version is in the wwwroot.

Since the file existed, it didn't error out, and in fact would work... given enough time.

I'm redeploying now to confirm that solves things.

@phillipharding

This comment has been minimized.

Show comment
Hide comment
@phillipharding

phillipharding Jun 8, 2017

@davidebbo @1N50MN14 I've made the Browserify technique work without the use of global...

The shared/library module is as follows...

/**
 * ./library/shared.js -->
 * export your dependencies here
 */
module.exports = {
	require: require('request'),
	requestPromise: require('request-promise'),
	common: require("./common"),
};

And the Function code file is...

/**
 * ./functionname/index.js -->
 */
var shared = require("../library/shared.min");

function functionHandler(context, req) {
	context.log('FuncImpersonate01 HTTP trigger function received a request.');
	shared.common.logRequestHeaders(context, req);

	context.res = {
		status: 200,
		headers: { "Content-Type": "application/json" },
		body: {
			Success: true,
		},
	};
	context.done();
};
module.exports = functionHandler;

The Build/Browserify commands are...

browserify --node ./library/shared.js --standalone shared -o ./library/shared.bundle.js
uglifyjs ./library/shared.bundle.js -o ./library/shared.min.js

phillipharding commented Jun 8, 2017

@davidebbo @1N50MN14 I've made the Browserify technique work without the use of global...

The shared/library module is as follows...

/**
 * ./library/shared.js -->
 * export your dependencies here
 */
module.exports = {
	require: require('request'),
	requestPromise: require('request-promise'),
	common: require("./common"),
};

And the Function code file is...

/**
 * ./functionname/index.js -->
 */
var shared = require("../library/shared.min");

function functionHandler(context, req) {
	context.log('FuncImpersonate01 HTTP trigger function received a request.');
	shared.common.logRequestHeaders(context, req);

	context.res = {
		status: 200,
		headers: { "Content-Type": "application/json" },
		body: {
			Success: true,
		},
	};
	context.done();
};
module.exports = functionHandler;

The Build/Browserify commands are...

browserify --node ./library/shared.js --standalone shared -o ./library/shared.bundle.js
uglifyjs ./library/shared.bundle.js -o ./library/shared.min.js
@davidebbo

This comment has been minimized.

Show comment
Hide comment
@davidebbo

davidebbo Jun 8, 2017

Contributor

@phillipharding thanks for sharing!

As an aside, note that we've added support to automatically run funcpack on Kudu git deployments. To enable it, simply set an Azure App Setting: SCM_USE_FUNCPACK=1. In theory this should just work, but it's still new, so feedback is welcome (https://github.com/Azure/azure-functions-pack is a good place for it).

Contributor

davidebbo commented Jun 8, 2017

@phillipharding thanks for sharing!

As an aside, note that we've added support to automatically run funcpack on Kudu git deployments. To enable it, simply set an Azure App Setting: SCM_USE_FUNCPACK=1. In theory this should just work, but it's still new, so feedback is welcome (https://github.com/Azure/azure-functions-pack is a good place for it).

@1N50MN14

This comment has been minimized.

Show comment
Hide comment
@1N50MN14

1N50MN14 Jun 13, 2017

@phillipharding Awesomes thanks for sharing this! it seems that browerify's --standalone did the trick!

1N50MN14 commented Jun 13, 2017

@phillipharding Awesomes thanks for sharing this! it seems that browerify's --standalone did the trick!

@bakatz

This comment has been minimized.

Show comment
Hide comment
@bakatz

bakatz Jan 8, 2018

I've been running into this issue for some time now and it's very frustrating as I'm seeing 60 second startup times on a regular basis. I'll give azure-functions-pack a shot at some point, and I think that the azure-functions-pack tool is a decent workaround for the short term, but is there anything on the roadmap to address the root cause? (i.e. the underlying file system being prohibitively slow)

Specifically, I'm using the Slack Events API, and if Slack determines that a request to my API takes more than 5 seconds to complete (which happens regularly whenever whatever orchestrates Azure Functions decides to spin down our API), it'll automatically be retried over and over. This behavior in combination with Functions' filesystem issues means I can't really work with Slack's API without a hack that ignores "retry" messages based on a certain request header value, which isn't ideal for various reasons.

Thanks

bakatz commented Jan 8, 2018

I've been running into this issue for some time now and it's very frustrating as I'm seeing 60 second startup times on a regular basis. I'll give azure-functions-pack a shot at some point, and I think that the azure-functions-pack tool is a decent workaround for the short term, but is there anything on the roadmap to address the root cause? (i.e. the underlying file system being prohibitively slow)

Specifically, I'm using the Slack Events API, and if Slack determines that a request to my API takes more than 5 seconds to complete (which happens regularly whenever whatever orchestrates Azure Functions decides to spin down our API), it'll automatically be retried over and over. This behavior in combination with Functions' filesystem issues means I can't really work with Slack's API without a hack that ignores "retry" messages based on a certain request header value, which isn't ideal for various reasons.

Thanks

@adamdriscoll

This comment has been minimized.

Show comment
Hide comment
@adamdriscoll

adamdriscoll Jan 16, 2018

I'm having the exact same issue with the Shopify Webhook API. It will retry if the request doesn't finish in 5 seconds. I usually get about 3-4 requests before one returns in time for it to stop trying.

adamdriscoll commented Jan 16, 2018

I'm having the exact same issue with the Shopify Webhook API. It will retry if the request doesn't finish in 5 seconds. I usually get about 3-4 requests before one returns in time for it to stop trying.

@bladerunner41

This comment has been minimized.

Show comment
Hide comment
@bladerunner41

bladerunner41 Jan 23, 2018

Microsoft needs to fix this ASAP. Node.js runs fine on AWS Lamba. Disappointing when you see articles like this: seven reasons why so many Node.js devs are excited about Microsoft Azure

bladerunner41 commented Jan 23, 2018

Microsoft needs to fix this ASAP. Node.js runs fine on AWS Lamba. Disappointing when you see articles like this: seven reasons why so many Node.js devs are excited about Microsoft Azure

@tony-gutierrez

This comment has been minimized.

Show comment
Hide comment
@tony-gutierrez

tony-gutierrez Feb 7, 2018

Forget the webpack flag, a huge number of modules wont compress properly with it. So basically no solution still, and a pretty useless product if you want serverless node.

tony-gutierrez commented Feb 7, 2018

Forget the webpack flag, a huge number of modules wont compress properly with it. So basically no solution still, and a pretty useless product if you want serverless node.

@bakatz

This comment has been minimized.

Show comment
Hide comment
@bakatz

bakatz Feb 8, 2018

@tony-gutierrez agree - this is something I would have expected Azure Functions to launch with, not something that still isn't fixed a full year after the original issue was filed. 🤔

bakatz commented Feb 8, 2018

@tony-gutierrez agree - this is something I would have expected Azure Functions to launch with, not something that still isn't fixed a full year after the original issue was filed. 🤔

@tony-gutierrez

This comment has been minimized.

Show comment
Hide comment
@tony-gutierrez

tony-gutierrez Feb 8, 2018

We started with functions trying to use the consumption plan. Not even close to working. Then we ran the app service plan for a while, and at least it worked. Then we realized we basically had no benefits of a serverless host, and all the horrid things function apps incur (like no access to console.log, super slow storage, not easy to dev locally, etc.) Did a quick modify to a normal node app and deployed to a VM and app service. The perf increase is pretty major on both.

tony-gutierrez commented Feb 8, 2018

We started with functions trying to use the consumption plan. Not even close to working. Then we ran the app service plan for a while, and at least it worked. Then we realized we basically had no benefits of a serverless host, and all the horrid things function apps incur (like no access to console.log, super slow storage, not easy to dev locally, etc.) Did a quick modify to a normal node app and deployed to a VM and app service. The perf increase is pretty major on both.

@bhosale-ajay

This comment has been minimized.

Show comment
Hide comment
@bhosale-ajay

bhosale-ajay Feb 15, 2018

We used webpack to bundle the npm module dependencies into a single file to reduce the cold time, refer this medium article

bhosale-ajay commented Feb 15, 2018

We used webpack to bundle the npm module dependencies into a single file to reduce the cold time, refer this medium article

@davidebbo

This comment has been minimized.

Show comment
Hide comment
@davidebbo

davidebbo Feb 15, 2018

Contributor

See also the newly announced Run-From-Zip feature, which also helps with cold start.

Contributor

davidebbo commented Feb 15, 2018

See also the newly announced Run-From-Zip feature, which also helps with cold start.

@davidebbo

This comment has been minimized.

Show comment
Hide comment
@davidebbo

davidebbo Jun 13, 2018

Contributor

Closing this issue as Run-From-Zip is working well enough for most people (and some others are using funcpack). It doesn't mean that there can't be more optimizations, but we can always track specific work separately, as this issue is too open-ended.

Contributor

davidebbo commented Jun 13, 2018

Closing this issue as Run-From-Zip is working well enough for most people (and some others are using funcpack). It doesn't mean that there can't be more optimizations, but we can always track specific work separately, as this issue is too open-ended.

@davidebbo davidebbo closed this Jun 13, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment