Node 4.2 large memory spikes and timeouts on Heroku #3370

Closed
AndrewBarba opened this Issue Oct 14, 2015 · 42 comments

Comments

Projects
None yet
@AndrewBarba

Below is a screenshot of the past 12 hours which has been a total disaster for us to say the least. After updating to node 4.2 (from 0.10) in production we immediately exceeded all memory quotas and experienced a high volume of timeouts (even with no load and memory under the 1GB limit).

First, I apologize if this is not the place for this. I am happy to move discussion somewhere else and we will help diagnose whatever you guys need. We did this same parade with Node 0.12 and had to downgrade to 0.10.

Second, and I guess the real question here, is Heroku's 512MB of ram simply not enough to run Node 4.x? If that is the case, cool, but memory constraints definitely need to be made more clear.

Timeline:

  • Tue, Oct. 13th, at 2pm EST we deployed Node 4.2.0 with cluster enabled running 2 threads. Immediately hit 512 memory limit as seen in the picture below.
  • Tue, Oct. 13th, at 2:15pm EST we removed cluster completely. Hit memory limits 30 min later.
  • Tue, Oct. 13th, at 4:00pm EST we saw Node 4.2.1 was released, deployed 4.2.1. Continued to hit memory limits
  • Tue, Oct. 13th, at 6:00pm EST we doubled memory to 1GB

In general you can see the memory is all over the place, maybe that is expected with newer versions of V8...

screen shot 2015-10-14 at 12 08 03 pm

Although I don't have a screen, you can see in the first part of the graph running Node 0.10 that it stays almost perfectly flat at 256MB of ram. Under any load, that was consistent.

For reference, here is a load test we did in a dev environment running Node 4.2.1, cluster forked to 4 processes, and about 5k rpms. Also immediately hit the higher 1GB memory limit. We then dropped this down to 2 forked processes with the same result.

screen shot 2015-10-14 at 12 26 15 pm

@jasnell

This comment has been minimized.

Show comment
Hide comment
Member

jasnell commented Oct 14, 2015

@mscdex mscdex added the memory label Oct 14, 2015

@bnoordhuis

This comment has been minimized.

Show comment
Hide comment
@bnoordhuis

bnoordhuis Oct 14, 2015

Member

What happens when you start with (for example) node --max_old_space_size=256? The GC is lazy (lazier than v0.10) and the default upper limit is about 1.5 GB so you'll need to cap it if your machine has less memory.

Member

bnoordhuis commented Oct 14, 2015

What happens when you start with (for example) node --max_old_space_size=256? The GC is lazy (lazier than v0.10) and the default upper limit is about 1.5 GB so you'll need to cap it if your machine has less memory.

@AndrewBarba

This comment has been minimized.

Show comment
Hide comment
@AndrewBarba

AndrewBarba Oct 14, 2015

Okay just tried running with --max_old_space_size=256 using cluster and 4 processes. Hit the limit immediately so dropping down to 1 process again. Is the 256mb per process?

2015-10-14T16:46:29.457954+00:00 heroku[web.1]: source=web.1 dyno=heroku.18327029.28d659a4-2e6e-4ee5-9fb9-18dc6d99e1de sample#memory_total=578.31MB sample#memory_rss=445.58MB sample#memory_cache=0.00MB sample#memory_swap=132.73MB sample#memory_pgpgin=338334pages sample#memory_pgpgout=225287pages

Okay just tried running with --max_old_space_size=256 using cluster and 4 processes. Hit the limit immediately so dropping down to 1 process again. Is the 256mb per process?

2015-10-14T16:46:29.457954+00:00 heroku[web.1]: source=web.1 dyno=heroku.18327029.28d659a4-2e6e-4ee5-9fb9-18dc6d99e1de sample#memory_total=578.31MB sample#memory_rss=445.58MB sample#memory_cache=0.00MB sample#memory_swap=132.73MB sample#memory_pgpgin=338334pages sample#memory_pgpgout=225287pages
@bnoordhuis

This comment has been minimized.

Show comment
Hide comment
@bnoordhuis

bnoordhuis Oct 14, 2015

Member

Is the 256mb per process?

Yes.

Member

bnoordhuis commented Oct 14, 2015

Is the 256mb per process?

Yes.

@AndrewBarba

This comment has been minimized.

Show comment
Hide comment
@AndrewBarba

AndrewBarba Oct 14, 2015

Dropping down to 1 process --max_old_space_size=256 and --max_old_space_size=128 both returned VERY consistent results. Huge difference. Here is running at 128:
screen shot 2015-10-14 at 1 01 38 pm
And here is running at --max_old_space_size=512:
screen shot 2015-10-14 at 12 59 35 pm

Dropping down to 1 process --max_old_space_size=256 and --max_old_space_size=128 both returned VERY consistent results. Huge difference. Here is running at 128:
screen shot 2015-10-14 at 1 01 38 pm
And here is running at --max_old_space_size=512:
screen shot 2015-10-14 at 12 59 35 pm

@jasnell

This comment has been minimized.

Show comment
Hide comment
@jasnell

jasnell Oct 14, 2015

Member

@bnoordhuis ... looking at this, we likely definitely want to make sure this is documented better

Member

jasnell commented Oct 14, 2015

@bnoordhuis ... looking at this, we likely definitely want to make sure this is documented better

@AndrewBarba

This comment has been minimized.

Show comment
Hide comment
@AndrewBarba

AndrewBarba Oct 14, 2015

Yes very eye-opening for me at least. Fingers crossed, but I think you guys just saved us from having to revert everything again. Really appreciate the quick response. Going to leave this open through end of day while we do a few more tests on our end.

Yes very eye-opening for me at least. Fingers crossed, but I think you guys just saved us from having to revert everything again. Really appreciate the quick response. Going to leave this open through end of day while we do a few more tests on our end.

@bnoordhuis

This comment has been minimized.

Show comment
Hide comment
@bnoordhuis

bnoordhuis Oct 14, 2015

Member

@jasnell I don't disagree but where would you put it? Maybe it's time we add a FAQ.

Member

bnoordhuis commented Oct 14, 2015

@jasnell I don't disagree but where would you put it? Maybe it's time we add a FAQ.

@mscdex

This comment has been minimized.

Show comment
Hide comment
@mscdex

mscdex Oct 14, 2015

Contributor

FWIW I just added a FAQ to the wiki here and it is linked to on the main wiki page. Perhaps this could also be linked to somewhere on nodejs.org and other places?

Contributor

mscdex commented Oct 14, 2015

FWIW I just added a FAQ to the wiki here and it is linked to on the main wiki page. Perhaps this could also be linked to somewhere on nodejs.org and other places?

@friism

This comment has been minimized.

Show comment
Hide comment
@jbergstroem

This comment has been minimized.

Show comment
Hide comment
@jbergstroem

jbergstroem Oct 15, 2015

Member

perhaps dig through a few old issues that shared the same type of characteristics so google/seo/etc can drive people in the right direction?

Member

jbergstroem commented Oct 15, 2015

perhaps dig through a few old issues that shared the same type of characteristics so google/seo/etc can drive people in the right direction?

@hunterloftis

This comment has been minimized.

Show comment
Hide comment
@hunterloftis

hunterloftis Oct 15, 2015

@friism I recommend --max_old_space_size when this occasionally pops up; the biggest downside is that when you specify that flag and then exceed the limit, your app will shut down hard. Heroku allows you to burst 5x the memory limit, so it's not always best to crash an app instead of letting it temporarily exceed soft limits.

This is probably documentation we should provide as well in a 'Memory-management with Node on Heroku' article.

@friism I recommend --max_old_space_size when this occasionally pops up; the biggest downside is that when you specify that flag and then exceed the limit, your app will shut down hard. Heroku allows you to burst 5x the memory limit, so it's not always best to crash an app instead of letting it temporarily exceed soft limits.

This is probably documentation we should provide as well in a 'Memory-management with Node on Heroku' article.

@AndrewBarba

This comment has been minimized.

Show comment
Hide comment
@AndrewBarba

AndrewBarba Oct 15, 2015

@hunterloftis From what we've seen, the second we hit a memory limit everything pretty much starts timing out. I'd actually rather a hard kill at that point so we can get a fresh node up that can respond to requests again.

You mention that if we specify --max_old_space_size the node will shut down immediately if we hit that limit, but doesn't that flag have to do with V8's heap and when it should start cleaning things up? I don't see how that relates to Heroku's memory limit and when a node is killed (other than exceeding Heroku's limit of course). In the tests above where we passed --max_old_space_size=128 the memory on the node was around 220MB.

@hunterloftis From what we've seen, the second we hit a memory limit everything pretty much starts timing out. I'd actually rather a hard kill at that point so we can get a fresh node up that can respond to requests again.

You mention that if we specify --max_old_space_size the node will shut down immediately if we hit that limit, but doesn't that flag have to do with V8's heap and when it should start cleaning things up? I don't see how that relates to Heroku's memory limit and when a node is killed (other than exceeding Heroku's limit of course). In the tests above where we passed --max_old_space_size=128 the memory on the node was around 220MB.

@hunterloftis

This comment has been minimized.

Show comment
Hide comment
@hunterloftis

hunterloftis Oct 15, 2015

@AndrewBarba yeah it looks like your app is particularly hard-hit by lazy collection, and in that case the lesser evil is shutting down a process.

I'd be interested in hearing from @bnoordhuis on this but, from what I've seen, if your app requires more space than max_old_space_size allows (ie, you're actually storing > X mb in objects, vs storing < X mb in objects and needing more frequent sweeps to clear out things you don't store anymore)... then you'll get 'allocation failed' errors (hard shutdowns).

Also keep in mind that 'old space' is just one part of the application's memory footprint, so you can expect the whole app to take > max_old_space_size as a whole.

@AndrewBarba yeah it looks like your app is particularly hard-hit by lazy collection, and in that case the lesser evil is shutting down a process.

I'd be interested in hearing from @bnoordhuis on this but, from what I've seen, if your app requires more space than max_old_space_size allows (ie, you're actually storing > X mb in objects, vs storing < X mb in objects and needing more frequent sweeps to clear out things you don't store anymore)... then you'll get 'allocation failed' errors (hard shutdowns).

Also keep in mind that 'old space' is just one part of the application's memory footprint, so you can expect the whole app to take > max_old_space_size as a whole.

@trevnorris

This comment has been minimized.

Show comment
Hide comment
@trevnorris

trevnorris Oct 15, 2015

Contributor

Might try playing with flags like --gc_interval and/or --always_compact. Will slow down the application, but if there's a memory crunch then is worth giving it a go.

Contributor

trevnorris commented Oct 15, 2015

Might try playing with flags like --gc_interval and/or --always_compact. Will slow down the application, but if there's a memory crunch then is worth giving it a go.

@hunterloftis

This comment has been minimized.

Show comment
Hide comment
@hunterloftis

hunterloftis Oct 15, 2015

For a kitchen sink example with arbitrary values:

node --gc_global --optimize_for_size --max_old_space_size=960 --use_idle_notification --always_compact --max_executable_size=64 --gc_interval=100 --expose_gc server.js

For a kitchen sink example with arbitrary values:

node --gc_global --optimize_for_size --max_old_space_size=960 --use_idle_notification --always_compact --max_executable_size=64 --gc_interval=100 --expose_gc server.js

@ahdinosaur ahdinosaur referenced this issue in mmckegg/loop-drop-app Oct 16, 2015

Closed

seems to start lagging after being used for a while #32

@bnoordhuis

This comment has been minimized.

Show comment
Hide comment
@bnoordhuis

bnoordhuis Oct 17, 2015

Member

@hunterloftis Yes, --max_old_space_size is a hard limit. --use_idle_notification is currently a no-op, by the way.

Member

bnoordhuis commented Oct 17, 2015

@hunterloftis Yes, --max_old_space_size is a hard limit. --use_idle_notification is currently a no-op, by the way.

@claudiorodriguez

This comment has been minimized.

Show comment
Hide comment
@claudiorodriguez

claudiorodriguez Oct 20, 2015

Member

Just weighing in to share my experience: I had this happen to me with 0.12.x and 4.0.x - before I migrated everything to 4.0, Heroku Support recommended --max_old_space_size and it worked like a charm. I still get some instances of the error every now and then, but the affected dyno shuts down and no great harm is done, just a couple dozen timed out requests once a week or so (out of thousands per minute).
Also, migrating to 4.0.x cut down memory usage from an average of about 500MB to 270 per dyno.

Member

claudiorodriguez commented Oct 20, 2015

Just weighing in to share my experience: I had this happen to me with 0.12.x and 4.0.x - before I migrated everything to 4.0, Heroku Support recommended --max_old_space_size and it worked like a charm. I still get some instances of the error every now and then, but the affected dyno shuts down and no great harm is done, just a couple dozen timed out requests once a week or so (out of thousands per minute).
Also, migrating to 4.0.x cut down memory usage from an average of about 500MB to 270 per dyno.

@AndrewBarba

This comment has been minimized.

Show comment
Hide comment
@AndrewBarba

AndrewBarba Oct 20, 2015

@fansworld-claudio That's great to hear. What size dyno are you running and what did you end up setting max_old_space_size to? Also have you used this setting with cluster?

@fansworld-claudio That's great to hear. What size dyno are you running and what did you end up setting max_old_space_size to? Also have you used this setting with cluster?

@claudiorodriguez

This comment has been minimized.

Show comment
Hide comment
@claudiorodriguez

claudiorodriguez Oct 20, 2015

Member

@AndrewBarba At first I had the Standard-2X (1GB) dynos and max_old at 960, then after a couple weeks of stability I scaled it down to Standard-1X (512MB) and max_old at 480 - has been stable for a couple months. YMMV though, this is a REST API that does mostly i/o with redis and mongo.

Member

claudiorodriguez commented Oct 20, 2015

@AndrewBarba At first I had the Standard-2X (1GB) dynos and max_old at 960, then after a couple weeks of stability I scaled it down to Standard-1X (512MB) and max_old at 480 - has been stable for a couple months. YMMV though, this is a REST API that does mostly i/o with redis and mongo.

@apechimp

This comment has been minimized.

Show comment
Hide comment
@apechimp

apechimp Oct 30, 2015

Contributor

It's kinda a pain to set this for all your heroku apps (and remember to vary this based on dyno size), so I made heroku-node wraps node and sets the max_old_space_size based on the values of $WEB_MEMORY and $WEB_CONCURRENCY which heroku sets based on the size of dyno you've chosen.

Contributor

apechimp commented Oct 30, 2015

It's kinda a pain to set this for all your heroku apps (and remember to vary this based on dyno size), so I made heroku-node wraps node and sets the max_old_space_size based on the values of $WEB_MEMORY and $WEB_CONCURRENCY which heroku sets based on the size of dyno you've chosen.

@AndrewBarba

This comment has been minimized.

Show comment
Hide comment
@AndrewBarba

AndrewBarba Oct 31, 2015

@apechimp I just gave that a go and this showed in the console immediately:

2015-10-31T14:51:14.352577+00:00 app[web.1]: <--- Last few GCs --->
2015-10-31T14:51:14.352577+00:00 app[web.1]:
2015-10-31T14:51:14.352579+00:00 app[web.1]:    14856 ms: Mark-sweep 48.0 (86.0) -> 48.0 (86.0) MB, 93.5 / 0 ms [allocation failure] [GC in old space requested].
2015-10-31T14:51:14.352580+00:00 app[web.1]:    14967 ms: Mark-sweep 48.0 (86.0) -> 48.0 (86.0) MB, 111.4 / 0 ms [allocation failure] [GC in old space requested].
2015-10-31T14:51:14.352581+00:00 app[web.1]:    15055 ms: Mark-sweep 48.0 (86.0) -> 48.0 (86.0) MB, 87.9 / 0 ms [last resort gc].
2015-10-31T14:51:14.352582+00:00 app[web.1]:    15178 ms: Mark-sweep 48.0 (86.0) -> 48.0 (86.0) MB, 122.7 / 0 ms [last resort gc].
2015-10-31T14:51:14.352582+00:00 app[web.1]:
2015-10-31T14:51:14.352583+00:00 app[web.1]:
2015-10-31T14:51:14.352583+00:00 app[web.1]: <--- JS stacktrace --->
2015-10-31T14:51:14.352584+00:00 app[web.1]:
2015-10-31T14:51:14.352585+00:00 app[web.1]: ==== JS stack trace =========================================

Looks like you are dividing WEB_MEMORY by some factor of concurrency but WEB_MEMORY as explained by that article you reference is the recommended memory for each process. As is, running this on the Performance M dyno would give you:

WEB_CONCURRENCY = 5;
WEB_MEMORY = 512;
WEB_MEMORY / (2 * WEB_CONCURRENCY) = 51.2MB

51.2MB of ram is not enough to even start up. I think you can just simplify to WEB_MEMORY / 2 or even WEB_MEMORY / 1.5. For us 256 seems to be the sweet spot so we'll stick with / 2

@apechimp I just gave that a go and this showed in the console immediately:

2015-10-31T14:51:14.352577+00:00 app[web.1]: <--- Last few GCs --->
2015-10-31T14:51:14.352577+00:00 app[web.1]:
2015-10-31T14:51:14.352579+00:00 app[web.1]:    14856 ms: Mark-sweep 48.0 (86.0) -> 48.0 (86.0) MB, 93.5 / 0 ms [allocation failure] [GC in old space requested].
2015-10-31T14:51:14.352580+00:00 app[web.1]:    14967 ms: Mark-sweep 48.0 (86.0) -> 48.0 (86.0) MB, 111.4 / 0 ms [allocation failure] [GC in old space requested].
2015-10-31T14:51:14.352581+00:00 app[web.1]:    15055 ms: Mark-sweep 48.0 (86.0) -> 48.0 (86.0) MB, 87.9 / 0 ms [last resort gc].
2015-10-31T14:51:14.352582+00:00 app[web.1]:    15178 ms: Mark-sweep 48.0 (86.0) -> 48.0 (86.0) MB, 122.7 / 0 ms [last resort gc].
2015-10-31T14:51:14.352582+00:00 app[web.1]:
2015-10-31T14:51:14.352583+00:00 app[web.1]:
2015-10-31T14:51:14.352583+00:00 app[web.1]: <--- JS stacktrace --->
2015-10-31T14:51:14.352584+00:00 app[web.1]:
2015-10-31T14:51:14.352585+00:00 app[web.1]: ==== JS stack trace =========================================

Looks like you are dividing WEB_MEMORY by some factor of concurrency but WEB_MEMORY as explained by that article you reference is the recommended memory for each process. As is, running this on the Performance M dyno would give you:

WEB_CONCURRENCY = 5;
WEB_MEMORY = 512;
WEB_MEMORY / (2 * WEB_CONCURRENCY) = 51.2MB

51.2MB of ram is not enough to even start up. I think you can just simplify to WEB_MEMORY / 2 or even WEB_MEMORY / 1.5. For us 256 seems to be the sweet spot so we'll stick with / 2

@apechimp

This comment has been minimized.

Show comment
Hide comment
@apechimp

apechimp Oct 31, 2015

Contributor

@AndrewBarba, mea culpa That's fixed as of heroku-node@1.0.1.

Contributor

apechimp commented Oct 31, 2015

@AndrewBarba, mea culpa That's fixed as of heroku-node@1.0.1.

@MayhemYDG MayhemYDG referenced this issue in MayhemYDG/iltorb Nov 4, 2015

Closed

High memory usage #3

@piranna

This comment has been minimized.

Show comment
Hide comment
@piranna

piranna Nov 19, 2015

Contributor

Couldn't the garbage collector also being executed when getting and out of memory exception? This would helps in a more generic way on systems with really low memory constrains, for example when executing npm on NodeOS it starts killing processes on QEmu with the default memory settings (128mb), and the same would happen on Raspberry Pi without swap...

Contributor

piranna commented Nov 19, 2015

Couldn't the garbage collector also being executed when getting and out of memory exception? This would helps in a more generic way on systems with really low memory constrains, for example when executing npm on NodeOS it starts killing processes on QEmu with the default memory settings (128mb), and the same would happen on Raspberry Pi without swap...

@ChALkeR

This comment has been minimized.

Show comment
Hide comment
@ChALkeR

ChALkeR Nov 19, 2015

Member

@piranna What exact exception do you mean? If you are having allocation failures in mind, catching all of those in some generic way is a terrible idea — many things could go wrong from that. If you are speaking about the system oom killer, intercepting it requires specific settings on the system level.

Member

ChALkeR commented Nov 19, 2015

@piranna What exact exception do you mean? If you are having allocation failures in mind, catching all of those in some generic way is a terrible idea — many things could go wrong from that. If you are speaking about the system oom killer, intercepting it requires specific settings on the system level.

@piranna

This comment has been minimized.

Show comment
Hide comment
@piranna

piranna Nov 19, 2015

Contributor

@piranna What exact exception do you mean?

The error I get is an Out of Memory exception, thrown at process level:
pantallazo-qemu
It happens when calling malloc() and the system don't have enough free memory or a contiguos free chunk big enough for the requested size. I propose that when creating a new Javascript object, when trying to reserve memory capture the exception on the C++ area and if so, exec the garbage colector and try to reserve it again. Probably this is more a v8 issue...

Contributor

piranna commented Nov 19, 2015

@piranna What exact exception do you mean?

The error I get is an Out of Memory exception, thrown at process level:
pantallazo-qemu
It happens when calling malloc() and the system don't have enough free memory or a contiguos free chunk big enough for the requested size. I propose that when creating a new Javascript object, when trying to reserve memory capture the exception on the C++ area and if so, exec the garbage colector and try to reserve it again. Probably this is more a v8 issue...

@ChALkeR

This comment has been minimized.

Show comment
Hide comment
@ChALkeR

ChALkeR Nov 19, 2015

Member

@pirania That screenshot you have posted shows the OOM killer triggered. It's not interceptable by default.

Also, it's not necessary caused by node process. Your process could allocate some memory at the beginning, then do nothing and still be killed by the OOM killer later (while doing nothing).

Member

ChALkeR commented Nov 19, 2015

@pirania That screenshot you have posted shows the OOM killer triggered. It's not interceptable by default.

Also, it's not necessary caused by node process. Your process could allocate some memory at the beginning, then do nothing and still be killed by the OOM killer later (while doing nothing).

@piranna

This comment has been minimized.

Show comment
Hide comment
@piranna

piranna Nov 19, 2015

Contributor

@pirania That screenshot you have posted shows the OOM killer triggered. It's not interceptable by default.

What a shame, I though it would be a good feature :-(

Also, it's not necessary caused by node process. Your process could allocate some memory at the beginning, then do nothing and still be killed by the OOM killer later (while doing nothing).

Yeah, I know it's killing random processes after it. In fact in that screenshot the error was caused by slap but nsh stopped working too :-/

Contributor

piranna commented Nov 19, 2015

@pirania That screenshot you have posted shows the OOM killer triggered. It's not interceptable by default.

What a shame, I though it would be a good feature :-(

Also, it's not necessary caused by node process. Your process could allocate some memory at the beginning, then do nothing and still be killed by the OOM killer later (while doing nothing).

Yeah, I know it's killing random processes after it. In fact in that screenshot the error was caused by slap but nsh stopped working too :-/

@joanniclaborde

This comment has been minimized.

Show comment
Hide comment
@joanniclaborde

joanniclaborde Nov 20, 2015

For what it's worth, here's the solution we're trying now (I sure wish I had found this conversation before!!):

if [ ! "$WEB_MEMORY" = "" ]; then
  if [ $WEB_MEMORY -le 512 ]; then
    NODE_FLAGS="--max_semi_space_size=2 --max_old_space_size=256 --max_executable_size=192"
  elif [ $WEB_MEMORY -le 768 ]; then
    NODE_FLAGS="--max_semi_space_size=8 --max_old_space_size=512 --max_executable_size=384"
  elif [ $WEB_MEMORY -le 1024 ]; then
    NODE_FLAGS="--max_semi_space_size=16 --max_old_space_size=1024 --max_executable_size=512"
  fi
fi

node $NODE_FLAGS "$@"

I'm getting the values for those flags from the V8 defaults, and it seems to be working great so far (Heroku / 512mb).

For what it's worth, here's the solution we're trying now (I sure wish I had found this conversation before!!):

if [ ! "$WEB_MEMORY" = "" ]; then
  if [ $WEB_MEMORY -le 512 ]; then
    NODE_FLAGS="--max_semi_space_size=2 --max_old_space_size=256 --max_executable_size=192"
  elif [ $WEB_MEMORY -le 768 ]; then
    NODE_FLAGS="--max_semi_space_size=8 --max_old_space_size=512 --max_executable_size=384"
  elif [ $WEB_MEMORY -le 1024 ]; then
    NODE_FLAGS="--max_semi_space_size=16 --max_old_space_size=1024 --max_executable_size=512"
  fi
fi

node $NODE_FLAGS "$@"

I'm getting the values for those flags from the V8 defaults, and it seems to be working great so far (Heroku / 512mb).

@spalger spalger referenced this issue in elastic/kibana Dec 8, 2015

Open

Lets talk about v8's memory settings #5595

@piranna

This comment has been minimized.

Show comment
Hide comment
@piranna

piranna Dec 11, 2015

Contributor

the default upper limit is about 1.5

Could it be possible to lower it when the sum of phisical memory + swap is smaller than this? It doesn't makes sense to have such a big limit if it's impossible to achieve it... This would help on memory-constrained systems.

Contributor

piranna commented Dec 11, 2015

the default upper limit is about 1.5

Could it be possible to lower it when the sum of phisical memory + swap is smaller than this? It doesn't makes sense to have such a big limit if it's impossible to achieve it... This would help on memory-constrained systems.

@hunterloftis

This comment has been minimized.

Show comment
Hide comment
@piranna

This comment has been minimized.

Show comment
Hide comment
@piranna

piranna Dec 11, 2015

Contributor

@piranna see #3370 (comment) and #3370 (comment)

Yes, I've read them, what I'm asking for is to calculate and set them automatically on start instead of setting the values with some flags.

Contributor

piranna commented Dec 11, 2015

@piranna see #3370 (comment) and #3370 (comment)

Yes, I've read them, what I'm asking for is to calculate and set them automatically on start instead of setting the values with some flags.

@joanniclaborde

This comment has been minimized.

Show comment
Hide comment
@joanniclaborde

joanniclaborde Dec 11, 2015

Or even a single flag could be a nice option: node --physical_memory=$WEB_MEMORY

Or even a single flag could be a nice option: node --physical_memory=$WEB_MEMORY

@joanniclaborde

This comment has been minimized.

Show comment
Hide comment
@joanniclaborde

joanniclaborde Dec 11, 2015

I also added custom settings for modulus.io's 396mb servos:

if [ $1 -le 396 ]; then
  NODE_FLAGS="--max_semi_space_size=1 --max_old_space_size=198 --max_executable_size=148"
elif ...

I also added custom settings for modulus.io's 396mb servos:

if [ $1 -le 396 ]; then
  NODE_FLAGS="--max_semi_space_size=1 --max_old_space_size=198 --max_executable_size=148"
elif ...
@piranna

This comment has been minimized.

Show comment
Hide comment
@piranna

piranna Dec 11, 2015

Contributor

Or even a single flag could be a nice option: node --physical_memory=$WEB_MEMORY

Is not max_old_space_size flag equivalent to this? Maybe it would be extended to accept auto to calculate it from the current system memory...

Contributor

piranna commented Dec 11, 2015

Or even a single flag could be a nice option: node --physical_memory=$WEB_MEMORY

Is not max_old_space_size flag equivalent to this? Maybe it would be extended to accept auto to calculate it from the current system memory...

@TylerBrock

This comment has been minimized.

Show comment
Hide comment
@TylerBrock

TylerBrock Jan 12, 2016

So because setting max-old-space-size just crashes your application when the limit is reached it isn't very helpful. I made this https://github.com/HustleInc/regiment which will seamlessly create and preemptively replace workers so that you never hit max-old-space-size.

So because setting max-old-space-size just crashes your application when the limit is reached it isn't very helpful. I made this https://github.com/HustleInc/regiment which will seamlessly create and preemptively replace workers so that you never hit max-old-space-size.

@hunterloftis hunterloftis referenced this issue in heroku/heroku-buildpack-nodejs Jan 13, 2016

Open

Alias node with flags for dyno-specific memory limits #287

@TylerBrock

This comment has been minimized.

Show comment
Hide comment
@TylerBrock

TylerBrock Jan 14, 2016

Follow up: this is pure win. I'm running serveral 2x dynos in production for a very leaky app with Regiment.middleware.MemoryFootprint(750) and I've had no dropped requests, no memory overuse, and my aggregate memory usage hovers around 700-750mb/dyno.

pure_memory_win

You do not want to just set --max-semi-space-size, max-old-space-size, max-executable-size etc..., because when those limits are reached your node process simply crashes and any currently running requests are cut short.

Follow up: this is pure win. I'm running serveral 2x dynos in production for a very leaky app with Regiment.middleware.MemoryFootprint(750) and I've had no dropped requests, no memory overuse, and my aggregate memory usage hovers around 700-750mb/dyno.

pure_memory_win

You do not want to just set --max-semi-space-size, max-old-space-size, max-executable-size etc..., because when those limits are reached your node process simply crashes and any currently running requests are cut short.

@abienkowski

This comment has been minimized.

Show comment
Hide comment
@abienkowski

abienkowski Feb 3, 2016

Thank you for the detailed analysis.

Just spent all day debugging a critical infrastructure performance issues, and finally found that one of the services was move to v4.2.8. After rolling back to v0.12 everything returned to normal.

Thank you for the detailed analysis.

Just spent all day debugging a critical infrastructure performance issues, and finally found that one of the services was move to v4.2.8. After rolling back to v0.12 everything returned to normal.

@ChALkeR ChALkeR added the performance label Feb 16, 2016

@4kochi 4kochi referenced this issue in litixsoft/log4js-node-mongodb Apr 13, 2016

Closed

Memory Leak #3

@bjfletcher

This comment has been minimized.

Show comment
Hide comment
@bjfletcher

bjfletcher May 3, 2016

@TylerBrock you mentioned that:

So because setting max-old-space-size just crashes your application when the limit is reached

I've been experimenting and noticed if --max-old-space-size is used without --max_semi_space_size being also adjusted then, yes, the application will get killed. If --max_semi_space_size is set to 1 then usually it doesn't get killed.

If anyone's interested, I've written more about the experiment here. If you have any ideas on how I can improve the experiment & understanding, I'd really appreciate it.

@TylerBrock you mentioned that:

So because setting max-old-space-size just crashes your application when the limit is reached

I've been experimenting and noticed if --max-old-space-size is used without --max_semi_space_size being also adjusted then, yes, the application will get killed. If --max_semi_space_size is set to 1 then usually it doesn't get killed.

If anyone's interested, I've written more about the experiment here. If you have any ideas on how I can improve the experiment & understanding, I'd really appreciate it.

@davibe

This comment has been minimized.

Show comment
Hide comment
@davibe

davibe Jul 12, 2016

I am testing --max-old-space-size=128 and i can confirm that when the application exceed 128mb of actually used memory the process gets killed.

davibe commented Jul 12, 2016

I am testing --max-old-space-size=128 and i can confirm that when the application exceed 128mb of actually used memory the process gets killed.

@damianmr

This comment has been minimized.

Show comment
Hide comment
@damianmr

damianmr Jul 12, 2016

Hello, the info in this issue was very useful. I made a module based on @joanniclaborde suggestion.
It is here: https://github.com/damianmr/heroku-node-settings

The good thing is that the node process never reaches its max memory available, so it never crashes in Heroku.

We have been using it in a real project with thousand of requests per minute and had no crash whatsoever.

I encourage you to try it out and report any problems you find.

damianmr commented Jul 12, 2016

Hello, the info in this issue was very useful. I made a module based on @joanniclaborde suggestion.
It is here: https://github.com/damianmr/heroku-node-settings

The good thing is that the node process never reaches its max memory available, so it never crashes in Heroku.

We have been using it in a real project with thousand of requests per minute and had no crash whatsoever.

I encourage you to try it out and report any problems you find.

@joanniclaborde

This comment has been minimized.

Show comment
Hide comment
@joanniclaborde

joanniclaborde Jul 12, 2016

Nice!! We've been using those same settings for a few months now, and it's running fine. Good idea to turn that code into a module @damianmr !

Nice!! We've been using those same settings for a few months now, and it's running fine. Good idea to turn that code into a module @damianmr !

@jvrsgsty jvrsgsty referenced this issue in cobyism/ghost-on-heroku Aug 30, 2016

Closed

Upgrade Node to 4.2 & memory management? #75

@neoziro neoziro referenced this issue in argos-ci/argos Mar 11, 2017

Merged

chore: fix V8 memory on heroku #173

@izakp izakp referenced this issue in artsy/force Mar 23, 2017

Merged

set node vm GC options #1091

@saschanaz saschanaz referenced this issue in kripken/emscripten Dec 23, 2017

Merged

Use emsdk master branch #5972

@Shahor Shahor referenced this issue in nodejs/help May 4, 2018

Closed

On `use_idle_notification` usage #1250

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment