CPU spike for 2500 watched files in v0.8.8 #71

Closed
nemtsov opened this Issue Aug 24, 2012 · 44 comments

Comments

Projects
None yet

nemtsov commented Aug 24, 2012

When watching a large number of files (a bit more than 2500 in my specific case) with supervisor -e 'js|mu' app the CPU spikes to 60% on MBP 2.2Ghz. I repeated the same with forever --spinSleepTime 10 --watch --watchDirectory . start app.js and the CPU remained roughly at 0%.

Collaborator

iangreenleaf commented Aug 24, 2012

What OS are you using?

nemtsov commented Aug 24, 2012

OS X 10.8.1 (on a MacBook Pro Early 2011, 2.2 GHz Intel Core i7, 8 GB 1333 MHz DDR3, 750 GB SATA)

Collaborator

iangreenleaf commented Aug 24, 2012

Thanks. It's probably the file-watching code that's to blame, since it is recursing through and watching every individual file. I'm still wanting to revisit that code - I've just been dragging my feet to see what shakes out of watchFile and watch in core.

DTrejo commented Sep 16, 2012

Same thing happened for me with run.js, was suspicious but I let it go and instead fixed my ignoring of files in gitignore, most notably the node modules folder which generally has tons of files in it.

Would be healthy to figure out what changed in core that makes it less efficient than before, I think.

specifying limited files/dirs to monitor via -w helps, got cpu usage down to ~20% from ~60%.

iangreenleaf,

I've just been dragging my feet to see what shakes out of watchFile
if you point me in the right direction, i can have a look

Collaborator

iangreenleaf commented Oct 8, 2012

I look into the watch vs. watchFile stuff periodically, and always end up hitting some sticking point or area of uncertainty. The requirements for The One Solution To Rule Them All are:

  • Cross-platform compatible
  • Uses inotify and other OS hooks when available (falls back to polling otherwise)
  • File change notification can provide the name of the file
  • Watches directories recursively (or else doesn't have performance problems)
  • Is not deprecated in Node core

I've also been wanting to check in with the core team to see what their future plans are for the fs module - if they're planning to keep both of those in the API, or if one is on the way out. The docs at least look better now than the last time I checked.

Collaborator

isaacs commented Oct 8, 2012

If you want inotify etc, then use fs.watch. If you want filename on notifications, then you're not going to get that (on all platforms) by calling fs.watch() on a directory. File-watching is one part of node where we've just had to accept that true platform independence is not really achievable.

Collaborator

isaacs commented Oct 8, 2012

Re: future plans, fs is very stable. You can rely on it.

Collaborator

iangreenleaf commented Oct 8, 2012

If you want inotify etc, then use fs.watch. If you want filename on notifications, then you're not going to get that (on all platforms) by calling fs.watch() on a directory. File-watching is one part of node where we've just had to accept that true platform independence is not really achievable.

I suspected that might be the answer. So by my reckoning, moving to fs.watch would solve the performance issues but would leave Mac users without the -e option (AFAIK that's the only feature that truly depends on the filename).

I'm also wondering if filename support could be patched on inelegantly in JS land by manually checking mtimes or something - that kind of solution would be good enough for the scale we're using it. Probably worth looking to see if someone has implemented this somewhere.

up does something similar to what's suggested in their watch implementation. They pre-process the files to watch, and then use fs.watch on each file. That implementation doesn't seem to cause any performance issues on OSX. If you're pre-processing the list of files to watch, it seems like you'd still be able to filter out by --extension or --ignore folders as it currently does.

Collaborator

iangreenleaf commented Oct 9, 2012

Interesting. I ran into limitations before with watch having a bound on the number of files it could handle (see #34). I'm testing again and not seeing that limitation any more, so perhaps that option is worth revisiting.

On my fork I switched it to just use watch() and it's been running smoothly without any cpu overhead. Granted, I'm only running on OSX, so I'm oblivious to what any of the issues on Windows would be.

I tried @bmharris fork and I got some improvements on OS X 10.7.5. For some reason the CPU usage was showing some really weird values (130% !)

PID   COMMAND      %CPU  TIME     #TH  #WQ  #POR #MRE RPRVT  RSHRD  RSIZE  VPRVT  VSIZE  PGRP PPID STATE    UID
3245  node         130.9 00:13.76 6/3  0    42   143  14M+   12M    23M+   95M    3014M  3245 311  running  501

With mbharris fork, this went down to around 30%:

PID   COMMAND      %CPU  TIME     #TH  #WQ  #POR #MRE RPRVT  RSHRD  RSIZE  VPRVT  VSIZE  PGRP PPID STATE    UID
3369  node         33.6 00:22.58 6     0    42   145  20M    12M    29M    92M    3007M  3369 311  sleeping 501  11372   385    175      86       5832606+ 333      111483

This is still a lot, though, for ~350 files

$ find . -name '*.js' | wc
     356     356   21662
Collaborator

iangreenleaf commented Oct 23, 2012

Thanks for the benchmark. I agree, that's better but still not ideal. I'm
planning to start putting together a new major release that's 0.8 only, so
I'll use that opportunity to work on this.
On Oct 23, 2012 12:12 PM, "gregerolsson" notifications@github.com wrote:

I tried @bmharris https://github.com/bmharris fork and I got some
improvements on OS X 10.7.5. For some reason the CPU usage was showing some
really weird values (130% !)

PID COMMAND %CPU TIME #TH #WQ #POR #MRE RPRVT RSHRD RSIZE VPRVT VSIZE PGRP PPID STATE UID
3245 node 130.9 00:13.76 6/3 0 42 143 14M+ 12M 23M+ 95M 3014M 3245 311 running 501

With mbharris fork, this went down to around 30%:

PID COMMAND %CPU TIME #TH #WQ #POR #MRE RPRVT RSHRD RSIZE VPRVT VSIZE PGRP PPID STATE UID
3369 node 33.6 00:22.58 6 0 42 145 20M 12M 29M 92M 3007M 3369 311 sleeping 501 11372 385 175 86 5832606+ 333 111483

This is still a lot, though, for ~350 files

$ find . -name '*.js' | wc
356 356 21662


Reply to this email directly or view it on GitHubhttps://github.com/isaacs/node-supervisor/issues/71#issuecomment-9714251.

@ef4 ef4 referenced this issue Oct 29, 2012

Closed

CPU usage #76

chakrit commented Jan 14, 2013

Any further update / workaround for this issue?

Tried @bmharris and the spike were a bit lower, but still there significantly.

chakrit commented Jan 14, 2013

For anyone else having this issue, I have had more success with Python's watchdog which have a very similar feature set to node-supervisor.

$ easy_install pip
$ pip install watchdog

It'll install a watchmedo command for you, check watchmedo auto-restart --help for docs.

I have it aliased as supervise so it is just one command away:

$ alias supervise='watchmedo auto-restart -d . -p "*.node;*.js;*.coffee"'
$ supervise node server.js

... which works perfectly without any noticable CPU spike at all.

Also works great with build tools like cake make or grunt directly. Example for a grunt lint on supervise:

$ supervise grunt lint # re-lint on every file changes.

Lu79 commented Feb 20, 2013

@chakrit watchdog is a stable tool. No CPU overcharged anymore. Many thanks.

watchmedo needed just one more parameter to scan my subdirectories, --recursive, so my alias looks like this:

alias supervise='watchmedo auto-restart -d . --recursive -p "*.js;*.node;*.coffee"'

Awesome tip @chakrit really made a difference for me!
/cc @Lu79

Wow am I glad I found this thread. Great tips from both @thanpolas and @chakrit. I experienced this problem today on Win7 x32 with Node 0.8. For any Windows users battling this same problem, install Python, easy_install (setuptools), and pip as suggested above. Next create a blank file at %Windir%\System32\supervise.bat. Put the following command in that batch file:

watchmedo auto-restart -d . --recursive -p "*.js;*.node;*.coffee" %1 %2

EDIT: ^^ Incorrect code here. See my comment below.

Then to actually supervise your app you would run the same command that @chakrit listed above:

supervise node server.js

After doing some further testing, I found that the actual code needed was something like this:

watchmedo auto-restart -d . --recursive --signal 1 --kill-after 0.1 -p "*.js;*.node;*.coffee" %1 %2

Note that this kills your process uncleanly so if you're needing for your node processes to shutdown cleanly, then either remove the kill-after arg or bump it up to a higher threshold.

I was having trouble with high CPU usage on Windows with even a small app (< 10 files). I forced supervisor to always use fs.watch, and CPU usage dropped to nothing, with the added bonus of file updates being detected instantaneously. I haven't tested this with very large projects, so YMMV, but I think it would be great to use watch instead of watchFile (or at least start with watch and fallback to watchFile, as mentioned in the docs at http://nodejs.org/api/all.html#all_availability).

With a cross-os fs.watch introduced in node 0.10.0, supervisor should really switch to this or at least wrap it in a version check.

wmayner commented Apr 8, 2013

+1, having this issue as well.

+1, having this issue on a very small app as well

nrako commented Apr 11, 2013

https://github.com/remy/nodemon as a replacement work like a charm for me. And it also support --debug.

zumoshi commented Apr 17, 2013

+1, having this issue

Collaborator

iangreenleaf commented Apr 17, 2013

Folks having this issue: are you seeing it with small numbers of files, or a large number of files like the OP? Obviously I would like to get this fixed at some point, but if you're having trouble with a large number of files, the best workaround for the moment will be to narrow down the watched files to those you're actually working on. You can use some combination of the --watch and --ignore files to do this.

Would it be worth considering to ignore node_modules by default and require
an opt in or have a specific override flag? My suspicion is that this is
the problem for most people.

On Wed, Apr 17, 2013 at 4:10 PM, Ian Young notifications@github.com wrote:

Folks having this issue: are you seeing it with small numbers of files, or
a large number of files like the OP? Obviously I would like to get this
fixed at some point, but if you're having trouble with a large number of
files, the best workaround for the moment will be to narrow down the
watched files to those you're actually working on. You can use some
combination of the --watch and --ignore files to do this.


Reply to this email directly or view it on GitHubhttps://github.com/isaacs/node-supervisor/issues/71#issuecomment-16535963
.

Collaborator

iangreenleaf commented Apr 17, 2013

@jcummins Maybe. It would be fine in the majority of cases and would probably take care of most complaints, though I'm loathe to jump to conclusions about what people do or do not want to watch. I would certainly prefer to just get it performant on OSX (I've always watched my entire node_modules on Linux and never had problems), but I haven't had time to work on upgrading this package.

@nrako, great tip for https://github.com/remy/nodemon CPU spikes dropped

ktmud added a commit to ktmud/node-supervisor that referenced this issue Apr 24, 2013

Contributor

ktmud commented Apr 24, 2013

Dose passing a longer poll interval help?

supervisor -p 2000 app.js

ktmud added a commit to ktmud/node-supervisor that referenced this issue Apr 24, 2013

ktmud added a commit to ktmud/node-supervisor that referenced this issue Apr 24, 2013

chakrit commented Apr 24, 2013

Does that means it'll take longer for supervisor to pick up changes?

1 second poll interval seems reasonable if it really fixes the problem. It can still be overridden if needed.

Looking at node.js for a "big deal" project, I was scared for a moment when I saw crazy runaway CPU usage. Luckily for me, it turned out just to be supervisor being overzealous on two fronts: it was watching everything under the sun, and it was apparently doing so by polling too frequently.

For me, this all goes away by:

  • collecting the relevant files to watch under one subdirectory, /modules, bar the entry point
  • changing the polling interval to be a few seconds

So my supervisor invoke line looks like:

supervisor -n -p 2500 -w server.coffee,modules ./server.coffee

...and life is good again ^_^. I'm not sure what this says about the defaults, but I'd say perhaps that the common use case says the default should be at least to ignore node_modules and perhaps only poll every few second or so if polling is required? Both of these could be mentioned early enough in the docs so as to not gotcha anyone.

At the very least, the potential big CPU usage spike should probably be mentioned somewhere early on in the readme?

my usage case, specify limited file, downgrade cpu usage

Collaborator

iangreenleaf commented Jun 15, 2013

This problem should be alleviated in supervisor 0.5.3. If anyone is still seeing very bad performance, please let me know.

We are still have this problem

VirtualBox
Host: Windows 7
Guest: Ubuntu 12.04

Shared drive, using vboxfs system

Using supervisor CPU is 100% and it takes allmost a min for it to start.
Starting without: less then 4 sec

supervisor public/app.js

Collaborator

iangreenleaf commented Aug 14, 2013

v0.5.4 offers a temporary --force-watch flag that changes the file watching mechanism. Those of you having trouble with this issue, please give that a try and report back on how it works.

chakrit commented Oct 11, 2013

@iangreenleaf been using supervisor (npm latest) again these past few months and things seems a lot better now. No longer run into the issue. I'm on OS X btw.

Instead I'm seeing some weird restart behavior (seems app termination are not clean i.e. ports left open) but that'll be a separate issue when I can reproduce it reliably.

natevw commented Nov 27, 2013

Even with the "--force-watch" flag I'm consistently seeing around 110% CPU usage with node-supervisor. (sup.run(["--force-watch", "-e", "js|coffee|jade|styl|css", "app"]);)

Looks like, including node_modules, there are over 3K .js files:

$ find . -name '*.js' | wc -l
    3224

Is "--force-watch" being ignored due to the "-e" option? But it always has a default, so how do I test without it?

natevw commented Nov 27, 2013

Ah, I see in the README that "--force-watch" may be a Windows-specific fix? I am on OS X 10.9 (MAVRIX).

I've found that sup.run(["-i", "node_modules", "-e", "js|coffee|jade|styl|css", "app"]); (i.e. ignoring the node_modules folder with its proliferation of files) does indeed fix the CPU usage issue as a workaround.

paul-sh added a commit to hatalski/truecron that referenced this issue Aug 7, 2014

@paul-sh paul-sh referenced this issue in hatalski/truecron Aug 7, 2014

Merged

Added --force-watch parameter to supervisor. #31

@nemtsov nemtsov closed this Oct 8, 2014

I was having the same prob with a significant CPU usage when using supervisor. This was a minor prob in dev and major prob when running multiple docker instances (of the same site) , since that meant that each new docker container added was consuming a bit of the CPU (which added up quite quickly to a substantial part of the CPU being used just to 'watch')

I'm trying with nodemon at the moment

truh commented Nov 25, 2015

Supervisor uses up 130% CPU according to htop.

Same application ran directly by node drops to 0% CPU usage after 2 seconds upstart time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment