Skip to content
This repository has been archived by the owner on Nov 4, 2022. It is now read-only.

gh is awful slow at startup: profile & debug #372

Closed
henvic opened this issue Mar 8, 2015 · 9 comments
Closed

gh is awful slow at startup: profile & debug #372

henvic opened this issue Mar 8, 2015 · 9 comments

Comments

@henvic
Copy link
Member

henvic commented Mar 8, 2015

I'm very unsatisfied by the time gh takes to start up. Compare it with something like git.

Run git and gh, then compare how slower gh is.

I believe we can make gh just as fast by using profiling tools to verify what are the bottlenecks and solve them.

@tomzx
Copy link
Contributor

tomzx commented Sep 18, 2015

I've done a quick profiling and made the results available @ https://gist.github.com/tomzx/ed21d61eba999245f369.

I simply ran node --prof --log-timer-events and used https://v8.googlecode.com/svn/branches/bleeding_edge/tools/profviz/profviz.html to analyze the results.

I don't have any expertise in using the v8 profiler, but hopefully what I will say is correct.

An empty js script takes about 200 ticks to execute while gh with no argument takes 1200 ticks (6x slower).

time and strace were used to track real/user/system times as well as to try to figure out where time spent waiting is located.

# time node empty.js

real    0m0.097s
user    0m0.081s
sys     0m0.015s

# time node bin/gh.js

real    0m1.299s
user    0m0.522s
sys     0m0.089s

# strace -c node empty.js
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 25.07    0.000473          22        22           munmap
 18.02    0.000340           5        63           mmap
 12.51    0.000236          24        10         1 futex
  8.80    0.000166           7        23           brk
  7.21    0.000136           5        28           mprotect
  6.20    0.000117          29         4           clone
  4.13    0.000078           8        10           open
  2.97    0.000056           2        34           rt_sigaction
  2.23    0.000042           5         9         9 access
  2.07    0.000039           3        13           fstat
  1.80    0.000034           4         9           read
  1.75    0.000033           3        10           close
  1.64    0.000031          31         1           readlink
  1.22    0.000023           1        17         8 ioctl
  0.90    0.000017           4         4           lstat
  0.69    0.000013           7         2           pipe2
  0.58    0.000011          11         1           stat
  0.26    0.000005           5         1           write
  0.26    0.000005           5         1           poll
  0.26    0.000005           5         1           epoll_create1
  0.21    0.000004           1         3           rt_sigprocmask
  0.21    0.000004           4         1           execve
  0.16    0.000003           2         2           getrlimit
  0.16    0.000003           3         1           eventfd2
  0.11    0.000002           2         1           getcwd
  0.11    0.000002           2         1           arch_prctl
  0.11    0.000002           2         1           setrlimit
  0.11    0.000002           2         1           set_tid_address
  0.11    0.000002           2         1           clock_getres
  0.11    0.000002           2         1           set_robust_list
  0.05    0.000001           1         1           getuid
------ ----------- ----------- --------- --------- ----------------
100.00    0.001887                   277        18 total

# strace -c node bin/gh.js
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 27.40    0.019537          13      1536       698 stat
 11.18    0.007974          18       449        16 open
  9.13    0.006510          13       487           lstat
  9.02    0.006432          14       466           close
  7.21    0.005143          16       328           read
  6.96    0.004961          35       143         1 futex
  4.21    0.003003          10       307           fstat
  4.08    0.002907          12       244           pread
  4.02    0.002869         261        11           clone
  3.33    0.002373          38        62           munmap
  2.75    0.001959          31        64           brk
  2.74    0.001953         109        18           madvise
  1.84    0.001309          12       105           mmap
  0.70    0.000500          14        36         9 ioctl
  0.70    0.000496          19        26         2 epoll_ctl
  0.62    0.000439          12        36           mprotect
  0.53    0.000381          27        14           epoll_wait
  0.53    0.000380          42         9           write
  0.53    0.000375           9        40           rt_sigaction
  0.37    0.000261          26        10           socketpair
  0.35    0.000253          17        15           rt_sigprocmask
  0.28    0.000201         101         2           wait4
  0.27    0.000192         192         1           rename
  0.22    0.000160          23         7           pipe2
  0.15    0.000107          36         3           epoll_create1
  0.15    0.000104          26         4           getdents
  0.10    0.000073          37         2           pwrite
  0.10    0.000072           8         9         9 access
  0.08    0.000057          29         2           openat
  0.07    0.000050          25         2           shutdown
  0.07    0.000050          17         3           eventfd2
  0.06    0.000046          23         2         2 mkdir
  0.06    0.000042          42         1           readlink
  0.05    0.000036          18         2           dup3
  0.04    0.000026           9         3           getcwd
  0.02    0.000013          13         1           clock_getres
  0.02    0.000012           6         2           getrlimit
  0.01    0.000010           5         2           uname
  0.01    0.000009           9         1           poll
  0.01    0.000006           6         1           getuid
  0.01    0.000006           6         1           setrlimit
  0.01    0.000005           5         1           arch_prctl
  0.01    0.000005           5         1           set_tid_address
  0.01    0.000005           5         1           set_robust_list
  0.00    0.000000           0         1           execve
------ ----------- ----------- --------- --------- ----------------
100.00    0.071302                  4461       737 total

Based on time, running gh.js is about 400ms of js execution time. On top of that, add about 800ms of wait.

Adding a process.exit() just after the list of requires in lib/cmd.js gives me the following for time

# time bin/gh.js

real    0m0.499s
user    0m0.446s
sys     0m0.072s

About half the application time is spent loading the dependencies.

I've also dumped a strace -T node bin/gh.js and looked at the time spent in each system call, this one was the longest I could find (close to the end of the execution):

epoll_wait(5, {{EPOLLIN|EPOLLHUP, {u32=10, u64=10}}}, 1024, -1) = 1 <0.715576>

If I'm not mistaken, it seems to be waiting on a couple of sockets before dying. As you can see, the wait time is around 700ms (while my previous wait time estimate is of 800ms).

Thus, if I were to dig in, I'd look for anything that does open up a socket and if there's any way to trim the requires in lib/cmd.js (possibly not have them all load up upfront but on a need-to basis).

@henvic
Copy link
Member Author

henvic commented Sep 18, 2015

wow! Thanks very much.

I am still in the beginning of my learning curve on profiling node apps. A few months ago I have delivered a talk (Writing CLI tools with Node) at my workplace's front-end meetup (Liferay) and such details would add a lot of information to it :)

By the way, I have created a small npm tool (require-time) that you can use to tell what dependencies are heavy weight. It's very unscientific, but I found it to be very useful for CLI tools. Watch this require-time screencast to see it.

I think the way to go is to load some heavy dependencies just when needed. 3a3c4c4, c088b3e, 5debb99 should be related to this. One of the things I gotta to do is remove the handlebars dependency because it's both heavy and kind of out-of-place (the best template engine for a CLI tool is, well, process.stdout/stderr :)).

I am glad @aredridel even used it on the npmjs website tests! henvic/require-time#1 (comment)

@tomzx
Copy link
Contributor

tomzx commented Sep 18, 2015

After playing a little bit more, it seems the culprit is the insight package. If I noop all of its logic and remove the require('insight') from tracker.js, the time drops from 1.244s to 0.247s of realtime.

@protoEvangelion
Copy link
Member

protoEvangelion commented Nov 18, 2017

@tomzx Thanks for all the "insight" here! (pun intended) @henvic Is it necessary for us to collect usage stats? Are we even utilizing those?

@henvic
Copy link
Member Author

henvic commented Nov 18, 2017

@protoEvangelion, it's nice because we can track what is really being used. If you can somehow make it works without adding such delay, then great. Go for it. If you are not able to do so, I am leaving out the removal decision up to you (I don't use gh anymore (I just never remember about its commands and do things on the web interface or manually instead lol), so I don't feel I have a right to say something on this point).�

Related: yeoman/insight#34

Example of stats we get:

gh pr --browser --user...**** --number ******** (180)
gh pr ******** --user ...*** --comment ******** (109)
gh is --list --assignee ******** --label ******** (104)
gh pr (72)
gh pr ******** (59)
gh pr -s ******** (50)
gh is --browser --user...**** --number ******** (48)
gh pr -s ******** -b ******** (44)
gh is --new -t ********...-L ******** -m ******** (37)
gh pr --list (36)

@protoEvangelion
Copy link
Member

Because the app is stable and we have already collected the stats that we needed in order to analyze trends and high demand commands (which by far was gh pr ...), I would imagine people would appreciate the increased snappiness.

After taking out the insight package, it really didn't save as much time as I would have hoped for.

BEFORE taking it out, when running time node bin/gh, you are looking at about 50-80 ms execution time.

AFTER taking it out, that drops to around 45 ms execution time.

Since the mind can perceive at least as fast as 13 ms increments, it seems beneficial to remove this package as it will increase the speed of all gh commands, even if it is a very small gain.

@protoEvangelion
Copy link
Member

Implemented here: 75d06f6

@protoEvangelion
Copy link
Member

@all-contributors please add tomzx for infra

@allcontributors
Copy link
Contributor

@protoEvangelion

I've put up a pull request to add @tomzx! 🎉

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants