New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Atom startup perf #9720

Closed
samccone opened this Issue Nov 20, 2015 · 7 comments

Comments

Projects
None yet
6 participants
@samccone
Question: Why do I see a blank screen for multiple seconds 馃暀 when booting a new atom session?

TLDR:

Most of the startup time is spent requiring modules and deadlocked in node.js execution before the initial paint in the browser is ever fired, resulting in slow feeling startup experience for end users, by potentially bundling all of the deps into one file there could be a sizeable perf savings.

screen shot 2015-11-20 at 10 04 03 am


Over the past week a few people have asked me to take a look at why atom鈥檚 startup time takes multiple seconds.

screen shot 2015-11-20 at 10 07 00 am

screen shot 2015-11-20 at 10 12 44 am

Being as atom is a hybrid node + electron (web) app I thought that it would make for a an interesting perf analysis to see the relationship of how node.js perf impacts the dependent web render side of thing ... well it ended up being exactly what I expected.. quite interesting.

To get a look at what was happening inside of atom required a few special tools, however since atom has a built in dev tools (command option i) it is easy to dump all of the profiling info right out of atom and import the traces into the newest version of devtools (inside of canary)

Tools

screen shot 2015-11-18 at 8 22 16 pm
screen shot 2015-11-18 at 8 22 25 pm

  • Press shift several times until you get the extra special experiments
    screen shot 2015-11-18 at 8 22 49 pm

The investigation 馃攳

Once our tools were setup it was pretty simple to reproduce the test case, open up atom (in my case I built atom off of the tip of tree bcbb8c0) and open up the built in inspector and then hit command + R to trip a reload of atom. Now we are cooking with our timeline and profile dumps -- onto looking at what the timeline is showing us

screen shot 2015-11-20 at 9 00 03 am

Looking at the initial chunk before load is called it looks like we are being bogged down in simply requiring all of the modules that atom uses.. before we do any work at all we spend over 70ms at initial boot just in require module calls

screen shot 2015-11-20 at 9 02 47 am

Once the initial window.onload we get kicked into another 167ms of module loads before we even have even started setting up our new atom env (which does all of the painting)

screen shot 2015-11-20 at 9 07 11 am

And then around 500~ ms we seen the initial non-blank paint happen. At this point we have a pretty good picture of where the time is being spent on startup in the most ideal of situations (where a user does not have a pile of plugins or custom themes to further slow it down).

screen shot 2015-11-20 at 9 13 53 am

Looking into the work that is happening inside of setup window prior to painting we can break the 200+ ms down into specific segments of work that perhaps could happen after the initial paint, thus prioritizing when we draw the UI to the user before loading additional modules and doing work that is blocking

screen shot 2015-11-20 at 9 33 18 am

So with all of this information, it is clear that we are paying a significant cost due to the number of modules that are being required and time being spent inside of the require load dance prior to even trying to run any JS. (one fun downfall of putting all of the requires at the top of all of your JS files). The profile deep dive does raise the question of how much of this require work can be pushed off to happen async after the initial paint? Would it be possible to precompile all of the JS used in atom into a single blob to avoid the fixed require cost? How much of a progressive experience is acceptable for plugins? can we load them in after the fact or will this be too strange?

Overall this profile shows that there is nothing that atom is doing wrong, there are just some interesting performance costs that most developers we never run into or care about given that the difference between a 200ms and 500ms startup is not something to spend time reworking the world around, however for a text editor where users are constantly opening new windows it can become quite noticeable and annoying.

Welp, hopefully this is helpful!

Thanks again to @paulirish and all of the awesome folks at github and atom :)

@izuzak izuzak added the performance label Nov 20, 2015

@izuzak

This comment has been minimized.

Show comment
Hide comment
@izuzak

izuzak Nov 20, 2015

Member

Thanks for this, @samccone 鈿★笍 -- just going to cc a few folks from the Atom team for more visibility. 馃敪

@nathansobo @as-cii @maxbrunsfeld @atom/core

Member

izuzak commented Nov 20, 2015

Thanks for this, @samccone 鈿★笍 -- just going to cc a few folks from the Atom team for more visibility. 馃敪

@nathansobo @as-cii @maxbrunsfeld @atom/core

@abdonrd

This comment has been minimized.

Show comment
Hide comment
@abdonrd

abdonrd Nov 20, 2015

+1

And awesome issue explication!

abdonrd commented Nov 20, 2015

+1

And awesome issue explication!

@benogle

This comment has been minimized.

Show comment
Hide comment
@benogle

benogle Nov 20, 2015

Contributor

Nice! What version did you do the analysis on?

Would it be possible to precompile all of the JS used in atom into a single blob to avoid the fixed require cost?

If I understand correctly this should be better in 1.3.0-beta due to a new NativeCompileCache.

Refs #9572

Contributor

benogle commented Nov 20, 2015

Nice! What version did you do the analysis on?

Would it be possible to precompile all of the JS used in atom into a single blob to avoid the fixed require cost?

If I understand correctly this should be better in 1.3.0-beta due to a new NativeCompileCache.

Refs #9572

@samccone

This comment has been minimized.

Show comment
Hide comment
@samccone

samccone Nov 20, 2015

@benogle I did the tests on bcbb8c0

Which I think contains the work done in #9318

@benogle I did the tests on bcbb8c0

Which I think contains the work done in #9318

@nathansobo

This comment has been minimized.

Show comment
Hide comment
@nathansobo

nathansobo Nov 20, 2015

Contributor

@as-cii had explained to me at some point that he was seeing no much slower compile times with the native cache when dev tools are open. He's on a plane back to Europe but I'm sure he'll have more to say in a couple days. My understanding from what he tells me is that real world require cost is actually about 30ms but can't be measured when the profiler is enabled. If he can conform this is true, then it does seem like producing concatenated bundles does seem promising for reducing any of the fixed overhead associated with calls to require.

Contributor

nathansobo commented Nov 20, 2015

@as-cii had explained to me at some point that he was seeing no much slower compile times with the native cache when dev tools are open. He's on a plane back to Europe but I'm sure he'll have more to say in a couple days. My understanding from what he tells me is that real world require cost is actually about 30ms but can't be measured when the profiler is enabled. If he can conform this is true, then it does seem like producing concatenated bundles does seem promising for reducing any of the fixed overhead associated with calls to require.

@as-cii

This comment has been minimized.

Show comment
Hide comment
@as-cii

as-cii Nov 23, 2015

Member

Thanks @samccone for the write-up and for your investigation. 馃檱

So, as @nathansobo already mentioned, from an earlier investigation we have found out that the actual time we spend in compiling modules when DevTools are not open is about ~ 30ms: please, note that here we're referring to v8 compilation, i.e. wrapping files within a function (Module.wrap) and executing them through runInThisContext (e.g. ScriptCompiler::CompileUnbound, unbound_script->BindToCurrentContext(), bound_script->Run()).

That said, I believe we all agree that there's some other kind of cost when calling require. Measuring how much this cost is, though, seems a bit trickier to me as it's not really evident if we just look at the profiler (as already pointed out, I feel like the JS profiler has a weird impact on those metrics). As stated in #9572, we think that some of this time is being used to access the file system for reading, statting and resolving files.

Lazily requiring modules could be an option, but I would employ such technique only for expensive modules (as we have already done in the past, e.g. atom/spell-check#98): the problem I see with this optimization strategy is that it only works fine when requiring and executing modules happen at two very distinct moments in time, and I believe this doesn't hold true for the initial pile of require statements we have before instantiating an AtomEnvironment.

We're already working towards avoiding to load packages that don't need to be activated right away in #9687, so I feel like that would let us save a good portion of those ~ 150ms spent in the packages land.

With that in mind, I guess this means that bundling our core source files into one might be a solution: once again, I feel like it's just something we have to experiment with to really understand how much we can save out of those initial ~ 250ms by using this approach. If anything, I believe this would allow us to remove some of the noise that those require calls introduce in the profiler and maybe get better insights on how to make that code path faster.

Moreover, at the summit I think we've understood what it means to instrument the native code that gets executed during Layout and Recalculate Styles, so we will probably be able to understand what makes them slow during startup and hopefully fix those too (or, at least, have a more detailed conversation with the Blink teams).

Member

as-cii commented Nov 23, 2015

Thanks @samccone for the write-up and for your investigation. 馃檱

So, as @nathansobo already mentioned, from an earlier investigation we have found out that the actual time we spend in compiling modules when DevTools are not open is about ~ 30ms: please, note that here we're referring to v8 compilation, i.e. wrapping files within a function (Module.wrap) and executing them through runInThisContext (e.g. ScriptCompiler::CompileUnbound, unbound_script->BindToCurrentContext(), bound_script->Run()).

That said, I believe we all agree that there's some other kind of cost when calling require. Measuring how much this cost is, though, seems a bit trickier to me as it's not really evident if we just look at the profiler (as already pointed out, I feel like the JS profiler has a weird impact on those metrics). As stated in #9572, we think that some of this time is being used to access the file system for reading, statting and resolving files.

Lazily requiring modules could be an option, but I would employ such technique only for expensive modules (as we have already done in the past, e.g. atom/spell-check#98): the problem I see with this optimization strategy is that it only works fine when requiring and executing modules happen at two very distinct moments in time, and I believe this doesn't hold true for the initial pile of require statements we have before instantiating an AtomEnvironment.

We're already working towards avoiding to load packages that don't need to be activated right away in #9687, so I feel like that would let us save a good portion of those ~ 150ms spent in the packages land.

With that in mind, I guess this means that bundling our core source files into one might be a solution: once again, I feel like it's just something we have to experiment with to really understand how much we can save out of those initial ~ 250ms by using this approach. If anything, I believe this would allow us to remove some of the noise that those require calls introduce in the profiler and maybe get better insights on how to make that code path faster.

Moreover, at the summit I think we've understood what it means to instrument the native code that gets executed during Layout and Recalculate Styles, so we will probably be able to understand what makes them slow during startup and hopefully fix those too (or, at least, have a more detailed conversation with the Blink teams).

@lock

This comment has been minimized.

Show comment
Hide comment
@lock

lock bot Mar 30, 2018

This issue has been automatically locked since there has not been any recent activity after it was closed. If you can still reproduce this issue in Safe Mode then please open a new issue and fill out the entire issue template to ensure that we have enough information to address your issue. Thanks!

lock bot commented Mar 30, 2018

This issue has been automatically locked since there has not been any recent activity after it was closed. If you can still reproduce this issue in Safe Mode then please open a new issue and fill out the entire issue template to ensure that we have enough information to address your issue. Thanks!

@lock lock bot locked and limited conversation to collaborators Mar 30, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.