Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research Performance Bottlenecks #465

Closed
2 tasks
Snugug opened this issue Sep 29, 2016 · 6 comments
Closed
2 tasks

Research Performance Bottlenecks #465

Snugug opened this issue Sep 29, 2016 · 6 comments

Comments

@Snugug
Copy link
Member

Snugug commented Sep 29, 2016

First step in #463

Feature: Performance Research
  As a Core Developer
  I want to understand the performance bottlenecks in Punchcard
  So that I can roadmap where I need to optimize Punchcard

  Scenario: New Content
    Given that I am entering a new piece of content
      And that content has references
     When I load the page
     Then I should have an understanding of the memory usage to load that page
      And I should have an understanding of the heap profile to load that page
      And I should have an understanding of the CPU usage to load that page
      And I should have an understanding of blocking/non-blocking processes to load that page
      And I should have an understanding of garbage collection needs to load that page
      And I should be able to identify what aspects need performance improvements

  Scenario: Content Revision
    Given that I am editing a content revision
      And that content has references
     When I load the page
     Then I should have an understanding of the memory usage to load that page
      And I should have an understanding of the heap profile to load that page
      And I should have an understanding of the CPU usage to load that page
      And I should have an understanding of blocking/non-blocking processes to load that page
      And I should have an understanding of garbage collection needs to load that page
      And I should be able to identify what aspects need performance improvements

Performance Research

  • New Content
  • Content Revision
@Snugug Snugug self-assigned this Sep 29, 2016
@Snugug Snugug added this to the Sprint 22: 10-12 milestone Sep 29, 2016
@Snugug
Copy link
Member Author

Snugug commented Oct 4, 2016

It doesn't appear as if I'm able to get as precise measurements as I'd like for this research, but I do have some numbers from a production instance for some endpoints which I think are good places to start:


/content/:type/add

  • Total - 1632ms
  • `SELECT FROM "sessions" - 68.9ms
  • `SELECT FROM "users" - 70.3ms
  • `SELECT FROM "content-type--foo" ~ 40ms

The particular content type that I'm looking at here has a bunch of references and, as can be seen from the chart, they seem to fire sequentially (I think this can be improved). There are also two queries to the same referenced content type, so we should reduce this as well.

Overall, DB time took 17% of total request time, or 277.44ms, so there is quit a lot going on that we can optimize as well, but haven't dug in to that yet.

Response Breakdown


/content/:type/:id/:revision/edit

  • Total - 1322ms
  • `SELECT FROM "sessions" - 71.6ms
  • `SELECT FROM "users" - 72.2ms
  • `SELECT FROM "content-type--foo" ~ 43ms
  • WITH - 74.0ms

While the actual queries appear to be slightly slower here, they seem to be more parallelized, which contributes greatly to its increased speed. I'd like to parallelize these queries all the time.

Overall, DB time took 25% of total request time, or 330.5ms, so there is quit a lot going on that we can optimize as well, but haven't dug in to that yet.

Response Breakdown


Overall, we can optimize how we make DB calls to get a little extra perf out of them, but for both cases we've got almost a full second of churn (or more) in our actual application logic. I'm going to investigate ways to improve this. Times are mostly consistent over multiple runs, with slight variation, although not from the DB side, which leads me to believe there is app churn somewhere we can greatly reduce.

@Snugug
Copy link
Member Author

Snugug commented Oct 4, 2016

Did some more digging, got a CPU Profile for a Punchcard implementation (same as above) with a flame chart. It's pretty telling, and gives me a good place to start digging in.

Flame chart was created using N|Solid which also gave memory/heap/async processes and the like. Most of this turns out to not be anywhere near as informative as this flame chart, and really gives places to dig in to. I've attached the annotated flame chart, as well as the actual cupprofile.

I'm going to synthesize this a little more tomorrow to come up with an action plan.

flame graph

CPU Profile ZIP

@Snugug
Copy link
Member Author

Snugug commented Oct 5, 2016

Okay, I believe I've tracked down what is causing the majority of the slowdown here:

  • DB Query
    • punchcard/lib/utils.js - fill
  • HTML Generation - Content Types
  • Scripts Generation - Content Types
    • plugins.forEach.type.attributes(annon).inputs.forEach.resolvedPlugins.map.children.forEach.inp.options.forEach
    • exports.singleItem.exports.splitPop.exports.getPlugins.dir.forEach
      • spawnSync, execSync
    • content-types/lib/form/scripts.js - rendered

For DB Queries, I believe we can improve performance by doing a shallow search for types referenced, then map over that to get all of the identifiers and values, then re-merging. For types that are referenced multiple times, this will reduce DB queries, and potentially allow us to reduce the size of the functions to hopefully get them inlined by the compiler. Adding identifiers directly to the database I think would also improve performance here for large numbers of referenced items by being able to grab it directly instead of calculating it.

For HTML Generation, there's a lot of working being done in loops and in practice there are a fair number of attributes that need to be processed. I think we can get some performance increases by parallelizing those sync tasks. Looking at parallel.js (old, but just got new maintainers last month so active development is back on). The flares HTML generation are also around rendering the Nunjucks. I think a second place for improvement would be to do the rendering all at once instead of rendering as we go as Nunjucks appears have quite a few steps involved in rendering. This may be harder to do given the context needed for each render.

Finally, for Script Generation, I'm not sure what can be done in the actual Content Types module to improve performance, maybe parallelize the sync tasks again like in the HTML Generation step. I think a much easier and more effective improvement would be to cache the rendered scripts in Punchcard itself. This may prove challenging given the IDs and repeatables, but a good chunk can likely be cached between renders. Same may be true for HTML.

@Snugug
Copy link
Member Author

Snugug commented Oct 5, 2016

@scottnath @joshblack My breakdown of places we can improve performance, any thoughts? Next steps if these sound like a good step forward would be for me to write stories to add to the Performance Optimization Epic

@Snugug
Copy link
Member Author

Snugug commented Oct 7, 2016

@scottnath @joshblack Lots of stories written to get started on this. OK to close?

@scottnath
Copy link
Contributor

Well done deep dive. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants