Research Performance Bottlenecks #465

Snugug · 2016-09-29T11:00:17Z

First step in #463

Feature: Performance Research
  As a Core Developer
  I want to understand the performance bottlenecks in Punchcard
  So that I can roadmap where I need to optimize Punchcard

  Scenario: New Content
    Given that I am entering a new piece of content
      And that content has references
     When I load the page
     Then I should have an understanding of the memory usage to load that page
      And I should have an understanding of the heap profile to load that page
      And I should have an understanding of the CPU usage to load that page
      And I should have an understanding of blocking/non-blocking processes to load that page
      And I should have an understanding of garbage collection needs to load that page
      And I should be able to identify what aspects need performance improvements

  Scenario: Content Revision
    Given that I am editing a content revision
      And that content has references
     When I load the page
     Then I should have an understanding of the memory usage to load that page
      And I should have an understanding of the heap profile to load that page
      And I should have an understanding of the CPU usage to load that page
      And I should have an understanding of blocking/non-blocking processes to load that page
      And I should have an understanding of garbage collection needs to load that page
      And I should be able to identify what aspects need performance improvements

Performance Research

New Content
Content Revision

Snugug · 2016-10-04T14:59:45Z

It doesn't appear as if I'm able to get as precise measurements as I'd like for this research, but I do have some numbers from a production instance for some endpoints which I think are good places to start:

/content/:type/add

Total - 1632ms
`SELECT FROM "sessions" - 68.9ms
`SELECT FROM "users" - 70.3ms
`SELECT FROM "content-type--foo" ~ 40ms

The particular content type that I'm looking at here has a bunch of references and, as can be seen from the chart, they seem to fire sequentially (I think this can be improved). There are also two queries to the same referenced content type, so we should reduce this as well.

Overall, DB time took 17% of total request time, or 277.44ms, so there is quit a lot going on that we can optimize as well, but haven't dug in to that yet.

/content/:type/:id/:revision/edit

Total - 1322ms
`SELECT FROM "sessions" - 71.6ms
`SELECT FROM "users" - 72.2ms
`SELECT FROM "content-type--foo" ~ 43ms
WITH - 74.0ms

While the actual queries appear to be slightly slower here, they seem to be more parallelized, which contributes greatly to its increased speed. I'd like to parallelize these queries all the time.

Overall, DB time took 25% of total request time, or 330.5ms, so there is quit a lot going on that we can optimize as well, but haven't dug in to that yet.

Overall, we can optimize how we make DB calls to get a little extra perf out of them, but for both cases we've got almost a full second of churn (or more) in our actual application logic. I'm going to investigate ways to improve this. Times are mostly consistent over multiple runs, with slight variation, although not from the DB side, which leads me to believe there is app churn somewhere we can greatly reduce.

Snugug · 2016-10-04T21:20:06Z

Did some more digging, got a CPU Profile for a Punchcard implementation (same as above) with a flame chart. It's pretty telling, and gives me a good place to start digging in.

Flame chart was created using N|Solid which also gave memory/heap/async processes and the like. Most of this turns out to not be anywhere near as informative as this flame chart, and really gives places to dig in to. I've attached the annotated flame chart, as well as the actual cupprofile.

I'm going to synthesize this a little more tomorrow to come up with an action plan.

CPU Profile ZIP

Snugug · 2016-10-05T15:52:03Z

Okay, I believe I've tracked down what is causing the majority of the slowdown here:

DB Query
- punchcard/lib/utils.js - fill
HTML Generation - Content Types
- Object.keys.map.Object.keys.map.Object.keys.map.rendered
- content-types/lib/form/html.js - addError, addRequired, renderer
Scripts Generation - Content Types
- plugins.forEach.type.attributes(annon).inputs.forEach.resolvedPlugins.map.children.forEach.inp.options.forEach
- exports.singleItem.exports.splitPop.exports.getPlugins.dir.forEach
  - spawnSync, execSync
- content-types/lib/form/scripts.js - rendered

For DB Queries, I believe we can improve performance by doing a shallow search for types referenced, then map over that to get all of the identifiers and values, then re-merging. For types that are referenced multiple times, this will reduce DB queries, and potentially allow us to reduce the size of the functions to hopefully get them inlined by the compiler. Adding identifiers directly to the database I think would also improve performance here for large numbers of referenced items by being able to grab it directly instead of calculating it.

For HTML Generation, there's a lot of working being done in loops and in practice there are a fair number of attributes that need to be processed. I think we can get some performance increases by parallelizing those sync tasks. Looking at parallel.js (old, but just got new maintainers last month so active development is back on). The flares HTML generation are also around rendering the Nunjucks. I think a second place for improvement would be to do the rendering all at once instead of rendering as we go as Nunjucks appears have quite a few steps involved in rendering. This may be harder to do given the context needed for each render.

Finally, for Script Generation, I'm not sure what can be done in the actual Content Types module to improve performance, maybe parallelize the sync tasks again like in the HTML Generation step. I think a much easier and more effective improvement would be to cache the rendered scripts in Punchcard itself. This may prove challenging given the IDs and repeatables, but a good chunk can likely be cached between renders. Same may be true for HTML.

Snugug · 2016-10-05T15:56:31Z

@scottnath @joshblack My breakdown of places we can improve performance, any thoughts? Next steps if these sound like a good step forward would be for me to write stories to add to the Performance Optimization Epic

Snugug · 2016-10-07T18:27:01Z

@scottnath @joshblack Lots of stories written to get started on this. OK to close?

scottnath · 2016-10-10T14:22:05Z

Well done deep dive. Closing.

Snugug added research experience debt labels Sep 29, 2016

Snugug self-assigned this Sep 29, 2016

Snugug added this to the Sprint 22: 10-12 milestone Sep 29, 2016

Snugug added the request for comments label Oct 5, 2016

This was referenced Oct 7, 2016

Optimize Reference Building #477

Open

Optimize HTML Generation punchcard-cms/content-types#118

Open

Deterministic IDs punchcard-cms/content-types#119

Open

Cache Forms and JS #478

Open

scottnath closed this as completed Oct 10, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Research Performance Bottlenecks #465

Research Performance Bottlenecks #465

Snugug commented Sep 29, 2016 •

edited

Snugug commented Oct 4, 2016

Snugug commented Oct 4, 2016 •

edited

Snugug commented Oct 5, 2016 •

edited

Snugug commented Oct 5, 2016

Snugug commented Oct 7, 2016

scottnath commented Oct 10, 2016

Research Performance Bottlenecks #465

Research Performance Bottlenecks #465

Comments

Snugug commented Sep 29, 2016 • edited

Snugug commented Oct 4, 2016

Snugug commented Oct 4, 2016 • edited

Snugug commented Oct 5, 2016 • edited

Snugug commented Oct 5, 2016

Snugug commented Oct 7, 2016

scottnath commented Oct 10, 2016

Snugug commented Sep 29, 2016 •

edited

Snugug commented Oct 4, 2016 •

edited

Snugug commented Oct 5, 2016 •

edited