New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Research Performance Bottlenecks #465
Comments
It doesn't appear as if I'm able to get as precise measurements as I'd like for this research, but I do have some numbers from a production instance for some endpoints which I think are good places to start:
The particular content type that I'm looking at here has a bunch of references and, as can be seen from the chart, they seem to fire sequentially (I think this can be improved). There are also two queries to the same referenced content type, so we should reduce this as well. Overall, DB time took 17% of total request time, or 277.44ms, so there is quit a lot going on that we can optimize as well, but haven't dug in to that yet.
While the actual queries appear to be slightly slower here, they seem to be more parallelized, which contributes greatly to its increased speed. I'd like to parallelize these queries all the time. Overall, DB time took 25% of total request time, or 330.5ms, so there is quit a lot going on that we can optimize as well, but haven't dug in to that yet. Overall, we can optimize how we make DB calls to get a little extra perf out of them, but for both cases we've got almost a full second of churn (or more) in our actual application logic. I'm going to investigate ways to improve this. Times are mostly consistent over multiple runs, with slight variation, although not from the DB side, which leads me to believe there is app churn somewhere we can greatly reduce. |
Did some more digging, got a CPU Profile for a Punchcard implementation (same as above) with a flame chart. It's pretty telling, and gives me a good place to start digging in. Flame chart was created using N|Solid which also gave memory/heap/async processes and the like. Most of this turns out to not be anywhere near as informative as this flame chart, and really gives places to dig in to. I've attached the annotated flame chart, as well as the actual cupprofile. I'm going to synthesize this a little more tomorrow to come up with an action plan. |
Okay, I believe I've tracked down what is causing the majority of the slowdown here:
For DB Queries, I believe we can improve performance by doing a shallow search for types referenced, then map over that to get all of the identifiers and values, then re-merging. For types that are referenced multiple times, this will reduce DB queries, and potentially allow us to reduce the size of the functions to hopefully get them inlined by the compiler. Adding identifiers directly to the database I think would also improve performance here for large numbers of referenced items by being able to grab it directly instead of calculating it. For HTML Generation, there's a lot of working being done in loops and in practice there are a fair number of attributes that need to be processed. I think we can get some performance increases by parallelizing those sync tasks. Looking at parallel.js (old, but just got new maintainers last month so active development is back on). The flares HTML generation are also around rendering the Nunjucks. I think a second place for improvement would be to do the rendering all at once instead of rendering as we go as Nunjucks appears have quite a few steps involved in rendering. This may be harder to do given the context needed for each render. Finally, for Script Generation, I'm not sure what can be done in the actual Content Types module to improve performance, maybe parallelize the sync tasks again like in the HTML Generation step. I think a much easier and more effective improvement would be to cache the rendered scripts in Punchcard itself. This may prove challenging given the IDs and repeatables, but a good chunk can likely be cached between renders. Same may be true for HTML. |
@scottnath @joshblack My breakdown of places we can improve performance, any thoughts? Next steps if these sound like a good step forward would be for me to write stories to add to the Performance Optimization Epic |
@scottnath @joshblack Lots of stories written to get started on this. OK to close? |
Well done deep dive. Closing. |
First step in #463
Performance Research
The text was updated successfully, but these errors were encountered: