Improve testing and tracking of performance critical components #1688

Tyriar · 2018-09-14T16:34:29Z

Rendering performance has regressed over 100% since 3.3.0 #1677, we should improve how we test and track this. I'd love to hear from people on how we could go about doing this is a good way but this is what I think we want:

A benchmark test suite which prints numbers for things like
- Filled line render time
- Empty line render time
- Full viewport render time
- Buffer write/read
- Fill buffer
- etc.
Eventually a dashboard or some way to track this periodically

jerch · 2018-09-15T12:06:20Z

We need an APM for this, lets stall xterm.js development for 3 ys and make the performance tooling first (or gather some bucks and at least 20 highly skilled C++/JS developer to get the job done). 😆

There are a few profiling tools that will cover parts of your list with reliable results (esp. components that could be tested in nodejs with a high res synchronous timer), as soon as the browser engine gets involved we are stuck with the nerfed timer due to Spectre. Since in the end all that counts is the user perceived performance the latter is still testable by doing "full runs" with typical actions (like my current ls benchmark) and comparing the numbers from the integrated profilers. Those numbers are less reliable though and contain noise, imho chrome and firefox use a statistical approach to get the numbers by peeking into the JS callstack periodically. This testing could be done in a selenium env, maybe electron allows there additional interaction. Last but not least chrome exhibits many debug switches that would help with tracing tasks.

Since you wrote this issue from the canvas perf regression perspective - imho this is even more tricky to test in a reliable manner, it heavily relies on system specifics like the OS, installed GPU and might even be driver version dependent. Under such circumstances a "once and for all" optimal solution does not exists.

TL;DR

We could test core components isolated with standard nodejs tooling.
We could test end user experience with typical actions in selenium envs or with electron and the built in profilers.
No clue how to get reliable numbers for GPU driven stuff, lol.

Edit: This might come handy - https://github.com/ChromeDevTools/timeline-viewer. It even has a compare mode.

Edit2: For in browser tests we can use https://github.com/paulirish/automated-chrome-profiling. With this we can run test cases in chrome and grab the profiling data. From there its only a small step to some dashboard thingy tracking changes over time. To get something like this running, we will need decent cloud storage (the profile data tend to get really big).

jerch · 2018-09-18T05:10:13Z

Here is a proof of concept perf tool, that gets the timeline data from chrome: https://github.com/jerch/perf-test. To run it, edit the options in example.js to your needs, start the xterm.js demo and run the example. It talks with chrome via the debugging protocol, I was not able to get the data with the webdriver (the timeline data was removed several versions ago from the selenium chromedriver).

Tyriar · 2018-09-26T22:18:45Z

Current plan:

Improve chrome-timeline https://github.com/jerch/chrome-timeline/issues
Create xtermjs/xterm-benchmark which integrates with chrome-timeline and adds things like adding a baseline to compare, cleaning baseline
Create some reasonably reliable/consistent benchmarks
Use xterm-benchmark when we're testing perf changes in PRs/versions
Integrate with CI to run a baseline on PR base branch against the PR change, comment on PR (only when a benchmark label is present to reduce noise?)

jerch · 2018-10-11T14:57:57Z

@Tyriar
chrome-timeline should now work from npm. There are a few changes:

timeline returns now summaries for traces
by default trace data is not written to disk, can be changed via tracingStartOptions or tracingEndOptions

jerch · 2018-10-11T16:05:17Z

@Tyriar

Offtopic: I already found a rather big perf regression in the parser, remember those numbers here: #1399 (comment) - print has dropped to 50 MB/s 😱 . Others also dropped but only slightly. Not sure yet what causes it, Imho there were only small fixes done to the code after those numbers.

Which leads to a more ontopic question: I have those benchmark data files and scripts from the parser, also used them to get the numbers here #1731 (comment) - I think we can use those for some first systematic perf regression testing. But where to put it? Into xterm-benchmark? Some subfolder in xterm.js for now until we got xterm-benchmark properly set up and integrated? What are your plans with xterm-benchmark?

To get the ball rolling a few ideas from my side:

similar layout to test cases:
Imho the perf case layout should work similar to test cases - means there are perf cases in some perf files with special symbols from xterm-benchmark to make life easier writing those perfs (pretty much like mocha/jasmine does).
provide a cmdline interface:
Maybe for a starter less important, xterm-benchmark itself could provide a cmdline interface to run those perf files and do its magic (like tracking stats over several branches and reporting regressions).
data storage:
To aggregate data and spot regressions xterm-benchmark would also need some persistent storage to aggregate and compare the data with previous runs. The easiest way is imho the common pattern to create a dedicated subfolder in the source repo. Not sure yet, how to save those data efficiently, maybe some json files will do. Dont want to pull in DB stuff from the beginning.
no test case mixing:
Doing chrome-timeline my initial goal was a nice integration with mocha test cases - well thats a bad idea, the debug settings are likely to screw the performance numbers. At least worth a note to not mix test and perf cases (unless someone really wants to test the debugging performance lol).

jerch · 2018-11-02T22:22:08Z

Made some progress: https://github.com/xtermjs/xterm-benchmark

basic working cli
mocha like perf case file creation with preparation and cleanup functions
extendible perf case classes via mixins
baseline creation
eval runs against a given baseline with automated tests
configurable tolerance settings for eval runs

There are a few early xterm.js tests. Those are currently hardlinked against an existing xterm.js (just check out the repo next to the xterm.js repo folder).
Run the cli by:

#> cd xterm-benchmark
#> npm install
#> node lib/cli.js --help

More see the https://github.com/xtermjs/xterm-benchmark/blob/master/README.md.
Its still pre alpha, so dont expect everything to work as intended.
Enjoy 😸

Tyriar · 2019-10-07T17:52:36Z

We've done lots of work on this and can now run benchmarks via npm scripts.

Tyriar added help wanted type/debt Technical debt that could slow us down in the long run area/performance labels Sep 14, 2018

Tyriar mentioned this issue Sep 14, 2018

Dynamic atlas: Reduce unnecessary objects generated and draw from ImageBitmap #1692

Merged

Tyriar closed this as completed Oct 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve testing and tracking of performance critical components #1688

Improve testing and tracking of performance critical components #1688

Tyriar commented Sep 14, 2018

jerch commented Sep 15, 2018 •

edited

jerch commented Sep 18, 2018

Tyriar commented Sep 26, 2018

jerch commented Oct 11, 2018

jerch commented Oct 11, 2018

jerch commented Nov 2, 2018 •

edited

Tyriar commented Oct 7, 2019

Improve testing and tracking of performance critical components #1688

Improve testing and tracking of performance critical components #1688

Comments

Tyriar commented Sep 14, 2018

jerch commented Sep 15, 2018 • edited

jerch commented Sep 18, 2018

Tyriar commented Sep 26, 2018

jerch commented Oct 11, 2018

jerch commented Oct 11, 2018

jerch commented Nov 2, 2018 • edited

Tyriar commented Oct 7, 2019

jerch commented Sep 15, 2018 •

edited

jerch commented Nov 2, 2018 •

edited