Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve testing and tracking of performance critical components #1688

Closed
Tyriar opened this issue Sep 14, 2018 · 7 comments
Closed

Improve testing and tracking of performance critical components #1688

Tyriar opened this issue Sep 14, 2018 · 7 comments
Labels
area/performance help wanted type/debt Technical debt that could slow us down in the long run

Comments

@Tyriar
Copy link
Member

Tyriar commented Sep 14, 2018

Rendering performance has regressed over 100% since 3.3.0 #1677, we should improve how we test and track this. I'd love to hear from people on how we could go about doing this is a good way but this is what I think we want:

  • A benchmark test suite which prints numbers for things like
    • Filled line render time
    • Empty line render time
    • Full viewport render time
    • Buffer write/read
    • Fill buffer
    • etc.
  • Eventually a dashboard or some way to track this periodically
@jerch
Copy link
Member

jerch commented Sep 15, 2018

We need an APM for this, lets stall xterm.js development for 3 ys and make the performance tooling first (or gather some bucks and at least 20 highly skilled C++/JS developer to get the job done). 😆

There are a few profiling tools that will cover parts of your list with reliable results (esp. components that could be tested in nodejs with a high res synchronous timer), as soon as the browser engine gets involved we are stuck with the nerfed timer due to Spectre. Since in the end all that counts is the user perceived performance the latter is still testable by doing "full runs" with typical actions (like my current ls benchmark) and comparing the numbers from the integrated profilers. Those numbers are less reliable though and contain noise, imho chrome and firefox use a statistical approach to get the numbers by peeking into the JS callstack periodically. This testing could be done in a selenium env, maybe electron allows there additional interaction. Last but not least chrome exhibits many debug switches that would help with tracing tasks.

Since you wrote this issue from the canvas perf regression perspective - imho this is even more tricky to test in a reliable manner, it heavily relies on system specifics like the OS, installed GPU and might even be driver version dependent. Under such circumstances a "once and for all" optimal solution does not exists.

TL;DR

  • We could test core components isolated with standard nodejs tooling.
  • We could test end user experience with typical actions in selenium envs or with electron and the built in profilers.
  • No clue how to get reliable numbers for GPU driven stuff, lol.

Edit: This might come handy - https://github.com/ChromeDevTools/timeline-viewer. It even has a compare mode.

Edit2: For in browser tests we can use https://github.com/paulirish/automated-chrome-profiling. With this we can run test cases in chrome and grab the profiling data. From there its only a small step to some dashboard thingy tracking changes over time. To get something like this running, we will need decent cloud storage (the profile data tend to get really big).

@jerch
Copy link
Member

jerch commented Sep 18, 2018

Here is a proof of concept perf tool, that gets the timeline data from chrome: https://github.com/jerch/perf-test. To run it, edit the options in example.js to your needs, start the xterm.js demo and run the example. It talks with chrome via the debugging protocol, I was not able to get the data with the webdriver (the timeline data was removed several versions ago from the selenium chromedriver).

@Tyriar
Copy link
Member Author

Tyriar commented Sep 26, 2018

Current plan:

  1. Improve chrome-timeline https://github.com/jerch/chrome-timeline/issues
  2. Create xtermjs/xterm-benchmark which integrates with chrome-timeline and adds things like adding a baseline to compare, cleaning baseline
  3. Create some reasonably reliable/consistent benchmarks
  4. Use xterm-benchmark when we're testing perf changes in PRs/versions
  5. Integrate with CI to run a baseline on PR base branch against the PR change, comment on PR (only when a benchmark label is present to reduce noise?)

@jerch
Copy link
Member

jerch commented Oct 11, 2018

@Tyriar
chrome-timeline should now work from npm. There are a few changes:

  • timeline returns now summaries for traces
  • by default trace data is not written to disk, can be changed via tracingStartOptions or tracingEndOptions

@jerch
Copy link
Member

jerch commented Oct 11, 2018

@Tyriar

Offtopic: I already found a rather big perf regression in the parser, remember those numbers here: #1399 (comment) - print has dropped to 50 MB/s 😱 . Others also dropped but only slightly. Not sure yet what causes it, Imho there were only small fixes done to the code after those numbers.

Which leads to a more ontopic question: I have those benchmark data files and scripts from the parser, also used them to get the numbers here #1731 (comment) - I think we can use those for some first systematic perf regression testing. But where to put it? Into xterm-benchmark? Some subfolder in xterm.js for now until we got xterm-benchmark properly set up and integrated? What are your plans with xterm-benchmark?

To get the ball rolling a few ideas from my side:

  • similar layout to test cases:
    Imho the perf case layout should work similar to test cases - means there are perf cases in some perf files with special symbols from xterm-benchmark to make life easier writing those perfs (pretty much like mocha/jasmine does).
  • provide a cmdline interface:
    Maybe for a starter less important, xterm-benchmark itself could provide a cmdline interface to run those perf files and do its magic (like tracking stats over several branches and reporting regressions).
  • data storage:
    To aggregate data and spot regressions xterm-benchmark would also need some persistent storage to aggregate and compare the data with previous runs. The easiest way is imho the common pattern to create a dedicated subfolder in the source repo. Not sure yet, how to save those data efficiently, maybe some json files will do. Dont want to pull in DB stuff from the beginning.
  • no test case mixing:
    Doing chrome-timeline my initial goal was a nice integration with mocha test cases - well thats a bad idea, the debug settings are likely to screw the performance numbers. At least worth a note to not mix test and perf cases (unless someone really wants to test the debugging performance lol).

@jerch
Copy link
Member

jerch commented Nov 2, 2018

Made some progress: https://github.com/xtermjs/xterm-benchmark

  • basic working cli
  • mocha like perf case file creation with preparation and cleanup functions
  • extendible perf case classes via mixins
  • baseline creation
  • eval runs against a given baseline with automated tests
  • configurable tolerance settings for eval runs

There are a few early xterm.js tests. Those are currently hardlinked against an existing xterm.js (just check out the repo next to the xterm.js repo folder).
Run the cli by:

#> cd xterm-benchmark
#> npm install
#> node lib/cli.js --help

More see the https://github.com/xtermjs/xterm-benchmark/blob/master/README.md.
Its still pre alpha, so dont expect everything to work as intended.
Enjoy 😸

@Tyriar
Copy link
Member Author

Tyriar commented Oct 7, 2019

We've done lots of work on this and can now run benchmarks via npm scripts.

@Tyriar Tyriar closed this as completed Oct 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/performance help wanted type/debt Technical debt that could slow us down in the long run
Projects
None yet
Development

No branches or pull requests

2 participants