Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add doc to run benchmark against expected UI changes #125

Merged
merged 1 commit into from
Nov 2, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions docs/source/benchmarks/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,26 @@ It represents the execution time distribution for the reference JupyterLab versi
- _start-debug_: Time to start the debugger
- _close_: Time to close the notebook (and display the text editor)

## Understanding the tests

As part of the tests, screenshots are generated after various actions and are expected to match the existing screenshots stored in the repository. The tests fails when the screenshots do not match and the benchmark report is not generated. The existing screenshots can be updated by using the flag `-u`, as was done while running the tests against the reference above.

Using the flag `-u` also sets the values from the current run as the baseline values (`expected`) in the benchmark report. These values are stored in `tests/tests-out/lab-expected-benchmark.json`. The values from the next run without `-u` are marked as the `actual` values in the benchmark report.

### Generating benchmark report when the screenshot mismatch is expected

Sometimes, it is necessary to run benchmark tests against two versions across which there are expected UI changes. In this case, the tests against the challenger will fail since the screenshots do not match and the benchmark report will not be generated. Simply using `-u` to set the challenger's screenshots as the expected ones will be a mistake here, since this will also set the challenger's values as the baseline in the benchmark report.

In this scenario, do the following -

1. Run tests against the reference using `-u` : this sets the expected screenshots and stores the baseline values.
2. Save the file `tests/tests-out/lab-expected-benchmark.json` created by the previous step to another temporary location.
3. Run a small sample of the tests against the challenger with `-u` : this resets the expected screenshots and creates new baseline values in `lab-expected-benchmark.json`.
4. Overwrite `lab-expected-benchmark.json` created in Step-3 with the one created in Step-1.
5. Now, run the tests against the challenger without using `-u`.

With this sequence, the challenger tests will not fail since the expected screenshots are the ones from Step-3, generated by the smaller sample of the challenger itself. The baseline values from reference are also retained to be the expected values in the benchmark report since they overwrote the ones the challenger sample created.

## Test notebooks

The available notebook definitions are located in the [/src/notebooks](https://github.com/jupyterlab/benchmarks/tree/master/tests/generators/) folder (some have special requirements - see below):
Expand Down