jupyterlab · fcollonval · Nov 2, 2022 · Oct 29, 2022
diff --git a/docs/source/benchmarks/index.md b/docs/source/benchmarks/index.md
@@ -51,6 +51,26 @@ It represents the execution time distribution for the reference JupyterLab versi
 - _start-debug_: Time to start the debugger
 - _close_: Time to close the notebook (and display the text editor)
 
+## Understanding the tests
+
+As part of the tests, screenshots are generated after various actions and are expected to match the existing screenshots stored in the repository. The tests fails when the screenshots do not match and the benchmark report is not generated. The existing screenshots can be updated by using the flag `-u`, as was done while running the tests against the reference above.
+
+Using the flag `-u` also sets the values from the current run as the baseline values (`expected`) in the benchmark report. These values are stored in `tests/tests-out/lab-expected-benchmark.json`. The values from the next run without `-u` are marked as the `actual` values in the benchmark report.
+
+### Generating benchmark report when the screenshot mismatch is expected
+
+Sometimes, it is necessary to run benchmark tests against two versions across which there are expected UI changes. In this case, the tests against the challenger will fail since the screenshots do not match and the benchmark report will not be generated. Simply using `-u` to set the challenger's screenshots as the expected ones will be a mistake here, since this will also set the challenger's values as the baseline in the benchmark report.
+
+In this scenario, do the following -
+
+1. Run tests against the reference using `-u` : this sets the expected screenshots and stores the baseline values.
+2. Save the file `tests/tests-out/lab-expected-benchmark.json` created by the previous step to another temporary location.
+3. Run a small sample of the tests against the challenger with `-u` : this resets the expected screenshots and creates new baseline values in `lab-expected-benchmark.json`.
+4. Overwrite `lab-expected-benchmark.json` created in Step-3 with the one created in Step-1.
+5. Now, run the tests against the challenger without using `-u`.
+
+With this sequence, the challenger tests will not fail since the expected screenshots are the ones from Step-3, generated by the smaller sample of the challenger itself. The baseline values from reference are also retained to be the expected values in the benchmark report since they overwrote the ones the challenger sample created.
+
 ## Test notebooks
 
 The available notebook definitions are located in the [/src/notebooks](https://github.com/jupyterlab/benchmarks/tree/master/tests/generators/) folder (some have special requirements - see below):