-
-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tabular View for Comparing Baseline and Test Builds for Selected Platforms & Metrics #37
Comments
Tabular View to Display Passing % of Tests for Different Targets & Platforms@smlambert had an excellent suggestion that we could leverage this tabular/matrix view design to display non-performance results as well such as functional and system tests. I've put down a sample view to show one of the ways of how we could display the aggregate results of all Jenkins runs for various targets currently supported by OpenJDK that are mentioned here: https://github.com/AdoptOpenJDK/openjdk-tests. This view is just one of the options and it should certainly be updated to specifically meet the test team's requirements. Each result cell could be used to show the percentage of tests that passed for a specific platform and target. Summary of all Runs for Different Targets & Platforms The tabular view above shows the results of various build lists run on different platforms. From the results above, we can identify the following issues easily:
|
…ed platforms and metrics - Closes adoptium#37 - Current filters: benchmark, platform, cell color - Allows comparison between different jdk versions and types - Cell on click redirects to Perf Compare - Perf Compare changed to fill in values from URL on load - Added sdkResource to parser, will be added as field to database
…ed platforms and metrics - Closes adoptium#37 - Current filters: benchmark, platform, cell color - Allows comparison between different jdk versions and types - Cell on click redirects to Perf Compare - Perf Compare changed to fill in values from URL on load - Added sdkResource to parser, will be added as field to database Signed-off-by: Awsaf Arefin Sakif <awsaf.sakif@ibm.com>
…ed platforms and metrics - Closes adoptium#37 - Current filters: benchmark, platform, cell color - Allows comparison between different jdk versions and types - Cell on click redirects to Perf Compare - Perf Compare changed to fill in values from URL on load - Added sdkResource to parser, will be added as field to database Signed-off-by: Awsaf Arefin Sakif <awsaf.sakif@ibm.com>
…ed platforms and metrics - Closes AdoptOpenJDK#37 - Current filters: benchmark, platform, cell color - Allows comparison between different jdk versions and types - Cell on click redirects to Perf Compare - Perf Compare changed to fill in values from URL on load - Added sdkResource to parser, will be added as field to database Signed-off-by: Awsaf Arefin Sakif <awsaf.sakif@ibm.com>
…ed platforms and metrics - Closes AdoptOpenJDK#37 - Current filters: benchmark, platform, cell color - Allows comparison between different jdk versions and types - Cell on click redirects to Perf Compare - Perf Compare changed to fill in values from URL on load - Added sdkResource to parser, will be added as field to database Co-authored-by: Piyush Gupta piyush286@gmail.com Signed-off-by: Awsaf Arefin Sakif <awsaf.sakif@ibm.com>
…ed platforms and metrics - Closes AdoptOpenJDK#37 - Current filters: benchmark, platform, cell color - Allows comparison between different jdk versions and types - Cell on click redirects to Perf Compare - Perf Compare changed to fill in values from URL on load - Added sdkResource to parser, will be added as field to database Co-authored-by: Piyush Gupta <piyush286@gmail.com> Signed-off-by: Awsaf Arefin Sakif <awsaf.sakif@ibm.com>
…ed platforms and metrics - Closes AdoptOpenJDK#37 - Current filters: benchmark, platform, cell color - Allows comparison between different jdk versions and types - Cell on click redirects to Perf Compare - Perf Compare changed to fill in values from URL on load - Added sdkResource to parser, will be added as field to database Co-Authored-By: Piyush Gupta <piyush286@gmail.com> Signed-off-by: Awsaf Arefin Sakif <awsaf.sakif@ibm.com>
…ed platforms and metrics - Closes AdoptOpenJDK#37 - Current filters: benchmark, platform, cell color - Allows comparison between different jdk versions and types - Cell on click redirects to Perf Compare - Perf Compare changed to fill in values from URL on load - Added sdkResource to parser, will be added as field to database Co-Authored-By: Piyush Gupta <piyush286@gmail.com> Signed-off-by: Awsaf Arefin Sakif <awsaf.sakif@ibm.com>
…ed platforms and metrics - Closes AdoptOpenJDK#37 - Current filters: benchmark, platform, cell color - Allows comparison between different jdk versions and types - Cell on click redirects to Perf Compare - Perf Compare changed to fill in values from URL on load - Added sdkResource to parser, will be added as field to database Co-Authored-By: Piyush Gupta <piyush286@gmail.com> Signed-off-by: Awsaf Arefin Sakif <awsaf.sakif@ibm.com>
…ed platforms and metrics - Closes AdoptOpenJDK#37 - Current filters: benchmark, platform, cell color - Allows comparison between different jdk versions and types - Cell on click redirects to Perf Compare - Perf Compare changed to fill in values from URL on load - Added sdkResource to parser, will be added as field to database - Warning sign appears if total CI exceeds percentage difference Co-Authored-By: Piyush Gupta <piyush286@gmail.com> Signed-off-by: Awsaf Arefin Sakif <awsaf.sakif@ibm.com>
…ed platforms and metrics - Closes AdoptOpenJDK#37 - Current filters: benchmark, platform, cell color - Allows comparison between different jdk versions and types - Cell on click redirects to Perf Compare - Perf Compare changed to fill in values from URL on load - Added sdkResource to parser, will be added as field to database - Warning sign appears if total CI exceeds percentage difference Co-Authored-By: Piyush Gupta <piyush286@gmail.com> Signed-off-by: Awsaf Arefin Sakif <awsaf.sakif@ibm.com>
…ed platforms and metrics - Closes AdoptOpenJDK#37 - Current filters: benchmark, platform, cell color - Allows comparison between different jdk versions and types - Cell on click redirects to Perf Compare - Perf Compare changed to fill in values from URL on load - Added sdkResource to parser, will be added as field to database - Warning sign appears if total CI exceeds percentage difference Co-Authored-By: Piyush Gupta <piyush286@gmail.com> Signed-off-by: Awsaf Arefin Sakif <awsaf.sakif@ibm.com>
Related to adoptium#136 adoptium/aqa-tests#1144 adoptium#37 Tabular View Changes - Enabled the setting of JDK date (i.e. benchmarkProduct) to be dynamic instead of expecting the launch agents such as PerfNext or TestKitGen to set it. ○ JDK date is used on Tabular View to show the data of latest baseline and test builds before that JDK date. - Updated the Tabular View query for fetching unique build names, sdk resource and build servers, options that are displayed for choosing desired baseline or test builds for comparison ○ Query didn't have the correct SDK resource location. - Resolved the issue of Tabular View incorrectly setting states for dropdown options - Fixed the Tabular View query for getting the filtered data by updating the SDK resource location Benchmark Parser Changes - Enable the parsing of some benchmarks such Liberty under Adopt openjdk-tests repo to be parsed by TRSS. This design to be extended further in future PRs to allow parsing of other benchmarks as well. - Simplied perf parser regexes to get various benchmark info such as benchmark name, variant and JDK info ○ Removed some constraints so that all info can be parsed without being affected by Jenkins timestamps - Updated the Java version regex for Open builds - Added extra regex check for parent builds in order to avoid TRSS from considering a perf build from Adopt as a test build. Example of an Adopt perf pipeline name (https://ci.adoptopenjdk.net/view/Test_perf/): Test_openjdk8_j9_sanity.perf_x86-64_linux. - Enabled parsing for ODM 300 Ruleset Signed-off-by: Piyush Gupta <piyush286@gmail.com>
Related to adoptium#136 adoptium/aqa-tests#1144 adoptium#37 Tabular View Changes - Enabled the setting of JDK date (i.e. benchmarkProduct) to be dynamic instead of expecting the launch agents such as PerfNext or TestKitGen to set it. ○ JDK date is used on Tabular View to show the data of latest baseline and test builds before that JDK date. - Updated the Tabular View query for fetching unique build names, sdk resource and build servers, options that are displayed for choosing desired baseline or test builds for comparison ○ Query didn't have the correct SDK resource location. - Resolved the issue of Tabular View incorrectly setting states for dropdown options - Fixed the Tabular View query for getting the filtered data by updating the SDK resource location Benchmark Parser Changes - Enable the parsing of some benchmarks such Liberty under Adopt openjdk-tests repo to be parsed by TRSS. This design to be extended further in future PRs to allow parsing of other benchmarks as well. - Simplied perf parser regexes to get various benchmark info such as benchmark name, variant and JDK info ○ Removed some constraints so that all info can be parsed without being affected by Jenkins timestamps - Updated the Java version regex for Open builds - Enabled parsing for ODM 300 Ruleset Signed-off-by: Piyush Gupta <piyush286@gmail.com>
Related to adoptium#136 adoptium/aqa-tests#1144 adoptium#37 Tabular View Changes - Enabled the setting of JDK date (i.e. benchmarkProduct) to be dynamic instead of expecting the launch agents such as PerfNext or TestKitGen to set it. ○ JDK date is used on Tabular View to show the data of latest baseline and test builds before that JDK date. - Updated the Tabular View query for fetching unique build names, sdk resource and build servers, options that are displayed for choosing desired baseline or test builds for comparison ○ Query didn't have the correct SDK resource location. - Resolved the issue of Tabular View incorrectly setting states for dropdown options - Fixed the Tabular View query for getting the filtered data by updating the SDK resource location Benchmark Parser Changes - Enable the parsing of some benchmarks such Liberty under Adopt openjdk-tests repo to be parsed by TRSS. This design to be extended further in future PRs to allow parsing of other benchmarks as well. - Simplied perf parser regexes to get various benchmark info such as benchmark name, variant and JDK info ○ Removed some constraints so that all info can be parsed without being affected by Jenkins timestamps - Updated the Java version regex for capture all kinds of JDK builds (IBM J9, Open J9, HotSpot, OpenJDK) - Moved `javaVersion` to higher common level in data structure - Updated Perf graph widgets to use `jdkDate` instead of `jdkBuildDateUnixTime` - Removed `jdkBuildDateUnixTime` since it's redundant now as we're storing the jdk - Renamed `benchmarkProduct` to `jdkDate` to reflect the correct data that it's storing and updated the code in Data Manager, Perf Compare and Tabular View - Enabled parsing for ODM 300 Ruleset Signed-off-by: Piyush Gupta <piyush286@gmail.com>
Related to adoptium#136 adoptium/aqa-tests#1144 adoptium#37 Tabular View Changes - Enabled the setting of JDK date (i.e. benchmarkProduct) to be dynamic instead of expecting the launch agents such as PerfNext or TestKitGen to set it. ○ JDK date is used on Tabular View to show the data of latest baseline and test builds before that JDK date. - Updated the Tabular View query for fetching unique build names, sdk resource and build servers, options that are displayed for choosing desired baseline or test builds for comparison ○ Query didn't have the correct SDK resource location. - Resolved the issue of Tabular View incorrectly setting states for dropdown options - Fixed the Tabular View query for getting the filtered data by updating the SDK resource location Benchmark Parser Changes - Enable the parsing of some benchmarks such Liberty under Adopt openjdk-tests repo to be parsed by TRSS. This design to be extended further in future PRs to allow parsing of other benchmarks as well. - Simplied perf parser regexes to get various benchmark info such as benchmark name, variant and JDK info ○ Removed some constraints so that all info can be parsed without being affected by Jenkins timestamps - Updated the Java version regex for capture all kinds of JDK builds (IBM J9, Open J9, HotSpot, OpenJDK) - Moved `javaVersion` to higher common level in data structure - Updated Perf graph widgets to use `jdkDate` instead of `jdkBuildDateUnixTime` - Removed `jdkBuildDateUnixTime` since it's redundant now as we're storing the jdk - Renamed `benchmarkProduct` to `jdkDate` to reflect the correct data that it's storing and updated the code in Data Manager, Perf Compare and Tabular View - Enabled parsing for ODM 300 Ruleset Signed-off-by: Piyush Gupta <piyush286@gmail.com>
Background About Benchmarking
For benchmarking, we always launch several iterations of a benchmark with a specific build to get performance results for various metrics such as throughput and startup time. These relative numbers are not very useful since they could change when benchmark is run on another platform, when the machine state isn't identical or when the configs are slightly different. Hence, we always use a baseline to gauge the performance of a newer test build.
While comparing baseline and test builds, it's important to use a relative number (Build 1 Score/Build 2 Score) instead of an absolute number (Build 1 Score - Build 2 Score) to look at the performance gap since the absolute number doesn't really mean much, could change and could have significantly varying range.
We usually use this formula to comparison:
Details about the Proposed Feature
Test Result Summary (TRS) should have the ability to create and show tabular views for comparing baseline and test build. Each view should show the relative comparison between baseline and test build in percentages corresponding to one specific metric and platform in a result cell. These result cells should be painted with different colors to classify the performance according to the table shown below.
Color Scheme for Result Cells
These tabular views would be extremely helpful in finding regression. I'm going to show the benefits of these tabular views with 2 examples.
Example 1:
SPEC Benchmarks
The tabular view above shows the results of all the SPEC benchmarks run on different platforms. From the results above, we can identify the following regressions easily:
Example 2
Micro Benchmarks
The tabular view above shows the results of all the micro benchmarks run on different platforms. From the results above, we can identify the following regressions easily:
Requirements for Tabular Views
Basic requirements of this tabular comparison view:
Advance Requirements for Tabular Views
Assigned Contributors
My team would work on adding this functionality.
The text was updated successfully, but these errors were encountered: