diff --git a/MLPerf_Results_Messaging_Guidelines.adoc b/MLPerf_Results_Messaging_Guidelines.adoc index f36f1cb..ccc22df 100644 --- a/MLPerf_Results_Messaging_Guidelines.adoc +++ b/MLPerf_Results_Messaging_Guidelines.adoc @@ -12,6 +12,10 @@ If you used an MLPerf™ benchmark to obtain a result for your product or servic Since your results have not gone through MLCommons review and, therefore, have not been verified by MLCommons, you must indicate your results are not verified by using the term “unverified” next to each result score and by using the following language when publishing or otherwise discussing your results: “_Result not verified by MLCommons Association._” You can include this statement in a footnote, as described in Section 3 below. +If the components (e.g. HW) that substantially determine ML performance of an "unverified" score also have a verified official score (e.g. same HW with a different submission SW stack), it is required to state the official submission score of the closest available system in any public comparisons. + +Note that benchmark reference code implementations often prioritize readability and should not be mistaken for performance optimized implementations (e.g. reference code performance should not be equated with "out-of-box" performance). + == Use of MLPerf™ Benchmark for MLCommons Reviewed and Verified Results If you used an MLPerf benchmark to obtain a result for your product or service, you submitted your result for MLCommons review, and your result was verified through such review, you may indicate your results are verified when publishing or otherwise discussing your results, by indicating your results are “verified” or “official” or by otherwise following the examples below for verified results. You may also choose to use this language: “_Result verified by MLCommons Association._” You can include this statement in a footnote, as described in Section 3 below. @@ -77,8 +81,8 @@ MLPerf results may not be compared against non-MLPerf results. For example, an M == When comparing MLPerf results, you must identify any submission differences -When comparing results the main text, table, or figure must clearly identify any difference in version, division, category, verified or unverified status, scenario or chip count (count of the compute devices executing the largest number of ops, which could be processors or accelerators). When comparing Open and Closed division results, any ways in which the Open result would not qualify as a Closed result must be identified. - +When comparing results the main text, table, or figure must clearly identify any difference in version, division, category, verified or unverified status, scenario or chip count (count of the compute devices executing the largest number of ops, which could be processors or accelerators) or submitter name. When comparing Open and Closed division results, any ways in which the Open result would not qualify as a Closed result must be identified. When making comparisons, submissions must not be portrayed as representing the performance available from a submitter unless the submission was by that submitter - e.g. a submission by SuperServers Inc that happened to use an accelerator from AccelCorp Ltd must not be portrayed as representing AccelCorp's performance. + **Example for Non-MLCommons Reviewed Result**: ____ @@ -95,6 +99,14 @@ SmartAI Corp achieved a score of 0.6 on the MLPerf™ Image Classification bench **Required Footnote**: “[1]Result verified by MLCommons Association. MLPerf™ name and logo are trademarks of MLCommons Association in the United States and other countries. All rights reserved. Unauthorized use strictly prohibited. See www.mlcommons.org for more information.” ____ +or + +____ +SmartAI Corp achieved a score of 1.2 on the MLPerf™ NLP benchmark using a SmartCluster with 8 chips in the Available category of Closed Division which is faster than the result of 7.2 achieved by LessSmartAI Corp with 16 chips from HardwareVendorX in the Available on-premise category of Closed Division.[1] + +**Required Footnote**: “[1]Result verified by MLCommons Association. MLPerf™ name and logo are trademarks of MLCommons Association in the United States and other countries. All rights reserved. Unauthorized use strictly prohibited. See www.mlcommons.org for more information.” +____ + Furthermore, a comparison of an unverified result with a verified result must include the following statement in a footnote: “_Unverified results have not been through an MLPerf™ review and may use measurement methodologies and/or workload implementations that are inconsistent with the MLPerf™ specification for verified results._” **Example (applicable to Non-MLCommons Reviewed Result)**: