-
Notifications
You must be signed in to change notification settings - Fork 75
Closed
Labels
P1 Launch 2023High priority issues for October 2023 AlgoPerf LaunchHigh priority issues for October 2023 AlgoPerf Launch🚀 Launch BlockerIssues that are blocking launch of benchmarkIssues that are blocking launch of benchmark
Description
Currently the scoring code is not consistent with the scoring procedure.
The scoring procedure for a submission:
- For each workload look at 5 trials per workload.
- If >=3 trials have reached the target, use the best trial out of the 5 trials in terms of submission time.
- If <3 trials have reached the target, the submission did not reach the target on that workload.
In scoring.py we have to make the following modifications:
- Print a warning if fewer than 5 trials are found for a workload submission.
- Check that at 3/5 trials have reached the target and only then use the best trial to score the submission (bug fix)
- Print a warning fewer than 8 workloads are found.
- Make sure the percentage workloads that reached targets is calculated over all 8 workloads as opposed to the number of workloads in the submission directory. Currently if the user scores a submission over an experiment with n workloads it will use n as the numerator.
Metadata
Metadata
Assignees
Labels
P1 Launch 2023High priority issues for October 2023 AlgoPerf LaunchHigh priority issues for October 2023 AlgoPerf Launch🚀 Launch BlockerIssues that are blocking launch of benchmarkIssues that are blocking launch of benchmark