-
-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aggregate Perf Results From Multiple Benchmark Iterations #73
Comments
Design for Aggregating Perf DataFor perf, we'll have 2 kinds of builds: Parent and Child Builds. We'll be doing similar aggregating for both of them with minor differences. Child BuildFor each child build, we'll aggregate the data of all iterations run inside that Jenkins build. We aggregate only the good data and ignore the iterations for which values might be null. Raw Data for Child:
Aggregated Data Structure for Child
Parent BuildFor each parent build, we'll aggregate the "raw" (and not "aggregated") data of all child builds that were launched by that parent build. Each child may carry different weight if they have different number of valid data points as compared to other child jobs. Hence, we'll use weighted average to get the most accurate results.
Raw Data for Parent Aggregated Data for Parent
|
Sample Parent Job
|
Sample Child Job
|
|
@llxia It's useful to take all child with equal weights when we are interleaving. Just so 2 interleave builds for baseline and test have similar weight for each iteration since similar factors would be affecting the same iteration for both. But you're right! It's more accurate to take weighted averages so that we divide the mean by the valid # of data points. @sophiaxu0424 Could you please update your changes? Thanks! I'll create another issue for updating Dashboard & Perf Compare. |
Closing this since all related work to this issue has been completed. |
Problem Description
Currently, we don't aggregate numbers for multiple benchmark iterations when each Jenkins build is stored in the database. As a result, all results such as average, median and confidence interval need to be calculated when Perf Compare is used to compare 2 builds. This design is not preferred due to the following reasons:
1) It takes time to generate Perf Reports through Perf Compare.
2) Aggregated results are not stored so they would need to be generated every time they are needed even though they don't change.
3) It requires more CPU time and puts unnecessary pressure on the database.
These issues should be resolved with the proposed changes mentioned below. This would significantly improve the speed of getting results, which would be needed for different views such as Dashboard (#28) and Tabular View (#37).
Proposed Changes
testResults
collection.Assigned Contributors
Sophia (@sophiaxu0424) from my team will work on this feature.
The text was updated successfully, but these errors were encountered: