You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Have the option to 'merge` a chosen selection of runs (e.g obtained manually or by a filter). For merged runs:
A new hash will be created for the merged row.
Numerical metrics will be transformed to mean +- standard deviation.
Metrics with >1 unique value for which aggregation is meaningless (e.g. date) will be made 'None'
Metrics with a unique value will remain constant
An option is included to 'unmerge' the runs (i.e. expand and restore to previous state).
Motivation
In most AI/ML research papers, result metrics are reported as the mean and standard deviation over repeated runs with identical configuration to account for randomness in training.
Currently, I either have to:
Incorporate 'repetitions' into my experiment code -> this is clumsy and frustrating. It seems sensible for my code/model to focus on a single pipeline for training and evaluation -> my experiment tracker should be responsible for aggregating the results from those experiments (almost by definition).
Export the experiment runs into a .csv, and manually aggregate to compute means +- stds.
Admittedly, neither of these take an extensively long time - but they are frustrating and feel like they could relatively easily be incorporated into the GUI for aim leading to a nice QOL improvement.
For example, in the above, I'd like to collapse the selected runs and the mae_benchmark, mse_benchmark metrics to be aggregated with the mean and s.d. reported.
The text was updated successfully, but these errors were encountered:
馃殌 Feature
Have the option to 'merge` a chosen selection of runs (e.g obtained manually or by a filter). For merged runs:
Motivation
In most AI/ML research papers, result metrics are reported as the mean and standard deviation over repeated runs with identical configuration to account for randomness in training.
Currently, I either have to:
Admittedly, neither of these take an extensively long time - but they are frustrating and feel like they could relatively easily be incorporated into the GUI for aim leading to a nice QOL improvement.
For example, in the above, I'd like to collapse the selected runs and the
mae_benchmark
,mse_benchmark
metrics to be aggregated with the mean and s.d. reported.The text was updated successfully, but these errors were encountered: