Update scaling experiment with K=1. #8

liehe · 2019-03-11T21:02:20Z

Update the experiment which benchmarks linear scaling rule and baseline for K=1.
Delete the cost / throughput figures in the benchmark task results.
Add speedup plots to the benchmark task results.

martinjaggi · 2019-03-11T21:25:28Z

thanks a lot!

instead of time-to-accuracy, can you plot the relative time speedup?
and can you also push the data behind the plot? this way it could be easier to use for different purposes or if we would change plot styles etc
the cost legend says time instead of cost

liehe · 2019-03-11T22:23:58Z

@martinjaggi

The figure here https://github.com/mlbench/mlbench-docs/blob/update_scaling_exp/images/val_tta.png is the relative time speedup to reach certain accuracies. Is this the one what you want? Or is the relative time speedup similar to throughput?
For the moment, I keep the raw data in my personal repository here. I will update this zip file later.
I will include the cost plot with K=1.

martinjaggi · 2019-03-11T22:33:28Z

indeed val_tta.png is the one, thanks. let's have more vertical space for clearer lines, also show the perfect line, and limit to 2 accuracies (see code comment)

martinjaggi · 2019-03-11T22:30:20Z

benchmark-tasks.rst

@@ -101,6 +101,7 @@ Image classification is one of the most important problems in computer vision an
 #. **Training Algorithm**
    We use standard synchronous SGD as the optimizer (that is distributed mini-batch SGD with synchronous all-reduce communication after each mini-batch).

+    - Model: Resnet 20


martinjaggi · 2019-03-11T22:32:08Z

benchmark-tasks.rst

-    :align: center
-
-* The second figure shows speedups of time-to-accuracy for Top-1 accuracy in 70%, 75%, 80%, 85%, 90%, 91%. Note that the 0 speedup means specified accuracy is not reached within the predefined maximum epochs. The linear scaling rule does not outperform baseline for accuracy <= 85%. However, in order to reach 90%+ accuracy,  using linear scaling is much better than the baseline.
+* The first figure shows speedups of time-to-accuracy for Top-1 accuracy in 70%, 75%, 80%, 85%, 90%, 91%. Note that the 0 speedup means specified accuracy is not reached within the predefined maximum epochs. The same accuracy is (relatively) much slower to reach as the number of machines grows. This is know as the issues of large-batch training. The linear scaling rule does not outperform baseline for accuracy <= 85%. However, to reach 90% or higher accuracy, using linear scaling is obviously better than the baseline.


let's maybe just focus on two accuracy levels (easy vs hard). but comment on how initial stepsize is chosen (and if failing at larger K can be avoided by tuning it). precise reproducible stepsizes then need to be written into the benchmark task description

negar-foroutan

Thanks Lie. I think we can just keep the plots for accuracy levels 91% and 70% and explain the resullts for the levels in between. We should also explain how we chose the initial learning rate.

martinjaggi · 2019-03-12T11:07:15Z

ok. also, instead of all raw data please put the data just for the main plot (as a csv say). @Panaetius what do you think?

Panaetius · 2019-03-12T13:51:55Z

@martinjaggi Makes sense. Maybe we should make a repository for just raw data. Not so much to present to the public (i.e. maybe just a small link somewhere in the docs) but for re-use so we don't need to run each experiment again every time we need the data.

martinjaggi · 2019-03-12T20:55:09Z

yes. but the plots which are part of the official task results should even have their own csv / pickle for easier reference and export

…

On Tue, Mar 12, 2019 at 2:52 PM Ralf Grubenmann ***@***.***> wrote: @martinjaggi <https://github.com/martinjaggi> Makes sense. Maybe we should make a repository for just raw data. Not so much to present to the public (i.e. maybe just a small link somewhere in the docs) but for re-use so we don't need to run each experiment again every time we need the data. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#8 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEaGR-rqeOvoD8Hlhzmkfb1QRMnpXEhgks5vV7EVgaJpZM4bpjlO> .

martinjaggi · 2019-04-09T09:51:02Z

hasn't been merged yet. might need to make a small script to generate the official results (time to acc etc) from the raw results. needs to be automatic as it's part of the official benchmark

Update scaling experiment with K=1.

ea21e74

martinjaggi requested a review from negar-foroutan March 11, 2019 21:25

minor.

248a655

martinjaggi reviewed Mar 11, 2019

View reviewed changes

negar-foroutan reviewed Mar 12, 2019

View reviewed changes

This was referenced Apr 9, 2019

add script to compute official results from raw results (time to acc for example) mlbench/mlbench-benchmarks#18

Closed

clean up task definitions and results #9

Closed

Panaetius closed this Jun 15, 2020

Panaetius deleted the update_scaling_exp branch June 15, 2020 14:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update scaling experiment with K=1. #8

Update scaling experiment with K=1. #8

liehe commented Mar 11, 2019

martinjaggi commented Mar 11, 2019

liehe commented Mar 11, 2019

martinjaggi commented Mar 11, 2019

martinjaggi Mar 11, 2019

martinjaggi Mar 11, 2019

negar-foroutan left a comment

martinjaggi commented Mar 12, 2019

Panaetius commented Mar 12, 2019

martinjaggi commented Mar 12, 2019 via email

martinjaggi commented Apr 9, 2019

Update scaling experiment with K=1. #8

Update scaling experiment with K=1. #8

Conversation

liehe commented Mar 11, 2019

martinjaggi commented Mar 11, 2019

liehe commented Mar 11, 2019

martinjaggi commented Mar 11, 2019

martinjaggi Mar 11, 2019

Choose a reason for hiding this comment

martinjaggi Mar 11, 2019

Choose a reason for hiding this comment

negar-foroutan left a comment

Choose a reason for hiding this comment

martinjaggi commented Mar 12, 2019

Panaetius commented Mar 12, 2019

martinjaggi commented Mar 12, 2019 via email

martinjaggi commented Apr 9, 2019