Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update scaling experiment with K=1. #8

Closed
wants to merge 2 commits into from
Closed

Conversation

liehe
Copy link
Member

@liehe liehe commented Mar 11, 2019

  • Update the experiment which benchmarks linear scaling rule and baseline for K=1.
  • Delete the cost / throughput figures in the benchmark task results.
  • Add speedup plots to the benchmark task results.

@martinjaggi
Copy link
Member

thanks a lot!

  • instead of time-to-accuracy, can you plot the relative time speedup?
  • and can you also push the data behind the plot? this way it could be easier to use for different purposes or if we would change plot styles etc
  • the cost legend says time instead of cost

@liehe
Copy link
Member Author

liehe commented Mar 11, 2019

@martinjaggi

  • The figure here https://github.com/mlbench/mlbench-docs/blob/update_scaling_exp/images/val_tta.png is the relative time speedup to reach certain accuracies. Is this the one what you want? Or is the relative time speedup similar to throughput?
  • For the moment, I keep the raw data in my personal repository here. I will update this zip file later.
  • I will include the cost plot with K=1.

@martinjaggi
Copy link
Member

indeed val_tta.png is the one, thanks. let's have more vertical space for clearer lines, also show the perfect line, and limit to 2 accuracies (see code comment)

@@ -101,6 +101,7 @@ Image classification is one of the most important problems in computer vision an
#. **Training Algorithm**
We use standard synchronous SGD as the optimizer (that is distributed mini-batch SGD with synchronous all-reduce communication after each mini-batch).

- Model: Resnet 20
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add k=1

:align: center

* The second figure shows speedups of time-to-accuracy for Top-1 accuracy in 70%, 75%, 80%, 85%, 90%, 91%. Note that the 0 speedup means specified accuracy is not reached within the predefined maximum epochs. The linear scaling rule does not outperform baseline for accuracy <= 85%. However, in order to reach 90%+ accuracy, using linear scaling is much better than the baseline.
* The first figure shows speedups of time-to-accuracy for Top-1 accuracy in 70%, 75%, 80%, 85%, 90%, 91%. Note that the 0 speedup means specified accuracy is not reached within the predefined maximum epochs. The same accuracy is (relatively) much slower to reach as the number of machines grows. This is know as the issues of large-batch training. The linear scaling rule does not outperform baseline for accuracy <= 85%. However, to reach 90% or higher accuracy, using linear scaling is obviously better than the baseline.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's maybe just focus on two accuracy levels (easy vs hard). but comment on how initial stepsize is chosen (and if failing at larger K can be avoided by tuning it). precise reproducible stepsizes then need to be written into the benchmark task description

Copy link
Contributor

@negar-foroutan negar-foroutan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Lie. I think we can just keep the plots for accuracy levels 91% and 70% and explain the resullts for the levels in between. We should also explain how we chose the initial learning rate.

@martinjaggi
Copy link
Member

ok. also, instead of all raw data please put the data just for the main plot (as a csv say). @Panaetius what do you think?

@Panaetius
Copy link
Member

@martinjaggi Makes sense. Maybe we should make a repository for just raw data. Not so much to present to the public (i.e. maybe just a small link somewhere in the docs) but for re-use so we don't need to run each experiment again every time we need the data.

@martinjaggi
Copy link
Member

martinjaggi commented Mar 12, 2019 via email

@martinjaggi
Copy link
Member

hasn't been merged yet. might need to make a small script to generate the official results (time to acc etc) from the raw results. needs to be automatic as it's part of the official benchmark

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants