communication-efficient distributed coordinate ascent
Scala Shell
Switch branches/tags
Nothing to show
Clone or download
Latest commit fc5b0e5 Feb 26, 2018
Permalink
Failed to load latest commit information.
conf initial code commit Dec 11, 2014
data initial code commit Dec 11, 2014
project initial code commit Dec 11, 2014
sbt fixes #4 Feb 26, 2018
src/main/scala faster cocoa update Jun 30, 2015
.gitignore
LICENSE adding license Jan 13, 2015
README.md proxcocoa+ Feb 18, 2016
build.sbt adding breeze vector support Jun 23, 2015
local-helper.sh initial code commit Dec 11, 2014
run-demo-cluster.sh small style edits Jan 8, 2015
run-demo-local.sh initial code commit Dec 11, 2014

README.md

CoCoA - A Framework for Communication-Efficient Distributed Optimization

New! ProxCoCoA+ provides support for L1-regularized objectives. See paper and code.

We've added support for faster additive udpates with CoCoA+. See more information here.

This code performs a comparison of 5 distributed algorithms for training of machine learning models, using Apache Spark. The implemented algorithms are

  • CoCoA+
  • CoCoA
  • mini-batch stochastic dual coordinate ascent (mini-batch SDCA)
  • stochastic subgradient descent with local updates (local SGD)
  • mini-batch stochastic subgradient descent (mini-batch SGD)

The present code trains a standard SVM (hinge-loss, l2-regularized) using SDCA as a local solver, and reports training and test error, as well as the duality gap certificate if the method is primal-dual. The code can be easily adapted to include other internal solvers or solve other objectives.

Getting Started

How to run the code locally:

sbt/sbt assembly
./run-demo-local.sh

(For the sbt script to run, make sure you have downloaded CoCoA into a directory whose path contains no spaces.)

References

The CoCoA+ and CoCoA algorithmic frameworks are described in more detail in the following papers:

Ma, C., Smith, V., Jaggi, M., Jordan, M. I., Richtarik, P., & Takac, M. Adding vs. Averaging in Distributed Primal-Dual Optimization. ICML 2015 - International Conference on Machine Learning.

Jaggi, M., Smith, V., Takac, M., Terhorst, J., Krishnan, S., Hofmann, T., & Jordan, M. I. Communication-Efficient Distributed Dual Coordinate Ascent (pp. 3068–3076). NIPS 2014 - Advances in Neural Information Processing Systems 27.

Smith, V., Forte, S., Jordan, M. I., Jaggi, M. L1-Regularized Distributed Optimization: A Communication-Efficient Primal-Dual Framework.