Distributed training framework with parameter server
C++ Python Shell CMake Makefile
Latest commit d1548dd Dec 9, 2016 @windreamer windreamer committed with xunzhang Update executor/scheduler (#53)
Permalink
Failed to load latest commit information.
alg [PARACEL-46]. Refactor python scripts according to pylint Sep 16, 2016
cmakes fix gflags include and only use msgpack-c header files (#19) Jul 4, 2016
doc
ebuild glog-0.3.3.ebuild Sep 26, 2015
external_usage PARACEL-35. Fix some "Uncaught Exception" defects. Aug 28, 2016
include PARACEL-35. Fix some "Uncaught Exception" defects. Aug 28, 2016
src
test PARACEL-35. Fix some "Uncaught Exception" defects. Aug 28, 2016
tool [PARACEL-46]. Refactor python scripts according to pylint Sep 16, 2016
.gitignore first commit: hello paracel Jan 27, 2015
.travis.yml [PARACEL-17]. Bugfix for osx image. Aug 4, 2016
AUTHORS first commit: hello paracel Jan 27, 2015
CMakeLists.txt fix gflags include and only use msgpack-c header files (#19) Jul 4, 2016
ChangeLog [PARACEL-25]. Create new release v0.3.0, update ChangeLog. Aug 4, 2016
LICENSE first commit: hello paracel Jan 27, 2015
README.md [PARACEL-27]. Import Paracel document and sync doc to the latest v0.3.0. Aug 4, 2016
copyright first commit: hello paracel Jan 27, 2015
logo.png first commit: hello paracel Jan 27, 2015
mesos_executor.py Update executor/scheduler (#53) Dec 9, 2016
mesos_scheduler.py Update executor/scheduler (#53) Dec 9, 2016
mrun
prun.py [PARACEL-46]. Refactor python scripts according to pylint Sep 16, 2016

README.md

logo https://travis-ci.org/douban/paracel.png Join the chat at https://gitter.im/douban/paracel Coverity Scan Build

Paracel Overview

Paracel is a distributed computational framework, designed for many machine learning problems: Logistic Regression, SVD, Matrix Factorization(BFGS, sgd, als, cg), LDA, Lasso...

Firstly, paracel splits both massive dataset and massive parameter space. Unlike Mapreduce-Like Systems, paracel offers a simple communication model, allowing you to work with a global and distributed key-value storage, which is called parameter server.

Upon using paracel, you can build algorithms with following rules: 'pull parameters before learning & push local updates after learning'. It is rather a simple model(compared to MPI) which is almost painless transforming from serial to parallel.

Secondly, paracel tries to solve the 'last-reducer' problem of iterative tasks. We use bounded staleness and find a sweet spot between 'improve-iter' curve and 'iter-sec' curve. A global scheduler takes charge of asynchronous working. This method is already proved to be a generalization of Bsp/Pregel by CMU.

Another advantage of paracel is fault tolerance while MPI has no idea with that.

Paracel can also be used for scientific computing and building graph algorithms. You can load your input in distributed file system and construct a graph, sparse/dense matrix.

Paracel is originally motivated by Jeff Dean's talk @Stanford in 2013. You can get more details in his paper: "Large Scale Distributed Deep Networks".

More documents could be found below:

Project Homepage

Quick Install

20-Minutes' Tutorial

API Reference Page