Rudra Distributed Learning Platform {#mainpage}

Rudra is a distributed framework for large-scale machine learning using deep neural networks, which accepts training data and model configuration as inputs from the user and outputs the parameters of the trained model.

Detailed documentation can be found at the project wiki.

Installation

Dependencies:

X10 (version 2.5.4 or higher)
g++ (4.4.7 or higher) or xlC (12.1 or higher)
(optional - for cuDNN learner) rudra-cudnnlearner package
(optional - for Theano learner) rudra-dist package and Theano plus Theano prerequisites

The default version of Rudra uses a proprietary IBM learner implementation with cuDNN. There is also an example Theano learner, the source code for which is available at rudra-dist. A mock learner is also included for unit testing purposes. Other learners are supported by implementing the learner API in include/NativeLearner.h . The make variable RUDRA_LEARNER chooses between different learner implementations e.g. basic, theano, mock. Setting RUDRA_LEARNER=xxx requires the build to link against a learner implementation at lib/librudralearner-xxx.so.

To build the default (cuDNN) version of Rudra, simply run:

$ source rudra.profile
$ make

To build Rudra with a mock learner (for testing purposes):

$ make rudra-mock

The make variable X10RTIMPL chooses the implementation of X10RT. You can use whichever versions of X10RT are supported on your platform e.g. sockets, pami, mpi. The default is MPI. (Note: mpi does not currently work on the IBM-internal DCC system for POWER nodes.)

For example, to build the default version of Rudra with X10RT for PAMI, run:

$ make rudra-cudnn X10RTIMPL=pami

To build Rudra with the Theano learner and MPI, run:

$ make rudra-theano X10RTIMPL=mpi

Building Individual Components

To build librudra:

$ cd cpp && make

To build the Rudra X10 application:

$ cd x10 && make

Note:

rudra.profile sets the necessary environment variables needed for building and running Rudra. Amongst other things, it sets the $RUDRA_HOME environment variable. In some cases, you may need to modify rudra.profile to correctly point to your local Python installation.

Verifying the Build (Theano version)

One process is reserved for testing, and the remainder are used as learners. (To turn off testing and use all processes for learning, pass the command line argument -noTest.)

With MPI or PAMI, the number of places equals the number of processes. For sockets, set the number of places with the environment variable X10_NPLACES.

Try running with mlp.py:

$ make rudra-theano X10RTIMPL=sockets
$ export X10_NPLACES=2
$ ./rudra-theano -f examples/theano-mnist.cfg -ll 0 -lr 0 -lt 0 -lu 0

Log level 0 (TRACING) prints the maximum amount of information. If you don't want it, skip the -l* flags.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
cpp		cpp
examples		examples
mock		mock
x10		x10
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
rudra.profile		rudra.profile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rudra Distributed Learning Platform {#mainpage}

Installation

Building Individual Components

Verifying the Build (Theano version)

About

Releases

Packages

Languages

License

Rudra-org/rudra

Folders and files

Latest commit

History

Repository files navigation

Rudra Distributed Learning Platform {#mainpage}

Installation

Building Individual Components

Verifying the Build (Theano version)

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages