Skip to content

ischlag/distributed-tensorflow-example

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 

Distributed Tensorflow 1.2 Example (DEPRECATED)

Using data parallelism with shared model parameters while updating parameters asynchronous. See comment for some changes to make the parameter updates synchronous (not sure if the synchronous part is implemented correctly though).

Trains a simple sigmoid Neural Network on MNIST for 20 epochs on three machines using one parameter server. The goal was not to achieve high accuracy but to get to know tensorflow.

Run it like this:

First, change the hardcoded host names with your own and run the following commands on the respective machines.

pc-01$ python example.py --job_name="ps" --task_index=0 
pc-02$ python example.py --job_name="worker" --task_index=0 
pc-03$ python example.py --job_name="worker" --task_index=1 
pc-04$ python example.py --job_name="worker" --task_index=2 

Thanks to snowsquizy for updating the script to TensorFlow 1.2.

About

An example of data parallelism and async updates of parameter in tensorflow.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages