Tensorflow Prism

TFPrism is a library that transforms your tensorflow graph to automatically do data parallelism for training. All you need to do to modify your single-cpu tensorflow code to run on a cluster is to send your training op and feed_dict through the library.

Example code

train_step = tf.train.GradientDescentOptimizer(0.9).minimize(loss)

with tf.Session('grpc://mycluster.example.com:5600') as sess:
    train_step, node_copier = tfprism.distribute_graph_on_all_tasks(train_step, sess)
    sess.run(init_op)

    for batch in batches:
        sess.run(
            train_step,
            feed_dict=node_copier.mangle_feed_dict(batch))

Installation

pip install .

Training server / cluster management

The example code above assumes that there is a tensorflow cluster running a set of worker tasks and parameter server tasks, apropriately named "/job:worker" "/job:ps" respectively. To set up this can be a bit tiresome, and if all you want is to quickly get a cluster up and running and parallelize your code, you can use the cluster management tool provided with tfprism.

To install the cluster management tools, you need to do

apt install parallel
pip install .[server]

on each node in your cluster. Once you have done so you can run

tfprism cluster start server1,server2,...serverN

to start your cluster. You need to be able to ssh without passwords (using public key auth) to all servers listed. After this you can connect to port grpc://server1:5600 using tensorflow.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
tfprism		tfprism
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
_config.yml		_config.yml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tfprism

tfprism

.gitignore

.gitignore

LICENSE.txt

LICENSE.txt

README.md

README.md

_config.yml

_config.yml

setup.py

setup.py

Repository files navigation

Tensorflow Prism

Example code

Installation

Training server / cluster management

About

Releases

Packages

Languages

License

innovationgarage/tfprism

Folders and files

Latest commit

History

Repository files navigation

Tensorflow Prism

Example code

Installation

Training server / cluster management

About

Topics

Resources

License

Stars

Watchers

Forks

Languages