GitHub

#Distributed TensorFlow on Spark First presented at the 2016 Spark Summit East: [Slide deck] (http://www.slideshare.net/arimoinc/distributed-tensorflow-scaling-googles-deep-learning-library-on-spark-58527889), [Presentation video] (https://www.youtube.com/watch?v=-QtcP3yRqyM), [Blog post] (https://arimo.com/machine-learning/deep-learning/2016/arimo-distributed-tensorflow-on-spark/)

##TensorSpark productionalized in yarn-cluster mode This latest version contains modifications/improvements that are mostly relevant to someone interested in taking TensorSpark to production in yarn-cluster mode (tested with a Hortonworks distribution [HDP 2.4] with CPU machines). For other deployment and machine types, the earlier version as of [Commit #62] (https://github.com/adatao/tensorspark/tree/2eae6732709884f08e800efa24653340f2f7997b) might still be a better option.

###Summary of changes since [Commit #62] (https://github.com/adatao/tensorspark/tree/2eae6732709884f08e800efa24653340f2f7997b) There are few minor improvements (see commits for details) and the following 2 major changes:

tensorspark.py: Reading the testset from the HDFS instead (Avoiding the need to put the testset on local disk; we are putting training and test sets at the same location on the HDFS)
parameterwebsocketclient.py: Find the machine that gets the Spark Driver in yarn-cluster mode (either way, there are some configs to be done here)

###To run

zip pyfiles.zip ./parameterwebsocketclient.py ./parameterservermodel.py ./mnistcnn.py ./mnistdnn.py ./moleculardnn.py ./higgsdnn.py
spark-submit

--master yarn

--deploy-mode cluster

--queue default

--num-executors 3

--driver-memory 20g

--executor-memory 60g

--executor-cores 8

--py-files ./pyfiles.zip

./tensorspark.py

Partial project layout:
tensorspark/gpu_install.sh - script to build tf from source with gpu support for aws
tensorspark/simple_websocket_*.py - simple tornado websocket example
tensorspark/parameterservermodel.py - "abstract" model class that has all tensorspark required methods implemented
tensorspark/*dnn.py - specific fully connected models for specific datasets
tensorspark/mnistcnn.py - convolutional model for mnist
tensorspark/parameterwebsocketclient.py - spark worker code
tensorspark/tensorspark.py - entry point and spark driver code

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
TSSpark		TSSpark
.gitignore		.gitignore
README.md		README.md
gpu_install.sh		gpu_install.sh
higgsdnn.py		higgsdnn.py
mnistcnn.py		mnistcnn.py
mnistdnn.py		mnistdnn.py
moleculardnn.py		moleculardnn.py
parameterservermodel.py		parameterservermodel.py
parameterwebsocketclient.py		parameterwebsocketclient.py
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
simple_websocket_client.py		simple_websocket_client.py
simple_websocket_server.py		simple_websocket_server.py
tensorspark.py		tensorspark.py
tiny_mnist_test.csv		tiny_mnist_test.csv
tiny_mnist_train.csv		tiny_mnist_train.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Contributors 4

Languages

adatao/tensorspark

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages