Skip to content

Latest commit

 

History

History
14 lines (12 loc) · 2.42 KB

README_ML Computation load distribution frameworks.md

File metadata and controls

14 lines (12 loc) · 2.42 KB

ML Computation load distribution frameworks

  • Apache Spark MLlib - Apache Spark's scalable machine learning library in Java, Scala, Python and R
  • Beam Apache Beam is a unified programming model for Batch and Streaming https://beam.apache.org/
  • BigDL - Deep learning framework on top of Spark/Hadoop to distribute data and computations across a HDFS system
  • Dask - Distributed parallel processing framework for Pandas and NumPy computations - (Video)
  • DEAP - A novel evolutionary computation framework for rapid prototyping and testing of ideas. It seeks to make algorithms explicit and data structures transparent. It works in perfect harmony with parallelisation mechanisms such as multiprocessing and SCOOP.
  • Hadoop Open Platform-as-a-service (HOPS) - A multi-tenancy open source framework with RESTful API for data science on Hadoop which enables for Spark, Tensorflow/Keras, it is Python-first, and provides a lot of features
  • Horovod - Uber's distributed training framework for TensorFlow, Keras, and PyTorch
  • NumPyWren - Scientific computing framework build on top of pywren to enable numpy-like distributed computations
  • PyWren - Answer the question of the "cloud button" for python function execution. It's a framework that abstracts AWS Lambda to enable data scientists to execute any Python function - (Video)
  • Ray - Ray is a flexible, high-performance distributed execution framework for machine learning (VIDEO)
  • Vespa Vespa is an engine for low-latency computation over large data sets. https://vespa.ai