Multiverso is a parameter server based framework for training machine learning models on big data with numbers of machines. It is currently a standard C++ library and provides a series of friendly programming interfaces, and it is extended to support calling from python and Lua programs. With such easy-to-use APIs, machine learning researchers and practitioners do not need to worry about the system routine issues such as distributed model storage and operation, inter-process and inter-thread communication, multi-threading management, and so on. Instead, they are able to focus on the core machine learning logics: data, model, and training.
For more details, please view our website http://www.dmtk.io.
Linux (Tested on Ubuntu 14.04)
sudo apt-get install libopenmpi-dev openmpi-bin build-essential cmake git git clone https://github.com/Microsoft/multiverso.git --recursive && cd multiverso mkdir build && cd build cmake .. && make && sudo make install
Multiverso.sln with Visual Studio 2013 and build.
Current distributed systems based on multiverso:
- lightLDA: Scalable, fast, lightweight system for large scale topic modeling
- distributed_word_embedding Distributed system for word embedding
- distributed_word_embedding(deprecated) Distributed system for word embedding
- distributed_skipgram_mixture(deprecated) Distributed skipgram mixture for multi-sense word embedding