Random Forests in MPyC

An implementation of machine learning on secure data. We allow a model to be trained and used on data that is kept private. We use the MPyC library to perform a secure multi-party computation (MPC) that trains a forest of decision trees using an algorithm that is similar to the C4.5 machine learning algorithm.

Installation

Install Python 3.7, then invoke:

pip install -r requirements.txt

Usage

The spect.py and balance.py files contain examples of how to specify a dataset and to train a random forest on this data. These examples can be run as follows:

python spect.py
python balance.py

Please keep in mind that these computations are much slower than their non-MPC counterparts.

Tests

Run the test by invoking:

pytest

Run tests in watch mode:

ptw [-c]

(The -c flag causes the screen to be cleared before each run.)

Profiling

pip install snakeviz
python -m cProfile -o spect.stats spect.py
snakeviz spect.stats

Thanks

This algorithm was developed as part of the SODA project. Many thanks to Mark Abspoel, Daniel Escudero and Nikolaj Volgushev for designing the decision tree algorithm for MPC (See chapter 6 of this SODA document). Many thanks to Berry Schoenmakers who developed MPyC and helped us throughout the implementation of this algorithm.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Readme.md

Readme.md

Random Forests in MPyC

Installation

Usage

Tests

Profiling

Thanks

Files

Readme.md

Latest commit

History

Readme.md

File metadata and controls

Random Forests in MPyC

Installation

Usage

Tests

Profiling

Thanks