Infrastructure for defining and running large data workflows against multiple backends.
This framework allows users to define data analysis workflows in familiar frontend languages and then execute them on multiple data storage and processing backends (including privacy-preserving backend services that support secure multi-party computation).
Conclave requires a Python 3.5 environment and was tested on Ubuntu (14.04+). See
requirements.txt for other dependencies.
Consider using pyenv (https://github.com/pyenv/pyenv) to avoid changing
python3 in a bunch of places.
pip install -r requirements.txt.
The library comes with a number of tests::
Note that the benchmarks under
benchmarks/ assume that party 1 is reachable at
ca-spark-node-0, party 2 at
cb-spark-node-0, and party 3 at
cc-spark-node-0. You can modify your
/etc/hosts file to map IP addresses to host addresses. To map the above to 127.0.0.1 (for a local run) include the following entry in your
127.0.0.1 ca-spark-node-0 cb-spark-node-0 cc-spark-node-0
Most likely you already have a mapping for localhost, for example:
In that case, just append the node addresses after
You can also modify the party addresses inside
CodeGenConfig by updating the
This is experimental software and does not guarantee security or correctness.