# MC<sup>2</sup> : Multiparty Collaboration & Coopetition
MC<sup>2</sup> contains a series of subprojects in the RISE Lab, all pertaining to multiparty collaboration and coopetition. The particular project we'll be giving a tutorial on today is Federated XGBoost, an extension of the existing gradient boosting machine learning framework that enables use of the framework in the federated setting. This is particularly important for use cases that focus on low bandwidth training across multiple parties.

You can find the codebase here: https://github.com/mc2-project/mc2

## Dataset
### Allstate Claim Prediction Dataset
This dataset is used in the original XGBoost paper and is taken from a Kaggle competition.
The goal of the competition is to predict insurance claim payments given multiple datapoints about the insured vehicle.
Further information can be found [here](https://www.kaggle.com/c/ClaimPredictionChallenge).
We propose a usecase where some insurance company has multiple departments specializing in different makes of cars, and these departments are unable to share data between them.
As such, a sample of the original Allstate Claim Prediction dataset is partitioned here into four groups, each of which represents one of these departments.
In the following exercises, you will represent one such department, and your task will be to use the information provided to predict whether new insurance claims will be greater than 0, or equal to 0. (binary classification)
You will then collaborate with the other departments, using our federated distributed XGBoost to collectively train a model without revealing all of your departments' data to one another.

## Federate

To simulate a federation, please get into groups of 3 or 4. Choose one member of the team to act as the aggregator. The aggregator will have a `party_id` of 1. Assign all other members of the federation a `party_ID` from 2 to 4. Keep your `party_id` handy for the rest of this tutorial.

## Table of Contents
Now that we've finished with setup, it's time to start the exercises. Depending on your role (as either the aggregator or a non-aggregator), you will be working in different notebooks. This tutorial consists of a setup phase and three exercises:

  1. Setup [[Aggregator](./setup-aggregator.ipynb), [Member](./setup-member.ipynb)]
  2. Single Party XGBoost on Data Subset [[Aggregator](./exercise1-aggregator.ipynb), [Member](./exercise1-member.ipynb)]
  3. Multiparty XGBoost with Centralized Training [[Aggregator](./exercise2-aggregator.ipynb), [Member](./exercise2-member.ipynb)]
  4. [Multiparty XGBoost with Federated Training](Exercise 3.ipynb)

Let's start with Setup [[Aggregator](./setup-aggregator.ipynb), [Member](./setup-member.ipynb)].