# Secure Statistics
This Notebook provides a description for the Secure Statistics workload as part of a benchmark for MPC Frameworks.

The workload is based on a literature review idenfying several use cases were secure statistics were used.

The stats caculated here are Sum, Mean, Min, Max and 

In [11]:
import numpy as np

## Data Generation
- N Input Parties Load data

- Data is horizontally split so that it simply be appended at the end of the file

- Data is generated randomly between 0 and 1000 for each column with 3 columns per party and 1000 rows per party

- Data could also represent 3000 participants that only provide one input per column

In [12]:
party1=np.random.randint(0,10000,(1000,3))
party2=np.random.randint(0,10000,(1000,3))
party3=np.random.randint(0,10000,(1000,3))

#append the parties together
all_parties=np.concatenate((party1,party2,party3),axis=0)

### Secure Sum
- We calculate the sum of each column 
- Could be used for voting or aggregation of net worth or CO2 emissions
- Only the sum should be public
- Can be used to calculate overall costs for different companies and measure savings in total

In [13]:
securesum=np.sum(all_parties,axis=0)
print(securesum)

[14847377 15124730 15014620]


### Secure Mean

- We calculate the mean of each column
- Calculate average live expectancy in medical use case or average income in wage gap scenario
- Only the mean should be public

In [14]:
securemean=np.round(np.mean(all_parties,axis=0),0)
print(securemean)

[4949. 5042. 5005.]


### Secure Max
- We calculate the max of each column
- Identify a bid for an auction 
- Identify outliers in a dataset for preprocessing
- Only the max should be public and should not be traceable to a specific party (altough in a 3 party setting it has to be one of the other two)

In [15]:
securemax = np.max(all_parties, axis=0)
print(securemax)

[9995 9999 9997]


### Secure Min
- We calculate the min of each column
- Identify lowest bid for an auction
- Identify outliers in a dataset for preprocessing
- Only the min should be public and should not be traceable to a specific party (altough in a 3 party setting it has to be one of the other two)

In [16]:
securemin = np.min(all_parties, axis=0)
print(securemin)

[ 2 10  0]
