# Query Limiting

Aims: This notebook will demonstrate query limiting on the SCN side. This will include; 

- A new additon to the SCN side helper lib: *privacy_sentinel*
    - privacy_sentinel contains the logic to be used by safe objects in order to vet their execution
    - privacy_sentinel vets execution at the computation layer but sentinel logic could be used by layers below to vet incoming jobs.
- This demo contains one statistical function which has been modified to accomodate for query limiting logic

We neeed query limiting over entire federations

## 1. The Privacy Sentinel MVP

- Contains generic logic to stop queries on samples under N size. 
- Until an appropriate value for N is determined, N is a constant value for all query types.

### 1.1 Get Data

In [1]:
from sail_safe_functions_test.helper_sail_safe_functions.data_frame_federated_local import DataFrameFederatedLocal
from sail_safe_functions_test.helper_sail_safe_functions.series_federated_local import SeriesFederatedLocal

DATA_PATH = "../../sail-safe-functions-test/sail_safe_functions_test/data_sail_safe_functions"

list_name_file_csv = ["bmc1.csv", "bwh1.csv", "mgh1.csv"]
id_column_0 = "PD-L1 level before treatment"


dataframe = DataFrameFederatedLocal()
for name_file_csv in list_name_file_csv:
    path_file_csv = os.path.join(DATA_PATH, "data_csv_investor_demo", name_file_csv)
    dataframe.add_csv(path_file_csv)

one_sample_big = dataframe[id_column_0]

In [2]:
one_sample_big

<sail_safe_functions_test.helper_sail_safe_functions.series_federated_local.SeriesFederatedLocal at 0x7fbc5c372430>

In [3]:
class PrivacySentinel:
    """
    A helper Library to be used on the SCN side to vet query privacy.
    """

    def query_limit_local_n_precompute(list_list_precompute, n=10):
        # Check federated sample is appropriate length
        length= 0
        for list in list_list_precompute:
            length = list[-1]+length
        PrivacySentinel.query_limit_local_n(length, n)

    @staticmethod
    def query_limit_local_n(num_samples: int, n: int = 50) -> bool:
        """
        Blocks a query if the number of samples is under a threshold n.

        :param: samples: The List of samples
        :type: Integer
        :param: n: The threshold limit on number of samples (default=10)
        :type: Integer
        :return: Whether the number of samples is above the threshold
        :type: Boolean
        """
        if num_samples < n:
            raise NameError("Too few samples")
        else:
            return True

print(PrivacySentinel.query_limit_local_n(10))
# PrivacySentinel.query_limit_n(9)

NameError: Too few samples

In [4]:
from typing import List
from sail_safe_functions.statistics.mean_precompute import MeanPrecompute

class MeanAggregateSAFE:
    """
    Aggregates data for computing the mean
    """
    def run(list_list_precompute: List[List[float]]):
        PrivacySentinel.query_limit_local_n_precompute(list_list_precompute)
        
        sum_x_0 = 0
        degrees_of_freedom_0 = 0

        for list_precompute in list_list_precompute:
            sum_x_0 += list_precompute[0]
            degrees_of_freedom_0 += list_precompute[1]

        sample_mean_0 = sum_x_0 / degrees_of_freedom_0

        # if degrees_of_freedom < 20:
        #     raise Exception()
        return sample_mean_0

In [5]:
list_list_precompute = []
for series in one_sample_big.dict_series.values():
    list_list_precompute.append(MeanPrecompute.run(series))
mean_statistic = MeanAggregateSAFE.run(list_list_precompute)
mean_statistic


85.03455555555556

In [6]:
one_sample_big

<sail_safe_functions_test.helper_sail_safe_functions.series_federated_local.SeriesFederatedLocal at 0x7fbc5c372430>