# Query Limiting

Aims: This notebook will demonstrate query limiting on the SCN side. This will include; 

- A new additon to the SCN side helper lib: *privacy_sentinel*
    - privacy_sentinel contains the logic to be used by safe objects in order to vet their execution
    - privacy_sentinel vets execution at the computation layer but sentinel logic could be used by layers below to vet incoming jobs.
- This demo contains one statistical function which has been modified to accomodate for query limiting logic

## 1. The Privacy Sentinel MVP

- Contains generic logic to stop queries on samples under N size. 
- Until an appropriate value for N is determined, N is a constant value for all query types.

In [36]:
class PrivacySentinel:
    """
    A helper Library to be used on the SCN side to vet query privacy.
    """

    @staticmethod
    def query_limit_n(num_samples: int, n: int = 10) -> bool:
        """
        Blocks a query if the number of samples is under a threshold n.

        :param: samples: The List of samples
        :type: Integer
        :param: n: The threshold limit on number of samples (default=10)
        :type: Integer
        :return: Whether the number of samples is above the threshold
        :type: Boolean
        """
        if num_samples < n:
            raise NameError("Too few samples")
        else:
            return True

print(PrivacySentinel.query_limit_n(10))
PrivacySentinel.query_limit_n(9)

True


NameError: Too few samples

## 2. Applying the Privacy Sentinel to a Mean Function

The precompute step is relevant to the computation here more than the aggregation step as it is what interacts directly with the data. This is why I chose that for mean queary limiting. There are two archetectural choices for how we implement sentinel checks:

- We add sentinel calls internally to safe functions
    - This is archetecturally simple but individual sentinel calls in functions must be implemented and maintained
- We add sentinel calls before to metadata to be run before a safe function is initialised
    - This is more complicated, involving some pre-exection step to be defined for safe functions.
    - metadata will still need to eb unique for each function
    - it's not clear that this is more or less complex to maintain


I put in something which accomodates for both for us to decide on.

### 2.1 Running Inside the SAFE function 

In [37]:
from typing import List, Tuple

import numpy as np
import pandas as pd


class MeanPrecomputeSAFE:
    """
    Precomputes data for computing the mean
    """

    def run(
        sample_0_dataframe: pd.Series,
    ) -> Tuple[
        List[float], List[bool]
    ]:  # there seems to be a problem here with this annotation

        #SENTINEL CODE
        PrivacySentinel.query_limit_n(len(sample_0_dataframe))
        #!SENTINEL CODE
        
        sample_0 = sample_0_dataframe.to_numpy()

        sum_x_0 = np.sum(sample_0)
        sample_0_degrees_of_freedom = len(sample_0)

        list_precompute = [sum_x_0, sample_0_degrees_of_freedom]
        # list_safe = [False, False, False, False, False, False ]
        return list_precompute  # , list_safe

sample = pd.Series(np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]))
print(sample.mean())
print(MeanPrecomputeSAFE.run(sample))

sample = pd.Series(np.array([1, 2, 3, 4, 5, 6, 7, 8, 9]))
print(MeanPrecomputeSAFE.run(sample))

5.5
[55, 10]


NameError: Too few samples

### 2.2 Running Outside the SAFE function 

In [38]:
class MeanPrecompute:
    """
    Precomputes data for computing the mean
    """

    def run(
        sample_0_dataframe: pd.Series,
    ) -> Tuple[
        List[float], List[bool]
    ]:  # there seems to be a problem here with this annotation
        
        sample_0 = sample_0_dataframe.to_numpy()

        sum_x_0 = np.sum(sample_0)
        sample_0_degrees_of_freedom = len(sample_0)

        list_precompute = [sum_x_0, sample_0_degrees_of_freedom]
        # list_safe = [False, False, False, False, False, False ]
        return list_precompute  # , list_safe

sample = pd.Series(np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]))
print(sample.mean())

if PrivacySentinel.query_limit_n(len(sample)):
    print(MeanPrecompute.run(sample))

sample = pd.Series(np.array([1, 2, 3, 4, 5, 6, 7, 8, 9]))
if PrivacySentinel.query_limit_n(len(sample)):
    print(MeanPrecompute.run(sample))

5.5
[55, 10]


NameError: Too few samples