## Sensitivity

The maximum amount that the query changes when removing an individual from the database.
#### Lets look more into this!

Let's take our function that generates parallel databases from the previous notebook.

In [1]:
!pip install ipynb
from ipynb.fs.full.a_Generate_Parallel_Databases import create_db_and_parallels
import torch



In [2]:
db, pdbs = create_db_and_parallels(5000)
pdbs

[tensor([1, 0, 1,  ..., 1, 1, 1], dtype=torch.uint8),
 tensor([1, 0, 1,  ..., 1, 1, 1], dtype=torch.uint8),
 tensor([1, 1, 1,  ..., 1, 1, 1], dtype=torch.uint8),
 tensor([1, 1, 0,  ..., 1, 1, 1], dtype=torch.uint8),
 tensor([1, 1, 0,  ..., 1, 1, 1], dtype=torch.uint8),
 tensor([1, 1, 0,  ..., 1, 1, 1], dtype=torch.uint8),
 tensor([1, 1, 0,  ..., 1, 1, 1], dtype=torch.uint8),
 tensor([1, 1, 0,  ..., 1, 1, 1], dtype=torch.uint8),
 tensor([1, 1, 0,  ..., 1, 1, 1], dtype=torch.uint8),
 tensor([1, 1, 0,  ..., 1, 1, 1], dtype=torch.uint8),
 tensor([1, 1, 0,  ..., 1, 1, 1], dtype=torch.uint8),
 tensor([1, 1, 0,  ..., 1, 1, 1], dtype=torch.uint8),
 tensor([1, 1, 0,  ..., 1, 1, 1], dtype=torch.uint8),
 tensor([1, 1, 0,  ..., 1, 1, 1], dtype=torch.uint8),
 tensor([1, 1, 0,  ..., 1, 1, 1], dtype=torch.uint8),
 tensor([1, 1, 0,  ..., 1, 1, 1], dtype=torch.uint8),
 tensor([1, 1, 0,  ..., 1, 1, 1], dtype=torch.uint8),
 tensor([1, 1, 0,  ..., 1, 1, 1], dtype=torch.uint8),
 tensor([1, 1, 0,  ..., 1, 1

In [3]:
pdbs[0].shape

torch.Size([4999])

For the sake of this example, we create 10 parallel instances of our database that has 9 items each.

Now let's create a simple query for this database. We will let this query just sum the items in the database

In [4]:
def query(db):
    return db.float().mean()

In [5]:
full_db_result = query(db)

In [6]:
sensitivity = 0
for pdb in pdbs:
    pdb_result = query(pdb)
    
    db_distance = torch.abs(pdb_result - full_db_result)
    
    if(db_distance > sensitivity):
        sensitivity = db_distance

In [7]:
sensitivity

tensor(0.0001)

##### What do we notice here?

Empirically, we can see that since our data above is binary i.e., it's 0 or 1, and given our query function if we remove one item from the database the maximum value by which our query can differ is 1. That means the that our database is sensitive by a value of 1 if we remove one item from our original database

# Generalizable Sensitivity Function

Now, let's create a single sensitivity function that combines all the things that we have done so far.

In [8]:
def sensitivity(query, n_entries):
    # initialize the database and the parallel databases
    db, pdbs = create_db_and_parallels(n_entries)
    print(db.shape)
    print(len(pdbs))
    
    # run the query over all databases
    full_db_result = query(db)
    
    # calculate sensitivity
    sensitivity = 0
    for pdb in pdbs:
        pdb_result = query(pdb)

        db_distance = torch.abs(pdb_result - full_db_result)

        if(db_distance > sensitivity):
            sensitivity = db_distance
    
    return sensitivity
    

In [9]:
sensitivity(query, 1000)

torch.Size([1000])
1000


tensor(0.0005)

### Assumptions

One of our assumptions here is that in the database that we create using the **create_db_and_parallel()** function, each of the values (1 or 0) represents a person.

So when we remove someone from the database, none of the values in a database refer to the same person.

We care about sensitivity to people and not necessarily to values.

### What are we really calculating here?

We are really trying to calculate, how much the output value from the **sensitivity()** function is using the information from each individual person database or it is just an aggregate information of data that multiple people are contributing to.

##### In the next notebook we will calulate the L1 sensitivity of a function...