<br>Generate parallel databases</br>

<p>Key to the definition of differential privacy is the ability to ask the question "When querying a database, if I removed someone from the database, would the output of the query be any different?". Thus, in order to check this, we must construct what we term "parallel databases", wich are simply databases with one entry removed.</p>
<p>In this first project, I am going to create a list of every parallel database to the one currently contained in the "db" variable.</p>
Then, I am going to create a function which both:
<ul>
    <li>creates the initial database(db)</li>
    <li>creates all parallel databases</li>
<ul>

In [1]:
import torch

#the number of entries in our database
num_entries = 5000

db = torch.rand(num_entries) > 0.5
db

tensor([1, 0, 1,  ..., 1, 0, 0], dtype=torch.uint8)

In [7]:
remove_index = 2
db[0:5]

tensor([1, 0, 1, 0, 1], dtype=torch.uint8)

In [9]:
def get_parallel_db(db, remove_index):
    return torch.cat((db[0:remove_index],
                      db[remove_index + 1:]))

In [11]:
get_parallel_db(db, 3).shape

torch.Size([4999])

In [12]:
get_parallel_db(db, 11111).shape

torch.Size([5000])

In [16]:
def get_parallel_dbs(db):
    
    parallel_dbs = list()
    
    for i in range(len(db)):
        pdb = get_parallel_db(db, i)
        parallel_dbs.append(pdb)
    
    return parallel_dbs

In [18]:
pdbs = get_parallel_dbs(db)

In [19]:
pdbs

[tensor([0, 1, 0,  ..., 1, 0, 0], dtype=torch.uint8),
 tensor([1, 1, 0,  ..., 1, 0, 0], dtype=torch.uint8),
 tensor([1, 0, 0,  ..., 1, 0, 0], dtype=torch.uint8),
 tensor([1, 0, 1,  ..., 1, 0, 0], dtype=torch.uint8),
 tensor([1, 0, 1,  ..., 1, 0, 0], dtype=torch.uint8),
 tensor([1, 0, 1,  ..., 1, 0, 0], dtype=torch.uint8),
 tensor([1, 0, 1,  ..., 1, 0, 0], dtype=torch.uint8),
 tensor([1, 0, 1,  ..., 1, 0, 0], dtype=torch.uint8),
 tensor([1, 0, 1,  ..., 1, 0, 0], dtype=torch.uint8),
 tensor([1, 0, 1,  ..., 1, 0, 0], dtype=torch.uint8),
 tensor([1, 0, 1,  ..., 1, 0, 0], dtype=torch.uint8),
 tensor([1, 0, 1,  ..., 1, 0, 0], dtype=torch.uint8),
 tensor([1, 0, 1,  ..., 1, 0, 0], dtype=torch.uint8),
 tensor([1, 0, 1,  ..., 1, 0, 0], dtype=torch.uint8),
 tensor([1, 0, 1,  ..., 1, 0, 0], dtype=torch.uint8),
 tensor([1, 0, 1,  ..., 1, 0, 0], dtype=torch.uint8),
 tensor([1, 0, 1,  ..., 1, 0, 0], dtype=torch.uint8),
 tensor([1, 0, 1,  ..., 1, 0, 0], dtype=torch.uint8),
 tensor([1, 0, 1,  ..., 1, 0

In [20]:
def create_db_and_parallels(num_entries):
    db = torch.rand(num_entries) > 0.5
    pdbs = get_parallel_dbs(db)
    
    return db, pdbs

In [21]:
db, pdbs = create_db_and_parallels(20)

In [22]:
db

tensor([0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1],
       dtype=torch.uint8)

In [23]:
pdbs

[tensor([1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1],
        dtype=torch.uint8),
 tensor([0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1],
        dtype=torch.uint8),
 tensor([0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1],
        dtype=torch.uint8),
 tensor([0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1],
        dtype=torch.uint8),
 tensor([0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1],
        dtype=torch.uint8),
 tensor([0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1],
        dtype=torch.uint8),
 tensor([0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1],
        dtype=torch.uint8),
 tensor([0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1],
        dtype=torch.uint8),
 tensor([0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1],
        dtype=torch.uint8),
 tensor([0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1],
        dtype=torch.uint8),
 tensor([0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1,