# Creating a Simple Database

We're going to create a simple database of 0s and 1s. Each entry will correspond to the number of people in database which comprise a certain quality(1) or not(0).

In [1]:
import torch

# number of entries in database
num_entries = 5000

db = torch.rand(num_entries) > 0.5
db

tensor([0, 0, 0,  ..., 0, 1, 0], dtype=torch.uint8)

## Project: Generate Parallel Databases

The definition of Differential Privacy can be perceived by answering a simple question:

>Will removing any one of the members from the database generate different output for the same query?

In order to check this, we will create parallel databases(N unique databases with N-1 entries).

Objective of project:
To create a function that
- Creates the initial database (db)
- Creates the parallel databases (List of pdb)

In [2]:
db = torch.rand(num_entries) > 0.5
db

tensor([1, 1, 1,  ..., 0, 0, 1], dtype=torch.uint8)

In [6]:
def get_parallel_db(db, remove_index):
    return(torch.cat((db[:remove_index], db[remove_index+1:])))

In [13]:
res = get_parallel_db(db, 2)
print(res)
print(res.shape)

tensor([1, 1, 0,  ..., 0, 0, 1], dtype=torch.uint8)
torch.Size([4999])


In [14]:
def get_parallel_dbs(db):

    pdbs = []

    for i in range(len(db)):
        pdbs.append(get_parallel_db(db, i))

    return(pdbs)

In [15]:
get_parallel_dbs(db)

[tensor([1, 1, 0,  ..., 0, 0, 1], dtype=torch.uint8),
 tensor([1, 1, 0,  ..., 0, 0, 1], dtype=torch.uint8),
 tensor([1, 1, 0,  ..., 0, 0, 1], dtype=torch.uint8),
 tensor([1, 1, 1,  ..., 0, 0, 1], dtype=torch.uint8),
 tensor([1, 1, 1,  ..., 0, 0, 1], dtype=torch.uint8),
 tensor([1, 1, 1,  ..., 0, 0, 1], dtype=torch.uint8),
 tensor([1, 1, 1,  ..., 0, 0, 1], dtype=torch.uint8),
 tensor([1, 1, 1,  ..., 0, 0, 1], dtype=torch.uint8),
 tensor([1, 1, 1,  ..., 0, 0, 1], dtype=torch.uint8),
 tensor([1, 1, 1,  ..., 0, 0, 1], dtype=torch.uint8),
 tensor([1, 1, 1,  ..., 0, 0, 1], dtype=torch.uint8),
 tensor([1, 1, 1,  ..., 0, 0, 1], dtype=torch.uint8),
 tensor([1, 1, 1,  ..., 0, 0, 1], dtype=torch.uint8),
 tensor([1, 1, 1,  ..., 0, 0, 1], dtype=torch.uint8),
 tensor([1, 1, 1,  ..., 0, 0, 1], dtype=torch.uint8),
 tensor([1, 1, 1,  ..., 0, 0, 1], dtype=torch.uint8),
 tensor([1, 1, 1,  ..., 0, 0, 1], dtype=torch.uint8),
 tensor([1, 1, 1,  ..., 0, 0, 1], dtype=torch.uint8),
 tensor([1, 1, 1,  ..., 0, 0

In [16]:
def create_db_and_parallel_dbs(num_entries):
    
    db = torch.rand(num_entries) > 0.5
    pdbs = get_parallel_dbs(db)
    
    return(db, pdbs)

In [17]:
db, pdbs = create_db_and_parallel_dbs(2000)

In [19]:
print("Shape of db:", db.shape)
print("Length of pdbs:", len(pdbs))
print("pdbs[0]:", pdbs[0])
print("Shape of pdbs[0]:", pdbs[0].shape)

Shape of db: torch.Size([2000])
Length of pdbs: 2000
pdbs[0]: tensor([0, 1, 1,  ..., 0, 1, 0], dtype=torch.uint8)
Shape of pdbs[0]: torch.Size([1999])
