# How do we actually use epsilon and delta?

# Privacy Budget?
How much epsilon or delta leakage we allow for our analysis.

# Two types of noise, we can add::
Gaussian
Laplacian

# How much noise should we add?


# Create a Differentially Private Query Project

What is The Laplace Mechanism??

Link:: https://stats.stackexchange.com/questions/223494/what-is-meant-by-laplace-noise?noredirect=1
The Laplace Mechanism. Given any function f:N|X|→Rk, the Laplace mechanism is defined as: ML(x,f(⋅),ϵ)=f(x)+(Y1,...,Yk) where Y are i.i.d. random variables drawn from Lap(Δf/ϵ)
To generate Y ( X ), a common choice is to use a Laplace distribution with zero mean and Δ ( f ) /ε scale parameter

Create Laplace random variables::
https://www.johndcook.com/blog/2018/03/13/generating-laplace-random-variables/

Formula from scratch to generate Laplace values in Python::
from math import log
    from random import random

    def exp_sample(mean): 
        return -mean*log(random())

    def laplace(scale):
        e1 = exp_sample(scale)
        e2 = exp_sample(scale)
        return e1 - e2

# Ques:
Two types of noise, we can add::
Gaussian
Laplacian

Why Laplacian noise is better than Gaussian?

# Ans
Gogulaanand R  
in laplacian noise we are almost always guranteed that the delta is 0

Gogulaanand R 
there is no extra unaccounted privacy leak other than the epsilon

Nirupama Singh  
@Gogulaanand R Why delta is zero?If I agree that delta is zero, then it means that privacy leak will be less.?

Tyler Yang 
@Nirupama Singh I can't explain why the delta is zero, but the mathematical proof can be found in the book _The Algorithmic Foundation of Differential Privacy_. The author also explains why Gaussian noise *DO* have delta.

When delta is zero, there is no chance for accidental information leak that is bigger than epsilon—bigger delta means higher probability of leaking more information than what you would expect with the epsilon. (edited)

Nirupama Singh 
Thanks @Tyler Yang

Prakhar Tripathi
@Nirupama Singh Laplacian noise (also called biexponential) which has this pdf: Nonlinear estimators can provide a much more accurate estimate of the mean of a stationary Laplacian random variable than the linear average [6]. ... This implies that nonlinear filters should be better at removing uniform noise than Gaussian noise.



# Ques
In Laplaccian noise, beta=sensitivity of the query/epsilon that we want to achieve
Why beta(scale parameter) is chosen as sensitivity of the query/epsilon that we want to achieve?

# Ans
Ateniola Oluwatobi Victor
@Nirupama Singh, :wink:that is just how the formula for beta is in Laplacian noise. You can use the formula perfectly when trying to use differential privacy without knowing the proof of the formula.
If you want to know the proof behind the formula and also behind so many other concepts in differential privacy, I suggest you read Cynthia Dwork's book: The Algorithmic foundations of differential privacy. The book is quite technical and you might not understand some mathematical concepts except you have a mathematics background.
https://www.cis.upenn.edu/~aaroth/Papers/privacybook.pdf

# Ques
Nirupama Singh
In video, it is given that Delta is always zero for laplacian noise.
Why delta is zero in Laplaccian noise?

# Ans
Gogulaanand R 
for your query regarding why delta is zero, see this answer by @Rishi S Rao
https://secureprivataischolar.slack.com/archives/CJCJJQ42W/p1562689042007400?thread_ts=1562687359.006900&cid=CJCJJQ42W
Rishi S Rao
this is quiet intense proof if your are not from math but if want to go through it, it is in the 3rd chapter of the text from Cynthia Dwork.
From a thread in #l4_loc_glob_diff_priv | Yesterday at 9:47 PM | View reply

# Ques:
Nirupama Singh
Is Gaussian distribution means normal distribution?

# Ans:
Venkata Rathnam Muralidharan 
Yes.  Both are same.  The standard normal distribution is the different one where mean = 0, variance = 1 distribution condition will be maintained. (edited)

Aarthi Alagammai 
yes normalized gaussian distribution is normal distribution

David Fernando Jurado Blanco 
Within the realm of probability theory, they're the same.

Deepak Sharma 
Yes they are, in statistics theory its very important as 95% of the data is between 2 standard deviation away from mean. Assuming that your mean is at the center of distribution. You can then create hypothesis for your arguments.

Prakhar Tripathi 
@Nirupama Singh yes within probabab it is same

# Project: Create a Differentially Private Query

create a query function which sums over the database and adds just the right amount of noise such that it satisfies an epsilon constraint. Write a query for both "sum" and for "mean". Ensure that you use the correct sensitivity measures for both.

In [1]:

epsilon = 0.0001

In [2]:
import numpy as np

In [3]:

import torch

# the number of entries in our database
num_entries = 5000

db = torch.rand(num_entries) > 0.5
db

tensor([0, 1, 1,  ..., 1, 0, 0], dtype=torch.uint8)

In [4]:

def get_parallel_db(db, remove_index):

    return torch.cat((db[0:remove_index], 
                      db[remove_index+1:]))

In [5]:

def get_parallel_dbs(db):

    parallel_dbs = list()

    for i in range(len(db)):
        pdb = get_parallel_db(db, i)
        parallel_dbs.append(pdb)
    
    return parallel_dbs

In [6]:
def create_db_and_parallels(num_entries):
    
    db = torch.rand(num_entries) > 0.5
    pdbs = get_parallel_dbs(db)
    
    return db, pdbs

In [7]:

db, pdbs = create_db_and_parallels(100)

In [8]:
def sum_query(db):
    return db.sum()

In [9]:
def laplacian_mechanism(db, query, sensitivity):
    
    beta = sensitivity / epsilon
    noise = torch.tensor(np.random.laplace(0, beta, 1))
    
    return query(db) + noise

In [10]:

def mean_query(db):
    return torch.mean(db.float())

In [11]:
laplacian_mechanism(db, sum_query, 1)

tensor([4857.2294], dtype=torch.float64)

In [12]:

laplacian_mechanism(db, mean_query, 1/100)

tensor([35.8954], dtype=torch.float64)