<a href="https://colab.research.google.com/github/souravs17031999/private-ai/blob/master/adding_laplacian_noise_global_diff_privacy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Project - VI
## Implementing global differential privacy and understanding laplacian noise

In [0]:
import torch
import numpy as np

In [0]:
db = torch.rand(100) > 0.5

In [0]:
def sum_query(db):
  return torch.sum(db.float())


The idea of global differential privacy is about adding the noise to the output of the query function applied over the database.
This is done when indivduals have beleif and faith in the process that curator will add noise to the output of query without seeing the individual datapoints.
The following 'M' showcase some randomised algorithm which is stated in original definition of differential privacy by cynthia dwork , whose privacy we wish to maintain.

In [0]:
def M(db):
  return query(db) + noise

In [11]:
sum_query(db)

tensor(60.)

Now , How much noise should we add so that epsilon - delta constraint is satisfied according to original formal differential privacy is maintained.

This depends on following four things- 

* Type of noise added (gaussian , laplacian)
* Sensitivity of query function
* Desired epsilon
* Desired delta

b is parameter for laplacian noise , where , b = sensitivity (query) / epsilon .
In other words, if we set b to be this value, then we know that we will have a privacy leakage of <= epsilon. Furthermore, the nice thing about Laplace is that it guarantees this with delta == 0. There are some tunings where we can have very low epsilon where delta is non-zero, but we'll ignore them for now.

The laplace distruibution is shown below - 
![normal vs laplace distribution](https://www.vosesoftware.com/riskwiki/images/image15_632.gif)

In [0]:
epsilon = 0.5


In [0]:
def laplace_m(db, query, sensitivity):
  beta = sensitivity / epsilon
  noise = torch.tensor(np.random.laplace(0, beta, 1))
  return query(db) + noise

In [32]:
laplace_m(db, sum_query, 1)

tensor([60.4653], dtype=torch.float64)

In [0]:
def mean_query(db):
  return torch.mean(db.float())

In [31]:
laplace_m(db, mean_query, 1/100)

tensor([0.6157], dtype=torch.float64)

In [33]:
mean_query(db)

tensor(0.6000)

Now , let's see when we decrease the value of epsilon to very little that means very negligible leakage of info.

In [0]:
epsilon = 0.0001

In [39]:
laplace_m(db, sum_query, 1)

tensor([1627.9181], dtype=torch.float64)

In [40]:
laplace_m(db, mean_query, 1/100)

tensor([153.6952], dtype=torch.float64)

As we can see , decreasing epsilon , increases beta parameter that means we have increased lots of noise and very less accuracy in the output and that's what the above results show !