In [1]:
%%HTML
<script>
  function code_toggle() {
    if (code_shown){
      $('div.input').hide('500');
      $('#toggleButton').val('Show Code')
    } else {
      $('div.input').show('500');
      $('#toggleButton').val('Hide Code')
    }
    code_shown = !code_shown
  }

  $( document ).ready(function(){
    code_shown=false;
    $('div.input').hide()
  });
</script>
<form action="javascript:code_toggle()"><input type="submit" id="toggleButton" value="Show Code"></form>

## Differentially Private Mean
---

The following tutorial gives one example of how the `dp_mean()` function is called. The data samples are randomly drawn from a Gaussian distribution. The output of the `dp_mean()` function will be compared to a non-differentially private version of the sample mean: $\bar{x}=\frac{1}{n}\sum_{i=1}^{n}x_i$. 

The parameters that can be adjusted are:

- Epsilon
- Delta
- Sample_size

In [2]:
from ipywidgets import interact
from IPython.display import display
import numpy as np
import dp_stats as dps

# This tutorial gives an example of using the dp_mean() function
# The true sample mean and differentially private mean of the data vector will be displayed for comparison


# This function will allow the outputs of the means to be interactive
def show_mean(Epsilon=1.0, Delta = 0.1, Sample_size = 100):
    # generate a sample data vector
    data_ = np.random.normal(loc = 0, scale = 1.0, size = Sample_size)
    
    # restric data vector to be positive and within the range [0, 1]
    data_ = abs(data_)
    data_ = data_.clip(min = 0, max = 1.0)

    # find the non-differentially private mean of the generated data
    mean_control = (np.sum(data_) * 1.0) / (Sample_size * 1.0)
    
    # find the differentially private mean of the generated data
    # dp_mean( data_vect, epsilon=1.0, delta=0.1 )
    mean_dp = dps.dp_mean(data_, epsilon = Epsilon, delta = Delta)
    
    # output the control and differentially private mean
    control_txt = 'Non-private Mean: {}'.format(round(mean_control, 4))
    display(control_txt)
    dp_txt = 'Differentially Private Mean: {}'.format(round(float(mean_dp), 4))
    display(dp_txt)

interact(show_mean, Epsilon=(0.01,3,0.01), Delta=(0.01,0.5,0.01), Sample_size=(100,10000,500))

'Non-private Mean: 0.6688'

'Differentially Private Mean: 0.6635'

<function __main__.show_mean>

It can be noted from the outputs that the differentially private mean will roughly come closer to the actual sample mean when the sample size becomes larger with fixed privacy level, or the privacy level becomes small (Epsilon being large) with fixed sample size.

In [3]:
%%HTML
<script>
  $(document).ready(function(){
    $('div.prompt').hide();
    $('div.back-to-top').hide();
    $('nav#menubar').hide();
    $('.breadcrumb').hide();
    $('.hidden-print').hide();
  });
</script>

<footer id="attribution" style="float:right; color:#999; background:#fff;">
Created with Jupyter, delivered by Fastly, rendered by Rackspace.
</footer>