# In-class practice
## The Laplace Mechanism
Material courtesy of Joseph Near, University of Vermont

## Instructions

The first half of this notebook contains code to read in and preprocess the example dataset. The second half contains questions for you to answer by writing code and describing your solutions.

First, download the example dataset and ensure that all cells in this notebook execute without error.

## Preamble: Read in Adult dataset & Preprocessing

Download the dataset by clicking [here](https://jnear.github.io/cs295-data-privacy/homework/adult_with_pii.csv) and placing it in the same directory as this notebook.

The dataset is based on census data. I have added the columns `Name`, `DOB`, `SSN`, and `Zip` to represent personally identifiable information (PII). The values in these columns are made up.

In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('seaborn-whitegrid')
import pandas as pd
import numpy as np

def your_code_here():
    return 1

adult_data = pd.read_csv("adult_with_pii.csv")
adult_data.head()

Unnamed: 0,Name,DOB,SSN,Zip,Age,Workclass,fnlwgt,Education,Education-Num,Martial Status,Occupation,Relationship,Race,Sex,Capital Gain,Capital Loss,Hours per week,Country,Target
0,Karrie Trusslove,9/7/1967,732-14-6110,64152,39,State-gov,77516,Bachelors,13,Never-married,Adm-clerical,Not-in-family,White,Male,2174,0,40,United-States,<=50K
1,Brandise Tripony,6/7/1988,150-19-2766,61523,50,Self-emp-not-inc,83311,Bachelors,13,Married-civ-spouse,Exec-managerial,Husband,White,Male,0,0,13,United-States,<=50K
2,Brenn McNeely,8/6/1991,725-59-9860,95668,38,Private,215646,HS-grad,9,Divorced,Handlers-cleaners,Not-in-family,White,Male,0,0,40,United-States,<=50K
3,Dorry Poter,4/6/2009,659-57-4974,25503,53,Private,234721,11th,7,Married-civ-spouse,Handlers-cleaners,Husband,Black,Male,0,0,40,United-States,<=50K
4,Dick Honnan,9/16/1951,220-93-3811,75387,28,Private,338409,Bachelors,13,Married-civ-spouse,Prof-specialty,Wife,Black,Female,0,0,40,Cuba,<=50K


In [2]:
# Remove PII
adult_anon = adult_data.drop(columns=['Name', 'SSN'])
adult_anon.head()

Unnamed: 0,DOB,Zip,Age,Workclass,fnlwgt,Education,Education-Num,Martial Status,Occupation,Relationship,Race,Sex,Capital Gain,Capital Loss,Hours per week,Country,Target
0,9/7/1967,64152,39,State-gov,77516,Bachelors,13,Never-married,Adm-clerical,Not-in-family,White,Male,2174,0,40,United-States,<=50K
1,6/7/1988,61523,50,Self-emp-not-inc,83311,Bachelors,13,Married-civ-spouse,Exec-managerial,Husband,White,Male,0,0,13,United-States,<=50K
2,8/6/1991,95668,38,Private,215646,HS-grad,9,Divorced,Handlers-cleaners,Not-in-family,White,Male,0,0,40,United-States,<=50K
3,4/6/2009,25503,53,Private,234721,11th,7,Married-civ-spouse,Handlers-cleaners,Husband,Black,Male,0,0,40,United-States,<=50K
4,9/16/1951,75387,28,Private,338409,Bachelors,13,Married-civ-spouse,Prof-specialty,Wife,Black,Female,0,0,40,Cuba,<=50K


In [3]:
# PII only
pii = adult_data[['Name', 'DOB', 'SSN', 'Zip']]
pii.head()

Unnamed: 0,Name,DOB,SSN,Zip
0,Karrie Trusslove,9/7/1967,732-14-6110,64152
1,Brandise Tripony,6/7/1988,150-19-2766,61523
2,Brenn McNeely,8/6/1991,725-59-9860,95668
3,Dorry Poter,4/6/2009,659-57-4974,25503
4,Dick Honnan,9/16/1951,220-93-3811,75387


## END PREAMBLE
-------------

### Question 1

Define a function `laplace_mech` that implements the Laplace Mechanism for a query result consisting of a single real number. Your implementation should work for queries of any sensitivity (by passing sensitivity as an argument) and any value of `epsilon`.

*Hint*: use `np.random.laplace`.

In [21]:
def laplace_mech(query, sensitivity, epsilon):
    return your_code_here()

In [4]:
laplace_mech(48273, 1, 1)

48273.62614440791

### Question 2

Use your implementation of `laplace_mech` to produce a differentially private answer to your query from the last question, with `epsilon = 0.1`.

In [11]:
your_code_here()

10683.322660193706