# CS295B F19: Homework 6
## Differentially Private Machine Learning

## Instructions

Before you start, download the example dataset and ensure that all cells in this notebook execute without error. If you have trouble getting the notebook to run, please post a question on Piazza.

To ensure that the notebook runs, I've defined a function `your_code_here()` that simply returns the number `1`. Whenever you see a call to this function, you should replace it with code you have written. Please make sure all cells of your notebook run without error before submitting the assignment. If you have not completed all the questions, leave calls to `your_code_here()` in place or insert dummy values so that the cell does not throw an error when it runs.

To help you arrive at the correct solution, I have left the value computed by my solution in the uploaded version of this notebook. You can refer to these example results by viewing the notebook on Github. If you re-run the cell after downloading the notebook, the results will disappear (because the notebook no longer contains the code that generated them). Your solutions should produce results similar to the ones in the uploaded notebook.

When answering non-code questions, feel free to use a comment, or put the cell in Markdown mode and use Markdown.

The assignment is due by 5:00pm on Friday, November 8. When you have finished your assignment, submit it via Gradescope under the assignment "Homework 6." For questions on grading and submitting assignments, refer to the course webpage or email the instructor.

The dataset files you'll need are available here:

- [`adult_processed_x`](https://github.com/jnear/cs295-data-privacy/blob/master/slides/adult_processed_x.npy)
- [`adult_processed_y`](https://github.com/jnear/cs295-data-privacy/blob/master/slides/adult_processed_y.npy)

## Collaboration Statement

In the cell below, write your collaboration statement. This statement should describe all collaborations, even high-level ones (e.g. "I discussed my general approach for answering question 3 with Josh"). High-level collaborations of this kind are allowed as long as they are described; copying of answers or code is not allowed.

In [1]:
# Write your collaboration statement here

--------------------

In [23]:
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('seaborn-whitegrid')
import pandas as pd
import numpy as np

# Some useful utilities

def laplace_mech(v, sensitivity, epsilon):
    return v + np.random.laplace(loc=0, scale=sensitivity / epsilon)

def gaussian_mech(v, sensitivity, epsilon, delta):
    return v + np.random.normal(loc=0, scale=sensitivity * np.sqrt(2*np.log(1.25/delta)) / epsilon)

def pct_error(orig, priv):
    return np.abs(orig - priv)/orig * 100.0

def z_clip(xs, b):
    return [min(x, b) for x in xs]

def clip(xs, upper, lower):
    return [max(min(x, upper), lower) for x in xs]

def gaussian_mech_vec(v, sensitivity, epsilon, delta):
    return v + np.random.normal(loc=0, scale=sensitivity * np.sqrt(2*np.log(1.25/delta)) / epsilon, size=len(v))

def L2_clip(v, b):
    norm = np.linalg.norm(v, ord=2)
    
    if norm > b:
        return b * (v / norm)
    else:
        return v

def your_code_here():
    return 1

def test(msg, value, expected):
    if value == expected:
        print(f"{msg}: {value}, as expected")
    else:
        print(f"{msg}: OH NO! Got {value}, but expected {expected}.")

In [13]:
X = np.load('adult_processed_x.npy')
y = np.load('adult_processed_y.npy')

training_size = int(X.shape[0] * 0.8)

X_train = X[:training_size]
X_test = X[training_size:]

y_train = y[:training_size]
y_test = y[training_size:]

In [18]:
# Prediction: take a model (theta) and a single example (xi) and return its predicted label
def predict(theta, xi):
    label = np.sign(xi @ theta)
    return label

# The loss function measures how good our model is. The training goal is to minimize the loss.
# This is the logistic loss function.
def loss(theta, xi, yi):
    exponent = - yi * (xi.dot(theta))
    return np.log(1 + np.exp(exponent))

# This is the gradient of the logistic loss
# The gradient is a vector that indicates the rate of change of the loss in each direction
def gradient(theta, xi, yi):
    exponent = yi * (xi.dot(theta))
    return - (yi*xi) / (1+np.exp(exponent))

def accuracy(theta):
    return np.sum(predict(theta, X_test) == y_test)/X_test.shape[0]

# Simple gradient descent algorithm
def avg_grad(theta, X, y):
    grads = [gradient(theta, xi, yi) for xi, yi in zip(X, y)]
    return np.mean(grads, axis=0)

def gradient_descent(iterations):
    theta = np.zeros(X_train.shape[1])

    for i in range(iterations):
        theta = theta - avg_grad(theta, X_train, y_train)

    return theta

In [19]:
theta = gradient_descent(10)
accuracy(theta)

0.77874834144183991

----------------

## Question 1 (30 points)

Implement a function `dp_gradient_descent` that performs differentially private gradient descent by adding noise to the gradient at each iteration. Your function should take additional arguments $\epsilon$ and $\delta$, and should have an **overall** privacy cost of $(\epsilon, \delta)$-differential privacy. You may target *either* bounded or unbounded differential privacy.

**Note**: this is a major difference from the function defined in the notes, which bounds $\epsilon$ *per-iteration*. Your solution should bound the *total* privacy cost of training.

*Hint*: Use `gaussian_mech_vec`, defined above, to add noise.

*Hint*: Use advanced composition to bound the total privacy cost. Start with the total privacy cost of $k$-fold adaptive composition under advanced composition, then solve for $\epsilon_i$, the privacy cost per iteration. Use this result to set the per-iteration value of `epsilon`, and similar for `delta`.

In [31]:
def dp_gradient_descent(iterations, epsilon, delta):
    return your_code_here()

delta = 1e-5

for epsilon in [0.01, 0.1, 1.0]:
    theta = dp_gradient_descent(10, epsilon, delta)
    acc = accuracy(theta)
    print(f"Epsilon = {epsilon}, final accuracy: {acc}")

Epsilon = 0.01, final accuracy: 0.7335249889429456
Epsilon = 0.1, final accuracy: 0.7389429455992923
Epsilon = 1.0, final accuracy: 0.7674701459531181


## Question 2 (10 points)

In 2-5 sentences, argue that your implementation of `dp_gradient_descent` satisfies $(\epsilon, \delta)$-differential privacy.

## Question 3 (40 points)

Implement a function `zcdp_gradient_descent` that performs differentially private gradient descent by adding noise to the gradient at each iteration. Your function should take an additional argument $\rho$, and should have an **overall** privacy cost of $\rho$-zero concentrated differential privacy. You will also have to implement `gaussian_mech_vec_zcdp`, the vector-valued gaussian mechanism for zCDP.

In [39]:
def gaussian_mech_zCDP_vec(vec, sensitivity, rho):
    return your_code_here()

def zcdp_gradient_descent(iterations, rho):
    return your_code_here()

for rho in [0.00001, 0.0001, 0.001]:
    theta = zcdp_gradient_descent(10, rho)
    acc = accuracy(theta)
    print(f"rho = {rho}, final accuracy: {acc}")

rho = 1e-05, final accuracy: 0.7838345864661654
rho = 0.0001, final accuracy: 0.7754312251216275
rho = 0.001, final accuracy: 0.7820654577620522


## Question 4 (20 points)

Implement a function `convert_zCDP_eps_delta` that converts a $\rho$-zCDP privacy bound to a $(\epsilon,\delta)$-differential privacy bound, given a target $\delta$.

In [40]:
def convert_zCDP_eps_delta(rho, delta):
    return your_code_here()

delta = 1e-5

for rho in [0.00001, 0.0001, 0.001]:
    theta = zcdp_gradient_descent(10, rho)
    acc = accuracy(theta)
    epsilon = convert_zCDP_eps_delta(rho, delta)
    print(f"rho = {rho}, epsilon = {epsilon}, final accuracy: {acc}")

rho = 1e-05, epsilon = 0.021469660262893472, final accuracy: 0.7472357363998231
rho = 0.0001, epsilon = 0.06796140424415112, final accuracy: 0.7778637770897833
rho = 0.001, epsilon = 0.21559660262893474, final accuracy: 0.7789694825298541
