# HW4A: The Local Model

### Bhaven Patel
### 4/16/2019

I worked with Anthony Rentsch, Lipika Ramaswamy, and Karina Huang on this homework.

My code can be found on my [Github](https://github.com/bhavenp/cs208/blob/master/homework/HW4/HW4_Bhaven_Patel.ipynb).

## Problem 1: Learning Conjunctions in the SQ Model

### Centralized Version of SQ Model

For the centralized version of the SQ model, I chose to calculate $p_j = P[x[j]=0 \,\wedge\, y=1]$ for $j=1,...,d$. Then, laplace noise is added to $p_j$ with a scale equal to $\dfrac{GS}{\tilde\epsilon}$, where $GS=\dfrac{1}{n}$ and $\tilde\epsilon= \dfrac{\epsilon}{d}$. Thus, every $p_j$ has a differentially private release $\hat p_j$
$$
\hat p_j = p_j + Lap\left(\dfrac{d}{n\tilde\epsilon}  \right)
$$

Below are the helper functions we generally use.

In [1]:
rm(list=ls())		# Remove any objects in memory
set.seed(123)

# Random draw from Laplace distribution
#
# mu numeric, center of the distribution
# b numeric, spread
# size integer, number of draws
# 
# return Random draws from Laplace distribution
# example:
# 
# rlap(size=1000)

rlap = function(mu=0, b=1, size=1) {
    p <- runif(size) - 0.5
    draws <- mu - b * sgn(p) * log(1 - 2 * abs(p))
    return(draws)
}

# Sign function
# 
# Function to determine what the sign of the passed values should be.
#
# x numeric, value or vector or values
# return The sign of passed values
# example:
#
# sgn(rnorm(10))

sgn <- function(x) {
    return(ifelse(x < 0, -1, 1))
}

In [18]:
##function to create the matrix that holds an indicator if x_j==0 & y==1
createConjunctionMat <- function(xData, yData){
    #create matrix to hold indicator if x_j==0 & y==1
    result_matrix = matrix(0, nrow=nrow(xData), ncol=ncol(xData));
    for(i in 1:nrow(xData)){
        if(yData[i] == 1){ #only need to consider row if y=1
            result_matrix[i, ] <- (xData[i, ] == 0); #check if x_j == 0
        }
    }
    return(result_matrix);
}

#function to calculate DP-releases for each probability
probRelease <- function(xMat, epsilon=1.0){
    probs <- colMeans(xMat); #calculate true probabilities
    sensitivity <- 1 / nrow(xMat); #sensitivity is 1/n
    scale <- sensitivity / epsilon;
    dpProbs <- probs + rlap(mu=0, b=scale, size=length(probs)); #add laplace noise to the true probabilities

	return(list(release=dpProbs, true=probs) );
}

#function that ties together the different parts for doing a DP release of the probabilities for a 
## xData: matrix of {0,1}
## yData: vector of {0,1}, same length as number of rows in xData
## epsilon: total privacy-loss parameter. This will get split up by the number of columns in xData that 
##          we must release probabilities for
## returns a vector of the indices corresponding to the columns of xData that predict yData well
centrlDP_SQAlg <- function(xData, yData, totEpsilon=1.0, threshold=1e-4){
    pMatrix <- createConjunctionMat(xData=xData, yData=yData); #create conjunction matrix
    print(totEpsilon/ncol(xData))
    dpRelease <- probRelease(pMatrix, epsilon = totEpsilon/ncol(xData) ); #get DP release of probabilities
#     cat(dpRelease$true);
#     cat(dpRelease$release);
    indices <- which(dpRelease$release < threshold); #get indices with probability less than threshold
    return(indices);
}

In [19]:
#generate matrix of indicators
# pMatrix <- createConjunctionMat(xData=mydata[,0:10], yData=mydata[['y']]);
# dpRelease <- probRelease(pMatrix, epsilon = 0.5);
# dpRelease$true
# dpRelease$release

#read in the test data
mydata <- read.csv('../../data/hw4testdata.csv');
# mydata[0:10, ]

#get set of features to use as predictors
centrlDP_SQAlg(xData = mydata[, 1:10], yData = mydata[['y']], totEpsilon = 0.5)

[1] 0.05


### Local Model

#### describe local model implementation

In [None]:
#
#
#
#

#mechanism works for x as a scalar or just a vector
localRelease <- function(x, values=c(-1,1), epsilon){
	draw <- runif(n=1, min=0, max=1);
	cutoff <- 1/(1+exp(epsilon));
    if(draw < cutoff){ #we are going to flip the our value
        return(values[ !values %in% x]); #create flag with (!values %in% x) for opposite value of 'x'
    }else{ #draw > cutoff so we return true value
        return(x);
    }
}

correction <- function(release, epsilon){
	inflation <- (exp(epsilon) + 1)/(exp(epsilon) - 1)
	expectation <- mean(release * inflation)
	return(expectation)
}





#function that ties together the different parts for doing a DP release of the probabilities for a local model
## xData: matrix of {0,1}
## yData: vector of {0,1}, same length as number of rows in xData
## epsilon: total privacy-loss parameter. This will get split up by the number of columns in xData that 
##          we must release probabilities for
## returns a vector of the indices corresponding to the columns of xData that predict yData well
localDP_SQAlg <- function(xData, yData, totEpsilon=1.0, threshold=1e-4){
    pMatrix <- createConjunctionMat(xData=xData, yData=yData); #create conjunction matrix

    dpRelease <- probRelease(pMatrix, epsilon = totEpsilon/ncol(xData) ); #get DP release of probabilities
#     cat(dpRelease$true);
#     cat(dpRelease$release);
    indices <- which(dpRelease$release < threshold); #get indices with probability less than threshold
    return(indices);
}

In [20]:
matrix(0, nrow = 2, ncol = 2)

0,1
0,0
0,0
