# Final project for Project 2 in IN-STK5000
## Alva Hørlyk, Tobias Opsahl and Ece Cetinoglu

This notebook is contains what will be the final version of Project 2. Since we have changed a bit under the way, the parts for the 1st and 2nd deadline is a bit different, and we therefore deliver the whole project in this notebook. We go through the exercises starting on the 1st deadline, and ending on the last one. 

# General information

First of all, let us introduce our policy. We will look at treaments, and not vaccines.

## Utility
To calculate the utility, we look at each person, their symptoms, treaments and outcomes. For each person we calculate the reward, and the utility is simply the sum of the rewards. The reward is calculated by looking at each symptom. 
If the symptom was there prior to the treatment given, and not afterwards, a positive weight is added. If the symptom was not there before the treatment, but was there afterwards, a negative weight is added, times a penalty. Finally, if any treatment was used (it is not the case that no treatments where used), a treatment-cost is subtracted from the reward.

## Policy overview

We will provide a short overview of the policy here. The actual code should be pretty well documented, so you can find more details in the doc-strings. 

The policy takes the amount of treatments as an argument in the **constructor**, which we assume to be 3 for compatibility with the **simulator.py**. The method **observe()** takes in a poluation (features), actions and outcomes, and fit the models used to predict the outcomes. Note that one would need the population, the actions and the outcomes from treating the poluation with the actions (by using **population.treat()**), before calling the method, since they are required as arguments. To chose the actions for the initial call, one can use **initialize_data()**, to pick actions randomly, and getting the corresponding outcomes. For each treatment, for each symptom, we fit a logistic regression model. Given a treatment and a symtpom, the model takes in some columns from the features as input, and returns the probability for that symptom to be present after the given treatment. 
The method **feature_select()** will chose the columns we want to fit the model on. This is the given symptom, age, income, gender, and the comorbidities. We do not select the other symptoms, because we assume they are not realated. We do not use the genes, becuase after our analysis they seem to be unrealated to the response, and adds a lot of noise.
**get_reward()** calculates the individual reward for each person in features, and **get_utility()** is the sum of the rewards. This was explained in the introductions and there are more details in the code. 
**get_action()** will chose a suitable treatment for each person in features. Note that **observe()** should have been called first. We use the fitted models to predict the probability of symptoms after a treatment, for each treatment. We then use the probability for symptoms as a response to **get_reward()**, to calculate the expected reward. For each person, we chose the treatment with the highest expected reward. If all of the treatments gives negative expected rewards, we chose no treatment. 

The help functions **add_feature_names()**, **add_action_names()** and **add_outcome_names()** sumply converts features, actions and outcomes to a pandas dataframe, with suitable column-names. 

**treatments_given()** checks how many persons recieved at least one treatment. 

We also make a class **ZeroModel**, to only predict zero if the response is only zero. This is because **LogisticRegression** does not work when all of the responses are zero.

Let's do the imports. Note that we renamed aux.py to aux_file.py to work on windows.

In [5]:
import numpy as np
import pandas as pd
from aux_file import symptom_names
import simulator
from IPython import embed
from sklearn.linear_model import LogisticRegression
from sklearn import preprocessing
np.warnings.filterwarnings('ignore', category=np.VisibleDeprecationWarning)

Finally, let's look at the actual code for the policy. We also include the RandomPolicy, since it is used in the initialization of the data.

In [6]:
class Policy:
    def __init__(self, n_actions, action_set, threshold=0.5):
        """ 
        In:
            n_actions (int): the number of actions
            action_set (list): the set of actions
            threshold (float): Threshold for caterogizing from logistic regression
        """
        self.n_actions = n_actions
        self.action_set = action_set
        self.threshold = threshold
    
    def initialize_data(self, n_population):
        """
        This function should be called before using the rest of the methods.
        This function simply makes a RandomPolicy of the same length as the
        population will be. This data is then used to fit the models later. 
        In: 
            n_population (int): Size of population, the amount of persons.
        Out:
            features (np.array): The population, generated by simulator.py.
            actions (np.array): The actions chosen by RandomPolicy on features.
            outcomes (np.array): The outcomes when features are treated with actions.
        These are also stored as class variables. 
        """
        population = simulator.Population(128, 3, 3)
        treatment_policy = RandomPolicy(3, list(range(3))) 
        self.n_population = n_population
        self.features = population.generate(self.n_population)
        self.actions = treatment_policy.get_action(self.features)
        self.outcomes = population.treat(list(range(n_population)), self.actions)
        return self.features, self.actions, self.outcomes
    
    def feature_select(self, X, symptom_index=1):
        """
        Chooses some columns in X.
        0 Covid-Recovered
        1 Covid-Positive
        2 No-Taste/Smell
        3 Fever
        4 Headache
        5 Pneumonia
        6 Stomach
        7 Myocarditis
        8 Blood-Clots
        9 Death
        10 Age
        11 Gender
        12 Income
        141 Asthma
        142 Obesity
        143 Smoking
        144 Diabetes
        145 Heart disease
        146 Hypertension
        """
        N = X[:, [symptom_index, 10, 11, 12, 141, 142, 143, 144, 143, 144, 145, 146]]
        return N
        
    def get_reward(self, features, actions, outcome, penalty=1.5, treatment_cost=0.1):
        """
        This method calculates the reward, which in a way is the utility of 
        a single indidual, for each row in the arguments.
        We calculate this by the following:
            For each person, if that person experience some symptom before the,
        treatment, but not after, a positive weight is added to the reward. If
        they did not experience any symptoms before, but do after the action, 
        then a negative weight is added, times a penalty (so new sick 
        individuals can be punished harder than curing sick persons is rewarded).
        The weight corresponds to the severity of the symptoms. 
        Finally, if we apply any action at all, a treatment_cost is subtracted, 
        representing the cost of a treatment. 
        
        In:
            features (np.array): The population.
            actions (np.array): The actions.
            outcome (np.array): The outcome when population is treated with outcome.
            penalty (float): How much worse it is to give a new symptom, than 
                to cure one. If 0, the reward overlooks if a person gets a new
                symptom. When penalty -> \infty, the reward only prevents new
                symptoms, and do not care about the cured patients.
            treatment_cost (float): The cost of a treatment. If one of the 
                columns in actions is non-zero, treatment_cost is subtracted
                from the reward. If not, then no treatment is used, and 
                we do not subtract it. 
        Out:
            rewards (np.array): Array of rewards, corresponding to the persons
                in features (and actions and outcome).
                
        """
        rewards = np.zeros(len(outcome))
        weights = [0, 0.2, 0.1, 0.1, 0.1, 0.5, 0.2, 0.5, 1.0, 100.0]
        threshold = self.threshold
        for t in range(len(features)):
            utility = 0
            for i in range(1, len(weights)): # i loops over the sypmtom indecies
                if features[t, i] == 1 and outcome[t, i-1] < threshold:
                    utility += weights[i]
                if features[t, i] == 0 and outcome[t, i-1] >= threshold:
                    utility -= weights[i] * penalty
            if (np.sum(actions[t, :]) > 0): # Some action were used
                utility = utility - treatment_cost # The treatment is not free
            rewards[t] = utility 
        return rewards
        
    def observe(self, features, actions, outcomes):
        """
        This functions takes in a population, the action used on them, and the
        outcome from doing so. Then it will update the model. 
        The model is updated accordingly:
            For each treatment (which should be 3), for each symptoms (which is
        9, Covid-Recovered is overlooked), a logistic regression method is
        fitted. The models are stored as class variable. If all of the responses
        are 0, then a model constantly predicting 0 is used. The models are
        fitted for each action for each symptoms with the features and action
        as input, and the post_symptom (symptom in outcomes) as response. 
        IN OTHER WORDS: Each model predicts wether a certain treatment will 
        for a certain symptom will continue to be there, after the treatment.
            NOTE: To start the method, one could get the inputs by calling
        initialize_data(), to get a random starting point to fit the data on.
        
        In: 
            features (t*|X| array): The population
            actions (t*|A| array): The actions the population is treated with
            outcomes (t*|Y| array): The outcomes from treating the population
                with the actions.
        """
        self.features = features
        self.actions = actions
        self.outcomes = outcomes
        symptom_indecies = [1, 2, 3, 4, 5, 6, 7, 8, 9] # Indecies for symptoms
        models = []
        for treatment in range(self.n_actions): # for each treatment i
            indecies = self.actions[:, treatment] == 1 # treament i is used
            for symptom_index in symptom_indecies: # for each symptom
                feat = self.features[indecies]
                out = self.outcomes[indecies]
                x_data = self.feature_select(feat, symptom_index)
                y_data = out[:, symptom_index]
                logistic_model = LogisticRegression()
                scaler = preprocessing.StandardScaler().fit(x_data)
                x_scaled = scaler.transform(x_data)
                if sum(y_data) != 0: 
                    logistic_model.fit(x_scaled, y_data)
                    # model = logistic_model.fit(x_scaled, y_data)
                else: # If all y_data is 0, we just predict 0. LogisticRegression would crash
                    logistic_model = ZeroModel("Name")
                models.append(logistic_model)
        self.models1 = models[:9] # Treatments / n_actions should be 3
        self.models2 = models[9:18]
        self.models3 = models[18:]
        
    def get_utility(self, features, actions, outcome, penalty=1.5, treatment_cost=0.1):
        """ 
        Return the empirical utility for a population. This is defined by the
        sum of the rewards, so se Policy.get_reward() for explaination of how
        we define our utility.
        
        Args:
            features (t*|X| array)
            actions (t*|A| array)
            outcomes (t*|Y| array)
            penalty (float): penalty for introducing new symptoms.
            treatment_cost (float): Cost for using a treatment
        Out:
            utility (float): Empirical utility of the policy on this data.
        """
        utility = sum(self.get_reward(features, actions, outcome, penalty, treatment_cost))
        return utility
        
    def get_action(self, features):
        """
        Get actions for one or more people. observe() should already have been
        called, so the model is fitted. 
        The actions are chosen as follows:
            For each person in the dataset, we use the models to extimate the 
        probability for a treatment introducing or removing a symptom, for
        each symptom. Then we have a the probability for the post_symptoms, 
        which behaves as the "expected response". For each person, we then
        see which treatment gives the largest expected reward. If all the 
        expected rewards is below zero, we do not treat them, if at least one
        is positive, we chose the one that is largest.
        In: 
            features (t*|X| array): Population to be found an action on
        Out: 
            actions (t*|A| array): Actions chosen
        """
        symptom_indecies = [1, 2, 3, 4, 5, 6, 7, 8, 9]
        post_symptoms1 = np.zeros((len(features), len(symptom_indecies)))
        post_symptoms2 = np.zeros((len(features), len(symptom_indecies)))
        post_symptoms3 = np.zeros((len(features), len(symptom_indecies)))
        
        for symptom_index in symptom_indecies: 
            x_data = self.feature_select(features, symptom_index)
            scaler = preprocessing.StandardScaler().fit(x_data)
            x_scaled = scaler.transform(x_data)
            pred1 = self.models1[symptom_index - 1].predict_proba(x_scaled)[:, 1]
            pred2 = self.models2[symptom_index - 1].predict_proba(x_scaled)[:, 1]
            pred3 = self.models3[symptom_index - 1].predict_proba(x_scaled)[:, 1]
            post_symptoms1[:, symptom_index-1] = pred1
            post_symptoms2[:, symptom_index-1] = pred2
            post_symptoms3[:, symptom_index-1] = pred3
        
        mock_actions = np.ones((self.n_population, 3)) # Represent an actions has been done
        rewards1 = self.get_reward(features, mock_actions, post_symptoms1)
        rewards2 = self.get_reward(features, mock_actions, post_symptoms2)
        rewards3 = self.get_reward(features, mock_actions, post_symptoms3)
        
        actions = np.zeros([n_population, self.n_actions]) # Initialize
        for t in range(n_population):
            # print(f"1: {pred1[t]} 2: {pred2[t]} 3: {pred3[t]}")
            if np.max(np.asarray([rewards1[t], rewards2[t], rewards3[t]])) < 0:
                # All the treatments have expected utility less than zero
                actions[t, 0] = 0 # Do nothing
            # If at least one expected reward is bigger than 0, we chose the biggest
            elif rewards1[t] >= rewards2[t] and rewards1[t] >= rewards3[t]:
                actions[t, 0] = 1
            elif rewards2[t] >= rewards1[t] and rewards2[t] >= rewards3[t]:
                actions[t, 1] = 1
            elif rewards3[t] >= rewards1[t] and rewards3[t] >= rewards2[t]:
                actions[t, 2] = 1
        # embed()
        return actions
    
    def get_arguments(self):
        return self.features, self.actions, self.outcomes

class RandomPolicy(Policy):
    """ This is a purely random policy!"""

    def get_utility(self, features, action, outcome):
        """Here the utiliy is defined in terms of the outcomes obtained only, ignoring both the treatment and the previous condition.
        """
        actions = self.get_action(features)
        utility = 0
        utility -= 0.2 * sum(outcome[:,symptom_names['Covid-Positive']])
        utility -= 0.1 * sum(outcome[:,symptom_names['Taste']])
        utility -= 0.1 * sum(outcome[:,symptom_names['Fever']])
        utility -= 0.1 * sum(outcome[:,symptom_names['Headache']])
        utility -= 0.5 * sum(outcome[:,symptom_names['Pneumonia']])
        utility -= 0.2 * sum(outcome[:,symptom_names['Stomach']])
        utility -= 0.5 * sum(outcome[:,symptom_names['Myocarditis']])
        utility -= 1.0 * sum(outcome[:,symptom_names['Blood-Clots']])
        utility -= 100.0 * sum(outcome[:,symptom_names['Death']])
        return utility
    
    def get_action(self, features):
        """Get a completely random set of actions, but only one for each individual.
        If there is more than one individual, feature has dimensions t*x matrix, otherwise it is an x-size array.
        
        It assumes a finite set of actions.
        Returns:
        A t*|A| array of actions
        """

        n_people = features.shape[0]
        ##print("Acting for ", n_people, "people");
        actions = np.zeros([n_people, self.n_actions])
        for t in range(features.shape[0]):
            action = np.random.choice(self.action_set)
            if (action >= 0):
                actions[t,action] = 1
            
        return actions

Here are the help functions:

In [7]:
def add_feature_names(X):
    """
    Convert a population / features / X to a pandas dataframe with suitable names.
    """
    features_data = pd.DataFrame(X)
    features = []
    features += ["Covid-Recovered", "Covid-Positive", "No-Taste/Smell", "Fever", 
                 "Headache", "Pneumonia", "Stomach", "Myocarditis", 
                 "Blood-Clots", "Death"]
    features += ["Age", "Gender", "Income"]
    features += ["Genome" + str(i) for i in range(1, 129)]
    features += ["Asthma", "Obesity", "Smoking", "Diabetes", 
                 "Heart disease", "Hypertension"]
    features += ["Vaccination status" + str(i) for i in range(1, 4)]
    features_data.columns = features
    return features_data
    
def add_action_names(actions):
    """
    Convert np.array of actions to a pandas dataframe with suitable names.
    """
    df = pd.DataFrame(actions)
    names = ["Treatment" + str(i) for i in range(1, actions.shape[1] + 1)]
    df.columns = names
    return df

def add_outcome_names(outcomes):
    """
    Convert a np.array of outcomes / post_symptoms to a pandas dataframe with
    suitable names. 
    """
    df = pd.DataFrame(outcomes)
    columns = ["Covid-Recovered", "Covid-Positive", "No-Taste/Smell", "Fever", 
                  "Headache", "Pneumonia", "Stomach", "Myocarditis", 
                  "Blood-Clots", "Death"]
    for i in range(len(columns)):
        columns[i] = "Post_" + columns[i]
    df.columns = columns
    return df

def treatments_given(actions):
    """
    Given a set of actions, this function will return how many the patient who
    recieved at least one action, in other words, do not have 0 for every
    action column.
    """
    actions = np.asmatrix(actions)
    s = 0
    for i in range(len(actions)):
        if np.sum(actions[i, :]) > 0:
            s += 1
    return s

class ZeroModel:
    """
    This class is simply made for always prediction 0. LogisticRegression
    does not work for only 0 inputs in the response, so we do this instead.
    """
    def __init__(self, name="name"):
        self.name = name
    
    def predict(self, array):
        """
        0 or 1 predictions, which should always be 0.
        """
        return np.zeros(len(array))
    
    def predict_proba(self, array):
        """
        Probability predictions for "yes" and "no" for each input, 0 and 1.
        """
        prob = np.zeros((len(array), 2))
        prob[:, 1] = np.ones(len(array))
        return prob

# 1st Deadline

We will now explore ways to add privacy to the data and the models

## a)
One simple way to protect the private information of the individuals is to just hide direct identifiers. However this is generally insufficient as attackers may have other identifying information. This information, combined with the information in the database, can reveal identities. 
<br>
Another method is k-anonymization, where k-1 people are indistinguishable from each other (with respect to quasi-identifiers) in the database. Columns with personal information, like name and date of birth are removed, and the rest of the information is generalized. For instance can a variable like age be categorical with different age-groups. Even though k-anonymization is an improvement from simply removing direct identifiers, an attacker with enough imformation can still infer something about the individuals.
<br>
If we assume that an attacker can have a lot of side-information, it is better to use differential privacy. For instance, we can use the Laplace mechanism, where Laplace distributed noise is added in the model. How much noise we add determines how private the result is. We can randomly chose a fraction of the data and add noise to it. This way, even if the data was publicly available, one would not be certain if the it really was true. The goal would obviously add a fraction of noise that makes the data private enough, but do not lower the predictions significantly. 
<br>
In this task the policy is released and can be used by the public. Then the data have to be anonymized before $\pi(a|x)$ is obtained. This can be done using a local privacy model, where independent Laplace noise $\omega_{i}$ is added to each individual. We have $y_{i}=x_{i}+\omega_{i}$ and use it to get $a=n^{-1}\sum_{i=1}^{n}y_{i}$.

## b) 
Here we assume that the analysts can be trusted with private information, so only the result made available for the public have to be privatized. Then we can use a centralized privacy model. We obtain $\pi(a|x)$ with $a=n^{-1}\sum_{i=1}^{n}x_{i}+\omega$. We do not need to privatize the data, just a bit of the decisions of the model we fitted on it. 

In other words, we can add noise to the actions after fitting the model. Without changing any of the observations in the population, we can fit a model that decides actions, and then add a bit of noise to the actions. This is so it one could not figure out personal data based on the action we chose, which might happen if the model picks ups a simple pattern in the data. 

For the actual implementation, we use both approaches. First, we add noise to the data itself and fit a model. Then we try to fit a model first, then add noise to the results. 
For the first approach, we implement functions for adding noise to our data. For the binary data, the function randomize() choses a ratio "1-theta" from a column, and changes it with a coinflip (50-50 chance of 0 and 1). For the continious variables, we replace the coinflip with the same data pluss an addition drawn from the Laplace distribution.

We then loop over all the desired columns in our population and add noise one by one. The desired columns are Age, Income, Gender and the co-morbidities. We do not want to change the pre- or post-sympoms, because this will drasticly change the utility. For example, let's say that our noise happend to remove a lot of the positive post-symptoms. Then this would artificially make our model look way better than it actually would be. The utility is calculated directly on the pre-symptom, action and post-symptom, so we do not add noise here. The model is then fitted on the privatized data. 

For the second approach, we simply fit the model on the data, then sends the outcome columns to the functions to add noise. We will do this two ways. First, we simply shuffle a persons actions with a "1-theta" probability. This may change the treatment a treated person is given, but it will not change a non-treated person, since all of their action-columns will correspond to 0. Therefore, we secondly draw 1, 2, 3 or no-treatment, with a 1-theta probability, for each person. 

## c) 

Let us now try to implement a policy, and see how the utility is affected by the privacy.

In [8]:
def privatize_actions_shuffle(A, theta):
    """
    Adds noise to the actions chosen by the model. This is done by shuffling 
    the actions given to each person, with a probability 1 - theta. 
    """
    A1 = A.copy()
    coins = np.random.choice([True, False], p=(theta, (1-theta)), size=A.shape[0])
    for i in range(A1.shape[0]):
        if not coins[i]:
            np.random.shuffle(A1[i, :])
    return A1

def privatize_actions_draw(A, theta):
    """
    Adds noise to the actions chosen by the model. This is done by choicing a 
    new action with a probability 1-theta.
    """
    A1 = A.copy()
    coins = np.random.choice([True, False], p=(theta, (1-theta)), size=A.shape[0])
    for i in range(A1.shape[0]):
        if not coins[i]:
            A1[i, :] = np.zeros(A.shape[1])
            coin = np.random.randint(A.shape[1]+1)
            if coin != A.shape[1]:
                A1[i, coin] = 1
    return A1
    
def randomize(a, theta):
    """
    Randomize a single column. Simply add a coin-toss to 1- theta amount of the data
    """
    coins = np.random.choice([True, False], p=(theta, (1-theta)), size=a.size)
    noise = np.random.choice([0, 1], size=a.size)
    response = np.array(a)
    response[~coins] = noise[~coins]
    return response 
    
def randomize_cont(a, theta, decay=1):
    coins = np.random.choice([True, False], p=(theta, (1-theta)), size=a.size)
    noise = np.random.laplace(0, decay, a.size)
    response = np.array(a)
    response[~coins] = response[~coins] + noise[~coins]
    return response
    
def privatize(X, theta):
    """
    Adds noice to the data, column by column. The continious and discreet 
    columns are treated differently. 
    """
    df = add_feature_names(X).copy()
    df["Age"] = randomize_cont(df["Age"], theta)
    df["Income"] = randomize_cont(df["Income"], theta)
    columns = ["Gender", "Asthma", "Obesity", "Smoking", 
               "Diabetes", "Heart disease", "Hypertension"]
    for column in columns:
        df[column] = randomize(df[column], theta)
    symptoms = ["Covid-Recovered", "Covid-Positive", "No-Taste/Smell", "Fever", 
                "Headache", "Pneumonia", "Stomach", "Myocarditis", "Blood-Clots", "Death"]
    for columns in symptoms: # Shuffle symptoms
        df[column] = randomize(df[column], theta)
    return np.asarray(df)

## d)
Now lets test:

In [19]:
np.random.seed(57)
n_genes = 128
n_vaccines = 3
n_treatments = 3
n_population = 10000
population = simulator.Population(n_genes, n_vaccines, n_treatments)
np.random.seed(57)
X = population.generate(n_population) # Population
treatment_policy = Policy(n_treatments, list(range(n_treatments)))
np.random.seed(57)
features, actions, outcomes = treatment_policy.initialize_data(n_population)
treatment_policy.observe(features, actions, outcomes)
A = treatment_policy.get_action(X) # Actions 
U = population.treat(list(range(n_population)), A)

In [20]:
thetas = [1, 0.99, 0.95, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0]
utility_list1 = np.zeros(len(thetas) + 1)
utility_list2 = np.zeros(len(thetas) + 1)
utility_list3 = np.zeros(len(thetas) + 1)
utility_list1[0] = treatment_policy.get_utility(X, A, U)
utility_list2[0] = treatment_policy.get_utility(X, A, U)
utility_list3[0] = treatment_policy.get_utility(X, A, U)
for i in range(len(thetas)):
    np.random.seed(57)
    X_noise = privatize(X, thetas[i])
    A_noise1 = treatment_policy.get_action(X_noise)
    A_noise2 = privatize_actions_shuffle(A, thetas[i])
    A_noise3 = privatize_actions_draw(A, thetas[i])

    U1 = population.treat(list(range(n_population)), A_noise1)
    U2 = population.treat(list(range(n_population)), A_noise2)
    U3 = population.treat(list(range(n_population)), A_noise3)

    utility_list1[i+1] = treatment_policy.get_utility(X, A_noise1, U1)
    utility_list2[i+1] = treatment_policy.get_utility(X, A_noise2, U2)
    utility_list3[i+1] = treatment_policy.get_utility(X, A_noise3, U3)


In [21]:
utility_list1

array([2311.7 , 2208.  , 2813.15, 1600.45, 2493.5 , 1699.65, 1997.5 ,
       1742.2 , 2550.55, 1946.  , 1848.7 , 1964.15, 2000.75, 2014.8 ])

In [22]:
utility_list2

array([2311.7 ,  803.8 , 2399.75, 2400.05, 2667.4 , 2086.6 , 1676.65,
       1918.4 ,  978.3 ,  130.75, -384.2 ,  329.9 , -507.65,  -85.8 ])

In [23]:
utility_list3

array([ 2311.7 ,  2208.85,  2484.55,  2210.3 ,  2238.1 ,   538.85,
         859.1 ,   -52.05,   411.1 , -2216.  , -2847.  , -2859.8 ,
       -2346.85, -3889.5 ])