# IMES 351 HW 5 
## Solving logistic regression problems using gradient methods 
Please put iems351_hw5_ex1.csv, iems351_report_hw5.ipynb, and iems_tools_hw5.py in the same folder. 

Please do the following to finish HW 5:

(1) Finish the implementation of gradient methods in iems_tools_hw5.py 

Notes: Update rule in the gradient method with diminishing step size rule is 
$$
x^{k+1} = x^k - \frac{\alpha}{k+1} \nabla f(x^k) \quad k = 0,1,2,\ldots
$$

(2) Finish the exercises in iems351_report_hw5.ipynb 

(3) Save your finished iems351_report_hw5.ipynb with all the outputs as a .HTML file  

(4) Submit iems351_tools_hw5.py, iems351_report_hw5.ipynb, and the .HTML file to Canvas 

In [None]:
# import packages
import numpy as np
import pandas as pd
from iems351_tools_hw5 import gradient_method

## Exercise 1: Finish the implementation of logistic regrsssion models 
## Logistic regression problem with labels ($+1$ or $-1$) is formulated as follows: 
$$
\boxed{\min_{x} \sum_{i=1}^N \log \left(1 + \exp{(-y^{(i)} \cdot x^\top z^{(i)}}) \right)}
$$
Requirement 1.1: Please do not use loops to compute objective function value or gradient value. Instead, you should use numpy matrix multiplication directly.

In [None]:
# build a logistic regression model


# sigmoid function
def sigmoid(t):
    """
    :param t: float or a numpy array

    :return: float or a numpay array
    """
    return 1 / (1 + np.exp(-t))


# objective function
def logistic_obj(x, Z, y):
    """
    :param x: numpy array (n,)
    :param Z: numpy array (N,n)
    :param y: numpy array (N,)

    :return: float
    """
    linear_term = y * (Z @ x)
    return np.sum(np.log(1 + np.exp(-linear_term)))  # obj func


# gradient function
def logistic_grad(x, Z, y):
    """
    :param x: numpy array (n,)  
    :param Z: numpy array (N,n)
    :param y: numpy array (N,)

    :return: numpy array (n,)
    """
    linear_term = y * (Z @ x)
    gradient = Z.T @ (-y * sigmoid(-linear_term))
    return gradient


def logistic_classification_model(x, z, threshold=0.5):
    prob = sigmoid(x @ z)
    if prob > threshold:
        return 1
    else:
        return -1


def logistic_prob(x, Z):
    prob = sigmoid(Z @ x)
    return prob


def prediction_accuracy(x, Z, y, threshold=0.5):
    # sample size
    N = len(y)
    # count
    counter_correct = 0
    for i in range(N):
        # predict label
        prob = sigmoid(x @ Z[i])
        if prob > threshold:
            predict_label = 1
        else:
            predict_label = -1
        if predict_label * y[i] > 0:  # correct
            counter_correct += 1
    return counter_correct / N


logistic_model = {
    "classification_model": logistic_classification_model,
    "prob": logistic_prob,
    "obj": logistic_obj,
    "grad": logistic_grad,
    "prediction_accuracy": prediction_accuracy,
}

## Exercise 2: Run the following toy example
Requirement 2.1: You should see that the accuracy of the logistic regression model trained by the gradient methods is above 0.75

In [None]:
ex1_file_name = "iems351_hw5_ex1.csv"
ex1_df = pd.read_csv(ex1_file_name)
y_train = ex1_df["y"].to_numpy()
Z_train = ex1_df[["z1", "z2"]].to_numpy()
# Set up algorithmic parameters
alg_param_const = {
    "alpha": 1e-2,
    "freq_print_obj": 50,
    "freq_print_accuracy": 100,
    "max_iteration": 2000,
    "flag_constant_step_size": True,
    "flag_diminishing_step_size": False,
}

alg_param_diminishing = {
    "alpha": 1,
    "freq_print_obj": 100,
    "freq_print_accuracy": 200,
    "max_iteration": 4000,
    "flag_constant_step_size": False,
    "flag_diminishing_step_size": True,
}

In [None]:
# train the model using a gradient method with constant step size
x_init = np.zeros(2)
x_train = gradient_method(x_init, Z_train, y_train, logistic_model, alg_param_const)

In [None]:
# train the model using a gradient method with diminishing step size
x_train = gradient_method(x_init, Z_train, y_train, logistic_model, alg_param_diminishing)

## Exercise 3: Find the proper step size parameter for spam email example 
Hint: The stepsize must be very small in this example. Otherwise, you may see the objective function value becomes inf. 

Requirement 3.1: The accuracy of the logistic regression model trained by the gradient method with constant step size rule is above 0.7

Requirement 3.2: The accuracy of the logistic regression model trained by the gradient method with diminishing step size rule is above 0.4

In [None]:
from ucimlrepo import fetch_ucirepo

# fetch dataset
spambase = fetch_ucirepo(id=94)

# data (as pandas dataframes)
Z = spambase.data.features
y = spambase.data.targets

# metadata
# print(spambase.metadata)

# variable information
# print(spambase.variables)
# Convert Z to numpy array
N = len(y)
Z_np = Z.to_numpy()
Z_np = np.concatenate((Z_np, np.ones((N, 1))), axis=1)
print(Z_np)
# Convert y to numpy array and convert {+1, 0} into {+1, -1}
y_np = y.to_numpy() * 2 - 1
y_np = np.squeeze(y_np)
print(y_np)

In [None]:
# Set up algorithmic parameters
# ======================================================================================
# Write your choice of step size
alg_param_const = {
    "alpha": 1e-8,  # very small step size
    "freq_print_obj": 100,
    "freq_print_accuracy": 200,
    "max_iteration": 1001,
    "flag_constant_step_size": True,
    "flag_diminishing_step_size": False,
}

alg_param_diminishing = {
    "alpha": 1e-5,  # larger initial step size for diminishing
    "freq_print_obj": 100,
    "freq_print_accuracy": 200,
    "max_iteration": 5001,
    "flag_constant_step_size": False,
    "flag_diminishing_step_size": True,
}
# ======================================================================================

In [None]:
x_init = np.ones(58) * 0.001
x = gradient_method(x_init, Z_np, y_np, logistic_model, alg_param_const)

In [None]:
x = gradient_method(x_init, Z_np, y_np, logistic_model, alg_param_diminishing)

## Exercise 4: Data normalization 
Here, we normalize the data by its mean. 

Requirement 4.1: The accuracy of the logistic regression model trained by the gradient method with constant step size rule is above 0.9

Requirement 4.2: The accuracy of the logistic regression model trained by the gradient method with diminishing step size rule is above 0.9

In [None]:
# normalize the data
scaling_factor = np.mean(Z_np, axis=0)
print(scaling_factor)
Z_scale = Z_np / scaling_factor
print(Z_scale)

In [None]:
# Set up algorithmic parameters
# ======================================================================================
# Write your choice of step size
alg_param_const = {
    "alpha": 1e-4,
    "freq_print_obj": 100,
    "freq_print_accuracy": 200,
    "max_iteration": 1001,
    "flag_constant_step_size": True,
    "flag_diminishing_step_size": False,
}

alg_param_diminishing = {
    "alpha": 1e-3,
    "freq_print_obj": 100,
    "freq_print_accuracy": 200,
    "max_iteration": 1001,
    "flag_constant_step_size": False,
    "flag_diminishing_step_size": True,
}
# ======================================================================================

In [None]:
x_init = np.ones(58) * 0.001
x = gradient_method(x_init, Z_scale, y_np, logistic_model, alg_param_const)

In [None]:
x = gradient_method(x_init, Z_scale, y_np, logistic_model, alg_param_diminishing)