# Regularized Proximal Descent Based Model Framework

In [18]:
%reload_ext autoreload
%autoreload 2

In [19]:
#Importing Needed Libraries
import framework
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

**Load Your Data**

The framework assumes all pre-processing has been done by the user. Simply load, and split your data as a Pandas DataFrame.

In [20]:
#Load your Data 
#file_path = ''
#data = pd.read_csv(file_path)
#labels = data['']
#samples = data.drop([''], axis = 1)'''

In [21]:
#Load your Data -- Example
file_path = '/Users/nicolascutrona/Downloads/heart.csv'
data = pd.read_csv(file_path)
data = data.drop(['Age', 'RestingBP', 'Cholesterol', 'FastingBS', 'MaxHR', 'Oldpeak'], axis = 1)
labels = data['HeartDisease']
samples = data.drop(['HeartDisease'], axis = 1)

In [22]:
#Split your Data
X,y = samples, labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

X_train = X_train.reset_index(drop=True)
X_test = X_test.reset_index(drop=True)
y_train = y_train.reset_index(drop=True)
y_test = y_test.reset_index(drop=True)

**Hyper-Paramter Selection**

Select the hyper-parameters for model learning:

*Convergence Constant*: A small pre-defined constant that represents the change in model loss per iteration

*Max Iterations*: Select the number of iterations for the model; few iterations may result in non-convergance

*learning Rate*: Give a predefined Learning Rate (Will implement line search WIP)

*Lambda Term*: Give a specified penalty value (Lambda) for the L1 penalty induced on the class-specific attribute weights (Will implement CV for paramter selection WIP)

In [23]:
#Hpyer Parameter Selections (Current Version)
convergence = 1e-3
max_iterations = 1000
learning_rate = 0.01
lambda_term = 0.01

**MODEL FRAMEWORK OUTPUT**

Run the RPNB Framework as shown below.

The output will contain the following:

*Iteration*: The iteration number of model learning

*Posterior Cache First Sample*: The first sample posterior distribution

*Weight Matrix*: The current weight matrix corresponding to class-specific attributes during model learning

*Gradient Weight Matrix Norm*: The norm of the gradient corresponding to the weight matrix. This should be close to zero as model learning progresses, as we hope to find some minimum

*Model Less*: Model loss with respect to the current iteration

*Converged* Boolean convergence check variable

In [24]:
#Run Model Framework
model = framework.Framework(X_train, y_train, X_test, y_test, max_iterations, convergence, learning_rate, lambda_term)
print("Regularized Naive Bayes Training Accuracy:", model.train_accuracy)
print("Regularized Naive Bayes Testing Accuracy:", model.test_accuracy)

_Computing Model Parameters_...

_Optimizing_...

Iteration: 1
Posterior Cache First Sample: [0.6191644799274885, 0.3808355200725114]
Weight Matrix: [[1.00244014 1.00726185 0.99493537 0.98020189 1.01018955]
 [1.00005816 0.96995034 1.00267566 0.98634255 1.01653524]]
Gradient Weight Matrix Norm: 9.970591
Model Loss: 71.78376145774368
Converged: False
Iteration: 2
Posterior Cache First Sample: [0.6056181303050305, 0.39438186969496936]
Weight Matrix: [[1.00321696 1.00964744 0.98682636 0.96036424 1.01449915]
 [1.00175442 0.94735003 1.00836155 0.97587393 1.03306285]]
Gradient Weight Matrix Norm: 9.940957
Model Loss: 71.51628963942471
Converged: False
Iteration: 3
Posterior Cache First Sample: [0.5959450266703864, 0.40405497332961365]
Weight Matrix: [[1.00373226 1.01079209 0.97836256 0.94154431 1.01653053]
 [1.00371415 0.92823852 1.01441694 0.96625514 1.04755975]]
Gradient Weight Matrix Norm: 9.911146
Model Loss: 71.29971456177354
Converged: False
Iteration: 4
Posterior Cache First Sample: [0

**Vanilla Naive Bayes Comparison**

In [25]:
#Hpyer Parameter Selections (Current Version)
convergence = 1e-3
max_iterations = 0
learning_rate = 0.01
lambda_term = 0.01

In [26]:
#Vanilla Naive Bayes
model = framework.Framework(X_train, y_train, X_test, y_test, max_iterations, convergence, learning_rate, lambda_term)
print("Vanilla Naive Bayes Training Accuracy:", model.train_accuracy)
print("Vanilla Naive Bayes Testing Accuracy:", model.test_accuracy)

_Computing Model Parameters_...

Vanilla Naive Bayes Training Accuracy: 0.8471544715447155
Vanilla Naive Bayes Testing Accuracy: 0.834983498349835
