
Code from here was adapted from: https://www.marktechpost.com/2021/04/08/logistic-regression-with-keras/

https://keras.io/api/optimizers/

https://www.tensorflow.org/guide/keras/train_and_evaluate

Hands-on Machine Learning: Aurelien Geron

Benjamin Katz

In [10]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
import keras

In [11]:
import sklearn
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from tensorflow.keras import backend as K
iris = datasets.load_iris()
X = np.array(iris["data"][:, 2:])# petal length, petal width 
y = np.array((iris["target"] == 0).astype(int)) # 1 if Iris-Virginica, else 0


In [12]:
#Seperate out testing data
#get a random distribution of indexes of the data
#this will allow testing on a random sample as opposed to just the begining of
#the data set or the end
shuffled_indexes = np.random.permutation(len(X))
#seperate out 20 percent for testing
test_size = int(len(X)*.2)
#get the testing and training indexes
test_indexes = shuffled_indexes[:test_size]
train_indexes = shuffled_indexes[test_size:]
#set the sets to the actual values
X_train = np.zeros((len(X)-test_size, 2))
y_train = np.zeros(len(X)-test_size)
X_test = np.zeros((test_size,2))
y_test = np.zeros(test_size)

for i in range(len(train_indexes)):
  X_train[i] = X[train_indexes[i]]
  y_train[i] = y[train_indexes[i]]
for i in range(len(test_indexes)):
  X_test[i] = X[test_indexes[i]]
  y_test[i] = y[test_indexes[i]]


The following method of finding the F1 score was inspired by 
https://neptune.ai/blog/implementing-the-macro-f1-score-in-keras

In [17]:
#F1 score is 2 times the recall times the precision over precision plus recall
#for further discusion see part one
def custom_f1(y_true, y_pred):
    #Recall is the true positives over all the actuall positives
    def recall_m(y_true, y_pred):
        #True positives are the positives it correctly predicted
        TP = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
        Positives = K.sum(K.round(K.clip(y_true, 0, 1)))
        #epsilon is added here to assist in the division and ensure we do not divide by 0
        recall = TP / (Positives+K.epsilon())
        return recall

  #precision is the true positives over the predicted positives, see part one for further discusion
    def precision_m(y_true, y_pred):
        #True positives are the positives it correctly predicted
        TP = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
        Pred_Positives = K.sum(K.round(K.clip(y_pred, 0, 1)))

        precision = TP / (Pred_Positives+K.epsilon())
        return precision

    precision, recall = precision_m(y_true, y_pred), recall_m(y_true, y_pred)

    return 2*((precision*recall)/(precision+recall+K.epsilon()))

In [16]:
number_of_classes = 1
number_of_features = 2
model = Sequential()
#Tell the model we want to use the sigmoid function to map our data, and tell it that we are using two features
#The l2 regulaizer multiplies the square of the loss by 0.01 to add to the loss. Adding this regulizer makes the model converge slower
#and not stay entirely accurate when accuracy reaches 100 percent
#this makes sense since the goal of regularization is to avoid overfitting, we are not satisfied with perfect accuracy
model.add(Dense(number_of_classes,activation = 'sigmoid',input_dim = number_of_features,kernel_regularizer='l2' ))
#Optimize for a learning rate of 5 which was decided using trial and error
opt = keras.optimizers.Adam(learning_rate=5)
#Set gradient desent and find the linear regression with a given loss function of binary cross entrapy
model.compile(optimizer=opt, loss='binary_crossentropy', metrics=[custom_f1])#, metrics = ['accuracy']
#actually perform the gradient desent over 40 epochs on the training data
model.fit(X_train, y_train, epochs=40)
#test the test data 
print("Test loss, test accuracy: " +str(model.evaluate(X_test,y_test)))

Epoch 1/40
Epoch 2/40
Epoch 3/40
Epoch 4/40
Epoch 5/40
Epoch 6/40
Epoch 7/40
Epoch 8/40
Epoch 9/40
Epoch 10/40
Epoch 11/40
Epoch 12/40
Epoch 13/40
Epoch 14/40
Epoch 15/40
Epoch 16/40
Epoch 17/40
Epoch 18/40
Epoch 19/40
Epoch 20/40
Epoch 21/40
Epoch 22/40
Epoch 23/40
Epoch 24/40
Epoch 25/40
Epoch 26/40
Epoch 27/40
Epoch 28/40
Epoch 29/40
Epoch 30/40
Epoch 31/40
Epoch 32/40
Epoch 33/40
Epoch 34/40
Epoch 35/40
Epoch 36/40
Epoch 37/40
Epoch 38/40
Epoch 39/40
Epoch 40/40
Test loss, test accuracy: [0.0808037742972374, 1.0]
