In [7]:

""" 
In the logistic regression function we are passing in a data set numpy array of shape(n,3) along with the labels for 
each entry in the dataset. With this in mind, the function is initializing the weight vector and for each iteration 
specified in the ‘max_iteration’ parameter, a gradient descent vector will be initialized that will traverse the 
entire data set for each iteration in max_iteration. In total O(max_iter*len(dataset)). After traversing the list and 
obtaining the gradient descent vector, it will be multiplied by theta (learning_rate) and subtracted from the weight 
vector to optimize it. As a result, this procedure of iterating through the entire data for a total of N times, 
produces a weight that when multiplied by a sample in X dataset (dot product) and plugged in the sigmoid function, 
will likely produce a probability towards correctly identifying the sample. Downside to this function is that it 
takes extremely long to execute as a batch of 1561 samples will be computed anywhere from 156,100 to 1,561,000 times. 
It takes minutes to run, but ultimately converges towards 98% accuracy in the end.

"""

if __name__ == '__main__':
    test_logistic_regression()




logistic regression testing...
max iteration testcase0: Train accuracy: 0.834721, Test accuracy: 0.827830
max iteration testcase1: Train accuracy: 0.924407, Test accuracy: 0.900943
max iteration testcase2: Train accuracy: 0.966047, Test accuracy: 0.941038
max iteration testcase3: Train accuracy: 0.973735, Test accuracy: 0.950472
learning rate testcase0: Train accuracy: 0.966047, Test accuracy: 0.941038
learning rate testcase1: Train accuracy: 0.973735, Test accuracy: 0.950472
learning rate testcase2: Train accuracy: 0.978860, Test accuracy: 0.962264
logistic regression test done.


In [10]:
"""
This function is generally the same as the regular Gradient Descent algorithm, instead of traversing the entire 
dataset for N times. The function will iterate through N random points, and for each point N in the dataset X, it 
will calculate the gradient at that point, and multiply it by some , and subtract this product from the weight 
vector to optimize it. Overall this algorithm is extremely efficient in terms of O(N random points), but may suffer 
extremely in the initial test cases as seen by the first test case with 64% accuracy, but does converge towards 98%
accuracy.

"""

if __name__ == '__main__':
    test_logistic_regression()
    

logistic regression SGD testing...
max iteration testcase0: Train accuracy: 0.905189, Test accuracy: 0.879717
max iteration testcase1: Train accuracy: 0.918642, Test accuracy: 0.889151
max iteration testcase2: Train accuracy: 0.921845, Test accuracy: 0.893868
max iteration testcase3: Train accuracy: 0.977578, Test accuracy: 0.966981
learning rate testcase0: Train accuracy: 0.956438, Test accuracy: 0.941038
learning rate testcase1: Train accuracy: 0.970532, Test accuracy: 0.948113
learning rate testcase2: Train accuracy: 0.978219, Test accuracy: 0.959906
logistic regression SGD test done.


In [8]:
"""
In the third order function  we are taking in the data samples of shape (n,3), and transforming it into a Z space of 
(n,10) dimension. In order to peform this a z_x numpy array will be initialized and each entry in the z_x will be a 
list of 10 real number produced by the following equation 
[1,x1,x2] = [1, x1, x2, (x1)^2, (x1)(x2),(x2)^2, (x1)^3, (x1)^2 (x2), (x1) (x2)^2, (x2)^3 ].
Why? Because by transforming this into a polynomial of degree 3 we are hoping to get a  feature space that is 
linearly separable or close to it. However, we see that using a third degree polynomial transform is only slightly 
better after the first two test cases when training and testing our model.

"""
if __name__ == '__main__':
    test_thirdorder_logistic_regression()

3rd order logistic regression testing...
max iteration testcase0: Train accuracy: 0.924407, Test accuracy: 0.898585
max iteration testcase1: Train accuracy: 0.958360, Test accuracy: 0.941038
max iteration testcase2: Train accuracy: 0.970532, Test accuracy: 0.948113
max iteration testcase3: Train accuracy: 0.975016, Test accuracy: 0.955189
learning rate testcase0: Train accuracy: 0.970532, Test accuracy: 0.948113
learning rate testcase1: Train accuracy: 0.975016, Test accuracy: 0.955189
learning rate testcase2: Train accuracy: 0.978219, Test accuracy: 0.964623
3rd order logistic regression test done.


In [11]:
"""
3rd Order with SGD: This function performs the exact same computation as the regular third order function, the only difference is that 
when the x feature space is transformed into the z feature space, the z feature space will find an optimal using the 
SGD algorithm rather than the regular Gradient Descent.

"""
if __name__ == '__main__':
	test_thirdorder_logistic_regression()

3rd order logistic regression SGD testing...
max iteration testcase0: Train accuracy: 0.782832, Test accuracy: 0.794811
max iteration testcase1: Train accuracy: 0.930814, Test accuracy: 0.900943
max iteration testcase2: Train accuracy: 0.949391, Test accuracy: 0.919811
max iteration testcase3: Train accuracy: 0.973735, Test accuracy: 0.950472
learning rate testcase0: Train accuracy: 0.970532, Test accuracy: 0.950472
learning rate testcase1: Train accuracy: 0.975657, Test accuracy: 0.952830
learning rate testcase2: Train accuracy: 0.977578, Test accuracy: 0.952830
3rd order logistic regression SGD test done.
