## Experimenting agains Iris dataset - Modified ICQ Method

Runs the modified ICQ classifier against Iris dataset, using Stratified 10-Fold cross validation throughout many different random seeds to validate the classifier. At the end of each 10-Fold cross validation, it prints the AVG score and F1-Score for the classifier - and after executing with all different random seeds, the average score/f1-score is also printed.

In [None]:
import sys
import os
sys.path.append(os.path.abspath('../../'))
sys.path.append(os.path.abspath('../../models'))
sys.path.append(os.path.abspath('../../helpers'))

In [5]:
import numpy as np
from helpers.utils import print_metrics, executeIrisOneVsRest
from helpers.icq_methods import create_and_execute_classifier_new_approach

In [6]:
import warnings
from sklearn.exceptions import UndefinedMetricWarning

# We're ignoring some warning from sklearn.metrics.classification_report
warnings.simplefilter(action='ignore', category=UndefinedMetricWarning)

## Sigma_Q param = [1, 1, 1, 0]
We're first executing this test for different seeds using the same sigma Q as the original one, but with the new approach (having weights on U operator and inputs on rho env)

In [3]:
%%time
scores = []
f1scores = []
for i_random_state in range(10, 70, 5):
    curr_scores, curr_f1scores = executeIrisOneVsRest(random_seed=i_random_state,
                                                classifier_function=create_and_execute_classifier_new_approach,
                                                sigma_q_weights=[1, 1, 1, 0],
                                                max_iter=2000,
                                                print_each_fold_metric=False,
                                                print_avg_metric=True)
    scores.append(np.mean(curr_scores))
    f1scores.append(np.mean(curr_f1scores))

AVG: Scores = 0.9199999999999999 F1-Scores = 0.9176347726347727
AVG: Scores = 0.9533333333333335 F1-Scores = 0.9525757575757575
AVG: Scores = 0.9333333333333333 F1-Scores = 0.9305977355977355
AVG: Scores = 0.9533333333333334 F1-Scores = 0.9528619528619527
AVG: Scores = 0.9133333333333333 F1-Scores = 0.9102009102009101
AVG: Scores = 0.9000000000000001 F1-Scores = 0.8968098568098568
AVG: Scores = 0.9066666666666666 F1-Scores = 0.9045130795130796
AVG: Scores = 0.9266666666666665 F1-Scores = 0.9261279461279461
AVG: Scores = 0.9400000000000001 F1-Scores = 0.9377526177526176
AVG: Scores = 0.9199999999999999 F1-Scores = 0.9173761423761423
AVG: Scores = 0.9400000000000001 F1-Scores = 0.9386868686868688
AVG: Scores = 0.9133333333333333 F1-Scores = 0.909040589040589
Wall time: 7h 36min 54s


In [4]:
print_metrics(scores, f1scores)

Scores: [0.9199999999999999, 0.9533333333333335, 0.9333333333333333, 0.9533333333333334, 0.9133333333333333, 0.9000000000000001, 0.9066666666666666, 0.9266666666666665, 0.9400000000000001, 0.9199999999999999, 0.9400000000000001, 0.9133333333333333]
Best score: 0.9533333333333335
F1-Scores: [0.9176347726347727, 0.9525757575757575, 0.9305977355977355, 0.9528619528619527, 0.9102009102009101, 0.8968098568098568, 0.9045130795130796, 0.9261279461279461, 0.9377526177526176, 0.9173761423761423, 0.9386868686868688, 0.909040589040589]
Max F1-Score: 0.9528619528619527


In [6]:
print("Avg score:", np.mean(scores))
print("Avg F1-Score:", np.mean(f1scores))

Avg score: 0.9266666666666666
Avg F1-Score: 0.9245148524315191


## Different Sigma Q
On [ICQ-Training-Estimator-New-Research Notebook](ICQ-Training-Estimator-New-Research.ipynb), the first research we do is about varying the Sigma_Q params - having different values apart from [1, 1, 1, 0] (which is the original one), and the best result we have is [1, 0, 12, 1] with highest accuracy as 0.89167 - therefore we don't expect to have an improvement right now, but we still do the test.

## Integer params

In [7]:
%%time
scores = []
f1scores = []
for i_random_state in range(10, 70, 5):
    curr_scores, curr_f1scores = executeIrisOneVsRest(random_seed=i_random_state,
                                                classifier_function=create_and_execute_classifier_new_approach,
                                                sigma_q_weights=[1, 0, 12, 1],
                                                max_iter=2000,
                                                print_each_fold_metric=False,
                                                print_avg_metric=True)
    scores.append(np.mean(curr_scores))
    f1scores.append(np.mean(curr_f1scores))

AVG: Scores = 0.8666666666666666 F1-Scores = 0.8592725792725794
AVG: Scores = 0.8733333333333334 F1-Scores = 0.8599074999075
AVG: Scores = 0.8866666666666667 F1-Scores = 0.8853703703703703
AVG: Scores = 0.8866666666666667 F1-Scores = 0.88493265993266
AVG: Scores = 0.8400000000000001 F1-Scores = 0.8186015836015835
AVG: Scores = 0.8399999999999999 F1-Scores = 0.8346381396381396
AVG: Scores = 0.8933333333333333 F1-Scores = 0.8837782587782588
AVG: Scores = 0.8533333333333333 F1-Scores = 0.8457372257372258
AVG: Scores = 0.8666666666666666 F1-Scores = 0.8603415103415104
AVG: Scores = 0.8666666666666668 F1-Scores = 0.861099086099086
AVG: Scores = 0.8466666666666667 F1-Scores = 0.8303705553705554
AVG: Scores = 0.8666666666666666 F1-Scores = 0.8612319162319162
Wall time: 4h 39min 55s


In [8]:
print_metrics(scores, f1scores)

Scores: [0.8666666666666666, 0.8733333333333334, 0.8866666666666667, 0.8866666666666667, 0.8400000000000001, 0.8399999999999999, 0.8933333333333333, 0.8533333333333333, 0.8666666666666666, 0.8666666666666668, 0.8466666666666667, 0.8666666666666666]
Best score: 0.8933333333333333
F1-Scores: [0.8592725792725794, 0.8599074999075, 0.8853703703703703, 0.88493265993266, 0.8186015836015835, 0.8346381396381396, 0.8837782587782588, 0.8457372257372258, 0.8603415103415104, 0.861099086099086, 0.8303705553705554, 0.8612319162319162]
Max F1-Score: 0.8853703703703703
Avg score: 0.8655555555555555
Avg F1-Score: 0.8571067821067819


## Float params

In [9]:
%%time
scores = []
f1scores = []
for i_random_state in range(10, 70, 5):
    curr_scores, curr_f1scores = executeIrisOneVsRest(random_seed=i_random_state,
                                                classifier_function=create_and_execute_classifier_new_approach,
                                                sigma_q_weights=[0.7, 0.0, 0.4, 0.1],
                                                max_iter=2000,
                                                print_each_fold_metric=False,
                                                print_avg_metric=True)
    scores.append(np.mean(curr_scores))
    f1scores.append(np.mean(curr_f1scores))

AVG: Scores = 0.8533333333333333 F1-Scores = 0.8420803270803271
AVG: Scores = 0.8466666666666669 F1-Scores = 0.8334428534428534
AVG: Scores = 0.8799999999999999 F1-Scores = 0.8736748436748437
AVG: Scores = 0.8533333333333333 F1-Scores = 0.8252068302068303
AVG: Scores = 0.86 F1-Scores = 0.8442809042809044
AVG: Scores = 0.8933333333333333 F1-Scores = 0.8887963887963887
AVG: Scores = 0.8533333333333335 F1-Scores = 0.8473990823990825
AVG: Scores = 0.8933333333333333 F1-Scores = 0.8879968179968181
AVG: Scores = 0.8800000000000001 F1-Scores = 0.8712086062086062
AVG: Scores = 0.8400000000000001 F1-Scores = 0.8244192844192846
AVG: Scores = 0.8466666666666667 F1-Scores = 0.830122655122655
AVG: Scores = 0.8666666666666668 F1-Scores = 0.8529292929292929
Wall time: 4h 48min 29s


In [10]:
print_metrics(scores, f1scores)

Scores: [0.8533333333333333, 0.8466666666666669, 0.8799999999999999, 0.8533333333333333, 0.86, 0.8933333333333333, 0.8533333333333335, 0.8933333333333333, 0.8800000000000001, 0.8400000000000001, 0.8466666666666667, 0.8666666666666668]
Best score: 0.8933333333333333
F1-Scores: [0.8420803270803271, 0.8334428534428534, 0.8736748436748437, 0.8252068302068303, 0.8442809042809044, 0.8887963887963887, 0.8473990823990825, 0.8879968179968181, 0.8712086062086062, 0.8244192844192846, 0.830122655122655, 0.8529292929292929]
Max F1-Score: 0.8887963887963887
Avg score: 0.8638888888888889
Avg F1-Score: 0.8517964905464904


## Search results for Learning Rate

## Small subset of learning rates
The very first result we get is learning_rate = 0.001, which is different from the article's best 0.01. Let's see how our model deals with that.

In [None]:
%%time
scores = []
f1scores = []
for i_random_state in range(10, 70, 5):
    curr_scores, curr_f1scores = executeIrisOneVsRest(random_seed=i_random_state,
                                                classifier_function=create_and_execute_classifier_new_approach,
                                                sigma_q_weights=[1, 1, 1, 0],
                                                max_iter=2000,
                                                print_each_fold_metric=False,
                                                print_avg_metric=True,
                                                learning_rate=0.001)
    scores.append(np.mean(curr_scores))
    f1scores.append(np.mean(curr_f1scores))

AVG: Scores = 0.9066666666666666 F1-Scores = 0.9056228956228957
AVG: Scores = 0.9533333333333334 F1-Scores = 0.9524410774410775
AVG: Scores = 0.9400000000000001 F1-Scores = 0.9386868686868686
AVG: Scores = 0.9200000000000002 F1-Scores = 0.919040404040404
AVG: Scores = 0.8866666666666667 F1-Scores = 0.8801394901394902
AVG: Scores = 0.9400000000000001 F1-Scores = 0.9391750841750841
AVG: Scores = 0.9533333333333334 F1-Scores = 0.9524410774410773
AVG: Scores = 0.9400000000000001 F1-Scores = 0.9382491582491582
AVG: Scores = 0.96 F1-Scores = 0.9588720538720539
AVG: Scores = 0.9533333333333334 F1-Scores = 0.9533164983164983
AVG: Scores = 0.9333333333333333 F1-Scores = 0.9305977355977355
AVG: Scores = 0.9466666666666667 F1-Scores = 0.9463299663299664
Wall time: 4h 1min 10s


In [None]:
print_metrics(scores, f1scores)

Scores: [0.9066666666666666, 0.9533333333333334, 0.9400000000000001, 0.9200000000000002, 0.8866666666666667, 0.9400000000000001, 0.9533333333333334, 0.9400000000000001, 0.96, 0.9533333333333334, 0.9333333333333333, 0.9466666666666667]
Best score: 0.96
F1-Scores: [0.9056228956228957, 0.9524410774410775, 0.9386868686868686, 0.919040404040404, 0.8801394901394902, 0.9391750841750841, 0.9524410774410773, 0.9382491582491582, 0.9588720538720539, 0.9533164983164983, 0.9305977355977355, 0.9463299663299664]
Max F1-Score: 0.9588720538720539


In [None]:
print("Avg score:", np.mean(scores))
print("Avg F1-Score:", np.mean(f1scores))

Avg score: 0.9361111111111112
Avg F1-Score: 0.9345760258260257


It seems that we managed to improve a bit more our model, as our first result was avg score 0.9266 and now we got 0.9361 - almost 1% more. Let's keep researching!

## Biggest subset of learning rates
After our big research on [ICQ-Training-Estimator-New-Research Notebook](ICQ-Training-Estimator-New-Research.ipynb), we reached the conclusion that the learning rate of 0.0008 was the best one, which is quite close to the 0.001 that we tried before, so it's expected little to none improvement in our tests.

In [None]:
%%time
scores = []
f1scores = []
for i_random_state in range(10, 70, 5):
    curr_scores, curr_f1scores = executeIrisOneVsRest(random_seed=i_random_state,
                                                classifier_function=create_and_execute_classifier_new_approach,
                                                sigma_q_weights=[1, 1, 1, 0],
                                                max_iter=2000,
                                                print_each_fold_metric=False,
                                                print_avg_metric=True,
                                                learning_rate=0.0008)
    scores.append(np.mean(curr_scores))
    f1scores.append(np.mean(curr_f1scores))

AVG: Scores = 0.9200000000000002 F1-Scores = 0.9188047138047137
AVG: Scores = 0.9533333333333334 F1-Scores = 0.9533346283346283
AVG: Scores = 0.9533333333333334 F1-Scores = 0.9527104377104377
AVG: Scores = 0.9266666666666665 F1-Scores = 0.9238637288637289
AVG: Scores = 0.8866666666666667 F1-Scores = 0.8805603655603657
AVG: Scores = 0.9466666666666667 F1-Scores = 0.9458417508417508
AVG: Scores = 0.9466666666666667 F1-Scores = 0.9457070707070706
AVG: Scores = 0.9400000000000001 F1-Scores = 0.9391750841750841
AVG: Scores = 0.9533333333333334 F1-Scores = 0.9521380471380472
AVG: Scores = 0.9400000000000001 F1-Scores = 0.9397811447811447
AVG: Scores = 0.9 F1-Scores = 0.8972138972138971
AVG: Scores = 0.9466666666666667 F1-Scores = 0.9461952861952861
Wall time: 5h 37min 51s


In [None]:
print_metrics(scores, f1scores)

Scores: [0.9200000000000002, 0.9533333333333334, 0.9533333333333334, 0.9266666666666665, 0.8866666666666667, 0.9466666666666667, 0.9466666666666667, 0.9400000000000001, 0.9533333333333334, 0.9400000000000001, 0.9, 0.9466666666666667]
Best score: 0.9533333333333334
F1-Scores: [0.9188047138047137, 0.9533346283346283, 0.9527104377104377, 0.9238637288637289, 0.8805603655603657, 0.9458417508417508, 0.9457070707070706, 0.9391750841750841, 0.9521380471380472, 0.9397811447811447, 0.8972138972138971, 0.9461952861952861]
Max F1-Score: 0.9533346283346283


In [None]:
print("Avg score:", np.mean(scores))
print("Avg F1-Score:", np.mean(f1scores))

Avg score: 0.9344444444444445
Avg F1-Score: 0.9329438462771796


As expected, we got no improvement in our average test whatsoever, which means our current best result is:
- Sigma Q params = [1, 1, 1, 0]
- Learning rate = 0.001
- \# of training epochs = 2000