# 1. Neural Networks Models

We have developed `adam` models with three hidden layers of (12,12,12), (16,16,16) [[Das et al.](https://dl.acm.org/doi/abs/10.1145/3185768.3186301)], (50,100,50) [[Lopez et al.](https://ieeexplore.ieee.org/document/8614157)], and (100, 100, 100) [[Wang et al.](https://dl.acm.org/doi/10.1145/3284127)] with the maximum 10000 iterations (which determines the number of epochs), and early stopping enabled.

**Note**: The SPEC datasets, the development frameworks, and the exact features used in these studies are not the same as ours, but we have attempted to replicate their models as closely as possible.


In [2]:
%reset -f

#####
# restore the notebook session
import dill

dill.load_session("pickles/regress_02_fs_rfe_mask.pkl")

warnings.simplefilter(action="default")
warnings.filterwarnings("ignore", category=DeprecationWarning)
warnings.filterwarnings("ignore", category=FutureWarning)

from sklearn.model_selection import GridSearchCV
from predict import visualise

# eli5 PermutationImportance with fs_MLP_adam

est = fs_pi_mlp_adam
# e_ind of adam is 5
e_ind = 5

param_grid[e_ind] = {
    "eval__regressor__hidden_layer_sizes": [
        (10,),
        (20,),
        (30,),
        (50,),
        (12,12,12),
        (16,16,16),
        (50,100,50),
        (100,100,100)
    ],
}

PARALLEL = False
grid = {}

if PARALLEL:
    print(colored("Parallel Computation", "red"))
# display pipeline
set_config(display="diagram")

pipe.steps.pop(-1) 
pipe.steps.pop(-1) 

for suite in suites:
    cat = suite["name"]
    print(colored("\n***** " + cat + " *****", "magenta", attrs=["bold"]))

    grid[cat] = {}

    pipe.steps.pop(-1) 

    
    evaluate = est.estimator
    print("Regressor:", colored(visualise.get_estimator_label(est), "green")) 

    
    pipe.steps.extend([
        ("eval", evaluate)
    ])
    #display(pipe)
    print("Pipeline Steps: " + str(list(pipe.named_steps)))

    #####
    # Grid Search CV
    grid[cat][e_ind] = GridSearchCV(
        pipe,
        param_grid=param_grid[e_ind],
        scoring=[GRIDCV_SCORING, "r2"],
        n_jobs= GRIDCV_N_JOBS if PARALLEL else None,
        cv=k_fold,
        # for multi-metric evaluation, refit should be the scorer to be used for best_params_, etc
        refit=GRIDCV_SCORING, # refit=True,
        verbose=True,
        return_train_score = True 
    )

    # fit
    grid[cat][e_ind].fit(train_X_df[cat], train_y_df[cat])

    # print("check rfe_mask_fs support mask:\n", grid[cat][e_ind].best_estimator_["rfe_mask_fs"].get_support())
    print(
        "Best cross-validation MAE: {:.4f}".format(-grid[cat][e_ind].best_score_)
    )
    # because it has been refit before, we can get score on the test too
    print(
        "Best MAE on Test: {:.4f}".format(
            -grid[cat][e_ind].score(test_X_df[cat], test_y_df[cat])
        )
    )
    print("Best parameters: {}\n".format(grid[cat][e_ind].best_params_))
    

display(pipe)

#####
# save the notebook session
dill.dump_session("pickles/regress_related_work.pkl")



[1m[35m
***** FP_rate *****[0m
Regressor: [32mPermutation_MLPRegressor_adam[0m
Pipeline Steps: ['transform_X', 'reorder_X', 'debug1', 'scale', 'debug2', 'variance_fs', 'eval']
Fitting 5 folds for each of 8 candidates, totalling 40 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done  40 out of  40 | elapsed:  2.1min finished
Best cross-validation MAE: 20.7144
Best MAE on Test: 17.0333
Best parameters: {'eval__regressor__hidden_layer_sizes': (100, 100, 100)}

[1m[35m
***** FP_speed *****[0m
Regressor: [32mPermutation_MLPRegressor_adam[0m
Pipeline Steps: ['transform_X', 'reorder_X', 'debug1', 'scale', 'debug2', 'variance_fs', 'eval']
Fitting 5 folds for each of 8 candidates, totalling 40 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done  40 out of  40 | elapsed:  1.9min finished
Best cross-validation MAE: 10.6143
Best MAE on Test: 9.9045
Best parameters: {'

# 2. Evaluation of Related Work

## 2.1 Exploring GridSearchCV Results

It is observed that models with the hidden layer size of (100, 100, 100) perform better than our top-performing `adam` model, and provide the best average CV MAE across all four suites, but they cannot still reach the accuracy of our final `lbfgs` models (refer to the paper for a more detailed discussion).

In [1]:
%reset -f

#####
# restore the notebook session
import dill

dill.load_session("pickles/regress_related_work.pkl")

warnings.simplefilter(action="default")

pd.set_option("display.max_columns", None)  # show all columns when display
pd.set_option("display.max_rows", None)

for suite in suites:
    cat = suite["name"]
    print(colored("\n***** " + cat + " *****", "magenta", attrs=["bold"]))
    print("GridSearchCV Regressor:", colored(visualise.get_estimator_label(est), "green")) 

    gs_df = pd.concat(
        [pd.DataFrame(grid[cat][e_ind].cv_results_["params"]),
        # use "mean_test_score" for single metric evaluation
        pd.DataFrame(grid[cat][e_ind].cv_results_["mean_train_r2"], columns=["Avg. CV Train R2"]),
        pd.DataFrame(grid[cat][e_ind].cv_results_["mean_test_r2"], columns=["Avg. CV Test R2"]),
        pd.DataFrame(-grid[cat][e_ind].cv_results_["mean_train_neg_mean_absolute_error"], columns=["Avg. CV Train MAE"]),
        pd.DataFrame(-grid[cat][e_ind].cv_results_["mean_test_neg_mean_absolute_error"], columns=["Avg. CV Test MAE"]),
        pd.DataFrame(grid[cat][e_ind].cv_results_["mean_score_time"], columns=["mean score time"]),
        pd.DataFrame(grid[cat][e_ind].cv_results_["mean_fit_time"], columns=["mean fit time"])
        ],
        axis=1)
    gs_df = gs_df.sort_values(by=['Avg. CV Test MAE'], ascending=True)
    
    # style the dateframe
    def f(dat, c="green"):
        return [f'background-color: {c}' for i in dat]

    display(gs_df.head(15).style.apply(f, axis=0, subset=["Avg. CV Test MAE", "mean score time"]))
    regr_model = model_regress.RegrModel("grid_best regr_model", grid[cat][e_ind].best_estimator_)

    # regr_model.fit(train_X_df[cat], train_y_df[cat])
    # regr_model.predict(test_X_df[cat], test_y_df[cat])
    # print(regr_model)
    

[1m[35m
***** FP_rate *****[0m
GridSearchCV Regressor: [32mPermutation_MLPRegressor_adam[0m


Unnamed: 0,eval__regressor__hidden_layer_sizes,Avg. CV Train R2,Avg. CV Test R2,Avg. CV Train MAE,Avg. CV Test MAE,mean score time,mean fit time
7,"(100, 100, 100)",0.874008,0.892845,19.848528,20.714375,0.041521,6.621434
6,"(50, 100, 50)",0.848982,0.875575,23.65759,24.541118,0.028037,6.665019
3,"(50,)",0.929866,0.862296,24.351562,26.007584,0.01882,2.620668
2,"(30,)",0.920376,0.858323,26.598838,28.660537,0.014091,2.596445
1,"(20,)",0.900722,0.850441,29.748388,31.240071,0.008518,1.089145
0,"(10,)",0.779076,0.800638,30.862707,31.248572,0.0048,1.717436
4,"(12, 12, 12)",0.732622,0.779355,35.797658,36.088389,0.010303,2.142564
5,"(16, 16, 16)",0.719936,0.762676,38.796627,39.550805,0.010874,1.809012


[1m[35m
***** FP_speed *****[0m
GridSearchCV Regressor: [32mPermutation_MLPRegressor_adam[0m


Unnamed: 0,eval__regressor__hidden_layer_sizes,Avg. CV Train R2,Avg. CV Test R2,Avg. CV Train MAE,Avg. CV Test MAE,mean score time,mean fit time
7,"(100, 100, 100)",0.919038,0.913885,10.239993,10.614286,0.032238,5.900388
6,"(50, 100, 50)",0.912535,0.907152,10.839875,11.209771,0.025842,5.044334
3,"(50,)",0.881402,0.868265,12.177838,12.610873,0.015442,2.218809
2,"(30,)",0.879732,0.867319,12.551703,13.001403,0.010667,1.525946
0,"(10,)",0.860105,0.850243,13.462606,13.731442,0.00557,2.732635
1,"(20,)",0.853045,0.843266,13.911186,14.326774,0.009231,1.447774
5,"(16, 16, 16)",0.796516,0.792054,16.418034,16.74702,0.010531,1.965348
4,"(12, 12, 12)",0.752972,0.748576,20.62345,20.827034,0.006249,1.703017


[1m[35m
***** Int_rate *****[0m
GridSearchCV Regressor: [32mPermutation_MLPRegressor_adam[0m


Unnamed: 0,eval__regressor__hidden_layer_sizes,Avg. CV Train R2,Avg. CV Test R2,Avg. CV Train MAE,Avg. CV Test MAE,mean score time,mean fit time
7,"(100, 100, 100)",0.873936,0.881968,18.445628,19.092234,0.039232,6.005681
3,"(50,)",0.92915,0.905838,21.010074,22.216176,0.024743,4.248539
6,"(50, 100, 50)",0.844582,0.853981,21.559497,22.221695,0.033547,5.692762
2,"(30,)",0.910919,0.902549,23.054806,23.561819,0.017476,3.33906
1,"(20,)",0.929345,0.920805,24.073287,24.649165,0.01024,1.279861
0,"(10,)",0.853644,0.885217,26.821947,26.630339,0.006398,1.95536
5,"(16, 16, 16)",0.755983,0.768368,29.62335,30.332531,0.012234,1.968427
4,"(12, 12, 12)",0.697042,0.710183,33.155107,33.804608,0.01169,1.951209


[1m[35m
***** Int_speed *****[0m
GridSearchCV Regressor: [32mPermutation_MLPRegressor_adam[0m


Unnamed: 0,eval__regressor__hidden_layer_sizes,Avg. CV Train R2,Avg. CV Test R2,Avg. CV Train MAE,Avg. CV Test MAE,mean score time,mean fit time
7,"(100, 100, 100)",0.925887,0.913549,0.328728,0.351869,0.027007,5.35509
6,"(50, 100, 50)",0.912936,0.902187,0.361039,0.380139,0.021157,3.887036
3,"(50,)",0.901548,0.891561,0.39141,0.405853,0.014801,2.354221
2,"(30,)",0.890269,0.877833,0.403586,0.417939,0.009813,1.778901
1,"(20,)",0.886596,0.878084,0.418473,0.429396,0.009529,1.871899
5,"(16, 16, 16)",0.874396,0.867434,0.446859,0.456085,0.011227,2.314805
4,"(12, 12, 12)",0.856123,0.850252,0.475791,0.48381,0.007132,2.593728
0,"(10,)",0.848626,0.836804,0.474547,0.485963,0.005806,2.098655
