# Model comparison (dataset)
# Drop pi questions with selection metric on train set

Reduced models were obtained by the following procedure:
* drop all questions from sum score PI
* drop next question according to selection metric on **train set**

Selection metric criteria:
* `ca`: min of mean conditional accuracy
* `ca_class`: max of min conditional accuracy
* `ca_prod`: max of product conditional accuracy
* `mse`: min of mean square error
* `mse_class`: min of max conditional mean square error
* `xent` min of cross-entropy
* `xent_class` min of max cross-entropy

http://jmlr.csail.mit.edu/papers/volume3/guyon03a/guyon03a.pdf

## Environment initialization

In [1]:
%autosave 0
%matplotlib notebook
%load_ext autoreload
%autoreload 2

import ipywidgets as widgets
import plotly.graph_objects as go
import plotly.express as px

import sys
sys.path.append("../")

import mod_evaluation
import mod_viewer
import mod_helper

Autosave disabled


## Execution params

In [2]:
results_path = 'data/results'

model_ref_id = 'linear'

n_splits = 10

train_val_random = None

run_types = [1,0]

In [3]:
cache_pre = 'model_'+model_ref_id
cache_post = str(n_splits)

if train_val_random is not None:
    cache_post += '_r'+str(train_val_random)

## Load results

In [4]:
my_data = mod_helper.load_data(
    results_path,
    cache_pre,
    cache_post,
    mod_evaluation.sort_params_train,
    run_types
)

Loaded 1
Loaded 0


## Global stats

In [5]:
df_flat, df_flat_val, df_flat_ca, df_flat_val_ca, info_flat = mod_helper.get_stats_flat(my_data)
df_multi, df_multi_val, df_multi_ca, df_multi_val_ca = mod_helper.get_stats_multi(my_data)

## Holdout set variability

In [54]:
for run_type in my_data:
    display(mod_helper.tab_plot_accuracy_multi(
        df_multi[run_type],
        df_multi_val[run_type]
    ))

Tab(children=(HBox(children=(FigureWidget({
    'data': [{'line': {'color': 'rgba(31, 119, 180, 0.2)'},
      …

Tab(children=(HBox(children=(FigureWidget({
    'data': [{'line': {'color': 'rgba(31, 119, 180, 0.2)'},
      …

# Model comparison

## Mean accuracy on validation set (cross-validation)

* Mean accuracy on validation set (cross-validation) according to selection metric
* Confidence interval estimated by bootstrap method over cross-validation repetitions

(clicking on labels adds/removes traces, double-clicking selects single trace)

* `ca_class`: max of min conditional accuracy
* `mse_class`: min of max conditional mean square error

Figures: **original dataset** (top), **new dataset** (bottom)

In [6]:
for run_type in my_data:
    display(mod_viewer.plot_accuracy_mse(df_flat[run_type]))

HBox(children=(FigureWidget({
    'data': [{'fill': 'toself',
              'fillcolor': 'rgba(31, 119, 180, 0…

HBox(children=(FigureWidget({
    'data': [{'fill': 'toself',
              'fillcolor': 'rgba(31, 119, 180, 0…

## Mean conditional accuracy on validation set  (cross-validation)

* Mean conditional accuracy on validation set (cross-validation) according to selection metric
* Confidence interval estimated by bootstrap method over cross-validation repetitions

(clicking on labels adds/removes traces, double-clicking selects single trace)

Figures: **original dataset** (top), **new dataset** (bottom)

In [7]:
for run_type in my_data:
    display(mod_viewer.tab_plot_conditional_accuracy(
        df_flat[run_type],
        df_flat_ca[run_type],
        info_flat
    ))

Tab(children=(HBox(children=(FigureWidget({
    'data': [{'line': {'color': 'rgba(31, 119, 180, 0.6)'},
      …

Tab(children=(HBox(children=(FigureWidget({
    'data': [{'line': {'color': 'rgba(31, 119, 180, 0.6)'},
      …

# Model validation

## Mean accuracy on holdout set

* Accuracy on holdout set according to selection metric
* Holdout accuracy outside confidence interval bounds may indicate (1) model overfitting or (2) data domain shift

(clicking on labels adds/removes traces, double-clicking selects single trace)

Figures: **original dataset** (top), **new dataset** (bottom)

In [8]:
for run_type in my_data:
    
    display(mod_viewer.tab_plot_accuracy(
        df_flat[run_type],
        info_flat,
        df_questions_holdout=df_flat_val[run_type]
    ))

Tab(children=(HBox(children=(FigureWidget({
    'data': [{'fill': 'toself',
              'fillcolor': 'rgba(3…

Tab(children=(HBox(children=(FigureWidget({
    'data': [{'fill': 'toself',
              'fillcolor': 'rgba(3…

## Mean conditional accuracy on validation set

* Accuracy on holdout set according to selection metric
* Holdout accuracy outside confidence interval bounds may indicate (1) model overfitting or (2) data domain shift

(clicking on labels adds/removes traces, double-clicking selects single trace)

Figures: **original dataset** (top), **new dataset** (bottom)

In [9]:
for run_type in my_data:
    
    display(mod_viewer.tab_plot_conditional_accuracy(
        df_flat_val[run_type],
        df_flat_val_ca[run_type],
        info_flat,
        holdout=True
    ))

Tab(children=(HBox(children=(FigureWidget({
    'data': [{'line': {'color': 'rgba(31, 119, 180, 0.6)'},
      …

Tab(children=(HBox(children=(FigureWidget({
    'data': [{'line': {'color': 'rgba(31, 119, 180, 0.6)'},
      …