
# Model Evaluation (scikit-learn)


## Table of contents

1. [Introduction].(#Introduction)


## Introduction


This is a file with functions that aim to facilitate the performance analysis of models. As output, the functions provide graphs, metrics, and basic tables commonly used in the analysis process. Pay attention to the functions parameters:

- **final_model**: developed model
    > Save as **final_model_*type***

&nbsp;

- **model**: type of developed model:
    - **lgbm**: Light Gradient Boosting Machine;
    - **loglin**: Linear Regression;
    - **logreg**: Logistic Regression;

&nbsp;

- **X**: test or validation database;

&nbsp;

- **y**: test or validation target;


---



## 1. Import of packages


In [7]:
#!pip install PySimpleGUI

In [8]:
import pandas as pd # parquet
import pickle # pickle

# interative interface
import ipywidgets as widgets
from IPython.display import display


## 2. Import of data



### 2.1. Train


In [1]:
X_train = pd.read_parquet('X_train.parquet')

In [None]:
y_train = pd.read_parquet('y_train.parquet')


### 2.1. Test


In [1]:
X_test = pd.read_parquet('X_test.parquet')

In [None]:
y_test = pd.read_parquet('y_test.parquet')


### 2.1. Validation


In [1]:
X_validation = pd.read_parquet('X_validation.parquet')

In [None]:
y_validation = pd.read_parquet('y_validation.parquet')


## 3. Import of trained model


In [10]:
with open('final_model_type.pkl', 'rb') as trained_model:
    model = pickle.load(trained_model)


---



## 4. Fuctions definitions



### 4.1. Classification



### 4.1.1. Accuracy


**Accuracy (accuracy_score)**:

- Definition: Accuracy measures the proportion of correct predictions out of the total predictions made. It is a common metric for classification problems but can be misleading when classes are imbalanced.
- Note: Accuracy does not take class imbalances into account, so it might not be the best choice when classes have vastly different sizes.

</br>

**Balanced Accuracy (balanced_accuracy_score)**:

- Definition: Balanced accuracy takes class imbalance into account by calculating the accuracy of labels for each class and then averaging them, weighted by the proportion of samples in each class.
- Note: It is especially useful when there is class imbalance, providing a more accurate measure of model performance in such cases.

</br>

**Top-k Accuracy (top_k_accuracy_score)**:

- Definition: Top-k accuracy considers a prediction correct if the true class is among the top k predictions made by the model.
- Note: It is useful when you are interested in knowing if the model can predict the true class among the top k predictions, rather than just the top prediction.

In [13]:
#code


### 4.1.2. Precision


**Average Precision (average_precision_score)**:

- Definition: Average Precision is calculated as the average of precision for each possible classification threshold. In other words, it is the area under the precision-recall curve (area under the PR curve).
- Note: It is useful when you are interested in the precision of positive predictions, taking into account the true positive rate and false positive rate. This metric is especially important when there is class imbalance.

<br/>

**Precision (precision_score)**:

- Definition: Precision is the ratio of true positives (samples correctly predicted as positive) to the total positive predictions (true positives + false positives).
- Note: Precision is a valuable metric when the focus is on minimizing false positives. It is particularly important in situations where false positives are costly or problematic.

<br/>

In summary, average_precision is a more comprehensive metric that takes into account multiple classification thresholds and is useful for imbalanced classification problems, whereas precision is a specific metric that focuses on the precision of positive predictions relative to the total positive predictions made by the model. The choice between these metrics depends on the specific goals of your classification problem and the relative importance of false positives compared to other types of classification errors.

In [13]:
#code


### 4.1.3. neg_brier_score


**neg_brier_score**:

- Definition: 
- Note: 

In [13]:
#code


### 4.1.4. F1


**F1 Score (f1_score)**:

- Definition: F1 Score is the harmonic mean of precision and recall. It is useful when there is an imbalance between classes in the dataset.
- Note: It is an overall F1 metric that gives equal weight to precision and recall.

**F1 Micro-Averaged Score (f1_micro)**:

- Definition: Calculates the overall F1 score by aggregating the counts of true positives, false positives, and false negatives, and then computing precision and recall based on these sums.
- Note: Useful when you want to calculate F1 metric for multiclass classification problems by aggregating total counts of true positives, false positives, and false negatives.

<br/>


**F1 Macro-Averaged Score (f1_macro)**:

- Definition: Calculates the F1 score for each class individually and then takes the average of these scores to obtain the macro-F1 score.
- Note: Useful when you want to treat each class with equal importance regardless of the number of samples in each class.

<br/>

**F1 Weighted-Averaged Score (f1_weighted)**:

- Definition: Calculates the F1 score for each class and then takes the average of these scores, weighted by the number of samples in each class.
- Note: Useful when you want to treat each class with importance proportional to the number of samples in the class.

<br/>

**F1 Samples-Averaged Score (f1_samples)**:

- Definition: Calculates the F1 score for each sample individually and then takes the average of these scores to obtain the F1 score per sample.
- Note: Useful when you want to calculate the F1 metric for multilabel classification problems.

<br/>

In summary, the choice between f1_micro, f1_macro, f1_weighted, and f1_samples depends on your specific problem and evaluation needs, while f1_score is used when you want the F1 score without specifying a specific averaging method.

In [13]:
#code


### 4.1.5. neg_log_loss


**neg_brier_score**:

- Definition: 
- Note: 

In [13]:
#code


### 4.1.6. Recall


**recall (recall_score)**:

- Definition: 
- Note: 

In [13]:
#code


### 4.1.7. jaccard


**jaccard (jaccard_score)**:

- Definition: 
- Note: 

In [13]:
#code


### 4.1.8. ROC


**ROC AUC (roc_auc_score)**:

- Definition: Computes the area under the ROC curve for binary classification problems. The ROC curve shows the true positive rate versus false positive rate for different classification thresholds.
- Note: It is a general metric for evaluating binary classifiers. The higher the value, the better the model's performance.

<br/>

**ROC AUC One-Versus-Rest (roc_auc_ovr)**:

- Definition: Computes the area under the ROC curve for multiclass classification problems using the one-versus-rest strategy. Each class is treated as the positive class, while the others are grouped as the negative class.
- Note: Useful when you have a multiclass classification problem.

<br/>

**ROC AUC One-Versus-One (roc_auc_ovo)**:

- Definition: Computes the area under the ROC curve for multiclass classification problems using the one-versus-one strategy. Each pair of classes is treated separately, and the average of the ROC AUC values is calculated.
- Note: Useful when you have a multiclass classification problem, especially with many classes, and one-versus-rest strategy would result in many comparisons.

<br/>

**ROC AUC One-Versus-Rest Weighted (roc_auc_ovr_weighted)**:

- Definition: Computes the area under the ROC curve for multiclass classification problems using the one-versus-rest strategy, with classes weighted by the number of samples in each class.
- Note: Weights the metric by the number of samples in each class, useful when there is class imbalance.

<br/>

**ROC AUC One-Versus-One Weighted (roc_auc_ovo_weighted)**:

- Definition: Computes the area under the ROC curve for multiclass classification problems using the one-versus-one strategy, with classes weighted by the number of samples in each class.
- Note: Weights the metric by the number of samples in each class, useful when there is class imbalance.

<br/>

In summary, the choice between roc_auc, roc_auc_ovr, roc_auc_ovo, roc_auc_ovr_weighted, and roc_auc_ovo_weighted depends on the type of classification problem you are dealing with (binary or multiclass) and specific evaluation needs, including handling class imbalances.

In [13]:
#code


### 4.2. Clustering



### 4.2.1. 


** **:

- Definition: 
- Note: 

In [13]:
#code


### 4.2. Regression


In [None]:
def basic_graphs(model, final_model, y, X):
    # Implement your basic_graphs function here
    print("Calling basic_graphs with parameters:")
    print("Model:", model)
    print("Final Model:", final_model)
    print("y:", y)
    print("X:", X)

def basic_tables(model, final_model, y, X):
    # Implement your basic_tables function here
    print("Calling basic_tables with parameters:")
    print("Model:", model)
    print("Final Model:", final_model)
    print("y:", y)
    print("X:", X)


---



## 5. Interactive interface


In [23]:
!jupyter labextension uninstall jupyter-matplotlib && jupyter labextension uninstall @jupyter-widgets/jupyterlab-manager && conda update -y widgetsnbextension && conda update -y nodejs && pip uninstall -y ipympl && pip install git+https://github.com/matplotlib/jupyter-matplotlib.git#egg=ipympl && conda update jupyterlab -y && jupyter labextension install @jupyter-widgets/jupyterlab-manager && jupyter labextension install jupyter-matplotlib && jupyter labextension update --all && jupyter lab build && jupyter nbextension list && jupyter labextension list

[33m(Deprecated) Uninstalling extensions with the jupyter labextension uninstall command is now deprecated and will be removed in a future major version of JupyterLab.

Users should manage prebuilt extensions with package managers like pip and conda, and extension authors are encouraged to distribute their extensions as prebuilt packages [0m
An error occurred.
ValueError: Please install Node.js and npm before continuing installation. You may be able to install Node.js from your package manager, from conda, or directly from the Node.js website (https://nodejs.org).
See the log file for details:  /tmp/jupyterlab-debug-ub5ng281.log


In [22]:
# import module
import ipywidgets as widgets

# creating button
widgets.Button(description = 'My Button')


Button(description='My Button', style=ButtonStyle())

In [21]:
import ipywidgets as widgets
from ipywidgets import interact, interact_manual, fixed

from random import choice

def lang():
    langSelect = ["English","Deustche","Espanol","Italiano","한국어","日本人"]
    print(choice(langSelect))

interact_manual(lang)


interactive(children=(Button(description='Run Interact', style=ButtonStyle()), Output()), _dom_classes=('widge…

<function __main__.lang()>

In [17]:
import ipywidgets as widgets
from IPython.display import display
import subprocess

# Create input widgets
model_widget = widgets.Text(description='Model:')
final_model_widget = widgets.Text(description='Final Model:')
y_widget = widgets.Text(description='y:')
X_widget = widgets.Text(description='X:')

function_dropdown = widgets.Dropdown(
    options=['basic_graphs', 'basic_tables'],
    description='Select Function:'
)

run_button = widgets.Button(description='Run')

output_widget = widgets.Output()

def run_button_click(b):
    selected_function = function_dropdown.value
    model = model_widget.value
    final_model = final_model_widget.value
    y = y_widget.value
    X = X_widget.value
    
    with output_widget:
        output_widget.clear_output()
        result = subprocess.run(
            ["python", "sagemaker_interface.py",
            "--function", selected_function,
            "--model", model,
            "--final_model", final_model,
            "--y", y,
            "--X", X],
            capture_output=True,
            text=True
        )
        print(result.stdout)
        print(result.stderr)

run_button.on_click(run_button_click)

# Display widgets
input_widgets = widgets.VBox([model_widget, final_model_widget, y_widget, X_widget, function_dropdown, run_button])
display(input_widgets, output_widget)

VBox(children=(Text(value='', description='Model:'), Text(value='', description='Final Model:'), Text(value=''…

Output()


---



## 6. Results
