FRAPPE

Python library that exercises Feature RAnkers to Predict classification PerformancE of binary classifiers

More information on the project can be found in the submitted paper. Below you find a resume and an usage example.

Abstract

Machine Learning algorithms that perform classification are increasingly been adopted in Information and Communication Technology (ICT) systems and infrastructures due to their capability to profile their expected behavior and detect anomalies due to ongoing errors or intrusions. Deploying a classifier for a given system requires conducting comparison and sensitivity analyses that are time-consuming, require domain expertise, and may even not achieve satisfactory classification performance, resulting in a waste of money and time for practitioners and stakeholders. This paper predicts the expected performance of classifiers without needing to select, craft, exercise, and compare them, requiring minimal expertise and machinery. Should classification performance be predicted worse than expectations, the users could focus on improving data quality and monitoring systems instead of wasting time in exercising classifiers, saving key time and money. The prediction strategy uses scores of feature rankers, which are processed by regressors to predict metrics as Matthews Correlation Coefficient (MCC) and Area Under roc-Curve (AUC) for quantifying classification performance. We validate our prediction strategy through a massive experimental analysis using up to 12 feature rankers that process features from 23 public datasets, creating additional variants in the process and exercising supervised and unsupervised classifiers. Our findings show that it is possible to predict the value of performance metrics for supervised or unsupervised classifiers with a mean average error (MAE) of residuals lower than 0.1 for many classification tasks. The predictors are publicly available in a Python library whose usage is straightforward and does not require domain-specific skill or expertise.

Usage

Examples can be found in the 'debug' folder. In a nutshell, all you need is to prepare your dataset as features and labels, initialize a FRAPPEObject and call the 'predict_metric function'. Here you find an example of code usage from one of the available code snippets.

FRAPPE/debug/regression_test.py

Lines 41 to 50 in 7c9f778

    
           fr_obj = FrappeInstance(classification_type=task, target_metric=metric, 
        
                                   instance=FrappeType.FAST, models_folder=MODELS_FOLDER) 
        
           pred_met, feature_data_time, reg_time, data_row, true_mets = \ 
        
               fr_obj.predict_metric(x, y, dataset_name, compute_true=True) 
        
           print("[" + task + "@" + metric + "] Predicted: " + str(pred_met) + " was " + str(true_mets[metric]) 
        
                 + " ae of " + str(abs(pred_met - true_mets[metric])) + 
        
                 " time: [" + str(feature_data_time) + "; " + str(reg_time) + "]")

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.idea		.idea
debug		debug
models		models
src		src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.idea

.idea

debug

debug

models

models

src

src

.gitignore

.gitignore

README.md

README.md

Repository files navigation

FRAPPE

Abstract

Usage

About

Releases

Packages

Languages

	fr_obj = FrappeInstance(classification_type=task, target_metric=metric,
	instance=FrappeType.FAST, models_folder=MODELS_FOLDER)

	pred_met, feature_data_time, reg_time, data_row, true_mets = \
	fr_obj.predict_metric(x, y, dataset_name, compute_true=True)

	print("[" + task + "@" + metric + "] Predicted: " + str(pred_met) + " was " + str(true_mets[metric])
	+ " ae of " + str(abs(pred_met - true_mets[metric])) +
	" time: [" + str(feature_data_time) + "; " + str(reg_time) + "]")

tommyippoz/FRAPPE

Folders and files

Latest commit

History

Repository files navigation

FRAPPE

Abstract

Usage

About

Resources

Stars

Watchers

Forks

Languages