Skip to content

wconf: The weighted confusion matrix and accuracy scores package for R

Notifications You must be signed in to change notification settings

alexandrumonahov/wconf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

wconf

Weighted Confusion Matrix

wconf is a package that allows users to create weighted confusion matrices and accuracy metrics that help with the model selection process for classification problems, where distance from the correct category is important.

The package includes several weighting schemes which can be parametrized, as well as custom configuration options. Furthermore, users can decide whether they wish to positively or negatively affect the accuracy score as a result of applying weights to the confusion matrix. “wconf” integrates well with the “caret” package, but it can also work standalone when provided data in matrix form.

Applying a weighting scheme to the confusion matrix can be useful in applications such as performance evaluation, where characteristics such as “underperforming”, “acceptable”, “overperforming” and “worker of the year” may represent gradations that are far apart and unevenly spaced. Similarly, where the objective is to classify geographic regions and proximity of the prediction to the actual region constitutes an advantage in terms of the model’s performance, applying a weighting scheme facilitates the model selection process.

v1.0.0

About wconf

wconf consists of the following functions:

weightmatrix - configure and visualize a weight matrix

This function allows users to choose from different weighting schemes and experiment with parametrizations and custom configurations.

weightmatrix(n, weight.type, weight.penalty, standard.deviation, geometric.multiplier, interval.high, interval.low, custom.weights, plot.weights)

n – the number of classes contained in the confusion matrix.

weight.type – the weighting schema to be used. Can be one of: "arithmetic" - a decreasing arithmetic progression weighting scheme, "geometric" - a decreasing geometric progression weighting scheme, "normal" - weights drawn from the right tail of a normal distribution, "interval" - weights contained on a user-defined interval, "custom" - custom weight vector defined by the user.

weight.penalty – determines whether the weights associated with non-diagonal elements generated by the "normal", "arithmetic" and "geometric" weight types are positive or negative values. By default, the value is set to FALSE, which means that generated weights will be positive values.

standard.deviation – standard deviation of the normal distribution, if the normal distribution weighting schema is used.

geometric.multiplier – the multiplier used to construct the geometric progression series, if the geometric progression weighting scheme is used.

interval.high – the upper bound of the weight interval, if the interval weighting scheme is used.

interval.low – the lower bound of the weight interval, if the interval weighting scheme is used.

custom.weights – the vector of custom weights to be applied, is the custom weighting scheme was selected. The vector should be equal to "n", but can be larger, with excess values being ignored.

plot.weights – optional setting to enable plotting of weight vector, corresponding to the first column of the weight matrix

wconfusionmatrix - compute a weighted confusion matrix

This function calculates the weighted confusion matrix by multiplying, element-by-element, a weight matrix with a supplied confusion matrix object.

wconfusionmatrix(m, weight.type, weight.penalty, standard.deviation, geometric.multiplier, interval.high, interval.low, custom.weights, print.weighted.accuracy)

m – the caret confusion matrix object or simple matrix.

weight.type – the weighting schema to be used. Can be one of: "arithmetic" - a decreasing arithmetic progression weighting scheme, "geometric" - a decreasing geometric progression weighting scheme, "normal" - weights drawn from the right tail of a normal distribution, "interval" - weights contained on a user-defined interval, "custom" - custom weight vector defined by the user.

weight.penalty – determines whether the weights associated with non-diagonal elements generated by the "normal", "arithmetic" and "geometric" weight types are positive or negative values. By default, the value is set to FALSE, which means that generated weights will be positive values.

standard.deviation – standard deviation of the normal distribution, if the normal distribution weighting schema is used.

geometric.multiplier – the multiplier used to construct the geometric progression series, if the geometric progression weighting scheme is used.

interval.high – the upper bound of the weight interval, if the interval weighting scheme is used.

interval.low – the lower bound of the weight interval, if the interval weighting scheme is used.

custom.weights – the vector of custom weights to be applied, is the custom weighting scheme was selected. The vector should be equal to "n", but can be larger, with excess values being ignored.

print.weighted.accuracy – optional setting to print the weighted accuracy metric, which represents the sum of all weighted confusion matrix cells divided by the total number of observations.

Technical details

For custom specifications, since the interval of variation of the weights is not bound to any given interval, depending on the user configuration, it is possible to obtain negative accuracy scores.

Download and installation of development version

Online, from Github:

You can download wconf directly from Github. To do so, you need to have the devtools package installed and loaded. Once you are in R, run the following commands:

install.packages("devtools")

library("devtools")

install_github("alexandrumonahov/wconf")

You may face downloading errors from Github if you are behind a firewall or there are https download restrictions. To avoid this, you can try running the following commands:

options(download.file.method = "libcurl")

options(download.file.method = "wininet")

Once the package is installed, you can run it using the: library(wconf) command.

Author details

Alexandru Monahov, 2023

About

wconf: The weighted confusion matrix and accuracy scores package for R

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages