Skip to content

Ali98Wayne/WEKA-Classifier-Comparisons

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

WEKA Classifier Comparisons

This is C++ code written as part of a homework assignment from my "Data Mining: Intelligent Systems: Algorithms and Tools" graduate course in winter 2025. Before creating the C++ code, WEKA was used to load the following datasets: anneal.arff, autos.arff, glass.arff, labor.arff, sonar.arff, tic-tac-toe.arff, & vehicle.arff. The datasets were then trained with the following classifiers using 10-fold cross-validation & default parameters: Naïve Bayes, J48, IBK, SMO, MultilayerPerceptron, Bagging, & AdaBoostM1. The accuracies (weighted average of the true positive rate) and sizes of each dataset were documented for the classifiers on each dataset.

In order to compare the classifiers on each dataset, a Z-statistic value much be computed:

Z = (pA – pB) / √((2p(1 – p)) / N)

Where:

  • pA: Accuracy of classifier A
  • pB: Accuracy of classifier B
  • p = (pA + pB) / 2
  • N = Dataset size

The Z-statistic values were represented on a 7x7 table, with each cell formatted as win-loss-tie on the datasets. A win would be when classifier A is better than classifier B where Z > -1.96, a loss would be when classifier B is better than classifier A where Z < 1.96, a tie would not fall under either of those ranges. Afterward, a single classifier is determined to be the winner after calculating the sum of all wins for all classifiers.

The C++ code created helps automate the calculations of Z-statistics, as considering what the homework problem was asking, there would be 147 unique values (due to table symmetry) which would be tedious to calculate manually. In order to see all 294 values (with duplicates) line 26 should be changed from “int k = j + 1” to “int k = 0”.

7x7 Classifier Comparison Table:

image

NOTE: Cells with the same 2 classifiers don’t need Z-statistics, they are marked with “-“.

Code Output Screenshots:

1

2

3

4

About

Repository for storing code for WEKA Classifier Comparisons

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages