Skip to content

Algorithms included in LAC

bugfoo edited this page Nov 20, 2019 · 3 revisions

The following algorithms are included in current version of LAC.

  • CBA: the very first algorithm for AC proposed by Liu et al.. It obtains an associative classifier by means of a two-part process. First, all the rules whose confidence and support are above a user-specified threshold. To obtain those rules, Apriori is used. It is also the first algorithm for association rule mining, whereas for association rule mining the target of discovery is not pre-determined, since no class exists, it is adapted to obtain only a subset of the possible rules. That rules are known as class association rules. Second, once all the class association rules are extracted, CBA continues sorting those rules and removing those which are not covering enough instances of the dataset. In this sense, all the redundancy is removed. Additionally, it also includes a default class for those cases which are not covered by any rule. Default class is obtained as the majority class for the instances not covered while training.
  • CBA2: it emerges as evolution of CBA. It was proposed by the very same authors. Unlike its previous version, it includes multiple minimum class supports, that is, a different minimum support per class. In this sense, imbalance among the frequency of occurrence of the classes do not affect to the accuracy of the classifier. Whereas in the previous approach (CBA) rule with minority classes may be totally ignored because they do not achieve enough frequency of occurrence, in this approach this problem is solved. Each support of rule is evaluated with regard to the frequency of its class, enabling to obtain rules for each class without mattering how frequent this class is.
  • ACAC: it obtains the classifier directly, and it does not require a two step process as many other algorithm for AC. Whereas almost all the algorithms in AC are based on the traditional framework of support and confidence, this method adds a new metric, known as all-confidence. It represents the minimum confidence of all association rules extracted from an itemset. Thanks to this measure, authors claim to reduce the number of generated rules speeding up the process. This reduction is so high, that authors claim that its algorithm does not have to do a post-processing step since all the generated rules are enough interesting to consider in the classifier.
  • ACCF: it makes use of a two-step process to obtain the final classifier. ACCF generates a much smaller set of high-quality predictive rules from the datasets thanks to obtaining closed frequent itemsets. An itemset is closed in a dataset if there are no superset that has the same support count as this original itemset. In this sense, ACCF enables to obtain a smaller number of rules without losing accuracy and speeding-up the process of rule discovery. Secondly, rules are sorted following the same criteria as in CBA with a small difference. Whereas CBA considers that a rule X => Y covers an example (t) iff X ⊂ t, ACCF relax this condition for that cases where no rule covers completely an example. In that cases, ACCF selects the first rule which contains exactly one item in t. In that way, authors claim that no default class is required.
  • ACN: this algorithm not only enables to obtain classifier with positives rules, but also with negative ones. It is based on a two-step process. The goal for the first step is to obtain both positive and negative rules. While the second phase aims at sorting and post-processing those rule to obtain an accurate classifier. In this last phase, ACN also calculates pearson's correlation coefficient for each rule and prunes a rule if its correlation measure is below a user-specified threshold.
  • L3: it also uses a two-step process to obtain the final classifier. First, it obtains rules whose confidence and support is above a user-specified threshold. It makes use of an adaptation of FP-Growth to obtain rules, adapted to incorporate multiple support threshold to solve the imbalance class problem. Then, a classifier is formed using the previously obtained rules. In this last step is where there are very much novelty in this algorithm. It proposes to perform a lazy pruning, the idea behind lazy pruning is to discard from the classifier only the rules that do not correctly classify any training case, i.e., the rules that only negatively contribute to the classification of training cases. To this end, after rule sorting, the training cases are covered to detect "harmful" rules. After having discarded "harmful" rules, the remaining rules are divided in two groups: 1) used rules, which have already correctly classified at least one training case; 2) spare rules, which have not been used during the training phase, but may become useful later.
  • ADT: this algorithm aims at obtaining rules whose confidence is higher than a user-specified threshold. Thus, this algorithm does not consider support as all the previous ones but only confidence. As for this algorithm mining confident rules does not require a minimum support, the classic support-based algorithms cannot be used. They proposed to exploit a confidence-based pruning to prune unpromising rules as early as possible. Furthermore, the lattice of rules are converted to a tree called ADT where general rules are at higher level and specific rules are at low levels. In that way the process of pruning could be done more efficiently.
  • MAC: it also follows a two-step process to obtain the final classifier. It utilises vertical mining approach for finding the knowledge from data sets whereas the majority of AC algorithms employ level-wise search in discovering the rules (Apriori approach). The main advantage of using vertical mining is that the data set is scanned only once, and then simple intersections among the TIDs of ruleitems of size one are required to derive the remaining ruleitems which are the input for rule production step. Secondly, the classifier is built using the previously obtained rules. In this sense, rules are sorted, and MAC iterates over the set of discovered rules (top-down fashion) and marks the first rule that matches the training case as a classifier rule. The same process is repeated until all training cases are utilised. This methodology of constructing the classifier does not consider the similarity between the candidate rule class and that of the training case during the selection of the classifier rules. Authors claim that reduce overfitting of the resulting classifier as well as its size.
  • CPAR: it follows a two-step process to obtain the final classifier. First, it obtains all the rules by means of an adaptation of First Order Inductive Learner (FOIL) algorithm in rule generation. The resulting rules are merged to form the classifier rule set. Rules are pruned considering Laplace accuracy measure. CPAR uses the best k rules of each group to predict the class label of each new tuple. The CPAR algorithm is the first AC technique that used Laplace Accuracy to assign the class to the test cases during prediction. Once all rules are found and ranked, and a test case (t) is about to be predicted, CPAR iterates over the rule set and marks all rules in the classifier that may cover t. If more than one rule is applicable to t, CPAR divides them into groups according to the classes, and calculates the average expected accuracy for each group. Finally, it assigns to t the class with the largest average expected accuracy value.
  • CMAR: once of the pioneering algorithms which incorporates prediction of unseen examples using multiple rules, and not only one. Authors proposed this algorithm to solve the problem of classification on those datasets where the classification is based on only single high-confidence rule. As happens previously, this algorithm obtains rules in a two-step process. First, it obtains rules using an algorithm for association rule mining. In concrete, CMAR uses FP-Growth which enables to obtain rules without candidate generation thanks to a data structure that speeds up the process. Second, once all the rule have been obtained, they are sorted and filtered using database coverage. Prediction of unseen examples makes use of weighted X^2 analysis using multiple strong association rules.
Clone this wiki locally