# Part 3: Rule Learning

We use the CN2 Learner of the Orange Python API to predict wine Proline class.

We begin by importing needed libraries. This time we are only going to need Orange to work with

In [1]:
import Orange

We load our data from the `wine.csv` file

In [2]:
data = Orange.data.Table("wine")

Let's take a look at the shape of our features and targets

In [3]:
data.X.shape, data.Y.shape

((178, 13), (178,))

What are the feature names?

In [4]:
data.domain

[Alcohol, Malic Acid, Ash, Alcalinity of ash, Magnesium, Total phenols, Flavanoids, Nonflavanoid phenols, Proanthocyanins, Color intensity, Hue, OD280/OD315 of diluted wines, Proline | Wine]

We will use these hyperparameters to train 3 different learners.

The values below are the optimal values.

In [5]:
ordered_learner_optimal = {'beam_width':4,
                          'min_rule_cvg':15,
                          'max_rule_length':3}
unordered_learner_optimal = {'beam_width':2,
                            'min_rule_cvg':5,
                            'max_rule_length':3}
ordered_laplace_learner_optimal = {'beam_width':11,
                                  'min_rule_cvg':8,
                                  'max_rule_length':3}

We build our learners

In [6]:
ordered_learner = Orange.classification.CN2Learner()
ordered_learner.rule_finder.search_algorithm.beam_width = ordered_learner_optimal['beam_width']
ordered_learner.rule_finder.general_validator.min_covered_examples = ordered_learner_optimal['min_rule_cvg']
ordered_learner.rule_finder.general_validator.max_rule_length = ordered_learner_optimal['max_rule_length']

unordered_learner = Orange.classification.CN2UnorderedLearner()
unordered_learner.rule_finder.search_algorithm.beam_width = unordered_learner_optimal['beam_width']
unordered_learner.rule_finder.general_validator.min_covered_examples = unordered_learner_optimal['min_rule_cvg']
unordered_learner.rule_finder.general_validator.max_rule_length = unordered_learner_optimal['max_rule_length']
            
ordered_laplace_learner = Orange.classification.CN2Learner()
ordered_laplace_learner.rule_finder.quality_evaluator = Orange.classification.rules.LaplaceAccuracyEvaluator()
ordered_laplace_learner.rule_finder.search_algorithm.beam_width = ordered_laplace_learner_optimal['beam_width']
ordered_laplace_learner.rule_finder.general_validator.min_covered_examples = ordered_laplace_learner_optimal['min_rule_cvg']
ordered_laplace_learner.rule_finder.general_validator.max_rule_length = ordered_laplace_learner_optimal['max_rule_length']

We will use a dictionary to manage training of the three different models

In [7]:
learners = {'ordered learner':ordered_learner,
            'unordered learner':unordered_learner,
            'ordered laplace':ordered_laplace_learner}

We create our cross-validator. We will use this to evaluate our models.

For number of folds we will use the default setting, which is 10.

In [8]:
cv = Orange.evaluation.CrossValidation(stratified=True)

We proceed to train our models with the cross-validator and gain some results.

In [9]:
results = cv(data, learners.values())

We compute the accuracy, recall, precision and F1 scores of our models

In [10]:
accuracy = Orange.evaluation.scoring.CA(results)
recall = Orange.evaluation.scoring.Recall(results, average='macro')
precision = Orange.evaluation.scoring.Precision(results, average='macro')
f1 = Orange.evaluation.scoring.F1(results, average='macro')

And proceed to display them

In [11]:
for learner, acc, rec, prec, f1_score in zip(learners.keys(), accuracy, recall, precision, f1):
    print("\n" + learner)
    print("Accuracy: " + str(acc))
    print("Precision: " + str(rec))
    print("Recall: " + str(rec))
    print("F1: " + str(f1_score))


ordered learner
Accuracy: 0.9325842696629213
Precision: 0.9340485000397868
Recall: 0.9340485000397868
F1: 0.9354243665301502

unordered learner
Accuracy: 0.9269662921348315
Precision: 0.9229446831649027
Recall: 0.9229446831649027
F1: 0.9282453367910639

ordered laplace
Accuracy: 0.9269662921348315
Precision: 0.924854446831649
Recall: 0.924854446831649
F1: 0.9275011339574802


We will now extract some rules by training our models again and making new classifiers out of them

In [12]:
classifiers = dict([(learner[0], learner[1](data)) for learner in learners.items()])

We display the rules of those classifiers

In [13]:
for classifier in classifiers.items():
    print("\n" + classifier[0])
    for rule in classifier[1].rule_list:
        print(rule)


ordered learner
IF Proline>=990.0 THEN Wine=1 
IF Color intensity>=6.62 THEN Wine=3 
IF Color intensity<=3.52 AND Alcalinity of ash>=20.4 THEN Wine=2 
IF Ash<=2.1 AND Hue>=0.93 THEN Wine=2 
IF Flavanoids<=1.3 AND Color intensity>=3.85 THEN Wine=3 
IF Alcohol<=12.93 AND Nonflavanoid phenols>=0.26 THEN Wine=2 
IF Proline>=680.0 AND Flavanoids>=2.41 THEN Wine=1 
IF TRUE THEN Wine=2 

unordered learner
IF Proline>=760.0 AND Alcohol>=13.39 THEN Wine=1 
IF Proline>=1015.0 THEN Wine=1 
IF Alcohol>=12.77 AND OD280/OD315 of diluted wines>=2.48 AND Magnesium>=102.0 THEN Wine=1 
IF Color intensity<=3.52 AND Nonflavanoid phenols>=0.29 THEN Wine=2 
IF Alcohol<=12.2 AND Total phenols>=1.45 THEN Wine=2 
IF Malic Acid<=1.35 AND Hue>=1.04 THEN Wine=2 
IF Proline<=480.0 AND Flavanoids>=1.41 THEN Wine=2 
IF Flavanoids<=0.99 AND Color intensity>=3.85 THEN Wine=3 
IF Hue<=0.79 AND Color intensity>=4.1 THEN Wine=3 
IF TRUE THEN Wine=2 

ordered laplace
IF Proline>=760.0 AND Alcohol>=13.39 THEN Wine=1 
IF A