In [None]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
%matplotlib inline

# Hand-computing support and confidence

In [None]:
from sympy import *                 #sympy is symbol python library.
from itertools import combinations  #To compute support and confidence, we need combination of attributes. To do this combinations is useful.
#init_printing()

I want to generate k-subsets from a given set.

In [None]:
def subsets(S, k):
  return [set(s)
    for s in combinations(S, k)]  #Set S and subset size k are passed to combinations. This function returns list of subsets, where each subset is of size k.
                                  #set(s) returns each group of elements in set format.
                                  #So, subsets function is returning list of sets of size k.

In [None]:
subsets({1, 2, 3}, 2)

[{1, 2}, {1, 3}, {2, 3}]

And sometimes I want to print a list, each on its own line:

In [None]:
def print_all(iterable):
  for item in iterable:
    print(item)

In [None]:
print_all([1, 2, 3, 4, 5])

1
2
3
4
5


Now, let's talk about __association rule mining__.

In association rule mining, we try to discover rules based on item sets.

Our input data is a list of _transactions_. Each transaction contains a set of items (called an _itemset_).
For example, this table represents the set of items bought by a customer in a single transaction:

| TID | Items |
| --- | ----- |
| 1 | Bread, Milk |
| 2 | Bread, Diaper, Beer, Eggs |
| 3 | Milk, Diaper, Beer, Coke |
| 4 | Bread, Milk, Diaper, Beer |
| 4 | Bread, Milk, Diaper, Coke |

In [None]:
T = [
 {'Bread', 'Milk'},
 {'Beer', 'Bread', 'Diaper', 'Eggs'},
 {'Beer', 'Coke', 'Diaper', 'Milk'},
 {'Beer', 'Bread', 'Diaper', 'Milk'},
 {'Bread', 'Coke', 'Diaper', 'Milk'},
]
#Transactional Data is represented as a list of sets.

We use them to find association rules, such as:

$$ \{\text{Milk}, \text{Bread}\} \Rightarrow \{\text{Eggs}, \text{Coke}\} $$

Which means that if a customer buys Milk and Bread, it is likely that they will buy Eggs and Coke as well.

Itemsets
========

An _itemset_ is simply a set of items, such as $\{ \text{Milk}, \text{Bread}, \text{Eggs} \}$.

The frequency of occurrence of an _itemset_ is called support count.

$$ \sigma(X) = \|\{ x \in T \:|\: X \subseteq x \}\| $$

In [None]:
def support_count(X, T):
  return (sum(1 for Y in T if X <= Y))
# X <= Y means test whether X is subset of Y.
# These symbolic operations are given by "sympy".


# s | t new set with elements from both s and t
# s & t new set with elements common to s and t
# s - t new set with elements in s but not in t
# s ^ t new set with elements in either s or t but not both

In [None]:
support_count({'Milk', 'Bread', 'Diaper'}, T)
print("\n")
support_count({'Coke', 'Milk'}, T)
print("\n")
support_count({}, T)

2





2





5

Support is the _proportion_ of transactions that contain an itemset.

$$ s(X) = \frac{\sigma(X)}{\|\text{T}\|} $$


In [None]:
def support(X, T):
  return float(support_count(X, T)) / len(T)

In [None]:
support({'Milk', 'Bread', 'Diaper'}, T)

0.4

If the _support_ is higher than a given ratio, it is called a _frequent itemset_.

Rules
=====

A rule is in form of $X \Rightarrow Y$ where $X$ and $Y$ are itemsets. For example:

$$ \{ \text{Milk}, \text{Diaper} \} \Rightarrow \{ \text{Beer} \} $$

In [None]:
rule = ({'Milk', 'Diaper'}, {'Beer'})

The _support_ of the rule is the fraction of transactions that contain both $X$ and $Y$.

$$ s(X \Rightarrow Y) = s(X \cup Y) $$

In [None]:
def rule_support(rule, T):
    (x, y)=rule
    return support(x | y, T)

In [None]:
rule_support(rule, T)

0.4

The _confidence_ of the rule tells you how many transactions that contains $X$ also contains $Y$ (in form of proportion).

$$ c(X \Rightarrow Y) = \frac{\sigma(X \cup Y)}{\sigma(X)} \left(= \frac{s(X \Rightarrow Y)}{s(X)}\right) $$

In [None]:
def rule_confidence(rule, T):
    (x, y)=rule
    return (float(support_count(x | y, T))
        / support_count(x, T))

In [None]:
rule_confidence(rule, T)

0.6666666666666666

Association Rule Mining
=======================

In association rule mining, we want to find all rules that has enough _support_ and _confidence_. In other words, we want to find $\{ X \Rightarrow Y \:|\: s(X \Rightarrow Y) > s_{min}, c(X \Rightarrow Y) > c_{min} \}$.

We are using 2-step approach:

- First, finding frequent itemsets with enough support.
    - For example, $\{A, B, C\}$
- Then, generate rules from these itemsets.
    - We can generate rules by finding binary partitions of a given itemset.
    - For example, from $\{A, B, C\}$, we can generate 6 rules:
    - Note that support of these rules are all the same.
    - We then select only the rules with enough confidence.
        - $\{A\} \Rightarrow \{C,B\}$
        - $\{C\} \Rightarrow \{A,B\}$
        - $\{B\} \Rightarrow \{A,C\}$
        - $\{A,C\} \Rightarrow \{B\}$
        - $\{A,B\} \Rightarrow \{C\}$
        - $\{C,B\} \Rightarrow \{A\}$

Let's forget about finding frequent itemsets for now; let's assume we were already given frequent itemsets.

In [None]:
T = [
 {'A', 'B', 'E'},
 {'B', 'D'},
 {'B', 'C'},
 {'A', 'B', 'D'},
 {'A', 'C'},
 {'B', 'C'},
 {'A', 'C'},
 {'A', 'B', 'C', 'E'},
 {'A', 'B', 'C'},
]

cmin = 0.5

Given a Frequent Itemset
------------------------

We can use a naive algorithm:

In [None]:
# frequent itemset `l`
def find_rules(l, T):
  rules = []
  for n in range(1, len(l)):
    for c in subsets(l, n):
      rule = (set(c), l - set(c))
      if rule_confidence(rule, T) >= cmin:
        rules.append(rule)
  return rules

It just tries all "binary partitions" of the frequent itemset `l`, and only emits rules with enough confidence.

In [None]:
[(rule, rule_confidence(rule, T))
  for rule in find_rules({'A', 'B', 'E'}, T)]

[(({'E'}, {'A', 'B'}), 1.0),
 (({'A', 'B'}, {'E'}), 0.5),
 (({'A', 'E'}, {'B'}), 1.0),
 (({'B', 'E'}, {'A'}), 1.0)]

Without a Frequent Itemset
--------------------------

Now, recall about having to find the frequent itemsets. What should we do?

For a large dataset, this is impractical. To find all the candidates, we must try all subsets of all items!
If we have $d$ items, the number of subsets become $2^d$.
See how fast it grows!
What can we do to help?

Apriori Principle
-----------------
Apriori priciple says:
_"If an itemset is frequent, then all of its subsets must also be frequent."_

From that, we know that if an itemset is _infrequent_, all its supersets are also infrequent.

Again, let's use the example from the slides.

In [None]:
T = [
  {'A', 'C', 'D'},
  {'B', 'C', 'E'},
  {'A', 'B', 'C', 'E'},
  {'B', 'E'},
  {'A', 'B', 'C', 'E'}
]

smin = 0.4


### Starting Small

So here's our approach.

- First, we find all frequent itemsets of size 1 (called _frequent 1-itemsets_).
- Next, we "prune" itemsets whose support is too low.
- Then, we generate frequent _2_-itemsets from the remaining 1-itemsets.
- Again, we "prune" itemsets whose support is too low.
- Increase the size and repeat.


### Generating Frequent 1-itemsets

First, let's generate frequent 1-itemsets.
Before that, I will create a function to union multiple sets.

In [None]:
def union_all(sets):
  """Finds the union of given sets."""
  result = set()
  print(result)
  for c in sets:
    result = result | c
  return result

In [None]:
union_all([{1, 2}, {2, 3}])

set([])


{1, 2, 3}

In [None]:
union_all([])

set([])


set()

Now, some code to find frequent 1-itemsets:

In [None]:
def frequent_1(T):
  items = union_all(T)
  print("Items:",items)
  return [{item}
    for item in items
      if support({item}, T) >= smin]

In [None]:
L1 = frequent_1(T)
L1

set([])
('Items:', set(['A', 'C', 'B', 'E', 'D']))


[{'A'}, {'C'}, {'B'}, {'E'}]

As you see, "D" is eliminated from the candidates.
That means any itemset with "D" in it will not be frequent enough.

### Expanding It

Next, we generate frequent 2-itemsets from 1-itemsets.
The easiest way to do it is to put these items together and select 2 items. First, we put them together:

In [None]:
union_all(L1)

set([])


{'A', 'B', 'C', 'E'}

In [None]:
C2 = subsets(_, 2)  #underscore means last result.
print_all(C2)

set(['A', 'C'])
set(['A', 'B'])
set(['A', 'E'])
set(['C', 'B'])
set(['C', 'E'])
set(['B', 'E'])


These are the candidate itemsets. But maybe... not all of them are frequent enough.

Now, for each candidate $c$ in `C2`,
we must make sure at all of $c$'s 1-subset is in `L1`.
Why? If one of $c$'s subset (let's call it $s$) is not in `L1`,
it means that that $s$ has already been pruned, because $s$ is not frequent enough.
Since $c$ is a superset of $s$, $c$ will also not be frequent enough.

In [None]:
def good_candidate(c, P):
  for s in subsets(c, len(c) - 1):
    if s not in P: 
        return false
  return True

In [None]:
F2 = [c for c in C2 if good_candidate(c, L1)]
print_all(F2)

set(['A', 'C'])
set(['A', 'B'])
set(['A', 'E'])
set(['C', 'B'])
set(['C', 'E'])
set(['B', 'E'])


Well, it seems that every candidate is a good one. Anyway, now we have the finalists!
For the final round, you might have guessed it:
We simply check the support to see if each item set is frequent enough!

In [None]:
L2 = [f for f in F2 if support(f, T) >= smin]
print(L2)

[set(['A', 'C']), set(['A', 'B']), set(['A', 'E']), set(['C', 'B']), set(['C', 'E']), set(['B', 'E'])]


Again, all of them are frequent enough! So, now we have the 2-itemsets.

### Moving On

Now, let's generate `L3`.

In [None]:
def generate_candidates(P, k):
  return subsets(union_all(P), k)

In [None]:
C3 = generate_candidates(L2, 3)
print_all(C3)

set([])
set(['A', 'C', 'B'])
set(['A', 'C', 'E'])
set(['A', 'B', 'E'])
set(['C', 'B', 'E'])


In [None]:
F3 = [c for c in C3
         if good_candidate(c, L2)]
print_all(F3)

set(['A', 'C', 'B'])
set(['A', 'C', 'E'])
set(['A', 'B', 'E'])
set(['C', 'B', 'E'])


In [None]:
L3 = [f for f in F3
         if support(f, T) >= smin]
print_all(L3)

set(['A', 'C', 'B'])
set(['A', 'C', 'E'])
set(['A', 'B', 'E'])
set(['C', 'B', 'E'])


### Generalizing It

We can turn the above steps into this function:

In [None]:
def frequent_k(P, k, T):
  C = generate_candidates(P, k)
  F = [c for c in C if good_candidate(c, P)]
  return [f for f in F if support(f, T) >= smin]

We then use that function to generate `L4`.

In [None]:
L4 = frequent_k(L3, 4, T)
L4

set([])


[{'A', 'B', 'C', 'E'}]

Finally, generating `L5` will return no itemsets, which concludes the Apriori algorithm:

In [None]:
L5 = frequent_k(L4, 5, T)
L5

set([])


[]

Putting It Together
-------------------

We take all the previous answers to find the frequent itemsets!

In [None]:
L1 + L2 + L3 + L4 + L5  #+ gives union of all lists. It is given by sympy.

[{'A'},
 {'C'},
 {'B'},
 {'E'},
 {'A', 'C'},
 {'A', 'B'},
 {'A', 'E'},
 {'B', 'C'},
 {'C', 'E'},
 {'B', 'E'},
 {'A', 'B', 'C'},
 {'A', 'C', 'E'},
 {'A', 'B', 'E'},
 {'B', 'C', 'E'},
 {'A', 'B', 'C', 'E'}]

Summing It Up
-------------

Finally, here's the apriori algorithm!

In [None]:
def apriori(T):
  result = []
  L = frequent_1(T)
  k = 1
  while len(L) > 0:
    result += L
    k += 1
    L = frequent_k(L, k, T)
  return result

In [None]:
L = apriori(T)
L

set([])
('Items:', set(['A', 'C', 'B', 'E', 'D']))
set([])
set([])
set([])
set([])


[{'A'},
 {'C'},
 {'B'},
 {'E'},
 {'A', 'C'},
 {'A', 'B'},
 {'A', 'E'},
 {'B', 'C'},
 {'C', 'E'},
 {'B', 'E'},
 {'A', 'B', 'C'},
 {'A', 'C', 'E'},
 {'A', 'B', 'E'},
 {'B', 'C', 'E'},
 {'A', 'B', 'C', 'E'}]

For each frequent itemsets, we generate rules from it.

In [None]:
cmin = 0.75

[rule
  for itemset in L
    for rule in find_rules(itemset, T)]

[({'A'}, {'C'}),
 ({'C'}, {'A'}),
 ({'C'}, {'B'}),
 ({'B'}, {'C'}),
 ({'C'}, {'E'}),
 ({'E'}, {'C'}),
 ({'B'}, {'E'}),
 ({'E'}, {'B'}),
 ({'A', 'B'}, {'C'}),
 ({'A', 'E'}, {'C'}),
 ({'A', 'B'}, {'E'}),
 ({'A', 'E'}, {'B'}),
 ({'C'}, {'B', 'E'}),
 ({'B'}, {'C', 'E'}),
 ({'E'}, {'B', 'C'}),
 ({'B', 'C'}, {'E'}),
 ({'C', 'E'}, {'B'}),
 ({'B', 'E'}, {'C'}),
 ({'A', 'B'}, {'C', 'E'}),
 ({'A', 'E'}, {'B', 'C'}),
 ({'A', 'B', 'C'}, {'E'}),
 ({'A', 'C', 'E'}, {'B'}),
 ({'A', 'B', 'E'}, {'C'})]

The Lift
--------

How can you be sure that there really is a correlation between the itemset $X$ and $Y$?

- 90% of customers buy coffee.
- 25% of customers buy tea.
- 20% of customers buy both.

After filling Venn diagram, here's our transactions:

In [None]:
T = (
  20 * [{'coffee', 'tea'}] +
  70 * [{'coffee'}] +
   5 * [{'tea'}] +
   5 * [set()]  #5% of people neigther buy coffee nor tea.
)

Given $\{\text{coffee}, \text{tea}\}$ is a frequent itemset,
let's mine some rules!

In [None]:
rules = find_rules({'coffee', 'tea'}, T)
rules

[({'tea'}, {'coffee'})]

Here, we mined the rule $\{\text{tea}\} \Rightarrow \{\text{coffee}\}$.
How confident we are?

In [None]:
rule = ({'tea'}, {'coffee'})
rule_confidence(rule, T)

0.8

We found that __80% of customers that buy tea also buys coffee__.
We're highly confident, at 80 percent!
But is 80% good?

Does it really mean that the customer buys coffee _because_ they buy tea?
To find out, let's remove the condition.
Let's see how many people buy coffee "no matter what:"

In [None]:
rule = (set(), {'coffee'})
rule_confidence(rule, T)

0.9

This is called the unconditional, _expected confidence_.
As you see, people who buy tea are actually less likely to buy coffee.
This is a _negative correlation_.

To quantify this correlation, we use a measure called "lift:"

$$ L(X \Rightarrow Y) = \frac{c(X \Rightarrow Y)}{c(\varnothing \Rightarrow Y)}$$

In [None]:
def rule_lift(rule, T):
  (x, y)=rule
  return (rule_confidence((x, y), T)
        / rule_confidence((set(), y), T)) #fraction says out people who buy coffee, how many are influence by buying tea. 

In [None]:
rule = ({'tea'}, {'coffee'})
rule_lift(rule, T)

0.888888888888889

The "lift" measure tells us the correlation of the rule.

- $L > 1 \Rightarrow$ positive correlation
- $L = 1 \Rightarrow$ independence
- $L < 1 \Rightarrow$ negative correlation

And this concludes this notebook regarding association rules and apriori algorithm.

# Problem on computing association rules with 100% confidence

Imagine there are 100 baskets, numbered 1,2,...,100, and 100 items, similarly numbered. Item i is in basket j if and only if i divides j evenly. For example, basket 24 is the set of items {1,2,3,4,6,8,12,24}.  Which of the following rules has 100% confidence?

{1,2}-> 4; {1}-> 2; {1,4,7}-> 14; {1,3,6}-> 12; {4,6}-> 12; {8,12}-> 96; {4,6}-> 24; {1,3,6}-> 12

In [None]:
baskets = range(1,101)  #Baskets are in the range 1 to 100. Each basket has divisors of the basket.
items = range(1,101)    

# Create transactions
transactions = []

for i in baskets:
    basket = []
    for item in items:
        if i % item == 0:
            basket.append(item)
    transactions.append(basket)

In [None]:
transactions[47]  #it is pointing to basket number 48. It has all divisors of 24.

[1, 2, 3, 4, 6, 8, 12, 16, 24, 48]

In [None]:
#Computes support for the query set
def check(transactions,query):
    count=0
    for t in transactions:
        query_in = True
        for q in query:
            if q not in t:
                query_in = False
        if query_in:
            count+=1
    return count

In [None]:
check(transactions,{1,2})

50

In [None]:
def confidence(num,denom):  #num is numerator, denom is denominator.
    count_denom = check(transactions,denom)
    count_num = check(transactions,num)
    confidence = count_num /(1.0* count_denom ) * 100
    return confidence

In [None]:
print("{1,2}-> 4,Condidence",confidence([1,2,4],[1,2]))   
print("{1}-> 2,Condidence",confidence([1,2],[1]))   
print("{1,4,7}-> 14,Condidence",confidence([1,4,7,14],[1,4,7]))   
print("{1,3,6}-> 12,Condidence",confidence([1,3,6,12],[1,3,6]))   
print("{4,6}-> 12,Condidence",confidence([4,6,12],[4,6]))   
print("{8,12}-> 96,Condidence",confidence([8,12,96],[8,12]))   
print("{4,6}-> 24,Condidence",confidence([4,6,24],[4,6]))   
print("{1,3,6}-> 12,Condidence",confidence([1,3,6,12],[1,3,6]))

('{1,2}-> 4,Condidence', 50.0)
('{1}-> 2,Condidence', 50.0)
('{1,4,7}-> 14,Condidence', 100.0)
('{1,3,6}-> 12,Condidence', 50.0)
('{4,6}-> 12,Condidence', 100.0)
('{8,12}-> 96,Condidence', 25.0)
('{4,6}-> 24,Condidence', 50.0)
('{1,3,6}-> 12,Condidence', 50.0)


# Orange way of computing association rules and frequent patterns
pip install orange3

pip install orange3-associate

In [None]:
T = [[1,    3, 4   ],
[   2, 3,    5],
[1, 2, 3,    5],
[   2,       5]]

#orange library has automated methods of finding association rules.

In [None]:
import orange3
#We can enumerate all frequent itemsets with support greater than two transactions:
from orangecontrib.associate.fpgrowth import *  
itemsets = frequent_itemsets(T, 2)  #support here is 2 out of 4 transactions i.e., 50%

ImportError: ignored

In [None]:
itemsets
list(itemsets)

[(frozenset({1}), 2),
 (frozenset({2}), 3),
 (frozenset({3}), 3),
 (frozenset({1, 3}), 2),
 (frozenset({2, 3}), 2),
 (frozenset({5}), 3),
 (frozenset({2, 5}), 3),
 (frozenset({3, 5}), 2),
 (frozenset({2, 3, 5}), 2)]

Note, functions in this module produce generators. The results space can explode quite quickly and can easily be too large to fit in your RAM. By using generators, you can filter the results to your liking as you pass them

In [None]:
#We can try it with a larger and more real-world database of categorical values:
import Orange
data = Orange.data.Table('zoo')
data

#Zoo Data has 101 rows with data about 16 attributes they have

#Along with 16 features they have, there is a animal class label i.e., are they a mammal or fish etc... 

ImportError: ignored

In [None]:
data.n_rows
data.X.shape

(101, 16)

In [None]:
data.domain

[hair, feathers, eggs, milk, airborne, aquatic, predator, toothed, backbone, breathes, venomous, fins, legs, tail, domestic, catsize | type] {name}

In [None]:
for i in data.metas:
    print(i, sep=",",end="")
data.metas.shape

['aardvark']['antelope']['bass']['bear']['boar']['buffalo']['calf']['carp']['catfish']['cavy']['cheetah']['chicken']['chub']['clam']['crab']['crayfish']['crow']['deer']['dogfish']['dolphin']['dove']['duck']['elephant']['flamingo']['flea']['frog']['frog']['fruitbat']['giraffe']['girl']['gnat']['goat']['gorilla']['gull']['haddock']['hamster']['hare']['hawk']['herring']['honeybee']['housefly']['kiwi']['ladybird']['lark']['leopard']['lion']['lobster']['lynx']['mink']['mole']['mongoose']['moth']['newt']['octopus']['opossum']['oryx']['ostrich']['parakeet']['penguin']['pheasant']['pike']['piranha']['pitviper']['platypus']['polecat']['pony']['porpoise']['puma']['pussycat']['raccoon']['reindeer']['rhea']['scorpion']['seahorse']['seal']['sealion']['seasnake']['seawasp']['skimmer']['skua']['slowworm']['slug']['sole']['sparrow']['squirrel']['starfish']['stingray']['swan']['termite']['toad']['tortoise']['tuatara']['tuna']['vampire']['vole']['vulture']['wallaby']['wasp']['wolf']['worm']['wren']

(101, 1)

In [None]:
data.domain.class_var.values
data.Y

array([ 5.,  5.,  2.,  5.,  5.,  5.,  5.,  2.,  2.,  5.,  5.,  1.,  2.,
        4.,  4.,  4.,  1.,  5.,  2.,  5.,  1.,  1.,  5.,  1.,  3.,  0.,
        0.,  5.,  5.,  5.,  3.,  5.,  5.,  1.,  2.,  5.,  5.,  1.,  2.,
        3.,  3.,  1.,  3.,  1.,  5.,  5.,  4.,  5.,  5.,  5.,  5.,  3.,
        0.,  4.,  5.,  5.,  1.,  1.,  1.,  1.,  2.,  2.,  6.,  5.,  5.,
        5.,  5.,  5.,  5.,  5.,  5.,  1.,  4.,  2.,  5.,  5.,  6.,  4.,
        1.,  1.,  6.,  4.,  2.,  1.,  5.,  4.,  2.,  1.,  3.,  0.,  6.,
        6.,  2.,  5.,  5.,  1.,  5.,  3.,  5.,  4.,  1.])

In [None]:
#We can’t use table data directly; we first have to one-hot transform it:
X, mapping = OneHot.encode(data, include_class=True)

#16 attributes are mapped to 7 animal classes to get 43 attributes.

In [None]:
X.shape

(101, 43)

In [None]:
type(X)

numpy.ndarray

In [None]:
X

array([[False,  True,  True, ..., False,  True, False],
       [False,  True,  True, ..., False,  True, False],
       [ True, False,  True, ..., False, False, False],
       ..., 
       [False,  True,  True, ..., False,  True, False],
       [ True, False,  True, ...,  True, False, False],
       [ True, False, False, ..., False, False, False]], dtype=bool)

In [None]:
sorted(mapping.items())

[(0, (0, 0)),
 (1, (0, 1)),
 (2, (1, 0)),
 (3, (1, 1)),
 (4, (2, 0)),
 (5, (2, 1)),
 (6, (3, 0)),
 (7, (3, 1)),
 (8, (4, 0)),
 (9, (4, 1)),
 (10, (5, 0)),
 (11, (5, 1)),
 (12, (6, 0)),
 (13, (6, 1)),
 (14, (7, 0)),
 (15, (7, 1)),
 (16, (8, 0)),
 (17, (8, 1)),
 (18, (9, 0)),
 (19, (9, 1)),
 (20, (10, 0)),
 (21, (10, 1)),
 (22, (11, 0)),
 (23, (11, 1)),
 (24, (12, 0)),
 (25, (12, 1)),
 (26, (12, 2)),
 (27, (12, 3)),
 (28, (12, 4)),
 (29, (12, 5)),
 (30, (13, 0)),
 (31, (13, 1)),
 (32, (14, 0)),
 (33, (14, 1)),
 (34, (15, 0)),
 (35, (15, 1)),
 (36, (16, 0)),
 (37, (16, 1)),
 (38, (16, 2)),
 (39, (16, 3)),
 (40, (16, 4)),
 (41, (16, 5)),
 (42, (16, 6))]

In [None]:
#We want itemsets with >40% support
itemsets = dict(frequent_itemsets(X, .4))
len(itemsets)

520

In [None]:
#The transaction-coded items corresponding to class values are:
class_items = {item for item, var, _ in OneHot.decode(mapping, data, mapping) if var is data.domain.class_var}
sorted(class_items)

[36, 37, 38, 39, 40, 41, 42]

In [None]:
#Now we can generate all association rules that have consequent 
#equal to one of the class values and >80% confidence (i.e. classification rules):
rules = [(P, Q, supp, conf) for P, Q, supp, conf in association_rules(itemsets, .8) if len(Q) == 1 and Q & class_items]
len(rules)
rules

[(frozenset({2, 7, 17, 19, 20}), frozenset({41}), 41, 1.0),
 (frozenset({2, 7, 17, 19}), frozenset({41}), 41, 1.0),
 (frozenset({2, 7, 17, 20}), frozenset({41}), 41, 1.0),
 (frozenset({2, 7, 19, 20}), frozenset({41}), 41, 1.0),
 (frozenset({2, 17, 19, 20}), frozenset({41}), 41, 0.8723404255319149),
 (frozenset({7, 17, 19, 20}), frozenset({41}), 41, 1.0),
 (frozenset({2, 7, 17}), frozenset({41}), 41, 1.0),
 (frozenset({2, 7, 19}), frozenset({41}), 41, 1.0),
 (frozenset({2, 17, 19}), frozenset({41}), 41, 0.8367346938775511),
 (frozenset({7, 17, 19}), frozenset({41}), 41, 1.0),
 (frozenset({2, 7, 20}), frozenset({41}), 41, 1.0),
 (frozenset({7, 17, 20}), frozenset({41}), 41, 1.0),
 (frozenset({7, 19, 20}), frozenset({41}), 41, 1.0),
 (frozenset({2, 7}), frozenset({41}), 41, 1.0),
 (frozenset({7, 17}), frozenset({41}), 41, 1.0),
 (frozenset({7, 19}), frozenset({41}), 41, 1.0),
 (frozenset({7, 20}), frozenset({41}), 41, 1.0),
 (frozenset({7}), frozenset({41}), 41, 1.0)]

In [None]:
#To make them more helpful, we can use mapping to transform the rules’ 
#items back into table domain values, e.g. for first five rules:
names = {item: '{}={}'.format(var.name, val) for item, var, val in OneHot.decode(mapping, data, mapping)}
for ante, cons, supp, conf in rules:
     print(', '.join(names[i] for i in ante), '-->', names[next(iter(cons))], '(supp: {}, conf: {})'.format(supp, conf))

feathers=0, milk=1, backbone=1, breathes=1, venomous=0 --> type=mammal (supp: 41, conf: 1.0)
backbone=1, feathers=0, breathes=1, milk=1 --> type=mammal (supp: 41, conf: 1.0)
backbone=1, feathers=0, venomous=0, milk=1 --> type=mammal (supp: 41, conf: 1.0)
feathers=0, breathes=1, venomous=0, milk=1 --> type=mammal (supp: 41, conf: 1.0)
backbone=1, feathers=0, breathes=1, venomous=0 --> type=mammal (supp: 41, conf: 0.8723404255319149)
backbone=1, breathes=1, venomous=0, milk=1 --> type=mammal (supp: 41, conf: 1.0)
backbone=1, feathers=0, milk=1 --> type=mammal (supp: 41, conf: 1.0)
feathers=0, breathes=1, milk=1 --> type=mammal (supp: 41, conf: 1.0)
backbone=1, feathers=0, breathes=1 --> type=mammal (supp: 41, conf: 0.8367346938775511)
backbone=1, breathes=1, milk=1 --> type=mammal (supp: 41, conf: 1.0)
feathers=0, venomous=0, milk=1 --> type=mammal (supp: 41, conf: 1.0)
backbone=1, venomous=0, milk=1 --> type=mammal (supp: 41, conf: 1.0)
breathes=1, venomous=0, milk=1 --> type=mammal (su

More examples here: https://orange3-associate.readthedocs.io/en/latest/scripting.html#fpgrowth.frequent_itemsets