# Exercise Sheet ## 8

## Task 1

**To show:** $\sum_{k=0}^N \binom{N}{k} = 2^N$

**Proof:**
For the binomial theorem it holds: $ (a + b)^n = \sum_{k=0}^n \binom{n}{k} a^{n-k} \cdot b^k $.
So for $a = 1, b = 1, n = N$ we get: $2^N = (1 + 1)^N = \sum_{k=0}^N \binom{N}{k} 1^{N-k} \cdot 1^k = \sum_{k=0}^N \binom{N}{k}$


## Task 2

### a)
If a single-item-set $\{c\}$ with $c \in \{1, \dots, 100\}$ is frequent, it has to be at least in five buckets. This means more or less, that the number $c$ has to have at least 5 multiples with a lower value than 100. So for each value $c \leq 20$, $c$ has five multiples $\leq 100$. For example $c = \{21\}$ wouldn't have 5 multiples within the range and thus wouldn't reach the threshhold of 5. So all single-item-sets $\{c\}$ with $c \in \{1, \dots, 20\}$ are frequent.

### b)
If a pair of items $\{a, b\}$ with $a, b \in \{1, \dots, 100\}$ is frequent, both pairs have to appear together in at least 5 buckets. That's the case, if the least common multiple (lcm, kgV in german) is below 20. For each multiple of the lcm, both $a$ and $b$ appear together in the corresponding bucket, since they are a multiple of the lcm which is a remainder of the multiple of the kgV. Also, they don't appear in any different bucket together. So if the lcm is above 20, $a$ and $b$ will be in less than five sets together, which means they aren't frequent.

### c)

1 is included in 100 baskets, 2 is included in 100/2=50 baskets, 3 is included in 100/3 = floor(33) baskets.
Hence, the total sum can be expressed by:
$$sumBasketSizes = \sum_{k=1}^{100} \lfloor \frac{100}{k} \rfloor$$

### d)

The confidence of a rule is defined as follows:
$\frac{support(I\cup J)}{support(I)}$

So for $R_1$:
$support(I) = 2$
$support(I\cup J)= 1$
Hence the confidence is 0.5

So for $R_2$:
$support(I)= 8$
$support(I\cup J)= 1$
Hence the confidence is 0.125

## Task 3

After 1st pass:
$$C_1=\{\{1\},\{2\},\{3\},\{4\},\{5\},\{6\},\{7\},\{8\},\{9\},\{10\},\{11\},\{12\},\{13\},\{14\},\{15\},\{16\},\{17\},\{18\},\{19\},\{20\}\}$$
non frequents (21-100)


In [21]:
# calculate frequent sets based on the observation of exercise 2 b)
# only pairs / triples which have a least common multiple equal or below 20 will overshoot the threshold of 5
from itertools import combinations

from math import gcd

def lcm(nums):
    lcm_temp = 1
    for i in nums:
        lcm_temp = lcm_temp * i // gcd(lcm_temp, i)
    return lcm_temp

def holds_threshold(nums):
    if lcm(nums) <= 20:
        return True
    else:
        return False

numbers = list(range(1,21))

print('After 2nd pass:')
lcm_pairs = filter(holds_threshold, combinations(numbers, 2))
print(list(lcm_pairs))

print('After 3rd pass:')
lcm_triples = filter(holds_threshold, combinations(numbers, 3))
print(list(lcm_triples))



After 2nd pass:
[(1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (1, 7), (1, 8), (1, 9), (1, 10), (1, 11), (1, 12), (1, 13), (1, 14), (1, 15), (1, 16), (1, 17), (1, 18), (1, 19), (1, 20), (2, 3), (2, 4), (2, 5), (2, 6), (2, 7), (2, 8), (2, 9), (2, 10), (2, 12), (2, 14), (2, 16), (2, 18), (2, 20), (3, 4), (3, 5), (3, 6), (3, 9), (3, 12), (3, 15), (3, 18), (4, 5), (4, 6), (4, 8), (4, 10), (4, 12), (4, 16), (4, 20), (5, 10), (5, 15), (5, 20), (6, 9), (6, 12), (6, 18), (7, 14), (8, 16), (9, 18), (10, 20)]
After 3rd pass:
[(1, 2, 3), (1, 2, 4), (1, 2, 5), (1, 2, 6), (1, 2, 7), (1, 2, 8), (1, 2, 9), (1, 2, 10), (1, 2, 12), (1, 2, 14), (1, 2, 16), (1, 2, 18), (1, 2, 20), (1, 3, 4), (1, 3, 5), (1, 3, 6), (1, 3, 9), (1, 3, 12), (1, 3, 15), (1, 3, 18), (1, 4, 5), (1, 4, 6), (1, 4, 8), (1, 4, 10), (1, 4, 12), (1, 4, 16), (1, 4, 20), (1, 5, 10), (1, 5, 15), (1, 5, 20), (1, 6, 9), (1, 6, 12), (1, 6, 18), (1, 7, 14), (1, 8, 16), (1, 9, 18), (1, 10, 20), (2, 3, 4), (2, 3, 6), (2, 3, 9), (2, 3, 12), (2, 3, 18