Activity 5-2
------------

Recall that given a set of attributes  $\{A_1, \dots, A_n\}$ and a set of FDs $\Gamma$

The closure, denoted $\{A_1, \dots, A_n\}^+$, is defined to be the largest set of attributes B s.t. $$A_1,\dots,A_n \rightarrow B \text{ using } \Gamma.$$

### Exercise 0

First, we want to just show you how easy it is to compute closures; first we just define some utility functions:

In [1]:
def to_set(x):
    if type(x) == set:
        return x
    elif type(x) in [list, set]:
        return set(x)
    elif type(x) in [str, int]:
        return set([x])
    else:
        raise Exception("Unrecognized type.")
def fd_to_str((lhs,rhs)): return ",".join(to_set(lhs)) + " -> " + ",".join(to_set(rhs))
def fds_to_str(fds): return "\n\t".join(map(fd_to_str, fds))
def set_to_str(x): return "{" + ",".join(x) + "}"

Next, we define an extremely simple but important sub-function, which just returns true if a given FD applies to a given set of attributes $x$:

In [2]:
def fd_applies_to(fd, x): 
    lhs, rhs = map(to_set, fd)
    return lhs.issubset(x)

Finally, the algorithm to compute the closure of a set of attributes $x$ given FDs $fd$:

In [3]:
def compute_closure(x, fds, verbose=False):
    bChanged = True        # We will repeat until there are no changes.
    x_ret    = x.copy()    # Make a copy of the input to hold x^{+}
    while bChanged:
        bChanged = False   # Must change on each iteration
        for fd in fds:     # loop through all the FDs.
            (lhs, rhs) = map(to_set, fd) # recall: lhs -> rhs
            if fd_applies_to(fd, x_ret) and not rhs.issubset(x_ret):
                x_ret = x_ret.union(rhs)
                if verbose:
                    print("Using FD " + fd_to_str(fd))
                    print("\t Updated x to " + set_to_str(x_ret))
                bChanged = True
    return x_ret

**Note** that our algorithm accepts FDs of the form $\{a_1,...,a_n\}\rightarrow a'$ as well as of the form $\{a_1,...,a_n\}\rightarrow \{a'_1,...,a'_n\}$- why can we get away with only considering FDs with single attributes on the RHS?  _(Hint: see lecture!)_

### Exercise 1

Consider a schema with attributes $X=\{A,B,C,D,E,F,G,H\}$.

In this exercise, you are given a set of attributes $A\subset X$ and a set of FDs $F$.  Find **one FD** to add to $F$ so that the closure $A^+=X$

(As we'll find out immediately after this activity, this equivalent to saying: _Find one FD to add such that $A$ becomes a superkey for $X$_)

In [4]:
A = set(['A', 'B','F'])
F = [(set(['A', 'C']), 'D'),
     (set(['D','H', 'G']), 'E'),
     (set(['A', 'B']), 'G'),
     (set(['F', 'B', 'G']), 'C')]

In [5]:
compute_closure(A, F, verbose=True)

Using FD A,B -> G
	 Updated x to {A,B,G,F}
Using FD B,G,F -> C
	 Updated x to {A,C,B,G,F}
Using FD A,C -> D
	 Updated x to {A,C,B,D,G,F}


{'A', 'B', 'C', 'D', 'F', 'G'}