# Functional Dependencies

Functional dependencies (FDs) play a crucial role in database design and data analysis. They help identify relationships between attributes, enabling:

- **Database Normalization** - Eliminating redundancy and update anomalies by decomposing tables into well-structured relations.
- **Schema Optimization** - Improving query performance through efficient table design.
- **Data Quality Analysis** - Detecting inconsistencies and constraints in datasets.

Let $r$ be a relation, and let $X$ and $Y$ be arbitrary subsets of the attribute set of $r$.

We say that $Y$ is functionally dependent on $X$, denoted as $X \rightarrow Y$, if and only if every value of $X$ in $r$ is associated with exactly one value of $Y$ in $r$.

$$X \rightarrow Y \iff \forall t1,t2 \in r, \; t1[X]=t2[X] \rightarrow t1[Y]=t2[Y]$$

- t1,t2: tuples in relation $r$;
- t[X]: the projection of tuple $t$ on attribute set $X$.

If you wish to know more, there are a few articles we would recommend:
- [TANE: Efficient discovery of approximate dependencies.](https://dl.acm.org/doi/10.14778/3192965.3192968)
- [Functional dependency discovery: an experimental evaluation of seven algorithms.](https://dl.acm.org/doi/abs/10.14778/2794367.2794377?download=true)

Let us now show how Desbordante helps you with discovering Functional Dependencies in dataset.

# Install python libraries

In [10]:
!pip install desbordante==2.3.2
!pip install pandas



# Import desbordante and pandas

In [11]:
import desbordante as db
import pandas as pd

# Get sample datasets

In [None]:
!wget -q https://raw.githubusercontent.com/Desbordante/desbordante-core/main/examples/datasets/duplicates_short.csv
!wget -q https://raw.githubusercontent.com/Desbordante/desbordante-core/main/examples/datasets/university_fd.csv

Load the data

In [13]:
pd.read_csv("university_fd.csv")

Unnamed: 0,Course,Classroom,Professor,Semester
0,Math,512,Dr. Smith,Fall
1,Physics,406,Dr. Green,Fall
2,English,208,Prof. Turner,Fall
3,History,209,Prof. Davis,Fall
4,Math,512,Dr. Smith,Spring
5,Physics,503,Dr. Gray,Spring
6,English,116,Prof. Turner,Spring
7,Biology,209,Prof. Light,Spring


# Dsicover functional dependencies

Using Desbordante it's trivial to discover all `functional dependecies` in the dataset.

In [14]:
algo = db.fd.algorithms.Default()
algo.load_data(table=("university_fd.csv", ',', True))
algo.execute()
print('FDs:')
for fd in algo.get_fds():
    print(fd)

FDs:
[Course Classroom] -> Professor
[Classroom Semester] -> Professor
[Classroom Semester] -> Course
[Professor] -> Course
[Professor Semester] -> Classroom
[Course Semester] -> Classroom
[Course Semester] -> Professor
