# Functional Dependencies

Functional dependencies (FDs) play a crucial role in database design and data analysis. They help identify relationships between attributes, enabling:

- **Database Normalization** - Eliminating redundancy and update anomalies by decomposing tables into well-structured relations.
- **Schema Optimization** - Improving query performance through efficient table design.
- **Data Quality Analysis** - Detecting inconsistencies and constraints in datasets.

Let $r$ be a relation, and let $X$ and $Y$ be arbitrary subsets of the attribute set of $r$.

We say that $Y$ is functionally dependent on $X$, denoted as $X \rightarrow Y$, if and only if every value of $X$ in $r$ is associated with exactly one value of $Y$ in $r$.

$$X \rightarrow Y \iff \forall t1,t2 \in r, \; t1[X]=t2[X] \rightarrow t1[Y]=t2[Y]$$

- t1,t2: tuples in relation $r$;
- t[X]: the projection of tuple $t$ on attribute set $X$.

Let us show how Desbordante helps you with discovering Functional Dependencies in dataset.

# Install python libraries

In [2]:
!pip install desbordante==2.3.2
!pip install pandas

Collecting desbordante==2.3.2
  Downloading desbordante-2.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (19 kB)
Downloading desbordante-2.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.0 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.0/4.0 MB[0m [31m12.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: desbordante
Successfully installed desbordante-2.3.2


# Import desbordante and pandas

In [3]:
import desbordante as db
import pandas as pd

# Get sample datasets

In [4]:
!wget -q https://raw.githubusercontent.com/Desbordante/desbordante-core/main/examples/datasets/university_fd.csv

Load the data

In [5]:
pd.read_csv("university_fd.csv")

Unnamed: 0,Course,Classroom,Professor,Semester
0,Math,512,Dr. Smith,Fall
1,Physics,406,Dr. Green,Fall
2,English,208,Prof. Turner,Fall
3,History,209,Prof. Davis,Fall
4,Math,512,Dr. Smith,Spring
5,Physics,503,Dr. Gray,Spring
6,English,116,Prof. Turner,Spring
7,Biology,209,Prof. Light,Spring


# Dsicover functional dependencies

Using Desbordante it's trivial to discover all `functional dependecies` in the dataset.

In [6]:
algo = db.fd.algorithms.Default()
algo.load_data(table=("university_fd.csv", ',', True))
algo.execute()
print('FDs:')
for fd in algo.get_fds():
    print(fd)

FDs:
[Course Classroom] -> Professor
[Classroom Semester] -> Professor
[Classroom Semester] -> Course
[Professor] -> Course
[Professor Semester] -> Classroom
[Course Semester] -> Classroom
[Course Semester] -> Professor
