### Relational Data

It’s rare that a data analysis involves only a single table of data. Typically you have many tables of data, and you must combine them to answer the questions that you’re interested in. Collectively, multiple tables of data are called relational data because it is the relations, not just the individual datasets, that are important.

We will use the nycflights13 package to learn about relational data

**Let's import the libraries and data**

In [None]:
import pandas as pd
import numpy as np

# Load the data
flights = pd.read_csv('../Data/nycflights13_flights.csv', index_col=0)
flights.reset_index(drop=True, inplace=True)

airlines = pd.read_csv('../Data/nycflights13_airlines.csv')
airports = pd.read_csv('../Data/nycflights13_airports.csv')
planes = pd.read_csv('../Data/nycflights13_planes.csv')
weather = pd.read_csv('../Data/nycflights13_weather.csv')

display(flights)
display(airlines)
display(airports)
display(planes)
display(weather)

One way to show the relationships between the different tables is with a drawing:

![GitHub Codespaces](fligths_data.png)

There are 2 type of joins

**Mutating joins:**
| Join Type     | Description                                                                 |
|---------------|-----------------------------------------------------------------------------|
| Inner Join    | Returns only rows with matching values in both tables.                      |
| Left Join     | Returns all rows from the left table, and matching rows from the right.      |
| Right Join    | Returns all rows from the right table, and matching rows from the left.      |
| Full Join (Outer Join)    | Returns all rows when there is a match in either left or right table.        |

![GitHub Codespaces](mutating_joins.png)

**Filtering joins:**
| Join Type   | Description                                                                 |
|-------------|-----------------------------------------------------------------------------|
| Semi Join   | Returns all rows from the left table where a match exists in the right table, but without bringing in the columns from the right table. |
| Anti Join   | Returns all rows from the left table where **no match** exists in the right table. |


**Inner Join**

In [None]:
# Keeps only flights where the carrier exists in the airlines table.

**Left Join**

In [None]:
 # Keeps all flights and adds airline info if the carrier exists.  

**Right Join**

In [None]:
 # Keeps all airlines and only the flights that match them.  

**Full Join**

In [None]:

#Keeps all flights and all airlines, matching where possible, filling missing values with NaN.  

**Chaining Multiple Joins**

Let’s say you want to combine:

`flights` (main table)

`airlines` (carrier name)

`planes` (plane details by tail number)

`airports` (destination info)

Recall from Data Transformation that its better to assign larger chains of functions to a variable using `=`

In [None]:
flights_all = ()
flights_all

### All Done!