# US Flights

In this notebook we will load a dataset with information about US Flights. The data is publicly accessible at the [Bureau of Transportation Statistics](https://www.transtats.bts.gov/Homepage.asp) of the US Department of Transportation. We will load a selection of this data stored in a AWS cloudfront.

To execute queries and upload data to Exasol database we will be using the <a href="https://github.com/exasol/pyexasol" target="_blank" rel="noopener">`pyexasol`</a> module.

## Prerequisites

Prior to using this notebook the following steps need to be completed:
1. [Configure the AI-Lab](../main_config.ipynb).

## Setup

### Open Secure Configuration Storage

In [None]:
%run ../utils/access_store_ui.ipynb
display(get_access_store_ui('../'))

## Create tables

In [None]:
from exasol.nb_connector.connections import open_pyexasol_connection

In [46]:
sql = f"""
CREATE OR REPLACE TABLE "{ai_lab_config.db_schema}"."US_FLIGHTS" (
        FL_DATE DATE,
        OP_CARRIER_AIRLINE_ID DECIMAL(10, 0),
        ORIGIN_AIRPORT_SEQ_ID DECIMAL(10, 0),
        ORIGIN_STATE_ABR CHAR(2),
        DEST_AIRPORT_SEQ_ID DECIMAL(10, 0),
        DEST_STATE_ABR CHAR(2),
        CRS_DEP_TIME CHAR(4),
        DEP_DELAY DECIMAL(6, 2),
        CRS_ARR_TIME CHAR(4),
        ARR_DELAY DECIMAL(6, 2),
        CANCELLED BOOLEAN,
        CANCELLATION_CODE CHAR(1),
        DIVERTED BOOLEAN,
        CRS_ELAPSED_TIME DECIMAL(6, 2),
        ACTUAL_ELAPSED_TIME DECIMAL(6, 2),
        DISTANCE DECIMAL(6, 2),
        CARRIER_DELAY DECIMAL(6, 2),
        WEATHER_DELAY DECIMAL(6, 2),
        NAS_DELAY DECIMAL(6, 2),
        SECURITY_DELAY DECIMAL(6, 2),
        LATE_AIRCRAFT_DELAY DECIMAL(6, 2)
);
"""

with open_pyexasol_connection(ai_lab_config, compression=True) as conn:
    conn.execute(query=sql)

In [47]:
sql = f"""
CREATE OR REPLACE TABLE "{ai_lab_config.db_schema}"."US_AIRLINES" (
        OP_CARRIER_AIRLINE_ID DECIMAL(10, 0) IDENTITY PRIMARY KEY,
        CARRIER_NAME VARCHAR(1000)
);
"""

with open_pyexasol_connection(ai_lab_config, compression=True) as conn:
    conn.execute(query=sql)

## Bring in the UI functions

We will need some UI functions that will handle loading the data.

In [None]:
%run utils/flight_utils.ipynb

## Load the data

Please select one or more data periods for the flights in the table below. Once the data for the selected periods is loaded the entries will be removed from the table. Please do not load data for the same period more than once.

Load the airlines' data (their codes and names). A repeated attempt to load the airlines' data will result in the primary key violation error.

In [51]:
display(get_data_selection_ui(ai_lab_config))

Box(children=(Box(children=(Label(value='Data Periods', layout=Layout(border_bottom='solid 1px', border_left='…