#### SQL Assignment

We are going to be using the open flights database. The following code will download the data and then set up a sqllite DB that will be the basis of the quiz

In [25]:
import sqlalchemy
import sqlite3
import pandas as pd

# Example: Load a sample dataset from a URL (Iris dataset here)
url_dict = {
'airports' : ('https://raw.githubusercontent.com/jpatokal/openflights/master/data/airports.dat',["Airport ID", "Name","City",'Country','IATA','ICAO','Latitude','Longitude','Altitude',
                                                                                                 'Timezone','DST','Tz database timezone','Type','Source']),
'airlines' : ('https://raw.githubusercontent.com/jpatokal/openflights/master/data/airlines.dat',['Airline ID','Name','Alias','IATA','ICAO','Callsign','Country','Active']),
'routes' : ('https://raw.githubusercontent.com/jpatokal/openflights/master/data/routes.dat',['Airline','Airline ID','Source airport','Source airport ID','Destination airport',
                                                                                             'Destination airport ID','Codeshare','Stops','Equipment']),
'planes' : ('https://raw.githubusercontent.com/jpatokal/openflights/master/data/planes.dat',['Name','IATA code','ICAO code'])
}



conn = sqlite3.connect("openflights.db")
cursor = conn.cursor()

for db_name, (url,columns) in url_dict.items():
    df = pd.read_csv(url,names=columns)
    df.columns = df.columns.str.lower().str.replace(" ", "_")
    print(df)
    df.to_sql(db_name,conn,if_exists='replace',index = False),

      airport_id                                         name          city  \
0              1                               Goroka Airport        Goroka   
1              2                               Madang Airport        Madang   
2              3                 Mount Hagen Kagamuga Airport   Mount Hagen   
3              4                               Nadzab Airport        Nadzab   
4              5  Port Moresby Jacksons International Airport  Port Moresby   
...          ...                                          ...           ...   
7693       14106                          Rogachyovo Air Base        Belaya   
7694       14107                        Ulan-Ude East Airport      Ulan Ude   
7695       14108                         Krechevitsy Air Base      Novgorod   
7696       14109                  Desierto de Atacama Airport       Copiapo   
7697       14110                           Melitopol Air Base     Melitopol   

               country iata  icao   latitude   long

In [26]:
query = "select * from airports"

df = pd.read_sql_query(query,conn)

df

Unnamed: 0,airport_id,name,city,country,iata,icao,latitude,longitude,altitude,timezone,dst,tz_database_timezone,type,source
0,1,Goroka Airport,Goroka,Papua New Guinea,GKA,AYGA,-6.081690,145.391998,5282,10,U,Pacific/Port_Moresby,airport,OurAirports
1,2,Madang Airport,Madang,Papua New Guinea,MAG,AYMD,-5.207080,145.789001,20,10,U,Pacific/Port_Moresby,airport,OurAirports
2,3,Mount Hagen Kagamuga Airport,Mount Hagen,Papua New Guinea,HGU,AYMH,-5.826790,144.296005,5388,10,U,Pacific/Port_Moresby,airport,OurAirports
3,4,Nadzab Airport,Nadzab,Papua New Guinea,LAE,AYNZ,-6.569803,146.725977,239,10,U,Pacific/Port_Moresby,airport,OurAirports
4,5,Port Moresby Jacksons International Airport,Port Moresby,Papua New Guinea,POM,AYPY,-9.443380,147.220001,146,10,U,Pacific/Port_Moresby,airport,OurAirports
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
7693,14106,Rogachyovo Air Base,Belaya,Russia,\N,ULDA,71.616699,52.478298,272,\N,\N,\N,airport,OurAirports
7694,14107,Ulan-Ude East Airport,Ulan Ude,Russia,\N,XIUW,51.849998,107.737999,1670,\N,\N,\N,airport,OurAirports
7695,14108,Krechevitsy Air Base,Novgorod,Russia,\N,ULLK,58.625000,31.385000,85,\N,\N,\N,airport,OurAirports
7696,14109,Desierto de Atacama Airport,Copiapo,Chile,CPO,SCAT,-27.261200,-70.779198,670,\N,\N,\N,airport,OurAirports


#### SQL Quiz

1. Write a SQL query to retrieve all columns for the first 5 rows from the `airports` table.

2. Write a SQL query to select the `name`, `city`, and `country` columns from the `airports` table for airports located in the United States.

3. Write a SQL query that counts the total number of active airlines in the `airlines` table (assume an active airline is marked with `'Y'` in the `active` column).

4. Write a SQL query to list the `name`, `city`, `country`, and `altitude` columns from the `airports` table, ordered by altitude in descending order.

5. Write a SQL query that shows the number of routes originating from each source airport by displaying the `source_airport` and the count of routes.

6. Write a SQL query to list the airline name along with the source and destination airport codes for each route by joining the `airlines` and `routes` tables on `airline_id`.

7. Write a SQL query to find the names of airports that do not serve as a source airport in any route.

8. Write a SQL query using a self-join on the `routes` table to identify source airports that serve multiple distinct destination airports, listing the source airport code along with two different destination airport codes.

9. Write a SQL statement to update the `active` status to `'N'` for the airline with a specific `airline_id` (e.g., `1234`) in the `airlines` table.

10. Write a SQL transaction that deletes all routes from the `routes` table where `stops` is greater than 0, commits the transaction, and then verifies that no routes with `stops` greater than 0 remain.
