## How to use Itinerary Builder

Itinerary Builder is a module used to query a database of possible itineraries, and return a dataframe with information on each. The critical outputs include the origins and destinations of each two flight itinerary, their times, airlines, and durations. Included as well is the next best second leg flight for each if the connecting flight is missed.

A machine learning algorithm is used to weight the time cost of a missed connection based on the liklihood of the missed connnection occuring.

Users of the module must provide a connection time assumption. This is the minimum time between connecting flights which the user finds acceptable (e.g., allow 45 minutes minimum for connection).

Examples of each function's usage is below.

In [1]:
from itineraryBuilder import *

In [2]:
pd.set_option('display.max_columns', None)

### Itinerary Builder main function call

With default options, the resulting dataframe is ordered by risk as shown below.

In [3]:
origin = 'ATL'
destination = 'SEA'
flight_date = '8/3/2022'

dne = '700'
dnl = '1000'
ane = '1'
anl = '2000'
tc = 45

df = itineraryBuilder('faa', origin, destination, flight_date, tc, dne, dnl, ane, anl); df.head(10)

Unnamed: 0,FIRST_LEG_AIRLINE,FIRST_LEG_ORIG,FIRST_LEG_ORIG_CITY,FIRST_LEG_DEST,FIRST_LEG_DEST_CITY,FIRST_LEG_DATE,FIRST_LEG_FLIGHT_NUM,FIRST_LEG_DEP_TIME,FIRST_LEG_ARR_TIME,SECOND_LEG_AIRLINE,SECOND_LEG_ORIG,SECOND_LEG_ORIG_CITY,SECOND_LEG_DEST,SECOND_LEG_DEST_CITY,SECOND_LEG_DATE,SECOND_LEG_FLIGHT_NUM,SECOND_LEG_DEP_TIME,SECOND_LEG_ARR_TIME,NEXT_BEST_SECOND_LEG_DATE,NEXT_BEST_SECOND_LEG_DEP_TIME,NEXT_BEST_SECOND_LEG_ARR_TIME,FIRST_LEG_ORIG_TZ,FIRST_LEG_DEST_TZ,SECOND_LEG_ORIG_TZ,SECOND_LEG_DEST_TZ,FIRST_LEG_DEP_TIMESTAMP,FIRST_LEG_ARR_TIMESTAMP,SECOND_LEG_DEP_TIMESTAMP,SECOND_LEG_ARR_TIMESTAMP,NEXT_BEST_SECOND_LEG_DEP_TIMESTAMP,NEXT_BEST_SECOND_LEG_ARR_TIMESTAMP,overnight_bool_1,overnight_bool_2,overnight_bool_3,FIRST_FLIGHT_DURATION,SECOND_FLIGHT_DURATION,CONNECT_TIME,TRIP_TIME,RISK_MISSED_CONNECTION,NEXT_FLIGHT_TIMELOSS,TOTAL_RISK
0,DL,ATL,"Atlanta, GA",PHX,"Phoenix, AZ",8/3/2022,1065,810,901,DL,PHX,"Phoenix, AZ",SEA,"Seattle, WA",8/3/2022,1322,1121,1426,8/3/2022,1540,1841,US/Eastern,US/Arizona,US/Arizona,US/Pacific,2022-08-03 08:10:00-04:00,2022-08-03 09:01:00-07:00,2022-08-03 11:21:00-07:00,2022-08-03 14:26:00-07:00,2022-08-03 15:40:00-07:00,2022-08-03 18:41:00-07:00,0,0,0,231.0,185.0,140.0,556.0,0.040165,255.0,10.242185
1,DL,ATL,"Atlanta, GA",SFO,"San Francisco, CA",8/3/2022,715,834,1033,DL,SFO,"San Francisco, CA",SEA,"Seattle, WA",8/3/2022,3609,1304,1515,8/3/2022,1644,1843,US/Eastern,US/Pacific,US/Pacific,US/Pacific,2022-08-03 08:34:00-04:00,2022-08-03 10:33:00-07:00,2022-08-03 13:04:00-07:00,2022-08-03 15:15:00-07:00,2022-08-03 16:44:00-07:00,2022-08-03 18:43:00-07:00,0,0,0,299.0,131.0,151.0,581.0,0.105715,208.0,21.988646
2,DL,ATL,"Atlanta, GA",LAS,"Las Vegas, NV",8/3/2022,419,825,933,DL,LAS,"Las Vegas, NV",SEA,"Seattle, WA",8/3/2022,2760,1330,1610,8/3/2022,1541,1818,US/Eastern,US/Pacific,US/Pacific,US/Pacific,2022-08-03 08:25:00-04:00,2022-08-03 09:33:00-07:00,2022-08-03 13:30:00-07:00,2022-08-03 16:10:00-07:00,2022-08-03 15:41:00-07:00,2022-08-03 18:18:00-07:00,0,0,0,248.0,160.0,237.0,645.0,0.188957,128.0,24.18644
3,DL,ATL,"Atlanta, GA",SFO,"San Francisco, CA",8/3/2022,824,1000,1203,DL,SFO,"San Francisco, CA",SEA,"Seattle, WA",8/3/2022,3586,1644,1843,8/4/2022,630,842,US/Eastern,US/Pacific,US/Pacific,US/Pacific,2022-08-03 10:00:00-04:00,2022-08-03 12:03:00-07:00,2022-08-03 16:44:00-07:00,2022-08-03 18:43:00-07:00,2022-08-04 06:30:00-07:00,2022-08-04 08:42:00-07:00,0,0,0,303.0,119.0,281.0,703.0,0.03199,839.0,26.83954
4,DL,ATL,"Atlanta, GA",MSP,"Minneapolis, MN",8/3/2022,2262,1007,1140,DL,MSP,"Minneapolis, MN",SEA,"Seattle, WA",8/3/2022,890,1616,1800,8/3/2022,1816,2005,US/Eastern,US/Central,US/Central,US/Pacific,2022-08-03 10:07:00-04:00,2022-08-03 11:40:00-05:00,2022-08-03 16:16:00-05:00,2022-08-03 18:00:00-07:00,2022-08-03 18:16:00-05:00,2022-08-03 20:05:00-07:00,0,0,0,153.0,224.0,276.0,653.0,0.220384,125.0,27.547938
5,DL,ATL,"Atlanta, GA",SAN,"San Diego, CA",8/3/2022,868,950,1112,DL,SAN,"San Diego, CA",SEA,"Seattle, WA",8/3/2022,2904,1355,1650,8/3/2022,1520,1811,US/Eastern,US/Pacific,US/Pacific,US/Pacific,2022-08-03 09:50:00-04:00,2022-08-03 11:12:00-07:00,2022-08-03 13:55:00-07:00,2022-08-03 16:50:00-07:00,2022-08-03 15:20:00-07:00,2022-08-03 18:11:00-07:00,0,0,0,262.0,175.0,163.0,600.0,0.350741,81.0,28.409999
6,DL,ATL,"Atlanta, GA",PDX,"Portland, OR",8/3/2022,841,839,1041,DL,PDX,"Portland, OR",SEA,"Seattle, WA",8/3/2022,3839,1340,1444,8/3/2022,1520,1624,US/Eastern,US/Pacific,US/Pacific,US/Pacific,2022-08-03 08:39:00-04:00,2022-08-03 10:41:00-07:00,2022-08-03 13:40:00-07:00,2022-08-03 14:44:00-07:00,2022-08-03 15:20:00-07:00,2022-08-03 16:24:00-07:00,0,0,0,302.0,64.0,179.0,545.0,0.416782,100.0,41.678186
7,DL,ATL,"Atlanta, GA",SLC,"Salt Lake City, UT",8/3/2022,381,810,1012,DL,SLC,"Salt Lake City, UT",SEA,"Seattle, WA",8/3/2022,500,1140,1254,8/3/2022,1530,1640,US/Eastern,US/Mountain,US/Mountain,US/Pacific,2022-08-03 08:10:00-04:00,2022-08-03 10:12:00-06:00,2022-08-03 11:40:00-06:00,2022-08-03 12:54:00-07:00,2022-08-03 15:30:00-06:00,2022-08-03 16:40:00-07:00,0,0,0,242.0,134.0,88.0,464.0,0.188475,226.0,42.595435
8,DL,ATL,"Atlanta, GA",SLC,"Salt Lake City, UT",8/3/2022,381,810,1012,DL,SLC,"Salt Lake City, UT",SEA,"Seattle, WA",8/3/2022,796,1530,1640,8/3/2022,1720,1828,US/Eastern,US/Mountain,US/Mountain,US/Pacific,2022-08-03 08:10:00-04:00,2022-08-03 10:12:00-06:00,2022-08-03 15:30:00-06:00,2022-08-03 16:40:00-07:00,2022-08-03 17:20:00-06:00,2022-08-03 18:28:00-07:00,0,0,0,242.0,130.0,318.0,690.0,0.535442,108.0,57.82771
9,DL,ATL,"Atlanta, GA",MSP,"Minneapolis, MN",8/3/2022,503,715,848,DL,MSP,"Minneapolis, MN",SEA,"Seattle, WA",8/3/2022,894,1100,1248,8/3/2022,1616,1800,US/Eastern,US/Central,US/Central,US/Pacific,2022-08-03 07:15:00-04:00,2022-08-03 08:48:00-05:00,2022-08-03 11:00:00-05:00,2022-08-03 12:48:00-07:00,2022-08-03 16:16:00-05:00,2022-08-03 18:00:00-07:00,0,0,0,153.0,228.0,132.0,513.0,0.1886,312.0,58.843203


You can also order by trip duration as shown:

In [4]:
df = itineraryBuilder('faa', origin, destination, flight_date, tc, dne, dnl, ane, anl, orderby='duration'); df.head(10)

Unnamed: 0,FIRST_LEG_AIRLINE,FIRST_LEG_ORIG,FIRST_LEG_ORIG_CITY,FIRST_LEG_DEST,FIRST_LEG_DEST_CITY,FIRST_LEG_DATE,FIRST_LEG_FLIGHT_NUM,FIRST_LEG_DEP_TIME,FIRST_LEG_ARR_TIME,SECOND_LEG_AIRLINE,SECOND_LEG_ORIG,SECOND_LEG_ORIG_CITY,SECOND_LEG_DEST,SECOND_LEG_DEST_CITY,SECOND_LEG_DATE,SECOND_LEG_FLIGHT_NUM,SECOND_LEG_DEP_TIME,SECOND_LEG_ARR_TIME,NEXT_BEST_SECOND_LEG_DATE,NEXT_BEST_SECOND_LEG_DEP_TIME,NEXT_BEST_SECOND_LEG_ARR_TIME,FIRST_LEG_ORIG_TZ,FIRST_LEG_DEST_TZ,SECOND_LEG_ORIG_TZ,SECOND_LEG_DEST_TZ,FIRST_LEG_DEP_TIMESTAMP,FIRST_LEG_ARR_TIMESTAMP,SECOND_LEG_DEP_TIMESTAMP,SECOND_LEG_ARR_TIMESTAMP,NEXT_BEST_SECOND_LEG_DEP_TIMESTAMP,NEXT_BEST_SECOND_LEG_ARR_TIMESTAMP,overnight_bool_1,overnight_bool_2,overnight_bool_3,FIRST_FLIGHT_DURATION,SECOND_FLIGHT_DURATION,CONNECT_TIME,TRIP_TIME,RISK_MISSED_CONNECTION,NEXT_FLIGHT_TIMELOSS,TOTAL_RISK
0,DL,ATL,"Atlanta, GA",DEN,"Denver, CO",8/3/2022,1537,955,1059,DL,DEN,"Denver, CO",SEA,"Seattle, WA",8/3/2022,3551,1230,1436,8/4/2022,650,848,US/Eastern,US/Mountain,US/Mountain,US/Pacific,2022-08-03 09:55:00-04:00,2022-08-03 10:59:00-06:00,2022-08-03 12:30:00-06:00,2022-08-03 14:36:00-07:00,2022-08-04 06:50:00-06:00,2022-08-04 08:48:00-07:00,0,0,0,184.0,186.0,91.0,461.0,0.582742,1092.0,636.353955
1,DL,ATL,"Atlanta, GA",SLC,"Salt Lake City, UT",8/3/2022,381,810,1012,DL,SLC,"Salt Lake City, UT",SEA,"Seattle, WA",8/3/2022,500,1140,1254,8/3/2022,1530,1640,US/Eastern,US/Mountain,US/Mountain,US/Pacific,2022-08-03 08:10:00-04:00,2022-08-03 10:12:00-06:00,2022-08-03 11:40:00-06:00,2022-08-03 12:54:00-07:00,2022-08-03 15:30:00-06:00,2022-08-03 16:40:00-07:00,0,0,0,242.0,134.0,88.0,464.0,0.036421,226.0,8.231198
2,DL,ATL,"Atlanta, GA",ONT,"Ontario, CA",8/3/2022,1144,953,1110,DL,ONT,"Ontario, CA",SEA,"Seattle, WA",8/3/2022,3659,1206,1445,8/4/2022,613,855,US/Eastern,US/Pacific,US/Pacific,US/Pacific,2022-08-03 09:53:00-04:00,2022-08-03 11:10:00-07:00,2022-08-03 12:06:00-07:00,2022-08-03 14:45:00-07:00,2022-08-04 06:13:00-07:00,2022-08-04 08:55:00-07:00,0,0,0,257.0,159.0,56.0,472.0,0.459274,1090.0,500.608265
3,DL,ATL,"Atlanta, GA",GEG,"Spokane, WA",8/3/2022,921,950,1127,DL,GEG,"Spokane, WA",SEA,"Seattle, WA",8/3/2022,3605,1325,1443,8/3/2022,1725,1839,US/Eastern,US/Pacific,US/Pacific,US/Pacific,2022-08-03 09:50:00-04:00,2022-08-03 11:27:00-07:00,2022-08-03 13:25:00-07:00,2022-08-03 14:43:00-07:00,2022-08-03 17:25:00-07:00,2022-08-03 18:39:00-07:00,0,0,0,277.0,78.0,118.0,473.0,0.399938,236.0,94.385425
4,DL,ATL,"Atlanta, GA",DTW,"Detroit, MI",8/3/2022,352,910,1107,DL,DTW,"Detroit, MI",SEA,"Seattle, WA",8/3/2022,830,1215,1410,8/3/2022,1630,1821,US/Eastern,US/Eastern,US/Eastern,US/Pacific,2022-08-03 09:10:00-04:00,2022-08-03 11:07:00-04:00,2022-08-03 12:15:00-04:00,2022-08-03 14:10:00-07:00,2022-08-03 16:30:00-04:00,2022-08-03 18:21:00-07:00,0,0,0,117.0,295.0,68.0,480.0,0.644722,251.0,161.825328
5,DL,ATL,"Atlanta, GA",SJC,"San Jose, CA",8/3/2022,1743,910,1102,DL,SJC,"San Jose, CA",SEA,"Seattle, WA",8/3/2022,3608,1155,1414,8/3/2022,1431,1645,US/Eastern,US/Pacific,US/Pacific,US/Pacific,2022-08-03 09:10:00-04:00,2022-08-03 11:02:00-07:00,2022-08-03 11:55:00-07:00,2022-08-03 14:14:00-07:00,2022-08-03 14:31:00-07:00,2022-08-03 16:45:00-07:00,0,0,0,292.0,139.0,53.0,484.0,0.043363,151.0,6.547876
6,DL,ATL,"Atlanta, GA",SNA,"Santa Ana, CA",8/3/2022,778,954,1123,DL,SNA,"Santa Ana, CA",SEA,"Seattle, WA",8/3/2022,1525,1226,1506,8/4/2022,730,1027,US/Eastern,US/Pacific,US/Pacific,US/Pacific,2022-08-03 09:54:00-04:00,2022-08-03 11:23:00-07:00,2022-08-03 12:26:00-07:00,2022-08-03 15:06:00-07:00,2022-08-04 07:30:00-07:00,2022-08-04 10:27:00-07:00,0,0,0,269.0,160.0,63.0,492.0,0.436891,1161.0,507.230176
7,DL,ATL,"Atlanta, GA",SFO,"San Francisco, CA",8/3/2022,824,1000,1203,DL,SFO,"San Francisco, CA",SEA,"Seattle, WA",8/3/2022,3609,1304,1515,8/3/2022,1644,1843,US/Eastern,US/Pacific,US/Pacific,US/Pacific,2022-08-03 10:00:00-04:00,2022-08-03 12:03:00-07:00,2022-08-03 13:04:00-07:00,2022-08-03 15:15:00-07:00,2022-08-03 16:44:00-07:00,2022-08-03 18:43:00-07:00,0,0,0,303.0,131.0,61.0,495.0,0.546321,208.0,113.6348
8,DL,ATL,"Atlanta, GA",LAS,"Las Vegas, NV",8/3/2022,419,825,933,DL,LAS,"Las Vegas, NV",SEA,"Seattle, WA",8/3/2022,2138,1100,1340,8/3/2022,1330,1610,US/Eastern,US/Pacific,US/Pacific,US/Pacific,2022-08-03 08:25:00-04:00,2022-08-03 09:33:00-07:00,2022-08-03 11:00:00-07:00,2022-08-03 13:40:00-07:00,2022-08-03 13:30:00-07:00,2022-08-03 16:10:00-07:00,0,0,0,248.0,160.0,87.0,495.0,0.596534,150.0,89.480124
9,DL,ATL,"Atlanta, GA",TPA,"Tampa, FL",8/3/2022,2211,840,1007,DL,TPA,"Tampa, FL",SEA,"Seattle, WA",8/3/2022,459,1106,1401,8/4/2022,1106,1401,US/Eastern,US/Eastern,US/Eastern,US/Pacific,2022-08-03 08:40:00-04:00,2022-08-03 10:07:00-04:00,2022-08-03 11:06:00-04:00,2022-08-03 14:01:00-07:00,2022-08-04 11:06:00-04:00,2022-08-04 14:01:00-07:00,0,0,0,87.0,355.0,59.0,501.0,0.592095,1440.0,852.616798


### Query Flights Function Call

Query flights includes many of the same parameters as Itinerary Builder and is in fact called by Itinerary Builder. This returns the initial query with the dates and times as string values.

In [5]:
origin = 'ATL'
destination = 'SEA'
flight_date = '8/3/2022'

dne = '700'
dnl = '1000'
ane = '1'
anl = '2000'

df = queryFlights('faa', origin, destination, flight_date, dne, dnl, ane, anl); df

Unnamed: 0,FIRST_LEG_AIRLINE,FIRST_LEG_ORIG,FIRST_LEG_ORIG_CITY,FIRST_LEG_DEST,FIRST_LEG_DEST_CITY,FIRST_LEG_DATE,FIRST_LEG_FLIGHT_NUM,FIRST_LEG_DEP_TIME,FIRST_LEG_ARR_TIME,SECOND_LEG_AIRLINE,SECOND_LEG_ORIG,SECOND_LEG_ORIG_CITY,SECOND_LEG_DEST,SECOND_LEG_DEST_CITY,SECOND_LEG_DATE,SECOND_LEG_FLIGHT_NUM,SECOND_LEG_DEP_TIME,SECOND_LEG_ARR_TIME,NEXT_BEST_SECOND_LEG_DATE,NEXT_BEST_SECOND_LEG_DEP_TIME,NEXT_BEST_SECOND_LEG_ARR_TIME
0,DL,ATL,"Atlanta, GA",AUS,"Austin, TX",8/3/2022,2725,820,937,DL,AUS,"Austin, TX",SEA,"Seattle, WA",8/3/2022,988,600,825,8/3/2022,1628,1852
1,DL,ATL,"Atlanta, GA",AUS,"Austin, TX",8/3/2022,2725,820,937,DL,AUS,"Austin, TX",SEA,"Seattle, WA",8/3/2022,1067,1628,1852,8/4/2022,600,825
2,DL,ATL,"Atlanta, GA",AUS,"Austin, TX",8/3/2022,2725,820,937,DL,AUS,"Austin, TX",SEA,"Seattle, WA",8/4/2022,988,600,825,8/4/2022,1628,1852
3,DL,ATL,"Atlanta, GA",AUS,"Austin, TX",8/3/2022,2725,820,937,DL,AUS,"Austin, TX",SEA,"Seattle, WA",8/4/2022,1067,1628,1852,8/5/2022,600,825
4,DL,ATL,"Atlanta, GA",AUS,"Austin, TX",8/3/2022,2725,820,937,DL,AUS,"Austin, TX",SEA,"Seattle, WA",8/5/2022,988,600,825,8/5/2022,1628,1852
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
303,DL,ATL,"Atlanta, GA",SNA,"Santa Ana, CA",8/3/2022,778,954,1123,DL,SNA,"Santa Ana, CA",SEA,"Seattle, WA",8/5/2022,2557,730,1027,8/5/2022,1226,1506
304,DL,ATL,"Atlanta, GA",TPA,"Tampa, FL",8/3/2022,1532,1000,1128,DL,TPA,"Tampa, FL",SEA,"Seattle, WA",8/3/2022,459,1106,1401,8/4/2022,1106,1401
305,DL,ATL,"Atlanta, GA",TPA,"Tampa, FL",8/3/2022,2211,840,1007,DL,TPA,"Tampa, FL",SEA,"Seattle, WA",8/3/2022,459,1106,1401,8/4/2022,1106,1401
306,DL,ATL,"Atlanta, GA",TPA,"Tampa, FL",8/3/2022,1532,1000,1128,DL,TPA,"Tampa, FL",SEA,"Seattle, WA",8/4/2022,459,1106,1401,8/5/2022,1106,1401


### getValidDestinations function call

Use this function to get a list of locations which can be reached with exactly two flights from the origin city, within a two day period.

In [6]:
df2 = getValidDestinations('faa', origin, flight_date); df2

Unnamed: 0,AIRPORT,CITY
0,ABR,"Aberdeen, SD"
1,ALB,"Albany, NY"
2,ABQ,"Albuquerque, NM"
3,ABE,"Allentown/Bethlehem/Easton, PA"
4,APN,"Alpena, MI"
...,...,...
169,HPN,"White Plains, NY"
170,ICT,"Wichita, KS"
171,XWA,"Williston, ND"
172,ILM,"Wilmington, NC"
