# Adding Support for Other Locations
In order to create a heatmap calendar for a given location for a given year, the `wunderground` API evidently requires the request to refer to the location by its ICAO airport code. Therefore, this Jupyter Notebook goes through the construction of a pandas DataFrame for locations and their associated codes. 

Fortunately, the pandas library has support for HTML and thus allows for table extraction (somewhat):

In [1]:
import pandas as pd

df = pd.read_html('https://en.wikipedia.org/wiki/List_of_airports_in_the_United_States')[2]
df.head()

Unnamed: 0,City,FAA,IATA,ICAO,Airport,Role,Enplanements
0,ALABAMA,,,,,,
1,Birmingham,BHM,BHM,KBHM,Birmingham–Shuttlesworth International Airport,P-S,1304467.0
2,Dothan,DHN,DHN,KDHN,Dothan Regional Airport,P-N,49411.0
3,Huntsville,HSV,HSV,KHSV,Huntsville International Airport (Carl T. Jone...,P-S,527801.0
4,Mobile,MOB,MOB,KMOB,Mobile Regional Airport,P-N,288209.0


It appears that the states have not been correctly extracted; however, a quick iteration over the dataframe captures the states.

In [2]:
iterate = df[df.isna().any(axis=1)]
iterate[:5]

Unnamed: 0,City,FAA,IATA,ICAO,Airport,Role,Enplanements
0,ALABAMA,,,,,,
6,ALASKA,,,,,,
7,Anchorage,LHD,,PALH,Lake Hood Seaplane Base (also see Lake Hood Ai...,P-N,23382.0
35,ARIZONA,,,,,,
45,ARKANSAS,,,,,,


It wasn't a perfect filter, but tolerable.

In [3]:
states = [state for state in iterate['City'] if state != 'Anchorage']
states[:5]

['ALABAMA', 'ALASKA', 'ARIZONA', 'ARKANSAS', 'CALIFORNIA']

The index of each state in the 'City' column is captured so that a new 'State' index may be created.

In [4]:
idx_purge = []

for idx in range(len(df['City'])):
    if df['City'][idx] in states:
        idx_purge.append(idx)

In [5]:
idx_purge[:5]

[0, 6, 35, 45, 50]

Verify that this selection has no state data at the beginning or end.

In [6]:
df[1:5]

Unnamed: 0,City,FAA,IATA,ICAO,Airport,Role,Enplanements
1,Birmingham,BHM,BHM,KBHM,Birmingham–Shuttlesworth International Airport,P-S,1304467.0
2,Dothan,DHN,DHN,KDHN,Dothan Regional Airport,P-N,49411.0
3,Huntsville,HSV,HSV,KHSV,Huntsville International Airport (Carl T. Jone...,P-S,527801.0
4,Mobile,MOB,MOB,KMOB,Mobile Regional Airport,P-N,288209.0


Each 'chunk' of the DataFrame within intervals of `idx_purge` belongs to one state. Iterate over the indices in `idx_purge` and get each chunk into a list.

In [7]:
chunks = []

for idx in range(len(idx_purge) - 1):
    lower = idx_purge[idx] + 1
    upper = idx_purge[idx + 1]
    chunks.append(df.iloc[lower:upper])
    
# Capture the last one
term = idx_purge[-1] + 1
chunks.append(df[term:])

Make a column within each chunk of the DataFrame for the state they share. After iterating through each chunk, concatenate them all into a pandas DataFrame.

The warning happens with and/or without the use of `.loc` but it is of no concern.

In [10]:
temp_gatherer = []

for idx in range(len(chunks)):
    temp = chunks[idx]
    temp['State'] = states[idx]
    temp_gatherer.append(temp)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  temp['State'] = states[idx]


In [14]:
complete_df = pd.concat(temp_gatherer)
complete_df['City'] = [city.upper() for city in complete_df['City']]
complete_df.head()

Unnamed: 0,City,FAA,IATA,ICAO,Airport,Role,Enplanements,State
1,BIRMINGHAM,BHM,BHM,KBHM,Birmingham–Shuttlesworth International Airport,P-S,1304467.0,ALABAMA
2,DOTHAN,DHN,DHN,KDHN,Dothan Regional Airport,P-N,49411.0,ALABAMA
3,HUNTSVILLE,HSV,HSV,KHSV,Huntsville International Airport (Carl T. Jone...,P-S,527801.0,ALABAMA
4,MOBILE,MOB,MOB,KMOB,Mobile Regional Airport,P-N,288209.0,ALABAMA
5,MONTGOMERY,MGM,MGM,KMGM,Montgomery Regional Airport (Dannelly Field),P-N,173210.0,ALABAMA


Save the DataFrame to `.csv` format.

In [15]:
complete_df.to_csv('code_table.csv')