## Basic Pandas Operations

### Q1:
Import airports.csv and use the first column of the dataset 'iata' as raw labels.
You can use index_col = 'name_of_column' as a parameter to the read_csv() to perform this type of labeling.

In [1]:
# Import libraries
import pandas as pd

# Read dataset from CSV file, with the 'iata' column containing unique IDs
# (therefore, the 'iata' column can be used as the index column)
airports = pd.read_csv('../Dataset/airports.csv', index_col = 'iata')

# Show first 5 rows, just to make sure everything is OK
airports.head()

Unnamed: 0_level_0,airport,city,state,country,lat,long
iata,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
00M,Thigpen,Bay Springs,MS,USA,31.953765,-89.234505
00R,Livingston Municipal,Livingston,TX,USA,30.685861,-95.017928
00V,Meadow Lake,Colorado Springs,CO,USA,38.945749,-104.569893
01G,Perry-Warsaw,Perry,NY,USA,42.741347,-78.052081
01J,Hilliard Airpark,Hilliard,FL,USA,30.688012,-81.905944


### Q2:
Extract the first 10 rows and first four columns excluding 'iata' (i.e. 'airport', 'city', 'state' and 'country').

Also extract the last 10 rows and first four columns excluding 'iata'.

In [2]:
# Get first 10 rows and first 4 columns
airports.iloc[:10, :4]

Unnamed: 0_level_0,airport,city,state,country
iata,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
00M,Thigpen,Bay Springs,MS,USA
00R,Livingston Municipal,Livingston,TX,USA
00V,Meadow Lake,Colorado Springs,CO,USA
01G,Perry-Warsaw,Perry,NY,USA
01J,Hilliard Airpark,Hilliard,FL,USA
01M,Tishomingo County,Belmont,MS,USA
02A,Gragg-Wade,Clanton,AL,USA
02C,Capitol,Brookfield,WI,USA
02G,Columbiana County,East Liverpool,OH,USA
03D,Memphis Memorial,Memphis,MO,USA


In [3]:
# Get last 10 rows and first 4 columns
airports.iloc[-10:, :4]

Unnamed: 0_level_0,airport,city,state,country
iata,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Z55,Lake Louise,Lake Louise,AK,USA
Z73,Nelson Lagoon,Nelson Lagoon,AK,USA
Z84,Clear,Clear A.F.B.,AK,USA
Z91,Birch Creek,Birch Creek,AK,USA
Z95,Cibecue,Cibecue,AZ,USA
ZEF,Elkin Municipal,Elkin,NC,USA
ZER,Schuylkill Cty/Joe Zerbey,Pottsville,PA,USA
ZPH,Zephyrhills Municipal,Zephyrhills,FL,USA
ZUN,Black Rock,Zuni,NM,USA
ZZV,Zanesville Municipal,Zanesville,OH,USA


### Q3:

Print the summary of statistical description of the DataFrame

In [4]:
# Show statistics of all numeric variables
airports.describe()

Unnamed: 0,lat,long
count,3376.0,3376.0
mean,40.036524,-98.621205
std,8.329559,22.869458
min,7.367222,-176.646031
25%,34.688427,-108.761121
50%,39.434449,-93.599425
75%,43.372612,-84.137519
max,71.285448,145.621384


### Q4:

Use the row labels ('iata') to get only the rows with the labels '2P2', 'PAO', 'PIT' and 'PNN'

In [5]:
# Get rows with labels '2P2', 'PAO', 'PIT' and 'PNN'.
# We use two sets of square brackets,
# because the inner brackets provide a list of all the rows,
# and the outer brackets are used to give the rows (and if wanted, the columns)
airports.loc[['2P2', 'PAO', 'PIT', 'PNN']]

Unnamed: 0_level_0,airport,city,state,country,lat,long
iata,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2P2,Washington Island,Washington Island,WI,USA,45.386208,-86.924481
PAO,Palo Alto Arpt of Santa Clara Co,Palo Alto,CA,USA,37.461119,-122.115044
PIT,Pittsburgh International,Pittsburgh,PA,USA,40.491466,-80.232871
PNN,Princeton Municipal,Princeton,ME,USA,45.200667,-67.564389


### Q5:

Now use the row indexes to display the rows with indices 100 to 104 (both inclusive)

In [6]:
# Get rows 100 to 104 from the dataset.
# We have to type "105" not "104" since the last row number given is not included.
airports.iloc[100:105]

Unnamed: 0_level_0,airport,city,state,country,lat,long
iata,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
11R,Brenham Municipal,Brenham,TX,USA,30.219,-96.374278
12C,Rochelle Municipal,Rochelle,IL,USA,41.893001,-89.07829
12D,Tower Municipal,Tower,MN,USA,47.818333,-92.291667
12J,Brewton Municipal,Brewton,AL,USA,31.051263,-87.067968
12K,Superior Municipal,Superior,NE,USA,40.046361,-98.060111


### Q6:

Display the first 4 columms of the rows with the labels '2P2', 'PAO', 'PIT' and 'PNN'



In [7]:
# First, get rows with labels '2P2', 'PAO', 'PIT' and 'PNN',
# then, get only the first 4 columns.
airports.loc[['2P2', 'PAO', 'PIT', 'PNN']].iloc[:,:4]

Unnamed: 0_level_0,airport,city,state,country
iata,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2P2,Washington Island,Washington Island,WI,USA
PAO,Palo Alto Arpt of Santa Clara Co,Palo Alto,CA,USA
PIT,Pittsburgh International,Pittsburgh,PA,USA
PNN,Princeton Municipal,Princeton,ME,USA


### Q7:

Display the airports (print all the columns) in the state 'CO'

In [8]:
# First, get only the 'state' column,
# then, check if it is equal to "CO",
# then, use the above result with boolean indexing to show only the airports where 'state' column is "CO"
airports[airports['state'] == "CO"]

Unnamed: 0_level_0,airport,city,state,country,lat,long
iata,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
00V,Meadow Lake,Colorado Springs,CO,USA,38.945749,-104.569893
0V2,Harriet Alexander,Salida,CO,USA,38.539164,-106.045848
1V5,Boulder Muni,Boulder,CO,USA,40.03943,-105.225822
1V6,Fremont County,Canon City,CO,USA,38.428381,-105.105499
1V9,Blake,Delta,CO,USA,38.785397,-108.063661
20V,McElroy Airfield,Kremmling,CO,USA,40.05368,-106.368947
2V1,Stevens,Pagosa Springs,CO,USA,37.277505,-107.055874
2V2,Vance Brand,Longmont,CO,USA,40.163671,-105.163037
2V5,Wray Municipal,Wray,CO,USA,40.100323,-102.24096
2V6,Yuma Municipal,Yuma,CO,USA,40.104153,-102.712987


### Q8:

Display the airports with lat > 48 and long < -170

In [9]:
# First, for each row in the dataset, check if 'lat' is greater than 48.
# Then, for each row in the dataset, check if 'long' is less than -170.
# Then, AND the two above results.
# Then, use the above with boolean indexing to only get the airports that meet both the above conditions
airports[(airports['lat'] > 48) & (airports['long'] < -170)]

Unnamed: 0_level_0,airport,city,state,country,lat,long
iata,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
ADK,Adak,Adak,AK,USA,51.877964,-176.646031
AKA,Atka,Atka,AK,USA,52.220348,-174.20635
GAM,Gambell,Gambell,AK,USA,63.766766,-171.732824
SNP,St. Paul,St. Paul,AK,USA,57.167333,-170.220444
SVA,Savoonga,Savoonga,AK,USA,63.686394,-170.492636


### Q9:

Sort the DataFrame by the index and display the top 10 rows of it. You can use sort_index() for this.

In [10]:
# Sort DataFrame by index, then show first 10 rows
airports.sort_index().head(10)

Unnamed: 0_level_0,airport,city,state,country,lat,long
iata,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
00M,Thigpen,Bay Springs,MS,USA,31.953765,-89.234505
00R,Livingston Municipal,Livingston,TX,USA,30.685861,-95.017928
00V,Meadow Lake,Colorado Springs,CO,USA,38.945749,-104.569893
01G,Perry-Warsaw,Perry,NY,USA,42.741347,-78.052081
01J,Hilliard Airpark,Hilliard,FL,USA,30.688012,-81.905944
01M,Tishomingo County,Belmont,MS,USA,34.491667,-88.201111
02A,Gragg-Wade,Clanton,AL,USA,32.850487,-86.611453
02C,Capitol,Brookfield,WI,USA,43.08751,-88.177869
02G,Columbiana County,East Liverpool,OH,USA,40.673313,-80.641406
03D,Memphis Memorial,Memphis,MO,USA,40.447259,-92.226961


### Q10:

Now sort the DataFrame in the descending order of latitude and print the first 10 rows. You may use sort_values(column_name, ascending=False) for this.

In [11]:
# Sort DataFrame by 'lat' in descending order, then show first 10 rows
airports.sort_values('lat', ascending = False).head(10)

Unnamed: 0_level_0,airport,city,state,country,lat,long
iata,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
BRW,Wiley Post Will Rogers Memorial,Barrow,AK,USA,71.285448,-156.766002
AWI,Wainwright,Wainwright,AK,USA,70.638,-159.99475
ATK,Atqasuk,Atqasuk,AK,USA,70.467276,-157.435736
AQT,Nuiqsut,Nuiqsut,AK,USA,70.209953,-151.005561
SCC,Deadhorse,Deadhorse,AK,USA,70.194756,-148.465161
BTI,Barter Island,Kaktovik,AK,USA,70.133903,-143.577044
PIZ,Point Lay Dew Station,Point Lay,AK,USA,69.732875,-163.005342
GBH,Galbraith Lake,Galbraith Lake,AK,USA,68.479063,-149.490021
PHO,Point Hope,Point Hope,AK,USA,68.348774,-166.799309
AKP,Anaktuvuk Pass,Anaktuvuk Pass,AK,USA,68.134322,-151.74168
