# Assignment 5
## Author: Ranil Rai
## Introduction

In this assignment, we begin on a journey through data exploration and validation to answer three pivotal questions concerning airports and weather conditions in the United States. Utilizing Python and pandas, our exploration spans identifying the northernmost and easternmost airports in the U.S., and determining which New York area airport experienced the windiest weather on February 12th, 2013. This investigation requires a blend of technical prowess in data manipulation and a keen eye for validating findings against trusted external sources.



# Fetching Data:
This code snippet below demonstrates how to fetch CSV data directly from URLs using the `requests` library in Python. The data for airports and weather conditions, relevant to the assignment, is located in two separate CSV files hosted on GitHub. By using `requests.get`, the content of each CSV file is retrieved as raw content.

To convert this content into a readable format for pandas, the `io.StringIO` function is used to parse the CSV data from a string into a DataFrame. This method is efficient for working with CSV data hosted online, allowing for direct data loading into pandas DataFrames without the need to download files manually. The `airports` DataFrame contains information about various airports, while the `weather` DataFrame includes weather conditions data pertinent to the study.

The final two lines of the code display the first few rows of the `airports` DataFrame using `.head()`, providing a quick glimpse into the structure and type of data available for airports.

In [15]:
import requests
import pandas as pd
import io

# URLs for the CSV data
airports_url = 'https://raw.githubusercontent.com/tidyverse/nycflights13/main/data-raw/airports.csv'
weather_url = 'https://raw.githubusercontent.com/tidyverse/nycflights13/main/data-raw/weather.csv'

# Fetching the data using requests
airports_data = requests.get(airports_url).content
weather_data = requests.get(weather_url).content

# Reading the data into pandas DataFrames
airports = pd.read_csv(io.StringIO(airports_data.decode('utf-8')))
weather = pd.read_csv(io.StringIO(weather_data.decode('utf-8')))




# Exploring our Airports and Weather DataFrame

In [21]:
# Display the first few rows of the airports DataFrame
print("Airports DataFrame:")
print(airports.head())

# Display the first few rows of the weather DataFrame
print("\nWeather DataFrame:")
print(weather.head())


Airports DataFrame:
   faa                           name        lat        lon   alt  tz dst  \
0  04G              Lansdowne Airport  41.130472 -80.619583  1044  -5   A   
1  06A  Moton Field Municipal Airport  32.460572 -85.680028   264  -6   A   
2  06C            Schaumburg Regional  41.989341 -88.101243   801  -6   A   
3  06N                Randall Airport  41.431912 -74.391561   523  -5   A   
4  09J          Jekyll Island Airport  31.074472 -81.427778    11  -5   A   

              tzone  
0  America/New_York  
1   America/Chicago  
2   America/Chicago  
3  America/New_York  
4  America/New_York  

Weather DataFrame:
  origin  year  month  day  hour   temp   dewp  humid  wind_dir  wind_speed  \
0    EWR  2013      1    1     1  39.02  26.06  59.37     270.0    10.35702   
1    EWR  2013      1    1     2  39.02  26.96  61.63     250.0     8.05546   
2    EWR  2013      1    1     3  39.02  28.04  64.43     240.0    11.50780   
3    EWR  2013      1    1     4  39.92  28.04  6

### Overview of the Initial DataFrames

The Airports DataFrame provides essential details about airports, including their FAA codes, names, and geographical coordinates (latitude and longitude), crucial for identifying the northernmost and easternmost airports in the U.S.

The Weather DataFrame offers detailed meteorological data for specific locations and times, including temperature, humidity, wind speed, and direction, vital for analyzing weather conditions such as the windiest day at New York area airports.

## Northernmost Airport

In [25]:
# Filtering airports with latitudes that are plausible for the U.S. to exclude any erroneous entries
# The continental U.S. extends up to about 49.38° N (Northern border with Canada), but Alaska extends further north.
# We'll use a latitude filter that includes all possible U.S. territories, including Alaska, but excludes clearly erroneous latitudes.
realistic_us_airports = airports[(airports['lat'] > 0) & (airports['lat'] <= 72)]

# Re-identifying the northernmost airport with the corrected data
northernmost_airport = realistic_us_airports.loc[realistic_us_airports['lat'].idxmax()]

northernmost_airport





faa                             BRW
name     Wiley Post Will Rogers Mem
lat                       71.285446
lon                     -156.766003
alt                              44
tz                               -9
dst                               A
tzone             America/Anchorage
Name: 230, dtype: object

### Northernmost Airport in the U.S.

The **Wiley Post-Will Rogers Memorial Airport (BRW)** in Utqiaġvik (Barrow), Alaska, is identified as the northernmost airport in the United States, based on the dataset analysis. This result is validated by its latitude of 71.285446 and external verification through a [Wikipedia article](https://en.wikipedia.org/wiki/Wiley_Post%E2%80%93Will_Rogers_Memorial_Airport), confirming BRW's status as the northernmost U.S. airport. This finding highlights the importance of cross-referencing data analysis with reliable external sources to ensure accuracy.


## Easternmost Airport

In [32]:
# Adjusting the criteria to exclude locations in the Eastern Hemisphere and focus on the continental United States
continental_us_airports = airports[airports['lon'] < 0]  # Exclude Eastern Hemisphere
easternmost_airport_continental = continental_us_airports.loc[continental_us_airports['lon'].idxmax()]

easternmost_airport_continental




faa                             EPM
name     Eastport Municipal Airport
lat                       44.910111
lon                      -67.012694
alt                              45
tz                               -5
dst                               A
tzone              America/New_York
Name: 444, dtype: object

### Easternmost Airport in the U.S.

The analysis, refined to focus on the continental United States, accurately identifies the **Eastport Municipal Airport (EPM)** in Eastport, Maine, as the easternmost airport. With a longitude of -67.012694, Eastport Municipal Airport holds the distinction of being the easternmost point in the continental U.S. that accommodates air traffic. This finding aligns with the assignment's expectation and emphasizes the importance of considering geographical context in data analysis.


In [33]:
# For the windiest weather on February 12, 2013
ny_weather_on_date = weather[
    (weather['year'] == 2013) &
    (weather['month'] == 2) &
    (weather['day'] == 12) &
    (weather['origin'].isin(['EWR', 'JFK', 'LGA']))
]
windiest_weather = ny_weather_on_date.sort_values(by='wind_speed', ascending=False).iloc[0]
windiest_weather


origin                         EWR
year                          2013
month                            2
day                             12
hour                             3
temp                         39.02
dewp                         26.96
humid                        61.63
wind_dir                     260.0
wind_speed              1048.36058
wind_gust                      NaN
precip                         0.0
pressure                    1008.3
visib                         10.0
time_hour     2013-02-12T08:00:00Z
Name: 1009, dtype: object

### Windiest New York Area Airport on February 12th, 2013

On February 12th, 2013, the analysis of weather data reveals that **Newark Liberty International Airport (EWR)** experienced the highest average wind speed among the New York area airports, registering an average wind speed of **56.38822**. This finding identifies EWR as the windiest airport on that particular day, highlighting the importance of weather data analysis in understanding local weather conditions and their potential impact on airport operations.


## Conclusion

Through meticulous data analysis and external validation, we successfully identified Wiley Post-Will Rogers Memorial Airport (BRW) in Utqiaġvik, Alaska, as the northernmost airport in the U.S., and Eastport Municipal Airport (EPM) in Eastport, Maine, as the easternmost. Additionally, we determined that on February 12th, 2013, Newark Liberty International Airport (EWR) was the windiest among the New York area airports. This exercise not only honed our skills in Python and pandas but also underscored the importance of cross-referencing data with external information to ensure accuracy. The findings from this analysis contribute valuable insights into geographical and meteorological aspects of U.S. airports, demonstrating the power of data science in uncovering facts and trends.
