# INFO 3402 – Week 01: Weekly Assignment

[Brian C. Keegan, Ph.D.](http://brianckeegan.com/)  
[Assistant Professor, Department of Information Science](https://www.colorado.edu/cmci/people/information-science/brian-c-keegan)  
University of Colorado Boulder  

Copyright and distributed under an [MIT License](https://opensource.org/licenses/MIT)

## Question 1: Import libraries (2 points)

Import pandas and numpy.

## Question 2: Load the Boulder and Broomfield data (4 points)

Use pandas's `read_csv` function to read in the "boulder_weather.csv" and "broomfield_weather.csv" files and assign to `boulder_df` and `broomfield_df` (respectively).

* DATE - Given in mm/dd/yyyy format
* TEMP - Mean temperature for the day in degrees Fahrenheit to tenths.
* MIN - Minimum temperature reported during the day in Fahrenheit to tenths.
* MAX - Maximum temperature reported during the day in Fahrenheit to tenths.
* DEWP - Mean dew point for the day in degrees Fahrenheit to tenths.
* STP - Mean station pressure for the day in millibars to tenths.
* VISIB - Mean visibility for the day in miles to tenths.
* WDSP - Mean wind speed for the day in knots to tenths.
* MXSPD - Maximum sustained wind speed reported for the day in knots to tenths.
* GUST - Maximum wind gust reported for the day in knots to tenths. 
* PRCP - Total precipitation (rain and/or melted snow) reported during the day in inches.

## Question 3: Display the last 5 rows of data (4 points)
Show the last five rows of data for `boulder_df` and `broomfield_df`.

## Question 4: Windiest day (4 points)

For each station, use the WDSP variable to identify the windiest day. Print a statement like "2015-01-01 was the windiest day for Boulder."

## Question 5: Biggest temperature swing (4 points)

For each station, what date had the largest range of temperatures between the daily minimum and maximum?

## Question 6: Convert the wind values from knots to MPH (4 points)

The WDSP, MXSPD, and GUST columns are reported in [knots](https://en.wikipedia.org/wiki/Knot_(unit)). Convert these values miles per hour. What is the maximum wind speed (MXSPD) in MPH and corresponding date for each station?

## Question 7: Identify dates with max temperatures above 100 (4 points)

For each station, identify the dates with max temperatures above 100.

## Question 8: What was the gust speed at each station on December 30, 2021? (4 points)
For each station, identify the gust speed (in MPH) for December 30, 2021.

## Extra Credit: Identify the date with the most dissimilar minimum temperature (4 points)

Identify the date where the minimum temperature between the two stations is the greatest.

## Appendix

You don't need to run any of the code here to complete the assignment. I'm sharing how I retrieved the data if you're curious.

Downloading data from [NOAA's Global Summary of the Day](https://www.ncei.noaa.gov/metadata/geoportal/rest/metadata/item/gov.noaa.ncdc%3AC00516/html#) [archives](https://www.ncei.noaa.gov/data/global-summary-of-the-day/access/) and [documentation](https://www.ncei.noaa.gov/data/global-summary-of-the-day/doc/readme.txt). Find stations from [this map](https://www.ncei.noaa.gov/maps/daily/?layers=0001).

* The Rocky Mountain Metro airport in Broomfield is station 72469903065.
* Boulder Municipal Airport in east Boulder is station 72053300160.

In [37]:
# Empty containers to hold data from the loop
boulder_daily_weather = {}
broomfield_daily_weather = {}

# A string containing (1) the URL of where each station's data lives and (2) a formatting character for the year
boulder_s = 'https://www.ncei.noaa.gov/data/global-summary-of-the-day/access/{0}/72469903065.csv'
broomfield_s = 'https://www.ncei.noaa.gov/data/global-summary-of-the-day/access/{0}/72053300160.csv'

# Loop through years in the range from 2010 through 2022
for year in range(2010,2023):
    boulder_daily_weather[year] = pd.read_csv(boulder_s.format(year), na_values=[99.99,999.9,9999.9])
    broomfield_daily_weather[year] = pd.read_csv(broomfield_s.format(year), na_values=[99.99,999.9,9999.9])

In [45]:
boulder_weather_df = pd.concat(boulder_daily_weather.values())
broomfield_weather_df = pd.concat(broomfield_daily_weather.values())

_cols = ['DATE','TEMP','MIN','MAX','DEWP','STP','VISIB','WDSP','MXSPD','GUST','PRCP']

boulder_weather_df = boulder_weather_df[_cols]
broomfield_weather_df = broomfield_weather_df[_cols]

boulder_weather_df.to_csv('boulder_weather.csv',sep='|',index=False)
broomfield_weather_df.to_csv('broomfield_weather.csv',sep='|',index=False)

In [46]:
broomfield_weather_df.tail(20)

Unnamed: 0,DATE,TEMP,MIN,MAX,DEWP,STP,VISIB,WDSP,MXSPD,GUST,PRCP
348,2021-12-17,35.2,28.4,42.8,9.8,832.0,10.0,6.3,17.1,25.1,0.0
349,2021-12-18,24.1,12.2,37.4,9.3,841.2,10.0,2.7,13.0,22.0,0.0
350,2021-12-19,36.2,21.2,60.8,7.4,836.7,10.0,2.7,8.0,14.0,0.0
351,2021-12-20,49.0,35.6,60.8,5.9,836.9,10.0,5.2,14.0,20.0,0.0
352,2021-12-21,43.6,28.4,62.6,10.8,835.5,10.0,4.1,18.1,27.0,0.0
353,2021-12-22,49.2,35.6,62.6,0.8,833.5,10.0,4.1,15.9,24.1,0.0
354,2021-12-23,54.8,44.6,60.8,16.7,829.9,10.0,10.2,22.0,34.0,0.0
355,2021-12-24,46.9,39.2,53.6,32.7,822.1,9.9,4.6,14.0,22.9,
356,2021-12-25,44.2,35.6,53.6,20.1,826.4,10.0,8.0,25.1,35.0,0.0
357,2021-12-26,37.1,24.8,50.0,13.9,824.5,10.0,3.9,22.9,35.9,0.0
