For each 24 hour day (midnight tomidnight) we'd like to know the following:
* Temperature and barometric pressure values at sunrise and sunset
* Difference between temp and pressure at sunrise and sunset
* Maximum, minimum and averages for temp and pressure for both of the time frames of sunset to sunrise and sunrise to sunset
* Differences between the max, min and average for both of these time frames.


# Load libraries

In [1]:
# pandas for data structure
import pandas as pd

# Load data

#### Weather Data

* data collected from [Wunderground](https://www.wunderground.com/weather/api/)
* hosted file: [Google Drive](https://drive.google.com/file/d/1eS0gGM14g7iFulUeqz3XwbKb5OtK9aSI/view)

In [58]:
# local file
filename_wunderground = '../data/wunderground-170701_171101-day_night.csv'

In [62]:
# load data into dataframes
wund = pd.read_csv(filename_wunderground, parse_dates=['utc_date'])

In [68]:
wund['utc_date'] = wund['utc_date'].dt.tz_localize('utc')

In [71]:
# localize datetime make local_date column
wund['local_date'] = pd.to_datetime(wund.loc[:, 'utc_date']).dt.tz_convert('US/Mountain')

In [76]:
wund = wund[['local_date' , 'utc_date', 'station_id','pressurei', 'pressurem', 'tempi', 'tempm']]

In [77]:
wund.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 727764 entries, 0 to 727763
Data columns (total 7 columns):
local_date    727764 non-null datetime64[ns, US/Mountain]
utc_date      727764 non-null datetime64[ns, UTC]
station_id    727764 non-null object
pressurei     727764 non-null float64
pressurem     727764 non-null float64
tempi         727764 non-null float64
tempm         727764 non-null float64
dtypes: datetime64[ns, US/Mountain](1), datetime64[ns, UTC](1), float64(4), object(1)
memory usage: 38.9+ MB


In [119]:
wund.head(2)

Unnamed: 0,local_date,utc_date,station_id,pressurei,pressurem,tempi,tempm
0,2017-07-01 00:12:00-06:00,2017-07-01 06:12:00+00:00,KMTCORVA9,26.0,880.4,58.8,14.9
1,2017-07-01 00:28:00-06:00,2017-07-01 06:28:00+00:00,KMTCORVA9,26.0,880.4,59.0,15.0


#### Sunset Sunrise Data

In [108]:
# Load Sunset Sunrise data
sun_filename = '../data/sunrise_sunset-wunderground-utc.csv'
sun = pd.read_csv(sun_filename, parse_dates=['sunrise', 'sunset'])

In [109]:
# Select a subset of loaded DataFrame
sun = sun[['sunrise', 'sunset']]

In [110]:
# Rename columns
sun.columns = ['sunrise_utc', 'sunset_utc']

In [111]:
# Localize datetime to UTC
sun['sunrise_utc'] = sun['sunrise_utc'].dt.tz_localize('utc')
sun['sunset_utc'] = sun['sunset_utc'].dt.tz_localize('utc')

In [114]:
# Create US/Mountain datetimes
sun['sunrise_local'] = pd.to_datetime(sun.loc[:, 'sunrise_utc']).dt.tz_convert('US/Mountain')
sun['sunset_local'] = pd.to_datetime(sun.loc[:, 'sunset_utc']).dt.tz_convert('US/Mountain')

In [117]:
sun = sun[['sunrise_local', 'sunset_local', 'sunrise_utc', 'sunset_utc']]

In [118]:
sun.head(2)

Unnamed: 0,sunrise_local,sunset_local,sunrise_utc,sunset_utc
0,2017-06-30 05:48:13-06:00,2017-06-30 21:31:54-06:00,2017-06-30 11:48:13+00:00,2017-07-01 03:31:54+00:00
1,2017-07-01 05:48:49-06:00,2017-07-01 21:31:41-06:00,2017-07-01 11:48:49+00:00,2017-07-02 03:31:41+00:00


*add local time column*

# New DataFrame

### Column for values at Sunrise and Sunset
* 'values' refers to temperature and pressure data
* indexed by day
* date will go sunrise to sunset
* columns = ['rise_tempi','set_tempi','rise_pressurei','set_pressurei']
* index = ['2017-07-01', .... '2017-10-31']

* Get closest wund.local_date to sun.sunrise_local
* [query the closest datetime index](https://stackoverflow.com/questions/42264848/pandas-dataframe-how-to-query-the-closest-datetime-index)

### Difference sunset, sunrise values

### Max, min, ave : sunset to sunrise : sunrise to sunset

### Difference max, min, ave sunset to sunrise : sunrise to sunset

### Scratch

In [21]:
col = ['astronomical_twilight_begin','astronomical_twilight_end','civil_twilight_begin',
       'civil_twilight_end','date','day_length','nautical_twilight_begin','nautical_twilight_end',
       'solar_noon','station_id','sunrise','sunset']