## MONET-Analysis Airnow prep notebook

### How to use

- start notebook and 
- in cell 2 set the start date and end date
- in cell 2 set the filename output (something like AIRNOW_STARTDATE_ENDDATE.nc with STARTDATE and ENDDATE in YYYYMMDD format)

In [1]:
import monetio as mio
import pandas as pd
import xarray as xr
from util import write_util

In [2]:
filename = 'AIRNOW_20190901_20190930.nc'
dates = pd.date_range(start='2019-09-01',end='2019-09-30',freq='H')

# helper function for local time.  Could be important for EPA statistics
def get_local_time(ds):
    from numpy import zeros
    if 'utcoffset' in ds.data_vars:
        tim = t.time.copy()
        o = tim.expand_dims({'x':t.x.values}).transpose('time','x')
        on = xr.Dataset({'time_local':o,'utcoffset':t.utcoffset})
        y = on.to_dataframe()
        y['time_local'] = y.time_local + pd.to_timedelta(y.utcoffset, unit='H')
        time_local = y[['time_local']].to_xarray()
        ds = xr.merge([ds,time_local])
    return ds

In [3]:
df = mio.airnow.add_data(dates,wide_fmt=False,n_procs=12)

Aggregating AIRNOW files...
Building AIRNOW URLs...
[########################################] | 100% Completed |  0.8s
[########################################] | 100% Completed |  0.8s
[########################################] | 100% Completed |  0.9s
[########################################] | 100% Completed |  0.9s
[########################################] | 100% Completed | 52.3s
[########################################] | 100% Completed | 52.3s
[########################################] | 100% Completed | 52.4s
[########################################] | 100% Completed | 52.4s
    Adding in Meta-data


In [4]:
df = df.dropna(subset=['latitude','longitude']) # drop all values without an assigned latitude and longitude 
dfp = df.rename({'siteid':'x'},axis=1).pivot_table(values='obs',index=['time','x'], columns=['variable']) # convert to wide format
dfx = dfp.to_xarray() # convert to xarray 
# df.head()

In [7]:
# When converting to wide format we have to remerge the site data back into the file.  
dfpsite = df.rename({'siteid':'x'},axis=1).drop_duplicates(subset=['x']) # droping duplicates and renaming 
# convert sites to xarray 
test = dfpsite.drop(['time','time_local','variable','obs'],axis=1).set_index('x').dropna(subset=['latitude','longitude']).to_xarray()
# merge sites back into the data 
t = xr.merge([dfx,test])
# get local time
tt = get_local_time(t)

Unnamed: 0,time,x,site,utcoffset,variable,units,obs,time_local,latitude,longitude,cmsa_name,msa_code,msa_name,state_name,epa_region
0,2019-09-01,10102,St. John's,-4,OZONE,PPB,25.0,2019-08-31 20:00:00,47.6528,-52.8167,,,,CC,CA
671,2019-09-01,10401,Mount Pearl,-4,OZONE,PPB,27.0,2019-08-31 20:00:00,47.505,-52.7947,,,,CC,CA
3345,2019-09-01,10501,Grand Falls Windsor,-4,OZONE,PPB,22.0,2019-08-31 20:00:00,49.0194,-55.8028,,,,CC,CA
4014,2019-09-01,10601,Goose Bay,-4,OZONE,PPB,20.0,2019-08-31 20:00:00,53.3047,-60.3644,,,,CC,CA
4701,2019-09-01,10602,MacPherson Avenue -,-4,PM2.5,UG/M3,4.0,2019-08-31 20:00:00,48.95224,-57.92207,,,,CC,CA


In [11]:
# add siteid back as a variable and create x as an array of integers 
tt['siteid'] = (('x'),tt.x.values)
tt['x'] = range(len(tt.x))
# expand dimensions so that it is (time,y,x)
t = tt.expand_dims('y').set_coords(['siteid','latitude','longitude']).transpose('time','y','x')
t

In [12]:
#wite out to filename set in cell 2
write_util.write_ncf(t,filename)

Writing: AIRNOW_20190901_20190930.nc
Compressing: BARPR, original_dtype: float64
Compressing: BC, original_dtype: float64
Compressing: CO, original_dtype: float64
Compressing: NO, original_dtype: float64
Compressing: NO2, original_dtype: float64
Compressing: NO2Y, original_dtype: float64
Compressing: NOX, original_dtype: float64
Compressing: NOY, original_dtype: float64
Compressing: OZONE, original_dtype: float64
Compressing: PM10, original_dtype: float64
Compressing: PM2.5, original_dtype: float64
Compressing: PRECIP, original_dtype: float64
Compressing: RHUM, original_dtype: float64
Compressing: RWD, original_dtype: float64
Compressing: RWS, original_dtype: float64
Compressing: SO2, original_dtype: float64
Compressing: SRAD, original_dtype: float64
Compressing: TEMP, original_dtype: float64
Compressing: UV-AETH, original_dtype: float64
Compressing: WD, original_dtype: float64
Compressing: WS, original_dtype: float64
Compressing: cmsa_name, original_dtype: float64
Compressing: msa_cod

In [17]:
ls -lh data/* 

-rw-r--r-- 1 bbaker25 users 16M May 18 13:56 data/AERONET_L15_20190801_20190831.nc
-rw-r--r-- 1 bbaker25 users 42M Jun  7 10:56 data/AIRNOW_20190831_20190831.nc
-rw-r--r-- 1 bbaker25 users 42M Jun  7 11:00 data/AIRNOW_20190901_20190930.nc
