## NetCDF Weather File Converter
Description:  ETL system for weather files from the EU Copernicus project.  The input is a NetCDF file with 2-m temperature, 10-m wind, total cloud cover, and total precip data, provided as monthly means for a one-year period.  Produces a reduced CSV file for use with html browsers on the "2018 Weather" web site.
Rationale:  This notebook automates and allows reproduction of the data table generation process.  As the originator of the data files and builder of the end-use web site, I am the logical choice for authoring the notebook.  The process is needed to finish the web site.  Doing it in the notebook will enable expeditious future website updates.
by:  Andrew Guenthner
v0.5:  05-07-2019

### Requirements:
The notebook expects a file named 'means_30yr.csv' in the same directory as this notebook.  The file is generated by running the notebook named 'generate_30yr_mean.jypnb'.  The input file and input path can be set below.  The SciPy (0.16), Pandas (0.23), and Numpy (1.15) modules are also needed.

### Note:  The files involved can be >100 MB in size
Make sure your system resources are adequate before running this notebook.
The operations involved attempt to keep data on disk as much as possible, but can still consume a lot of memory. 

In [32]:
# Import dependencies
import numpy as np
from scipy.io import netcdf
import datetime as dt
import pandas as pd
filepath = '../Large_File/'

In [3]:
# Input the filename you want to process here...
file_to_open = 'data1.nc'

In [4]:
# Do a quick file test before starting ...
infile = filepath + file_to_open
with netcdf.netcdf_file(infile, 'r') as f:
    print(f.history)

b'2019-05-07 21:07:41 GMT by grib_to_netcdf-2.10.0: /opt/ecmwf/eccodes/bin/grib_to_netcdf -o /cache/data6/adaptor.mars.internal-1557263252.1882875-9670-11-dcde4521-4ef2-4a84-b494-d4054b6b9cb7.nc /cache/tmp/dcde4521-4ef2-4a84-b494-d4054b6b9cb7-adaptor.mars.internal-1557263252.189089-9670-4-tmp.grib'


The code above should have provided a text message about the file.  

### File content check

The next steps are meant for basic file exploration.  These can be skipped if you know the file contains the needed info.

In [5]:
# Print the available variables
infile = filepath + file_to_open
with netcdf.netcdf_file(infile, 'r') as f:
    print(f.variables)

OrderedDict([('longitude', <scipy.io.netcdf.netcdf_variable object at 0x000001647D175748>), ('latitude', <scipy.io.netcdf.netcdf_variable object at 0x000001647D175D30>), ('time', <scipy.io.netcdf.netcdf_variable object at 0x000001647D175828>), ('si10', <scipy.io.netcdf.netcdf_variable object at 0x000001647D1758D0>), ('t2m', <scipy.io.netcdf.netcdf_variable object at 0x000001647D1759B0>), ('tcc', <scipy.io.netcdf.netcdf_variable object at 0x000001647D175A90>), ('tp', <scipy.io.netcdf.netcdf_variable object at 0x000001647D1756D8>)])


What you should see is:
*  longitude
*  latitude
*  time
*  si10 -- this is the wind speed at 10 m height
*  t2m -- this is the 2-meter air temperature
*  tcc -- total cloud cover
*  tp -- total precipitation

In [33]:
# Print the units of these varaibles
infile = filepath + file_to_open
with netcdf.netcdf_file(infile, 'r') as f:
    print('longitude:  ',f.variables['longitude'].units)
    print('latitude:   ',f.variables['latitude'].units)
    print('time:       ',f.variables['time'].units)
    print('temperature:',f.variables['t2m'].units)
    print('wind speed: ',f.variables['si10'].units)
    print('cloud cover ',f.variables['tcc'].units)
    print('total precip',f.variables['tp'].units)

longitude:   b'degrees_east'
latitude:    b'degrees_north'
time:        b'hours since 1900-01-01 00:00:00.0'
temperature: b'K'
wind speed:  b'm s**-1'
cloud cover  b'(0 - 1)'
total precip b'm'


In [29]:
# Print the spatio-temporal characteristics of the data
infile = filepath + file_to_open
with netcdf.netcdf_file(infile, 'r') as f:
    print('Longitude:')
    print('# of Points: ',f.variables['longitude'].shape)
    print('From: ',f.variables['longitude'][0],' to ',f.variables['longitude'][-1])
    print('Latitude:')
    print('# of Points: ',f.variables['latitude'].shape)
    print('From: ',f.variables['latitude'][0],' to ',f.variables['latitude'][-1])
    print('Time:')
    print('# of Points: ',f.variables['time'].shape)
    print('From: ',f.variables['time'][0],' to ',f.variables['time'][-1])

Longitude:
# of Points:  (1440,)
From:  0.0  to  359.75
Latitude:
# of Points:  (721,)
From:  90.0  to  -90.0
Time:
# of Points:  (12,)
From:  1034376  to  1042392


### Generate the 2018 Data

In [31]:
# Print the spatio-temporal characteristics of the data
infile = filepath + file_to_open
with netcdf.netcdf_file(infile, 'r') as f:
    print (f.variables['t2m'][0].shape)

(721, 1440)


In [25]:
del t

In [26]:
f.close()