# Tidal currents from ferry data download code:
I have created this notebook to try the download process and to learn how to work with NetCDF data in python.

There is tidal current data from two ferries. 

To read netCDF files we will need to install netCDF4 module, which also needs h5py, I have already installed them in my computer as follows:

In [2]:
!conda install h5py
!pip install netCDF4

Fetching package metadata .......
Solving package specifications: ..........

# All requested packages already installed.
# packages in environment at /Users/Maru/miniconda3:
#
h5py                      2.6.0               np111py35_2  


#### Andrew: I've added a compact "download ferry data" function below. 

In [3]:
import numpy as np
from netCDF4 import Dataset
import pandas as pd

In [12]:
def download_ferry_data():
    """Downloads the adcp data in netCDF format from two WaDOT ferries"""
    from netCDF4 import Dataset
    URL1='http://107.170.217.21:8080/thredds/dodsC/Salish_L1_STA/Salish_L1_STA.ncml'
    URL2='http://107.170.217.21:8080/thredds/dodsC/Kennewick_L1_STA/Kennewick_L1_STA.ncml'
    Salish = Dataset(URL1, 'r')
    Kennewick = Dataset(URL2, 'r')
    return Salish, Kennewick




A tutorial on how to read NetCDF is available here:
http://unidata.github.io/netcdf4-python/


There are two ferries, and the data lives in this URLs:

In [4]:
URL1='http://107.170.217.21:8080/thredds/dodsC/Salish_L1_STA/Salish_L1_STA.ncml'
URL2='http://107.170.217.21:8080/thredds/dodsC/Kennewick_L1_STA/Kennewick_L1_STA.ncml'


In [5]:
Salish = Dataset(URL1, 'r')
Salish.variables

OrderedDict([('depth', <class 'netCDF4._netCDF4.Variable'>
              float32 depth(depth)
                  long_name: Depth of Bin Center
                  standard_name: depth
                  units: meters
                  positive: down
              unlimited dimensions: 
              current shape = (60,)
              filling off), ('time', <class 'netCDF4._netCDF4.Variable'>
              float64 time(time)
                  long_name: Time (UTC) as Seconds Since 2014-01-01 00:00:00
                  standard_name: time
                  units: seconds since 2014-01-01 00:00:00
                  time_zone: UTC
                  calendar: standard
              unlimited dimensions: 
              current shape = (991145,)
              filling off), ('latitude', <class 'netCDF4._netCDF4.Variable'>
              float64 latitude(time)
                  long_name: Latitude
                  standard_name: latitude
                  units: degrees_north
                  _F

In [6]:
Kennewick = Dataset(URL2, 'r')
Kennewick.variables

OrderedDict([('depth', <class 'netCDF4._netCDF4.Variable'>
              float32 depth(depth)
                  long_name: Depth of Bin Center
                  standard_name: depth
                  units: meters
                  positive: down
              unlimited dimensions: 
              current shape = (60,)
              filling off), ('time', <class 'netCDF4._netCDF4.Variable'>
              float64 time(time)
                  long_name: Time (UTC) as Seconds Since 2014-01-01 00:00:00
                  standard_name: time
                  units: seconds since 2014-01-01 00:00:00
                  time_zone: UTC
                  calendar: standard
              unlimited dimensions: 
              current shape = (1007475,)
              filling off), ('latitude', <class 'netCDF4._netCDF4.Variable'>
              float64 latitude(time)
                  long_name: Latitude
                  standard_name: latitude
                  units: degrees_north
                  _

The netCDF4 dataset read works very similar to a pandas data set, each variable can be open like this:

In [7]:
depth=Salish.variables['depth']

In [9]:
print(depth)

<class 'netCDF4._netCDF4.Variable'>
float32 depth(depth)
    long_name: Depth of Bin Center
    standard_name: depth
    units: meters
    positive: down
unlimited dimensions: 
current shape = (60,)
filling off



To access the variable values we need to specify the range:

In [10]:
depth_bins = depth[0:59]

In [11]:
depth_bins

array([   4.05000019,    6.05000019,    8.05000019,   10.05000019,
         12.05000019,   14.05000019,   16.04999924,   18.04999924,
         20.04999924,   22.04999924,   24.04999924,   26.04999924,
         28.04999924,   30.04999924,   32.04999924,   34.04999924,
         36.04999924,   38.04999924,   40.04999924,   42.04999924,
         44.04999924,   46.04999924,   48.04999924,   50.04999924,
         52.04999924,   54.04999924,   56.04999924,   58.04999924,
         60.04999924,   62.04999924,   64.05000305,   66.05000305,
         68.05000305,   70.05000305,   72.05000305,   74.05000305,
         76.05000305,   78.05000305,   80.05000305,   82.05000305,
         84.05000305,   86.05000305,   88.05000305,   90.05000305,
         92.05000305,   94.05000305,   96.05000305,   98.05000305,
        100.05000305,  102.05000305,  104.05000305,  106.05000305,
        108.05000305,  110.05000305,  112.05000305,  114.05000305,
        116.05000305,  118.05000305,  120.05000305], dtype=flo

In [16]:
lat_1=Salish.variables['latitude']
lat_1

<class 'netCDF4._netCDF4.Variable'>
float64 latitude(time)
    long_name: Latitude
    standard_name: latitude
    units: degrees_north
    _FillValue: -32768.0
unlimited dimensions: 
current shape = (987194,)
filling off

In [17]:
latitude1=lat_1[:]

In [18]:
latitude1

array([ 48.10909167,  48.10864249,  48.10826333, ...,  48.10882584,
        48.10910332,  48.10933668])

At this point I think is better to start working on functions:
- Download data:
    - inputs: NetCDF URL, 
    - outputs: needed variables
- QC data: can be called inside download data
    - inputs: variables needed for QC of ferry data
    - outputs: QCd velocity data, time of mesurement, location of measurement, ferry
- Save final velocities, location and time in a new file (either a NetCDF or a .csv if that is easier for everyone)

In [19]:
lat_1.size

987194