# Frosty Dawgs Data Pipine
This notebook walks through the steps used to acquire and process the data used in the Frosty Dawgs project SnowML model. It can be used to create additional or modified model-ready data. 

# Step 0 - Set up Notebook

Before running the cell below, you will need to install and import the snowML module.  This can be achieved by cloning the repo at https://github.com/DSHydro/SnowML.git, navigating to the SnowML directory, and running `pip install .` from the command line.  You may also need to separately run ```pip install eaasysnowdata ```

In [1]:
# import needed modules 
import ee 
import xarray as xr 
import zarr 
import easysnowdata 
from snowML.datapipe.utils import set_data_constants as sdc
from snowML.datapipe import get_bronze as gb
from snowML.datapipe.utils import get_geos as gg
from snowML.datapipe import bronze_to_gold as btg
from snowML.datapipe.to_silver import to_silver as silver
from snowML.datapipe.to_silver import combine_silver as cs 
from snowML.datapipe import to_model_ready as gtm
from snowML.datapipe.utils import data_utils as du

# Step 1 - Credential, Credentials, Credentials 

**1A: Set up Google Earthe Engine Account**
First, you'll need a google earth engine account for accessing the huc geometry information  Once you have created an account and registerd a project at https://code.earthengine.google.com/register, run the code below and follow the instructions to generate an access token. 

In [2]:
ee.Authenticate(auth_mode='notebook')

True

In [3]:
ee.Initialize(project = "ee-frostydawgs") # Relace with your project name 


*** Earth Engine *** Share your feedback by taking our Annual Developer Satisfaction Survey: https://google.qualtrics.com/jfe/form/SV_7TDKVSyKvBdmMqW?ref=4i2o6


**1B: Set up Nasa Earth Data Account** <br> 
You also need to create Nasa Earth Data account, which you can do for free here: https://urs.earthdata.nasa.gov/home https://urs.earthdata.nasa.gov/oauth/authorize?.  If you sign in from your browser on the same machine from which you run this notebook, all will go well. If you are running the notebook from a hosted environment, such as Sagemaker, credential management for Nasa Earth Data is tricky and is not addressed here.  

# Step 2- Set up S3 Storage Buckets

The code in this notebook and in the SnowML.datapipe module assumes that data will be stored in one or more AWS S3 buckets.  

If you do not have an AWS account you can go to https://aws.amazon.com/s3/to create one. A single S3 bucket in the Amazon Free Tier should be sufficient to run the example code in this notebook.  However, for large scale data processing more may be required. 

**Note**: If your buckets are not public, you should also configure an IAM role with ability to access the buckets programatically, and store the relevant credentials in your working enviornment. 

Once you have set up one or more S3 buckets, you need to specify a bucket_dictionary to specify where data should be stored.  The Frosty Dawgs data pipeline uses a medallion architecture with the following tiers: 	Bronze dat, 
•	Gold D, and 
•	Model Ready . Intermediate files resulting from early processing steps are saved at the end of each processing tier to enable a more modular approach. This will enable future researchers to more easily update the data pipeline without losing the benefit of the early, computationally intensive processing stages. Data 


In [4]:
# create a dictionary defining where various data should be stored 
# bucket types are ["shape-bronze", "bronze", "silver", "gold", "model-ready"]
# in products the names we used were ["snowml-shape", "snowml-bronze", "snowml-silver", "snowml-gold", "snowml-model-ready"]
# for this notebook we'll use a single bucket ["sues-test"] 
# create the bucket dictionary by calling the create_bucket_dictionary functoin in the set_data_constants module (sdc) and passing parameter "test")
BUCKET_DICT = sdc.create_bucket_dict("test")
BUCKET_DICT


{'shape-bronze': 'sues-test',
 'bronze': 'sues-test',
 'silver': 'sues-test',
 'gold': 'sues-test',
 'model-ready': 'sues-test'}

# Step 3 - Create Bronze Level Data (SWE and WRF Data)

**3A - Convert SWE data to Zarr files** <br>
The University of Arizona SWE data is available here: https://nsidc.org/data/nsidc-0719/versions/1, with each year of data contained in its own netcdf files.  Since our data acquistion pattern is most typically by region accross all years, the first step is to download the raw data and reconfigure it into zarr files with storage chuncks more suited to our access patterns.  This raw data is then saved in the bronze bucket. 

This step can be time consuming! For purposes of this notebook, we'll download only a year or two, and even that can take a while . . . 

In [5]:
# download the specified years and save as a zarr file 
bucket_nm = BUCKET_DICT["bronze"] 
var = "swe" 
yr_start = 1995
yr_end =  1996
gb.get_bronze(var, bucket_nm, year_start = yr_start, year_end = yr_end)

Resuming with completed years: []
Processing year: 1995
Downloading https://daacdata.apps.nsidc.org/pub/DATASETS/nsidc0719_SWE_Snow_Depth_v1/4km_SWE_Depth_WY1995_v01.nc
<xarray.Dataset> Size: 3GB
Dimensions:   (lat: 621, lon: 1405, time: 365, time_str_len: 11)
Coordinates:
  * lat       (lat) float32 2kB 24.08 24.12 24.17 24.21 ... 49.83 49.88 49.92
  * lon       (lon) float32 6kB -125.0 -125.0 -124.9 ... -66.58 -66.54 -66.5
  * time      (time) datetime64[ns] 3kB 1994-10-01 1994-10-02 ... 1995-09-30
Dimensions without coordinates: time_str_len
Data variables:
    crs       |S1 1B ...
    time_str  (time_str_len, time) |S1 4kB dask.array<chunksize=(11, 365), meta=np.ndarray>
    SWE       (time, lat, lon) float32 1GB dask.array<chunksize=(61, 104, 235), meta=np.ndarray>
    DEPTH     (time, lat, lon) float32 1GB dask.array<chunksize=(61, 104, 235), meta=np.ndarray>
Created new Zarr file at s3://sues-test/swe_all.zarr
Processing year: 1996
Downloading https://daacdata.apps.nsidc.org/pub

'swe_all.zarr'

In [6]:
# reload the data into an xarray to see how it looks 
zarr_store_url = f's3://{bucket_nm}/{var}_all.zarr'
ds = xr.open_zarr(zarr_store_url, consolidated=True, storage_options={'anon': False}) # set anon to True if public bucket
ds

Unnamed: 0,Array,Chunk
Bytes,2.38 GiB,34.03 MiB
Shape,"(731, 621, 1405)","(365, 104, 235)"
Dask graph,108 chunks in 2 graph layers,108 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 2.38 GiB 34.03 MiB Shape (731, 621, 1405) (365, 104, 235) Dask graph 108 chunks in 2 graph layers Data type float32 numpy.ndarray",1405  621  731,

Unnamed: 0,Array,Chunk
Bytes,2.38 GiB,34.03 MiB
Shape,"(731, 621, 1405)","(365, 104, 235)"
Dask graph,108 chunks in 2 graph layers,108 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


The get_bronze function also creates a local file called "swe_progress.json" that tracks which years have already been downloaded and protects against overwriting.   If you want to add additional years, you can call the function again and pass the parameter append_to = True.  

In [7]:
yr_start = 1997
yr_end =  1998
gb.get_bronze(var, bucket_nm, year_start = yr_start, year_end = yr_end, append_to = True)

Resuming with completed years: [1995, 1996]
Processing year: 1997
Downloading https://daacdata.apps.nsidc.org/pub/DATASETS/nsidc0719_SWE_Snow_Depth_v1/4km_SWE_Depth_WY1997_v01.nc
<xarray.Dataset> Size: 3GB
Dimensions:   (lat: 621, lon: 1405, time: 365, time_str_len: 11)
Coordinates:
  * lat       (lat) float32 2kB 24.08 24.12 24.17 24.21 ... 49.83 49.88 49.92
  * lon       (lon) float32 6kB -125.0 -125.0 -124.9 ... -66.58 -66.54 -66.5
  * time      (time) datetime64[ns] 3kB 1996-10-01 1996-10-02 ... 1997-09-30
Dimensions without coordinates: time_str_len
Data variables:
    crs       |S1 1B ...
    time_str  (time_str_len, time) |S1 4kB dask.array<chunksize=(11, 365), meta=np.ndarray>
    SWE       (time, lat, lon) float32 1GB dask.array<chunksize=(61, 104, 235), meta=np.ndarray>
    DEPTH     (time, lat, lon) float32 1GB dask.array<chunksize=(61, 104, 235), meta=np.ndarray>
Appended year 1997 to s3://sues-test/swe_all.zarr
______Elapsed time is 18 seconds
Processing year: 1998
Downloa

'swe_all.zarr'

**3B: Meterological Data to Zarr** <br>
Now repeat the process for the meteorological variables you want.  The University of Idaho gridmet data can be downloaded from here: https://www.climatologylab.org/gridmet.html or via google earth engine.  The available variables are described here: https://explorer.earthengine.google.com/#detail/IDAHO_EPSCOR%2FGRIDMET. 

Note that the function skips zarr files that already exist unless the append_to parameter is set to True. This is to protect against unintended overwrites that might corrupt the existing files.  


In [8]:
var_list_wrf = ["pr", "tmmn", "tmmx", "vs", "srad", "rmax", "rmin"]
for var in var_list_wrf:
    print(f"processing {var}")
    gb.get_bronze(var, bucket_nm, year_start = 1995, year_end = 1998)


processing pr
Resuming with completed years: []
Processing year: 1995
Downloading http://www.northwestknowledge.net/metdata/data/pr_1995.nc
<xarray.Dataset> Size: 2GB
Dimensions:               (lon: 1386, lat: 585, day: 365, crs: 1)
Coordinates:
  * lon                   (lon) float64 11kB -124.8 -124.7 ... -67.1 -67.06
  * lat                   (lat) float64 5kB 49.4 49.36 49.32 ... 25.11 25.07
  * day                   (day) datetime64[ns] 3kB 1995-01-01 ... 1995-12-31
  * crs                   (crs) uint16 2B 3
Data variables:
    precipitation_amount  (day, lat, lon) float64 2GB dask.array<chunksize=(365, 98, 231), meta=np.ndarray>
Attributes: (12/19)
    geospatial_bounds_crs:      EPSG:4326
    Conventions:                CF-1.6
    geospatial_bounds:          POLYGON((-124.7666666333333 49.40000000000000...
    geospatial_lat_min:         25.066666666666666
    geospatial_lat_max:         49.40000000000000
    geospatial_lon_min:         -124.7666666333333
    ...               

In [9]:
# reload the data for one of the variables into an xarray to see how it looks 
var = "pr"
zarr_store_url = f's3://{bucket_nm}/{var}_all.zarr'
ds = xr.open_zarr(zarr_store_url, consolidated=True, storage_options={'anon': False}) # set anon to True if public bucket
ds

Unnamed: 0,Array,Chunk
Bytes,8.83 GiB,62.40 MiB
Shape,"(1461, 585, 1386)","(365, 97, 231)"
Dask graph,210 chunks in 2 graph layers,210 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 8.83 GiB 62.40 MiB Shape (1461, 585, 1386) (365, 97, 231) Dask graph 210 chunks in 2 graph layers Data type float64 numpy.ndarray",1386  585  1461,

Unnamed: 0,Array,Chunk
Bytes,8.83 GiB,62.40 MiB
Shape,"(1461, 585, 1386)","(365, 97, 231)"
Dask graph,210 chunks in 2 graph layers,210 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray


# Step 4 - Define your Region(s) of Interest 

The remainder of this data pipeline is designed to process data for a given set of regions (geos), creating aggregated data for each specified geometry.
So first we need to define the geometry/geometries of interest. The Frosty Dawgs ML model uses data aggregated at a watershed (Huc10) or sub-watershed (Huc12) unit for model training and evaluation. The function below can be used to create the geometry for a specified huc, or for all the subunits within a specified huc.  

In [10]:
# to obtain a geopandas dataframe with the geometry for a specific huc 
huc_id = 17110001
huc_lev = '08' 
geos_single = gg.get_geos(huc_id, huc_lev)
geos_single

Unnamed: 0,huc_id,geometry
0,17110001,"POLYGON ((-121.51825 49.20725, -121.51877 49.2..."


In [11]:
#  or you can obtain the geometries for each of the lower level hucs within a given huc unit 
huc_id = 17110001
huc_lev = '12' 
geos = gg.get_geos(huc_id, huc_lev)
geos.head()

Unnamed: 0,huc_id,geometry
0,171100010101,"POLYGON ((-121.30653 48.9338, -121.30707 48.93..."
1,171100010102,"POLYGON ((-121.41907 48.95988, -121.41941 48.9..."
2,171100010103,"POLYGON ((-121.45712 49.01607, -121.45763 49.0..."
3,171100010104,"POLYGON ((-121.37376 49.03833, -121.3738 49.03..."
4,171100010105,"POLYGON ((-121.3447 49.097, -121.34493 49.0972..."


In [12]:
# sometimes seeing the English language names is helpful 
huc_id = 17110001
huc_lev = '12'
geos_names = gg.get_geos_with_name(huc_id, huc_lev)
geos_names.head()

Unnamed: 0,name,huc_id,huc_name,geometry
0,Indian Creek-Chilliwack River,171100010101,Indian Creek-Chilliwack River,"POLYGON ((-121.30653 48.9338, -121.30707 48.93..."
1,Little Chilliwack River,171100010102,Little Chilliwack River,"POLYGON ((-121.41907 48.95988, -121.41941 48.9..."
2,Bear Creek-Chilliwack River,171100010103,Bear Creek-Chilliwack River,"POLYGON ((-121.45712 49.01607, -121.45763 49.0..."
3,Depot Creek,171100010104,Depot Creek,"POLYGON ((-121.37376 49.03833, -121.3738 49.03..."
4,Paleface Creek,171100010105,Paleface Creek,"POLYGON ((-121.3447 49.097, -121.34493 49.0972..."


Permitted values of huc_lev are '02', '04', '06', '08', '10', '12', and the huc_level must be at least the size of the huc_id.  In other words, if your starting huc_id is at level '06', then permissible levels are '06', '08', '10', and '12'. 

In [13]:
geos_names.explore()

In [14]:
# Optionally, the get_geos function will save the results as a shape file in the specified S3 bucket. 
bucket_nm_shape = BUCKET_DICT["shape-bronze"]
geos = gg.get_geos(huc_id, huc_lev, s3_save=True, bucket_nm=bucket_nm_shape)

File Huc12_in_17110001.geojson successfully uploaded to sues-test


Note that some of the HUC12 regions are in Canada, for which there is no SWE data in the University of Arizona dataset.  We exclude those hucs from our model training set later. 

# Step 5 - Process the Static Variables For the Region of Interest

Next we will gather the static variables for each region - snow types, elevation (dem), and forrest cover and save them in the "silver" bucket

In [15]:
# create a small set of hucs for purposes of this example 
geos = gg.get_geos(1711000101, '12').iloc[0:2, :]
huc_ls = list(geos["huc_id"])
huc_ls

['171100010101', '171100010102']

This next function downloads the snow-type classifiction from: "https://daacdata.apps.nsidc.org/pub/DATASETS/nsidc0768_global_seasonal_snow_classification_v01/SnowClass_NA_05km_2.50arcmin_2021_v01.0.nc"
Note: Don't forget that you need to be logged in to your Nasa Earth Data account. This step of the pipeline is easiest to manage from a local computer - and only needs to be done once.
It also downloads elevation / dem  data   

In [16]:
var_names = ["snow_types", "dem",  "forest_cover"]
tif_path = "Land_Cover/nlcd_tcc_conus_2021_v2021-4.tif"
for var_name in var_names: 
    silver.process_single_hucs(huc_ls, var_name, region = 17, bucket = BUCKET_DICT["silver"], tif_path = tif_path)

File Snow_Types_17.csv successfully uploaded to sues-test
File Snow_Types_17.csv successfully uploaded to sues-test
The following hucs had errors []
The following hucs were excluded as being in Canada []
File Dem_17.csv successfully uploaded to sues-test
File Dem_17.csv successfully uploaded to sues-test
The following hucs had errors []
The following hucs were excluded as being in Canada []
File Forest_Cover_17.csv successfully uploaded to sues-test
File Forest_Cover_17.csv successfully uploaded to sues-test
The following hucs had errors []
The following hucs were excluded as being in Canada []


In [17]:
cs.combine_no_geo(var_names = ["Snow_Types", "Dem", "Forest_Cover"], bucket = BUCKET_DICT["silver"])

File Static_No_Geo_Region_17.csv successfully uploaded to sues-test


Unnamed: 0_level_0,Predominant_Snow,Mean Elevation,Mean_Forest_Cover
huc_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
171100010101,Maritime,1369.745239,49.722009
171100010102,Maritime,1448.802856,58.402845


# Step 6 - Process SWE and WRF Data Into Gold Files 

The next step is to aggregate data to a mean value for each region of interest.  This requires: <br>
1.  Extracting the relevant information from the underlying bronze data set, using rioxarray to apply a geomask; and <br>
2.  Taking the daily mean accross all of the latitute/longitude pairs in the relevant geo.

Taking the daily mean for ~40 years of data is time and computationally intensive.  **This notebook performs the computation over 2 years only and be warned it still takes a while and requires significant RAM. Be prepared to hang out for a while . . . ** 

We parallelized the process -  you'll want to set the max_worker argument to the highest your system will allow.  Its also recommended to process the data in batches for a given variable, to avoid excessive i/o reading of the bronze files. 

For SWE and WRF data, the riomasking and aggregation is performed as an independent phase in hte pipeline, with the results saved as "gold" files. For elevation and snow-types, these datasets do not vary over time so the calculations are more tractable and are performed dynamically in the next phase.  

One more thing  - processing the swe and wrf data requires both: <br>
    1. The short name used in the file system (e.g. "pr" for precipitation, "tmmn" for min teperature, etc.) and <br>
    2. The long name of the variable in the data set (e.g. "precipitation_amount", "air_temperature") 

So our first step is to create a dictionary of var names that stores this info. The available variables, long and short names, are described here: https://explorer.earthengine.google.com/#detail/IDAHO_EPSCOR%2FGRIDMET. 


In [18]:
# create var dict
var_list = ["pr", "tmmn", "tmmx", "vs", "srad", "rmax", "rmin", "swe"]
var_names = ["precipitation_amount", "air_temperature", "air_temperature", \
                 "wind_speed", "surface_downwelling_shortwave_flux_in_air",  \
                  "relative_humidity", "relative_humidity", "SWE"]
var_dict = dict(zip(var_list, var_names))

In [19]:
geos

Unnamed: 0,huc_id,geometry
0,171100010101,"POLYGON ((-121.30653 48.9338, -121.30707 48.93..."
1,171100010102,"POLYGON ((-121.41907 48.95988, -121.41941 48.9..."


Now its time to process the gold files.  

**NOTE** If you were running this in script mode, you cwould want to run the below code block which takes advantage of some
parallel processing logic. 

    `for var in var_list: btg.process_geos(geos, var, bucket_dict= bucket_dict, max_wk = 8)`

And adjust the max_workers variabe to the maximum your system will allow to speed up the data processing.  The paralell processing doesn't play well with jupyter notebook though, so we'll use the below local function instead. 


In [20]:
def process_geos(geos, var, var_dict, bucket_dict, overwrite=False, append_start = None):
    """
   
    Args:
        geos (GeoDataFrame): A GeoDataFrame containing the data to be processed.
            var (str): The variable to be processed, used to retrieve the 
                variable name from VAR_DICT.
            bucket_dict (dict, optional): A dictionary containing bucket info. 
                If None, a default bucket dictionary is created. 
            overwrite (bool, optional): A flag indicating whether to 
                overwrite existing data. Defaults to False.
            append_start (optional): The date on which to begin appending to existing file, defaults to None/

        Returns:
            None
        """
    crs = geos.crs
    var_name = var_dict.get(var)
    for idx in range(geos.shape[0]):
        row = geos.iloc[idx, :]
        btg.process_row(row, var, idx, bucket_dict, crs, var_name, overwrite, append_start)


In [21]:
# Note that the function will identify where a gold file already exists and skip processing unless overwrite = True 

for var in var_list: 
    process_geos(geos, var, var_dict, BUCKET_DICT)
    

Processing huc 1, huc_id: 171100010101
File mean_pr_in_171100010101.csv successfully uploaded to sues-test
Processing huc 2, huc_id: 171100010102
File mean_pr_in_171100010102.csv successfully uploaded to sues-test
Processing huc 1, huc_id: 171100010101
File mean_tmmn_in_171100010101.csv successfully uploaded to sues-test
Processing huc 2, huc_id: 171100010102
File mean_tmmn_in_171100010102.csv successfully uploaded to sues-test
Processing huc 1, huc_id: 171100010101
File mean_tmmx_in_171100010101.csv successfully uploaded to sues-test
Processing huc 2, huc_id: 171100010102
File mean_tmmx_in_171100010102.csv successfully uploaded to sues-test
Processing huc 1, huc_id: 171100010101
File mean_vs_in_171100010101.csv successfully uploaded to sues-test
Processing huc 2, huc_id: 171100010102
File mean_vs_in_171100010102.csv successfully uploaded to sues-test
Processing huc 1, huc_id: 171100010101
File mean_srad_in_171100010101.csv successfully uploaded to sues-test
Processing huc 2, huc_id: 1

In [23]:
# reload an example to see how it looks 
f = "mean_pr_in_171100010101.csv"
b = BUCKET_DICT["gold"]
df = du.s3_to_df(f, b) # a handy function to retrieve csv file by bucket and file name 
df.head()

Unnamed: 0,day,mean_pr,huc_id
0,1995-01-01,0.0,171100010101
1,1995-01-02,0.0,171100010101
2,1995-01-03,0.0,171100010101
3,1995-01-04,0.0,171100010101
4,1995-01-05,0.0,171100010101


# Step 6- Putting it all together into Model Ready Data 

Once you have processed the swe and meterological variables into gold files for each relevant huc, the last step is to combine them into one dataset for model training. This next function <br> 
 - combines the swe and meterological data for a given huc
 - adds in elevation
 - add classification data
 - updates the units to be more human interpretable, such as changing the temperature values from Kelvin to Celcius.
 - averages "tmmx" and "tmmn", the max and min daily temperatures, into one daily average "tair"
 - averages "rhmx" and "rxmin," the max and min daily relative humidity, into one daily average 
 
Please review the data pipeline documenttaion for a complete discussion of variable naming conventions and units. 

In [24]:
huc_id = 171100010101
var_list = ["pr", "tmmn", "tmmx", "vs", "srad", "rmax", "rmin", "swe"]
df_model_ready = gtm.huc_model(huc_id, var_list = var_list, bucket_dict = BUCKET_DICT)

File model_ready_huc171100010101.csv successfully uploaded to sues-test


In [26]:
df_model_ready.iloc[500:510, :]


Unnamed: 0_level_0,mean_pr,mean_tair,mean_vs,mean_srad,mean_hum,mean_swe,Mean Elevation,Predominant Snow,Mean Forest Cover,mean_swe_lag_7,mean_swe_lag_30,mean_swe_lag_60
day,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
1996-02-13,0.0,6.335714,2.928571,108.385714,0.471071,0.751875,1369.745239,Maritime,49.722009,0.91325,0.65875,0.4015
1996-02-14,0.0,5.907143,2.785714,126.785714,0.451857,0.7565,1369.745239,Maritime,49.722009,0.893625,0.682625,0.405625
1996-02-15,0.0,5.757143,3.1,123.0,0.507714,0.85325,1369.745239,Maritime,49.722009,0.826625,0.7,0.40625
1996-02-16,2.028571,5.442857,3.2,131.757143,0.484357,0.860625,1369.745239,Maritime,49.722009,0.801125,0.721625,0.419375
1996-02-17,40.3,4.228571,4.114286,67.6,0.699357,0.856375,1369.745239,Maritime,49.722009,0.779625,0.7225,0.420625
1996-02-18,22.628571,2.085714,7.857143,61.814286,0.798,0.916875,1369.745239,Maritime,49.722009,0.761875,0.784125,0.421875
1996-02-19,20.457143,0.764286,4.557143,61.985714,0.804286,0.932125,1369.745239,Maritime,49.722009,0.771125,0.812125,0.419625
1996-02-20,9.9,0.357143,8.871429,100.614286,0.795786,0.941,1369.745239,Maritime,49.722009,0.751875,0.997,0.425875
1996-02-21,10.285714,-2.414286,4.171429,74.614286,0.802929,1.081125,1369.745239,Maritime,49.722009,0.7565,0.935875,0.425
1996-02-22,28.742857,-5.064286,7.585714,59.9,0.805214,1.107625,1369.745239,Maritime,49.722009,0.85325,1.011875,0.425625


# Phew! You Did It 
Check out the Data Visualization Notebook to get a feel for all that data! 