## Time Series

* **time series (stations)**: Data is located at named locations, called *stations*. There can be many stations, and ususally for each station we have multiple data with different time coordinates. Stations have a unique identifier. *Examples: weather station data, fixed buoys*.
    * Global attribute `feature_type = timeSeries`.
    * The altitude coordinate is optional.
    * Special station variables are recognized by standard names as given below. For backwards compatibility, the given aliases are allowed.
    
        |**standard_name**|**alias**|
        |-----------------|---------|
        |`timeseries_id`|`station_id`|
        |`platform_name`|`station_description`|
        |`surface_altitude`|`station_altitude`|
        |`platform_id`|`station_WMO_id`|
        
    <br>
        
    * [Only one station in the file](https://cfconventions.org/Data/cf-conventions/cf-conventions-1.11/cf-conventions.html#_single_time_series_including_deviations_from_a_nominal_fixed_spatial_location)

    * [All stations have the same time coordinates](https://cfconventions.org/Data/cf-conventions/cf-conventions-1.11/cf-conventions.html#_orthogonal_multidimensional_array_representation_of_time_series)
    * [Each station has the same number of time coordinates but the coordinate values may be different](https://cfconventions.org/Data/cf-conventions/cf-conventions-1.11/cf-conventions.html#_incomplete_multidimensional_array_representation_of_time_series)
        * The `lat`, `lon` and `altitude` coordinates must ahve the same dimension, called the *`station`* or *`instance`* dimension. All variables with the *`station`* dimension as outer dimension are *stations variables*(?).
        * The time dimension must be of the form **time(time)** or **time(station, time)**, where the time dimension is the *`obs`* or *`sample`* dimension. (?)
        * All data variables must have the form **data(station, time)**.
        * For backwards compatibility:
            |**standard_name**|**alias**|
            |-----------------|---------|
            |`sample_dimension`|`ragged_row_count`|
            |`feature_dimension`|`ragged_row_index`|

    * Each station has different number of coordinates and we wanna keep file size as small as possible:
        * [we have all the data already, and we wanna optimize reading all the data for one station](https://cfconventions.org/Data/cf-conventions/cf-conventions-1.11/cf-conventions.html#_contiguous_ragged_array_representation_of_time_series)
        * [we wanna write the data as it arrives, in any order](https://cfconventions.org/Data/cf-conventions/cf-conventions-1.11/cf-conventions.html#_indexed_ragged_array_representation_of_time_series)


In [None]:
import os
from glob import glob
import numpy as np
import pandas as pd
import xarray as xr
import matplotlib.pyplot as plt

In [None]:
os.chdir('/Users/icdc/Documents/NFDI/Kemeng/cfbook/src/data')
os.getcwd()

'/Users/icdc/Documents/NFDI/Kemeng/cfbook/src/data'

Example Dataset: [Kelp Forest Monitoring Sea Temperature](https://coastwatch.pfeg.noaa.gov/erddap/tabledap/erdCinpKfmT.html)

In [None]:
ts_files = glob(os.path.join(os.getcwd(), "dsg_timeSeries", "*.nc"))
ts_files

['/Users/icdc/Documents/NFDI/Kemeng/cfbook/src/data/dsg_timeSeries/KFMTemperature_Anacapa_Cathedral_Cove.nc',
 '/Users/icdc/Documents/NFDI/Kemeng/cfbook/src/data/dsg_timeSeries/KFMTemperature_Anacapa_Black_Sea_Bass_Reef.nc']

In [None]:
ds_ts1 = xr.open_dataset(ts_files[0])

print(ds_ts1["ID"].data)
ds_ts1

b'Anacapa (Cathedral Cove)'


In [None]:
ds_ts2 = xr.open_dataset(ts_files[1])

print(ds_ts2['ID'].data)
ds_ts2

b'Anacapa (Black Sea Bass Reef)'
