# Convert from instrument data to raw .cdf

In a terminal, change to the `data/rbrCSF20SC201` directory and type:

`runrskcsv2cdf.py gatts_CSF20SC2.txt csf20sc201_config.yaml`

This should generate the file
`CSF20SC201pt-raw.cdf`

Take a look at the output of the run script.

In [None]:
%matplotlib notebook
import xarray as xr
import matplotlib.pyplot as plt
import pandas as pd
import requests
ds = xr.load_dataset('data/rbrCSF20SC201/CSF20SC201pt-raw.cdf')

Type ds to see the data

In [None]:
ds

Plot the data. Note that this is plotting the raw data with time on the y axis and burst numble on the x axis.

In [None]:
plt.figure()
ds.P_1.plot()

In [None]:
plt.close()
ds.T_28.plot()

Note that we have lots of out-of water data we need to clip, but this raw file preserves all the data.

# Get atmos pressure data

This downloads data using the NOAA Tides & Currents API using the reqeusts module and writes to a text file


In [None]:
# we use plus signs to concatenate a long string
r = requests.get("https://api.tidesandcurrents.noaa.gov/api/prod/datagetter?" + 
                 "product=air_pressure&application=NOS.COOPS.TAC.WL&" + 
                 "begin_date=20200701&end_date=20200901&station=9414763" + 
                 "&time_zone=GMT&units=metric&interval=h&format=csv")

with open('data/rbrCSF20SC201/atmos_pressure.txt', 'w') as f:
    f.write(r.text)

In [None]:
df = pd.read_csv('data/rbrCSF20SC201/atmos_pressure.txt')

Take a look at the data

In [None]:
df

Now do a bunch of data massaging

In [None]:
# need to strip the column names because there are spaces after the commas
df.columns = df.columns.str.strip()
# rename Date Time to time and Pressure to atmpres to match stglib expectations
df = df.rename(columns={'Date Time': 'time', 'Pressure': 'atmpres'})
# pressure is in mbar, so convert to dbar
df['atmpres'] = df['atmpres'] * 0.01
# set time to be the index
df = df.set_index('time')

# and convert to an xarray Dataset
atm = df.to_xarray()
# ensure time is stored as a datetime
atm['time'] = pd.DatetimeIndex(atm['time'])
# drop unneeded variables
atm = atm.drop(['X', 'N', 'R'])

Look at our new xarray Dataset

In [None]:
atm

Because our atmospheric pressure data was collected ourly, but our water-level data was collected every 5 minutes, we need to reindex our atmos data

In [None]:
atm = atm.reindex_like(ds,  method='nearest', tolerance='60min')

Look at our data and figure out our offset

In [None]:
plt.close()
plt.figure()
(ds.P_1.mean(dim='sample')-atm.atmpres).plot()

In [None]:
# looks like we need an offset
plt.close()
plt.figure()

offset = 0.12
(ds.P_1.mean(dim='sample')-atm.atmpres-offset).plot()

# zoom in to see if the offset looks right

Attach our offset value as an attribute on the xarray Dataset

In [None]:
atm
atm.atmpres.attrs['offset'] = offset

And write to netcdf

In [None]:
atm.to_netcdf('data/rbrCSF20SC201/atmpres.cdf')

Now we have an atmospheric pressure compensation file and the raw data file, so we are ready to process to final .nc with atmospheric compensation applied

`runrskcdf2nc.py CSF20SC201pt-raw.cdf --atmpres=atmpres.cdf`

In [None]:
ds = xr.load_dataset('data/rbrCSF20SC201/CSF20SC201ptb-cal.nc')

In [None]:
ds

Looks great!

In [None]:
plt.close()
plt.figure()
ds.P_1ac.mean(dim='sample').plot()
plt.show()

# During the meeting: computing waves statistics