# Read and Match-to-Trawl CTD Data

This notebook utilizes a set of functions contained in `readCtd.py` that read in CTD data from one of two forms:
- SeaBird .cnv files produced from Seaterm : `cnv2table`
- Post-processed QC data files produced by PMEL (contact Shaun Bell/Phyllis Stabeno for more information): `csv2table`

This also requires the trawl event data. This needs to be formatted as described in `MarinovichEventData.ipynb` in the `pCod` directory, whoch contains the SQL code for pulling the headrope data in clamsbase. Everything in `readCtd.py` is formatted to use the header expected in that file.

The output of these two read functions are
- `dfCtd`: Complete dataframe by `seawater` package converted depth bins of temperature and salinity
- `dfCtdKey`: Reference dataframe containing time and location of each cast

These outputs can be paired with exports from Clams2ABL/Clamsbase2 of trawl times, locations, and headrope statistics, and used to produce a 'matched' dataframe using `eventTemps` which contains summary statistics of temperature conditions for each trawl event.

### Comments on missing data from final table

As of 06/11/20, the following event-cast pairs are missing one or more temperature values:
- 2017
    - 128: CTD shallower than average HR depth (cast 23)
    - 689: CTD shallower than average HR depth (cast 125)
    - 708: CTD shallower than haul (cast 128)
    - 736: CTD Failed after 16m (cast 131)
- 2019
    - 157: CTD shallower than haul (cast 49)
    - 169: CTD shallower than haul (cast 52)

Below are examples of the calls for using these functions. First, we will need the readCtd functions and pandas

In [1]:
import readCtd
import pandas as pd

### csv2table

This function accepts either a directory or single file path, and will merge all .csv files found.

In [3]:
dfCtd,dfCtdKey = readCtd.csv2table('D:\AIESII\OceanStarr_201701_AIERP\ctd\qc\\')
dfTrawls = pd.read_csv('../pCod/AIESMarinovichEventData.csv')
dfMatched17 = readCtd.eventTemps(dfCtd, dfCtdKey, dfTrawls[dfTrawls.SURVEY == 201701])
dfCtd,dfCtdKey = readCtd.csv2table('D:\AIESII\OS201901\ctd\qc\\')
dfTrawls = pd.read_csv('../pCod/AIESMarinovichEventData.csv')
dfMatched19 = readCtd.eventTemps(dfCtd, dfCtdKey, dfTrawls[dfTrawls.SURVEY == 201901])
dfMatchedAll = dfMatched17.append(dfMatched19)
dfMatchedAll.head()
dfMatchedAll.to_csv('TrawlEventTemperatures.csv',index=False)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self[name] = value
  dfTrawls.AVG_NET_HORI_OPENING[dfTrawls.EVENT_ID == trawl.Event_id].values[0]))]))
  dfTrawls.AVG_NET_HORI_OPENING[dfTrawls.EVENT_ID == trawl.Event_id].values[0]))]))
  tempBot.append(np.nanmean(curCast.temp[curCast.depth > (curCast.depth.max()-5)]))


### cnv2table

This function accepts either a directory or single file path, and will merge all .cnv files found. An Example is below.

In [2]:
dfCtd,dfCtdKey = readCtd.cnv2table('D:\AIESII\OS201901\ctd\\')
dfTrawls = pd.read_csv('../pCod/AIESMarinovichEventData.csv')
dfMatched19 = readCtd.eventTemps(dfCtd, dfCtdKey, dfTrawls[dfTrawls.SURVEY == 201901])
dfMatched19.head()

  dfTrawls.AVG_NET_HORI_OPENING[dfTrawls.EVENT_ID == trawl.Event_id].values[0]))]))
  dfTrawls.AVG_NET_HORI_OPENING[dfTrawls.EVENT_ID == trawl.Event_id].values[0]))]))


Unnamed: 0,Event_id,ctdCast,ctdDist,tempCol,tempBot,tempSurf,tempOpen,tempRange
0,112,42.0,18.44001,2.566525,0.5612,7.036967,0.421943,0.28208
1,124,43.0,1.661649,4.072509,0.652767,6.952133,0.675525,2.909647
2,125,44.0,4.7626,4.06779,0.548717,6.179867,0.555829,2.02108
3,131,45.0,17.82901,3.565437,0.459733,5.616667,0.4593,0.4595
4,137,45.0,11.999223,3.565437,0.459733,5.616667,0.640125,1.3437
