# Data Setup
Range Driver was created to help users better understand and analyze the data collected during acoustic range tests. Critically, this data needs to made available to range_driver in files which follow specific types and formats. 


## Organization
We recommend placing all the data files relevant for a specific range test in a single `data` folder. This folder can contain sub-directories to further organize the data. For example, we use the following folder structure for one of our range tests.
```
data
 |
 +-- detections.csv
 +-- tag_metadata.xlsx
 +-- EnvironmentalData
 |    |
 |    +-- salinity.nc
 |    +-- temperature.nc
 |    +-- turbidity.nc 
```

## Data
There are three main kinds of data users should provide to range_driver.   

|                                   |                         |
|-----------------------------------|:------------------------|
| [Detection event data](#event)    | The tabular dataset (typically Excel) containing event-specific information <br/> (datetime, receiver, transmitter, etc) from the range test.|
| [Metadata](#meta)                 | Metadata specific to the range test of interest, including vendor tag and <br/> deployment information. |
| [Environmental data](#env)        | Environmental data sets for contextualizing and analyzing differences in <br/> detection performance. |

### Detection Event Data<a id="event"></a>

Detection event data is stored in a CSV file that is fed into range_driver. The location of this file is specified using the `detections_csv` key which resides under the `reader` key in the configuration file. As discussed in the [configuration tutorial](), there are currently two readers supported by range_driver. However, custom readers can be specified. 

Both of the current readers expect tabular detection event data, where each row in the file corresponds to a single detection recorded by a deployed receiver. Depending on the reader you are using, slightly different data is required in these files.

#### `otn`
When using the `otn` reader, the following columns are required in the detections file:
* `Date and Time (UTC)` &mdash; The date and time a detection was recorded (e.g. `2016-03-09 13:25:33`) 
* `Receiver` &mdash; The receiver string specifying both the type of receiver and its ID (e.g. `VR4-UWM-250473)
* `Transmitter` &mdash; The transmitter string specifying both the type of transmitter and its ID (e.g. `A69-1601-31341`)

#### `nsog`
When using the `nsog` reader, many more columns are required (this is because much of the metadata is actually contained within the detections file). 
* `datecollected` &mdash; The date and time a detection was recorded (e.g. `2016-03-09 13:25:33`) 
* `rcvrcatnumber` &mdash; The catalog number for the receiver (e.g. `VR4-UWM`).
* `collectornumber` &mdash; The receiver's ID number (e.g. `250473`).
* `catalognumber` &mdash; The transmitter string specifying the type of transmitter and it's ID (e.g. `A69-1601-31341`).
* `station` &mdash; The station where the receiver was deployed (e.g. `NSOG011P`).
* `latitude` &mdash; The latitude of the receiver (e.g. `48.333210`).
* `longitude` &mdash; The longitude of the receiveer (e.g. `-123.979820`).
* `bottom_depth` &mdash; The bottom depth (in meters) where the receiver was placed.
* `receiver_depth` &mdash; The depth (in meters) of the receiver.


### Metadata<a id="meta"></a>
There are two kinds of metadata that can be provided to range_driver. Which metadata you need to provide depends on the reader you are using (as specified in your configuration file). 

#### Tag Metadata
Both the `otn` and `nsog` readers require users to provide tag metadata using the `vender_tag_specs` key in the configuration file. This Excel provides information about the transmitter tags (e.g. power levels and expected detection delays).

The `vender_tag_specs` Excel file requires the following columns:
* `Tag Family` &mdash; The family of tags the transmitter belongs to.
* `ID Code` &mdash; The transmitter's ID number.
* `VUE Tag ID\n(Freq-Space-ID)`&mdash; The transmitter's VUE tag ID. Formatted as Freq-Space-ID.
* `Power\n(L/H)` &mdash; The transmitter's power level.
* `Min \n(sec)` &mdash; The transmitter's minimum delay.
* `Max \n(sec)` &mdash; The transmitter's maximum delay.

#### OTN Metadata
Only the `otn` reader requires users to provide additional metadata using the `otn_metadata` key in the configuration file. This Excel file provides deployment information specific to the range test being analyzed. The columns required in this spreadsheet are: 

* `INS_SERIAL_NO` &mdash; The deployed receiver's serial number (aka ID).
* `DEPLOY_LAT` &mdash; The latitude at which the receiver was deployed.
* `DEPLOY_LONG` &mdash; The longitude at which the receiver was deployed.
* `INSTRUMENT_DEPTH` &mdash; The depth at which the receiver was deployed.
* `BOTTOM_DEPTH` &mdash; The bottom depth at the location where the receiver was deployed.

### Environmental Data<a id="meta"></a>

Finally, environmental data can either be provided in NetCDF format or, for specific environmental variables, can be automatically fetched using range_driver and kadlu. This environmental data is interpolated to approximate environmental conditions at the time and location when a detection (or set of detections) are received. t

#### NetCDF
Often, users will want to incorporate environmental data from sources that aren't supported by kadlu. These custom environmental datasets should be provided as a NetCDF file. To provide environmental data in this way, make sure the NetCDF file contains the environmental variable of interest along with `lat`, `lon`, and `time` data. Optionally, `depth` can also be included. `lat`, `lon`, `time`, and `depth` (if included) are the axes used for interpolation. 

The `file_map` configuration key is used to specify which NetCDF files contain which environmental variables. The key (e.g. `uVelocity`) should correspond _exactly_ to the variable name within the NetCDF file. An example of how environmental data from NetCDFs can be specified is below:

```yaml
file_map:
  data_dir: '{repo_path}/data/NSOG_Jan2018/SalishSeaCast'
  uVelocity: uVelocity_2018-01-01_2018-02-01.nc
  vVelocity: vVelocity_2018-01-01_2018-02-01.nc
  wVelocity: wVelocity_2018-01-01_2018-02-01.nc
  salinity: salinity_2018-01-01_2018-02-01.nc
  temperature: temperature_2018-01-01_2018-02-01.nc
  fraserTurbidity: fraserTurbidity_2018-01-01_2018-02-01.nc
```

#### Fetching w/ kadlu
Certain environmental variables can be fetched using kadlu. To determine which variables (and data sources) are available, check out the [kadlu documentation](https://docs.meridian.cs.dal.ca/kadlu/introduction.html) or `print(kadlu.source_map)`.

If you identify environmental variables to fetch from kadlu, use the `data` configuration key to specify the variables & which source they should be fetched from. For example:

```yaml
data:
  sources:
    load_wavedir: 'era5'
    load_waveheight: 'era5'
    load_waveperiod: 'era5'
    load_wind_uv: 'era5'
    load_wind_u: 'era5'
    load_wind_v: 'era5'
```