# MCM LTER Aquarius Data Processing

This script was created by Jared Collins on 6/26/2025 to be used by the MCM LTER stream team as an easy way to merge and organize data downloaded from Aquarius. The original processing code was done in R. The hope is this python script will be easier to use and more intuitive, allowing users to easily convert data files into publishable forms. In the future, it may need to be edited/changed to reflect changes in data collection strategies.

## Importing Packages + Setting Directory

It is very important that the computer being used for processing contains the proper packages, paths, and names for the data files that will be used. **The script expects the data files from Aquarius to be stored in the downloads folder with no edits made to the name of the file. Please ensure that this is the case.**

In [4]:
from datetime import date
import pandas as pd
import os
from pathlib import Path

In [9]:
home = Path.home()
downloads_path = home / "Downloads"
downloads_path = downloads_path.as_posix()
print("Downloads folder:", downloads_path)

Downloads folder: C:/Users/jared/Downloads


## Specifying Data (EDIT THIS SECTION ONLY)

In order for this script to work properly, you will need to input necessary information about the files that you are hoping to publish:
- Stream ID: Acceptable formats are B3, B5, C1, F1, F2, F3, F5, F6, F7, F8, F9, F10, H1, M1, M2, Onyx_LWRT, Onyx_Vanda
- Season: Separate seasons into A and B data depending on what portion of the data you are working on; will only impact how datafile is saved
- Date: The first day of the data file, written in "YYYYMMDD" format

**Edit the following cell with the information above. If done correctly, this is the only cell you will have to edit.**

In [10]:
streamid = 'F1'
season = '2425A'
date = '20241203'

## Organizing Data and Publishing CSV File

The following cell contains some lists for the script to detect which stream is being worked on and how to fill in the data appropriately. These should never have to be changed.

In [13]:
streamid_map = {
    'B3': 'lawson_b3',
    'B5': 'bohner_b5',
    'C1': 'common_c1',
    'F1': 'canada_f1',
    'F2': 'huey_f2',
    'F3': 'lostseal_f3',
    'F5': 'aiken_f5',
    'F6': 'vonguerard_f6',
    'F7': 'harnish_f7',
    'F8': 'crescent_f8',
    'F9': 'green_f9',
    'F10': 'delta_f10',
    'M1': 'adams_m1',
    'M2': 'miers_m2',
    'Onyx_LWRT': 'onyx_lwright',
    'Onyx_Vanda': 'onyx_vnda'}

grade_map = {
    '1': 'unverified',
    '11': 'poor',
    '21': 'fair',
    '31': 'good',
    '41': 'excellent'}

The next cell deals with calling the data files downloaded from Aquarius that we will be combining in this script. So long as the file is in the downloads folder of your computer and you did not change the name from aquarius, this cell should work perfectly.

In [12]:
dis = pd.read_csv(f'{downloads_path}/Discharge.Metric@{streamid}.{date}.csv', skiprows=14)
temp = pd.read_csv(f'{downloads_path}/Water_Temp.Working@{streamid}.{date}.csv', skiprows=14)
spc = pd.read_csv(f'{downloads_path}/Sp_Cond.Working@{streamid}.{date}.csv', skiprows=14)
dosat = pd.read_csv(f'{downloads_path}/Dis_Oxygen_Sat.Working@{streamid}.{date}.csv', skiprows=14)
doconc = pd.read_csv(f'{downloads_path}/O2_(Dis).Working@{streamid}.{date}.csv', skiprows=14)

The following cell deals with creating the dataframe that will contain all of the data specified and ready for publishing, as well as generating the CSV file output. **It will publish this file to your downloads folder, so be sure to move it to the appropriate Stream Team folder once done.**

In [21]:
subm_file = pd.DataFrame()
subm_file['DATASET_CODE'] = ['DSCRT_STRM_GAGE'] * len(dis)
subm_file['STRM_GAGE_ID'] = [streamid_map[streamid]] * len(dis)
subm_file['DATE_TIME'] = dis['Timestamp (UTC+13:00)']
subm_file['DISCHARGE_RATE'] = dis['Value']
subm_file['DISCHARGE_QLTY'] = dis['Grade'].astype(str).map(grade_map)
subm_file['WATER_TEMP'] = temp['Value']
subm_file['WATER_TEMP_QLTY'] = temp['Grade'].astype(str).map(grade_map)
subm_file['SPECIFIC_CONDUCTANCE'] = spc['Value']
subm_file['SPECIFIC_CONDUCTANCE_QLTY'] = spc['Grade'].astype(str).map(grade_map)
subm_file['DIS_OXYGEN_SAT'] = dosat['Value']
subm_file['DIS_OXYGEN_SAT_QLTY'] = dosat['Grade'].astype(str).map(grade_map)
subm_file['DIS_OXYGEN_CONC'] = doconc['Value']
subm_file['DIS_OXYGEN_CONC_QLTY'] = doconc['Grade'].astype(str).map(grade_map)
subm_file.to_csv(f'{downloads_path}/{streamid}_{season}_SUBM.csv')