## **The code below is to**
**a. Correct for water level logger drift,**

**b. Correct for SC sensor drift,**

**c. Check accuracy of logger deployment, and**

**d. Add the deployment block information to the block table.**

Read the code text and comments embedded in each code block carefully (denoted by '#'), as some components require user input (initials, manual water level measurements, etc.).


Code blocks below denoted with (^) indicate they're to be run only if using a local runtime

---



**If working through Google Colab**

You will be prompted to click on a link that will show you an authorization code. copy the authorization code into the input box below. You also may be asked to allow Google Colab access to your Drive.

In [None]:
from google.colab import drive, auth
drive.mount("/content/drive/")

import pandas as pd
import matplotlib.pyplot as plt
#%matplotlib notebook
import numpy as np
from datetime import date
import csv
import os

Path names will be the same, but specify the file name you are working on.

In [None]:
#Change 'file' below to whatever file you're working on
file='MBHF1_20210416_LTC_baroC.csv'

path='/content/drive/My Drive/Water/preprocess_files/'
endpath='/content/drive/My Drive/Water/postprocess_files/'

df=pd.read_csv(path+file,index_col=[0])

**(^) If working from a local runtime on you computer**

In [None]:
#For working on local machine
import pandas as pd
import matplotlib.pyplot as plt
#%matplotlib notebook
import numpy as np
from datetime import date
import csv
import os

Collecting plotly
  Downloading plotly-4.14.3-py2.py3-none-any.whl (13.2 MB)
Collecting retrying>=1.3.3
  Downloading retrying-1.3.3.tar.gz (10 kB)
Building wheels for collected packages: retrying
  Building wheel for retrying (setup.py): started
  Building wheel for retrying (setup.py): finished with status 'done'
  Created wheel for retrying: filename=retrying-1.3.3-py3-none-any.whl size=11429 sha256=082b5bb2bd81b6af34d734c2aaf5e1f4d3f55d237db95676c15b56c817ff8ccd
  Stored in directory: c:\users\mcquiggan\appdata\local\pip\cache\wheels\c4\a7\48\0a434133f6d56e878ca511c0e6c38326907c0792f67b476e56
Successfully built retrying
Installing collected packages: retrying, plotly
Successfully installed plotly-4.14.3 retrying-1.3.3


(^) Path names will be the same, but specify the file name you are working on.

In [None]:
#Change 'file' below to whatever file you're working on
file='MBM1_20210416_LTC_baroC.csv'

path='G:/Shared drives/CZN_HydroGroup/Water/preprocess_files/'
endpath='G:/Shared drives/CZN_HydroGroup/Water/postprocess_files/'

df=pd.read_csv(path+file,index_col=[0])
df.index=pd.to_datetime(df.index)

For **Solinst brand loggers**, run the 1 code block below. Skip this if you're using a different brand logger.

In [None]:
#Add extra DateTime field for interactive plot
df['DateTime']=pd.to_datetime(df['Date']+" "+df['Time'])

#Find start and end dates from logger file
start_date=df.index[0]
end_date=df.index[len(df.index)-1]
print('Need manual measurements collected on '+start_date+' and '+end_date)

TypeError: can only concatenate str (not "Timestamp") to str

**Enter field water level measurements as variables below.**

These values can be found in both the Manual_msmt.gsheet OR in the watlev table in the AccessDB.

start_lev - the manual measurement collected when you deployed the sensor

end_lev - the manual measurement collected when you stopped and downloaded the sensor

In [None]:
start_lev=0.966216
end_lev=start_lev+-0.18419

(^) Plot water level data. Will pop up in a new browser window if using a local runtime, otherwise use Excel for now.

In [None]:
#%matplotlib widget
df.plot(x='DateTime',y='LOGGER_mWater_corr',style='.',rot=45)
plt.grid()
plt.tight_layout()
plt.show()
#To see first 5 rows of dataframe
df.head()

In [None]:
#To see last 5 rows of dataframe
df.tail()


---
##**Clean up the data**##

**Below are some common corrections to remove noisy or rogue data points**

All require some manual entry (i.e. number of rows, date and time). Zoom in and pan through the interactive plot to find info for specific points or periods of data. You can run one or more separately.

In [None]:
#To remove a specified number of records at the start of the file
#Enter the number of rows below as start_del
start_del=2
df=df.iloc[start_del:]
df.head()

In [None]:
#To remove a specified number of records at the end of the file
#Enter the number of rows below as end_del
end_del=1
df=df.iloc[:-end_del]
df.tail()

In [None]:
#To interpolate values between two records (i.e. smooth over a point)
#Change the date and time to that of whatever point you want to smooth over
#Can do this multiple times for multiple points and the interpolate function will do all at once
point=(df.index.get_loc('2021-02-03 11:00:00'))
df['LOGGER_mWater_corr'][point]=np.nan
df['LOGGER_mWater_corr']=df['LOGGER_mWater_corr'].interpolate(axis=0)
print(df['LOGGER_mWater_corr'][point])

In [None]:
#To interpolate values over multiple records (i.e. interpolate over multiple consecutive records)
#Change the date and time for the first (first_pt) and last (last_pt) of the interval you want to change
first_pt=df.index.get_loc('2020-12-01 14:15:00')
last_pt=df.index.get_loc('2020-12-01 14:45:00')+1
df['LOGGER_mWater_corr'][first_pt:last_pt]=np.nan
df['LOGGER_mWater_corr']=df['LOGGER_mWater_corr'].interpolate(axis=0)
df.head()


---

##**Correct water levels**##


**After removing and cleaning all noisy data, run the code below to correct water level values for sensor drift**

Your starting level (first record value) should match your starting manual measurement and your ending level (last record value) should match your end manual measurement.

In [None]:
df['LOGGER_mDTW_corr']=df['LOGGER_mWater_corr']
n=len(df['LOGGER_mWater_corr'])-1
LTCinit=df['LOGGER_mWater_corr'][0]
LTCend=df['LOGGER_mWater_corr'][n]
df['LOGGER_mDTW_corr']=start_lev-(df['LOGGER_mWater_corr']-LTCinit)
R1=df['LOGGER_mDTW_corr'][0]
R2=df['LOGGER_mDTW_corr'][n]
acc=round((end_lev-R2)-(start_lev-R1),3)
K=acc/(n-1)

new_lc=[]
for index,val in enumerate(df['LOGGER_mDTW_corr'],start=1):
    corr=K*(index-1)
    new=round(val+corr,3)
    new_lc.append(new)

df['LOGGER_mDTW_corr']=new_lc

---
##**Correct SC for sensor drift**##

Use calibration check and calibration values to complete this part.

In [None]:
#Enter starting calibration value and ending calibration check value
start_std=1413
start_cal=1413
end_std=1413
end_cal= 1413

n=len(df['LOGGER_SC'])-1
K1=start_std/start_cal
K2=end_std/end_cal
drift=round(K1-K2,4)
dK=(K2-K1)/(n-1)

new_sc=[]
for index,val in enumerate(df['LOGGER_SC'],start=1):
    corr=1+(dK/K1)*(index-1)
    new=round(val*corr,3)
    new_sc.append(new)

df['LOGGER_SC']=new_sc


---

##**Add deployment record data to block and data tables**##

This will call up the block.csv file and add a new record. Run #1-5 for EACH MEASUREMENT VARIABLE. For example, if your logger is collecting temperature, level and SC, the following #1-5 blocks must be run three time separately. Make sure you change the variables in #1 each time.


**#1. Manually enter some variables below**

In [None]:
#Enter the variables below
initials='RWM'
dat_type='T'
matrix='W'
unit='DC'
sensor='M3001'
sensor_sn=1080544
datum='TOC'
interval=15

**#2. Open block table through (a) Colab or (b) local runtime**

  (a) Open block table on the Shared Drive **if working through Google Colab**

In [None]:
df_block=pd.read_csv('/content/drive/My Drive/Water/data_tables/block.csv')
print(df_block)

(b) (^) Open block table **if in a local runtime**

In [None]:
df_block=pd.read_csv('G:/Shared drives/CZN_HydroGroup/Water/data_tables/block.csv')
print(df_block)

**#3. Calculate some variables**

Do not manually enter anything below.

In [None]:
#Get block table info
block_start_time=df.index[0]
block_end_time=df.index[len(df.index)-1]
site_id=file.split('_')[0]
blockno=df_block['blockno'].max()+1
ind1=df_block['index2'].max()+1
ind2=ind1+n
process_date=date.today().strftime('%m/%d/%Y')

def find_acc():
    if dat_type=='L':
        return (round((end_lev-R2)-(start_lev-R1),3))
    elif dat_type=='T':
        return np.nan
    elif dat_type=='C':
        return np.nan
    else:
        print('You have entered an invalid data type')

acc=find_acc()

def find_drift():
    if dat_type=='L':
        return np.nan
    elif dat_type=='T':
        return np.nan
    elif dat_type=='C':
        return (round(K1-K2,4))
    else:
        print('You have entered an invalid data type')

drift=find_drift()

**#4. Append new deployment record to existing block table**

In [None]:
#Append to block table
df_block=df_block.append({'blockno':blockno,
                          'site_id':site_id,
                          'start_time':block_start_time,
                          'index1':ind1,
                          'end_time':block_end_time,
                          'index2':ind2,
                          'matrix':matrix,
                          'data_type':dat_type,
                          'sensor':sensor,
                          'sensor_sn':sensor_sn,
                          'unit':unit,
                          'interval':interval,
                          'mp_datum':datum,
                          'accuracy':acc,
                          'drift':drift,
                          'process_initials':initials,
                          'process_date':process_date},ignore_index=True)

#Check block table and make sure the new record looks OK
df_block

**#5. Commit this new record to the table**

No need to append here, since you brought in the existing file that contained previous records

In [None]:
df_block.to_csv('/content/drive/My Drive/Water/data_tables/block.csv')

---

##**Add the actual data values to the data.csv file**##

Each variable must be done separately, but they will be appended to the existing file.

In [None]:
#If data is water levels
df_data=pd.read_csv('/content/drive/My Drive/Water/data_tables/data.csv')
df_data['index']=[*range(ind1,ind2+1,1)]
df_data['blockno']=blockno
df_data['amount']=df['LOGGER_mDTW_corr'].to_list()
df_data=df_data.to_csv('/content/drive/My Drive/Water/data_tables/data.csv',mode='a',index=False)

In [None]:
#If data is conductivity
df_data=pd.read_csv('/content/drive/My Drive/Water/data_tables/data.csv')
df_data['index']=[*range(ind1,ind2+1,1)]
df_data['blockno']=blockno
df_data['amount']=df['LOGGER_SC'].to_list()
df_data=df_data.to_csv('/content/drive/My Drive/Water/data_tables/data.csv',mode='a',index=False)

In [None]:
#If data is temperature
df_data=pd.read_csv('/content/drive/My Drive/Water/data_tables/data.csv')
df_data['index']=[*range(ind1,ind2+1,1)]
df_data['blockno']=blockno
df_data['amount']=df['LOGGER_Temp_C'].to_list()
df_data=df_data.to_csv('/content/drive/My Drive/Water/data_tables/data.csv',mode='a',index=False)

##**Move the logger file to the Post Process folder**##

This comes after you've finished adding records to the block table AND adding to the data table.

In [None]:
#Save to csv with '_levelCorrect' suffix and move to postprocess
df=df.to_csv(endpath+file.split('.')[0]+'_levC.csv',index=True)
os.remove(path+file)