## Modeling barometric pressure, linear trend, tidal corrections, regional strains

Sections in this notebook:

&emsp;Introduction <br>
&emsp;&emsp;BAYTAP
1. Import the necessary modules, data, and metadata <br>
   1.1 Select the station you want to analyze <br>
   1.2 Load the file into a dataframe <br>
   1.3 Download and store the xml metadata <br>
1. Barometric pressure correction <br>
   2.1 Available pressure channels <br>
   2.2 Scale and interpolate the raw pressure data <br>
3. Linear trend correction <br>
4. Tidal correction <br>
   4.1 Model tides in SPOTL <br>
   4.2 Load tides to dataframe <br>
5. Regional strains <br>
6. Saving and plotting the product <br>
   6.1 Save the dataframe to text files <br>
   6.2 Plot <br>

Tips: 
- If you don't see the cell widget, try running all initialization cells, or the individual cell, and it should pop up. 
- If at any point you wish to take a look at the dataframe we continually add to, type ```df.head()``` into a new code cell and run it. 
- If something seems off, make sure you ran prior cells. Some cells have dependencies on the cells before it, even between sections. 

***
## Introduction
***

The goal of this notebook is to model gauge strain corrections for the barometric pressure response, linear trend, and tides. 

Although we will not run BAYTAP, describing it will help us to understand predictable responses in the strainmeter data. Once modeled, these corrections can be removed from the linearized strain to produce residual time series - which will highlight anomalous signals of interest. 

### Time series analysis in BAYTAP08
The original BAYTAP-G program was rewritten by Duncan Agnew in the BAYTAP08 iteration: http://igppweb.ucsd.edu/~agnew/Baytap/baytap.html. 
 
In sum, the program assumes a strain signal ($y_i$) composed of the following components: 

>$$y_i = t_i + d_i + c_i + s_i$$
- $t_i$: tidal signal
- $d_i$: long term trend (instrument drift)
- $c_i$: response to other effects (e.g., barometric pressure)
- $s_i$: data offsets

The tidal signal is composed of the tidal amplitudes and phases we seek ($A_m$ and $B_m$), and the known theoretical tidal group values ($C_{mi}$ and $S_{mi}$) that are the sum of the constituents with similar frequencies:
>$$t_i =\sum_{m=1}^{M} (A_m C_{mi}+B_m S_{mi})$$

The barometric response is considered linear with the barometric pressure change. Where, $b_i$ is the barometric response coefficient:
>$$c_i=b_i y_i$$

The residual time series is left once these signals are removed: 
>$$ r_i = y_i - (t_i + d_i + c_i + s_i)$$

BAYTAP solves for the model parameters by minimizing S:
>$$S = \sum_{i=1}^{N}r_i^2+ D^2 \sum_{i=1}^{n} (d_i - 2d_{i-1} + d_{i-2})^2 + W^2 \sum_{m=2}^{M}(A_m - A_{m-1})^2+(B_m-B_{m-1})^2 $$
- D is an input smoothness parameter, where a very large D provides a linear drift with time. 
- W is an input parameter that controls how much the tidal admittance can vary over frequency bands. 

The strainmeters have already been analyzed with BAYTAP, and those outputs are what we will use to model the tidal correction and the barometric pressure correction. We can find the info in the station metadata.  



***
## 1. Import the necessary modules, data, and metadata 
***

In [100]:
# Import the necessary modules

# This imports matplotlib for later plotting
import matplotlib.pyplot as plt
# Make plotting interactive
#%matplotlib notebook
plt.style.use('ggplot')
plt.rcParams['figure.figsize'] = 8, 6

import numpy as np
import pandas as pd
from datetime import timedelta
import os
from scipy import signal, linalg
import xmltodict

# This imports obspy, a python toolbox for seismology,
# the iris web services client, a reader for stream data
# and the UTC date time format
import obspy
from obspy import read, UTCDateTime, read_inventory
from obspy.clients.iris import Client
client = Client()
from obspy.clients.fdsn import Client
inv_client = Client('IRIS')

import ipywidgets as widgets
from ipywidgets import HBox, VBox, interact, Layout
style = {'description_width': 'initial'}
layout=Layout(width='30%', height='40px')

*** 
### 1.1 Select the station you want to analyze
***

In [127]:
# Assign station seed codes from selected file

dir = './DataFiles/Level1/'
sta_list = []
for file in os.listdir(dir):
    if file.endswith('Level1.txt'):
        sta_list.append(file)
        
# Set initial values   
file = sta_list[0]
network = file[0:2]
scode = file[3:7]
loc = file[8:10]
cha = file[11:13]

print('Pick the file you would like to model corrections for:')
sta_select = widgets.Dropdown(
            options=sta_list,
            value=sta_list[0],
            description='Station file:',
            )
# Change the station and network as the dropdown is changed
def the_ccodes(siteval):
    global scode, network, loc, cha, file
    file = siteval
    network = siteval[0:2]
    scode = siteval[3:7]
    loc = siteval[8:10]
    cha = siteval[11:13]
def on_cselect(change):
    the_ccodes(change.new)
sta_select.observe(on_cselect,names='value')

display(sta_select)

Pick the file you would like to model corrections for:


Dropdown(description='Station file:', options=('PB.B916.T0.RS.2009-02-062009-02-22_Level1.txt', 'PB.B916.T0.RS…

***
### 1.2 Load the file into a dataframe
***

In [102]:
# Make a dataframe with the file and assign start and end dates
fbutton = widgets.Button(description="Load dataframe", button_style='danger')
foutput = widgets.Output()

display(fbutton, foutput)

# Initial start and end times (arbitrarily chosen)
start = UTCDateTime('2000-01-01 00:00:00.000')
end = UTCDateTime('2000-01-02 00:00:00.000')
file_list = ['DF not loaded','DF not loaded','DF not loaded','DF not loaded']

def on_fbutton_clicked(b):
    with foutput:
        foutput.clear_output()
        global df, start, end, file_list, comment_lines
        
        # Store file comments
        comment_lines = []
        with open(dir+file,'r') as f:
            for ln in f:
                if ln.startswith('#'):
                    comment_lines.append('#'+ln[1:])
        f.close()
        
        df = pd.read_csv(dir+file,sep='\t',index_col=0,header=0,comment='#')
        print('Wait for the dataframe to display.')
        longscode=df.index.name
        ind = []
        for i in range(0,len(df)):
            ind.append(UTCDateTime(df.index[i]))
        df.index = ind
        start = df.index[0]
        end = df.index[-1]
        
        # For tide code cell later
        file_list = [scode+'.'+cha+'.'+'gauge0tides'+str(start)[0:4]+str(start)[5:7]+str(start)[8:10]+'.txt',
                     scode+'.'+cha+'.'+'gauge1tides'+str(start)[0:4]+str(start)[5:7]+str(start)[8:10]+'.txt',
                     scode+'.'+cha+'.'+'gauge2tides'+str(start)[0:4]+str(start)[5:7]+str(start)[8:10]+'.txt',
                     scode+'.'+cha+'.'+'gauge3tides'+str(start)[0:4]+str(start)[5:7]+str(start)[8:10]+'.txt']

        
        print(df.head())
fbutton.on_click(on_fbutton_clicked)

Button(button_style='danger', description='Load dataframe', style=ButtonStyle())

Output()

***
### 1.3 Download and store the xml metadata
***
We will use the station XML metadata associated with the level 2 processed data from UNAVCO to model the tides and regional strains. 

The xml file can be found for any station under the processed ASCII link here: https://www.unavco.org/data/strain-seismic/bsm-data/bsm-data.html. The file is updated regularly as more processed data becomes available.

In [103]:
# Download the level 2 xml metadata and store it in a dictionary-type object

xbutton = widgets.Button(description="Download xml metadata",layout=layout, button_style='danger')
xoutput = widgets.Output()

display(xbutton, xoutput)

def on_xbutton_clicked(b):
    with xoutput:
        xoutput.clear_output()
        print('Working on it...')

        # Check box to download meta again, only displays if already downloaded
        cx = widgets.Checkbox(value=False,
            description='Download again?',
            disabled=False, style=style,layout=Layout(width='50%')
            )
     
        xml_file = scode +'.xml'
        xml_path = 'ftp://bsm.unavco.org/pub/bsm/level2/'+scode+'/'
        metadir = './DataFiles/Metadata/'
        os.makedirs(metadir, exist_ok=True)
        if os.path.exists(metadir+xml_file) == True:
            print('You already downloaded the',xml_file,'!')
            print('Check the box if you would like to download it again: ')
            display(cx)
        else:
            # Import the necessary libraries to download the file
            import shutil
            import urllib.request as request
            from contextlib import closing
            
            with closing(request.urlopen(xml_path+xml_file)) as r:
                with open(metadir+xml_file, 'wb') as f:
                    shutil.copyfileobj(r, f)
            print(xml_file, 'has been saved to your', metadir, 'directory')
            
        xobutton = widgets.Button(description="Now load the metadata",layout=layout, button_style='danger')
        xooutput = widgets.Output()

        display(xobutton, xooutput)

        def on_xbutton_click(b):
            with xooutput:
                xooutput.clear_output()
                # If box is checked, download again
                if os.path.exists(metadir+xml_file) == True:
                    if cx.value == True:
                        import shutil
                        import urllib.request as request
                        from contextlib import closing
                        with closing(request.urlopen(xml_path+xml_file)) as r:
                            with open(metadir+xml_file, 'wb') as f:
                                shutil.copyfileobj(r, f)
                        print(xml_file, 'has been re-saved to your', metadir, 'directory')
                global xmldict
                xmldict = xmltodict.parse(open(metadir+xml_file).read(),process_namespaces=True)
                print('Loaded!')
        xobutton.on_click(on_xbutton_click)
        
xbutton.on_click(on_xbutton_clicked)



Button(button_style='danger', description='Download xml metadata', layout=Layout(height='40px', width='30%'), …

Output()

***
## 2. Barometric pressure correction
***
The strainmeters are sensitive to changes in barometric pressure. In this section we will correct each gauge with its associated pressure response coefficient from the latest xml metadata. The pressure response coefficients were originally calculated in BAYTAP to produce the level 2 data corrections available from UNAVCO, but will work perfectly for our purposes as well. 

The barometric response at each gauge is modelled as a linear relationship; so, to remove the response we simply need to subtract the correction from the linearized data. 

***
### 2.1 Available pressure channels
***
The strainmeters were each installed with a barometric pressure sensor that returns data at 30 minute intervals. At some of the strainmeters, sensors with higher rate data are available. You can pick which channel you would like to use to compute the barometric pressure correction in this section. 

In [104]:
# Find available pressure channels

abutton = widgets.Button(description="Find Pressure Channels",layout=layout, button_style='danger')
aoutput = widgets.Output()

display(abutton, aoutput)


def on_abutton_click(b):
    with aoutput:
        aoutput.clear_output()
        plt.close()
        print('Finding barometric pressure channels...')
        
        # This should return the LDO and RDO options for pressure data, if they are available.
        # One issue is the time references the station time, not the individual sensor time, so if an error
        # appears when plotting the data, there may not be data for that time.
        inv = inv_client.get_stations(network = network, station = scode, channel= '*DO',level='response',matchtimeseries=True,starttime=start,endtime=end)
        
        print('...channels found! Click to update the dropdown selection')
        print('RDO provides 30 minute data while LDO provides 1sps data, if it is available')
        chan = widgets.Dropdown(
            options=inv.get_contents()['channels'],
            value = inv.get_contents()['channels'][0],
            description='Pick a channel from the list:',
            ) 
        if len(inv.get_contents()['channels'][0])==2:
            print('yes!')
        else:
            print('no!')
        global baro_channel
        # initial value
        baro_channel = chan.value[-3:]
        
        display(chan)
        # Change the station and network as the dropdown is changed
        def the_codes(chanval):
            global baro_channel
            baro_channel = chanval[-3:]
        def on_select(change):
            the_codes(change.new)
        chan.observe(on_select,names='value')
        
        obutton = widgets.Button(description="Download and Examine Raw Pressure Data",layout=Layout(width='40%', height='40px'), button_style='danger')
        ooutput = widgets.Output()
        
        display(obutton, ooutput)
        
        def on_abutton_clicked(b):
            with ooutput:
                ooutput.clear_output()
                plt.close()
                global  baro_loc, baro_meta, sf, baro
                baro_meta = inv_client.get_stations(network = network, station = scode,channel=baro_channel, level = 'response')
                print(baro_meta[0][0][0],baro_meta[0][0][0].response)
                
                # Let's get the pressure data, take an initial look, and get the conversion to geophysical units
                if baro_channel == 'LDO':
                    baro_loc = ''
                    # Technically we need to add 1000 to get real units of pressure for LDO
                    # but we demean the data so just applying the factor is sufficient
                    sf = baro_meta[0][0][0].response.instrument_sensitivity.value
                else:
                    # The scale factor is a conversion to KPa for RDO, we need millibars
                    baro_loc = 'TS'
                    sf = baro_meta[0][0][0].response.instrument_sensitivity.value * 10
                print('From this, we see the conversion to millibar is:',sf)
                print('Loading the pressure data to a plot...')
                # Download time series from iris
                baro = client.timeseries(network,scode,baro_loc,baro_channel,start,end)
                plt.plot(baro[0].times('utcdatetime'),baro[0].data) 
        obutton.on_click(on_abutton_clicked)

abutton.on_click(on_abutton_click)

Button(button_style='danger', description='Find Pressure Channels', layout=Layout(height='40px', width='30%'),…

Output()

***
### 2.2 Scale and interpolate the raw pressure data
***
This next cell performs three main tasks:
1. Apply the appropriate geophysical scale factor found in section 2.1 to the raw pressure data.
1. Demean the pressure data. 
1. Multiply the pressure data by the barometric pressure response coefficient (from the metadata) and interpolate to the times of our strain data. 
 

In [105]:
# Store the pressure data in a dataframe with times interpolated to match the gauge data

def interp(df, new_index):
    """Return a new DataFrame with all columns values interpolated
    to the new_index values."""
    df_out = pd.DataFrame(index=new_index)
    df_out.index.name = df.index.name

    for colname, col in df.iteritems():
        df_out[colname] = np.interp(new_index, df.index, col) # default is a linear interpolation

    return df_out

cbutton = widgets.Button(description="Scale and Calculate Pressure Correction",layout=Layout(width='40%', height='40px'), button_style='danger')
coutput = widgets.Output()

display(cbutton, coutput)

def on_cbutton_clicked(b):
    with coutput:
        coutput.clear_output()
        print('Working on it...')
        global df
        fillme = np.array([])
        for i in range(0,len(baro)):
            p_array = np.append(fillme,baro[i].data)
        # first use sf in instrument response info to get milibar 
        # (check that the above output is in hectopascals, hPa=milibar)
        # Note that the setra data requires 1000 to be added after unit conversion for real units, 
        # but we remove the mean anyway so it doesn't matter
        millibar = p_array * sf
        # Get the UTC datetimes
        empty = np.array([])
        for i in range(0,len(baro)):
            t = baro[i].times('utcdatetime')
            tmpt = np.append(empty, t)
        # Demean the data
        baro_df = pd.DataFrame(millibar-millibar.mean(),tmpt,columns=['millibar'])
        
        baro_df = interp(baro_df,df.index)
        
        # compute the barometric pressure correction 
        baro_ch = {'0':np.array([]),'1':np.array([]),'2':np.array([]),'3':np.array([])}
        # get the channel response from the xml data
        for channel in baro_ch:
            ch_resp = xmldict['strain_xml']['inst_info']['processing']['bsm_processing_history'][-1]['bsm_processing']['atm_pressure']['apc_g'+channel]
            baro_ch[channel] = np.array(baro_df['millibar']*float(ch_resp))
            df['baro_ch'+channel] = baro_ch[channel]
        print('Done!')
cbutton.on_click(on_cbutton_clicked)


Button(button_style='danger', description='Scale and Calculate Pressure Correction', layout=Layout(height='40p…

Output()

***
## 3. Linear trend correction
***
The long term trend of gauge strain is dominated by compression as the ground closes in on the borehole. In some cases, it is helpful to remove this trend (especially if a full month is used). Here, we compute a simple least squares linear fit to the data after the data has been corrected for barometric pressure. One complication to this simple solution is that any offsets or significant variation in the data (whether real or instrumental in origin) will skew the trend. If this is the case, more rigorous post-processing may be necessary.

[Cleanstrain+](https://www.usgs.gov/software/cleanstrain) is a software package (external to these notebooks) that can estimate tidal constituents, pressure admittance, offsets, rate changes and temporally correlated noise in strainmeter data, among other things. For offsets, the times of offsets, averaging window to estimate the offsets, and nature of the offsets (tectonic or instrumental) must be supplied. This is a useful resource if further processing is needed. 

In [106]:
# Compute the linear trend by least squares regression
# remove the pressure correction first
# if there are major offsets in the data, the offsets should be removed first

lbutton = widgets.Button(description="Compute Linear Trend",layout=layout, button_style='danger')
loutput = widgets.Output()

display(lbutton,loutput)

def on_lbutton_clicked(b):
    with loutput:
        loutput.clear_output()
        global df
        
        print('Computing linear trend...')
        
        trend_ch = {'0':np.array([]),'1':np.array([]),'2':np.array([]),'3':np.array([])}
    
        # get the least squares linear trend of the data using the scipy 'detrend' function
        for channel in trend_ch:
            trend = np.array(df['ch'+channel+' [ms]']-df['baro_ch'+channel]) - signal.detrend(np.array(df['ch'+channel+' [ms]']-df['baro_ch'+channel]),type='linear')
            trend_ch[channel] = trend
            df['trend_ch'+channel] = trend_ch[channel]
            
        print('Done!')
lbutton.on_click(on_lbutton_clicked)



Button(button_style='danger', description='Compute Linear Trend', layout=Layout(height='40px', width='30%'), s…

Output()

***
## 4. Tidal correction
***

To model the solid earth tides and ocean loads in one 'tidal correction,' we will use a program called SPOTL [(Some Programs for Ocean Tide Loading)](https://igppweb.ucsd.edu/~agnew/Spotl/spotlmain.html). The relevant module from the program for us is **hartid**. From the man pages: 

```
hartid − predicts tides from harmonic constants

SYNOPSIS
hartid year [day-of-year | month day] hr min sec nterms samp

DESCRIPTION
Given the ‘‘harmonic constants’’ for a tidal series, hartid computes the predicted tides for a specified time.
The harmonic constants (in the format specified below) are read in from the standard input, and the predicted tides written to the standard output. The computation (described in more detail below) infers the
value of small constituents from those of larger ones, so only a few are needed to give a good result. The
arguments on the command line are:
year day hr min sec

    The time of the first output value, in Greenwich time (UTC); the date may be given either as day
of the year, or as month and day (Gregorian calendar). There are no explicit restrictions on the
range of admissable dates, but the relevant formulae for the fundamental tidal arguments will be
increasingly inaccurate before 1700.

nterms    The number of values to be written out
samp    The sample interval, in seconds.
```

hartid can read from standard input in the following format:
```
l
−116.455
 1−1 0 0 0 0 7.41000 −78.0
 1 1 0 0 0 0 11.7900 −94.0
 ...
 2 0 0 0 0 0 16.0600 −287.0
-1
```
The first line is a lowercase L, the last line is a negative one. The second line is the longitude of the location (W is negative). The remaining lines in the middle are the tidal constituent "Cartwright" code followed by the amplitude in nanostrain and phase in degrees. There can be many supplied harmonic constituents, but the program requires at least one diurnal and one semidiurnal constituent. The program then automatically models the signal from other smaller amplitude species within the provided tidal bands. 

Other packages in SPOTL would allow us to compute and recombine the tidal harmonic constituents, but for now we will use the already computed tidal amplitudes and phases from the metadata of our stations. These amplitudes and phases were originally computed in BAYTAP for the level 2 processed data. 

***
### 4.1 Model tides in SPOTL
***
The code cell in this section produces SPOTL modeled tides in individual text files for each gauge from a print statement with the tide Cartwright codes, amplitudes, and phases. 

In [126]:
# Get tidal amplitudes and phases from the metadata and produce a print statement to run with SPOTL

# Directory of SPOTL and directory to store the tidal time series 
spotl = 'docker run -i spotl hartid'
dname = './DataFiles/SpotlTides/'
rel_dname = './DataFiles/SpotlTides/'
os.makedirs(rel_dname, exist_ok=True)

if all([os.path.exists(rel_dname+file_list[0]),os.path.exists(rel_dname+file_list[1]),os.path.exists(rel_dname+file_list[2]),os.path.exists(rel_dname+file_list[3])]) == True:
    print('Tide files already exist for this station, channel, and startdate.')
    print('To caclulate anyway, click the button. Otherwise, move on to section 4.2.')

sbutton = widgets.Button(description="Compute SPOTL Tides",layout=layout, button_style='danger')
soutput = widgets.Output()

def on_button_clicker(b):
    with soutput:
        soutput.clear_output()
        print('Working on it...')
        
        global samp, val
        # xml metadata starting point
        h_stit = xmldict['strain_xml']['inst_info']['processing']['bsm_processing_history'][-1]['bsm_processing']['tidal_parameters']['tide']

        start_dt = str(start)[0:4] + ' ' +str(start)[5:7] + ' ' +str(start)[8:10] + ' ' +str(start)[11:13] + ' ' +str(start)[14:16] + ' ' +str(start)[17:19]
        val = []
        for ch in ['gauge0','gauge1','gauge2','gauge3']:
            # Start a formatted print statement to run from the command line 
            print_str = f'printf \'l\\n{str(baro_meta[0][0][0].longitude)}'
            # loop through each constituent present in the metadata
            for i in range(0,len(h_stit)):
                gauge = h_stit[i]['phz']['@kind']
                tide = h_stit[i]['@name']
                if ch == gauge and (tide == 'O1' or tide == 'M2' or tide == 'K1' or tide == 'S2' or tide == 'N2' or tide == 'P1'):
                    # Separated Cartwright codes for each harmonic constituent in the metadata
                    dood = str.split(h_stit[i]['@doodson'])
                    one = dood[0]
                    two = dood[1]
                    three = dood[2]
                    four = dood[3]
                    five = dood[4]
                    six = dood[5]
                    # Amplitude and phase
                    amp = h_stit[i]['amp']['#text']
                    phz = h_stit[i]['phz']['#text']
                    # Create strings with the correct code, amplitude, and phase
                    if tide == 'O1':
                        O1 = f'{one:>2}{two:>2}{three:>2}{four:>2}{five:>2}{six:>2} {amp:0<7} {phz:<8}'
                        print_str = print_str + f'\\n{O1}'
                    if tide == 'M2':
                        M2 = f'{one:>2}{two:>2}{three:>2}{four:>2}{five:>2}{six:>2} {amp:0<7} {phz:<8}'
                        print_str = print_str + f'\\n{M2}'
                    if tide == 'K1':
                        K1 = f'{one:>2}{two:>2}{three:>2}{four:>2}{five:>2}{six:>2} {amp:0<7} {phz:<8}'
                        print_str = print_str + f'\\n{K1}'
                    if tide == 'S2':
                        S2 = f'{one:>2}{two:>2}{three:>2}{four:>2}{five:>2}{six:>2} {amp:0<7} {phz:<8}'
                        print_str = print_str + f'\\n{S2}'
                    if tide == 'N2':
                        N2 = f'{one:>2}{two:>2}{three:>2}{four:>2}{five:>2}{six:>2} {amp:0<7} {phz:<8}'
                        print_str = print_str + f'\\n{N2}'
                    if tide == 'P1':
                        P1 = f'{one:>2}{two:>2}{three:>2}{four:>2}{five:>2}{six:>2} {amp:0<7} {phz:<8}'
                        print_str = print_str + f'\\n{P1}' 
            fname = f'{scode}.{cha}.{ch}tides{str(start)[0:4]}{str(start)[5:7]}{str(start)[8:10]}.txt'   
            # SPOTL will only allow up to 999999 values to be calculated
            # if the dataframe is longer than this, we will have to modify the number of 
            # samples and sample interval
            if len(df) > 999999:
                nterms = 999999
            else:
                nterms = len(df)
            samp = round((end-start)/(nterms-2))
            # Finish the print statement
            print_str = print_str + f'\\n-1\' | {spotl} {start_dt} {nterms} {samp} > {dname}{fname} \n'
            # Run in SPOTL
            print(print_str)
            os.system("%s" % (print_str))
            # Make sure the files wrote
            if int(os.popen('wc -l '+dname+fname).read().split(  )[0]) > 0:
                print(ch+' tides saved to '+rel_dname)
            else: 
                print('The '+ch+' tides were not computed.')
sbutton.on_click(on_button_clicker)

display(sbutton, soutput)

Button(button_style='danger', description='Compute SPOTL Tides', layout=Layout(height='40px', width='30%'), st…

Output()

***
### 4.2 Load tides to dataframe
***

This cell loads the SPOTL tides to the dataframe, interpolating values if necessary. 

In [117]:
# Load the tides, and store in the dataframe

tbutton = widgets.Button(description="Load Tidal Corrections to DataFrame",layout=Layout(width='40%', height='40px'), button_style='danger')
toutput = widgets.Output()

def on_tbutton_click(b):
    with toutput:
        toutput.clear_output()
        print('Working on it...')
        global df
        tide = {'0':None,'1':None,'2':None,'3':None}
        for file in os.listdir('./DataFiles/SpotlTides'): 
            if file.startswith(f'{scode}.{cha}.gauge') and file.endswith(f'{str(start)[0:4]}{str(start)[5:7]}{str(start)[8:10]}.txt'):
                n = file[13:14]
                tmpdf = pd.read_csv('./DataFiles/SpotlTides/'+file,header=None)
                # if the dataframe is longer than 999999, interpolate the tides to the dataframe times
                if len(df) > 999999:
                    ind = []
                    for i in range(0,999998):
                        ind.append(start + (samp)*i)
                    tmpdf.index = ind
                    tide[n] = interp(tmpdf,df.index)
                else: 
                    tide[n] = tmpdf.values
                
        df['tide_ch0'] = tide['0']/1000
        df['tide_ch1'] = tide['1']/1000
        df['tide_ch2'] = tide['2']/1000
        df['tide_ch3'] = tide['3']/1000
        print('Done!')
tbutton.on_click(on_tbutton_click)

display(tbutton, toutput)

Button(button_style='danger', description='Load Tidal Corrections to DataFrame', layout=Layout(height='40px', …

Output()

***
## 5. Regional strains
***
In this section, we will apply the orientation matrix ($a^{-1}_{ij}$) to transform the gauge strains into regional areal and shear strains ($E_j$). Two orientation matrices may exist for the strainmeters: (1) The manufacturer's orientation matrix, and (2) the tidally calibrated matrix. See Hodgkinson et al. (2013) for more information on the calibration methods. We will start by using the latest orientation matrix in the xml metadata. 

Generally, each gauge strain ($e_i$) can be expressed as a linear combination of the areal and shear strains multiplied by coefficients derived from instrument orientation and instrument-bedrock coupling information. This is:
>$$e_{i} = a_{i1}E_A + a_{i2}E_D + a_{i3}E_S $$

Where the coupling coefficients ($a_{ij}$) can be determined through calibration with a known physical event (e.g. the tides, seismic waves), or by assuming constrained instrument properties. In the latter case, the manufacturer's coupling coefficients ($c_i$ for areal coupling, $d_i$ for shear coupling), and gauge orientations ($\theta_i$) are used to create the orientation matrix. With these estimated quantities for one gauge, the equation above becomes: 
>$$e_{i} = 0.5[c_{i}E_A + d_{i}cos(2\theta_i)E_D + d_{i}sin(2\theta_i)E_S] $$

Whether determined by calibration with known geophysical events or by assuming instrument coupling and orientation, we can apply the inverse of the orientation matrix (note that it is the Moore-Penrose Pseudoinverse) to the measured gauge strain as follows:

>$$a^{-1}_{ij} e_{i} = E_{j}$$ <br>
>$$\begin{pmatrix} a_{11} & a_{12} & a_{13} & a_{14} \\ a_{21} & a_{22} & a_{23} & a_{24} \\ a_{31} & a_{32} & a_{33} & a_{34} \end{pmatrix}^{-1} \begin{pmatrix} e_{0} \\ e_{1} \\ e_{2} \\ e_{3} \end{pmatrix} = \begin{pmatrix} e_{EE} + e_{NN} \\ e_{EE} - e_{NN} \\ 2e_{EN} \end{pmatrix} = \begin{pmatrix} E_A \\ E_D \\ E_S \end{pmatrix}$$


In [128]:
# Define a function to apply the orientation matrix to gauge data
# Get the orientation matrix from the latest xml metadata entry
def regional_s(z,y,x,w):
    '''
    A function that takes 4 dataframe type columns (strain) and applies the orientation matrix to produce areal and shear strains.
    ''' 
    EA = []
    ED = []
    ES = []
    for i in range(0,len(df)): 
        gauge_strain = np.array([z.iloc[i],y.iloc[i],x.iloc[i],w.iloc[i]])
        Ei = np.matmul(a_inv,gauge_strain)
        EA.append(Ei[0])
        ED.append(Ei[1])
        ES.append(Ei[2])
    return EA, ED, ES

print('Tip: if you know one of you gauges has bad data, and you don\'t want that to appear in your regional strains, exclude it now.')

exclude = {'ch0':0,'ch1':1,'ch2':2,'ch3':3}
exc = widgets.SelectMultiple(
    options=exclude,
    description='Gauges to exclude:',
    disabled=False, style=style
    )

display(exc)

ibutton = widgets.Button(description="Get Orientation Matrix",layout=layout, button_style='danger')
ioutput = widgets.Output()

def on_ibutton_clicked(b):
    with ioutput:
        ioutput.clear_output()
        print('Working on it...')
        global a_inv
        # The orientation matrix from the xml metadata
        a_loc = xmldict['strain_xml']['inst_info']['processing']['bsm_processing_history'][-1]['bsm_processing']['orientation_matrix']
        c = 1.5; d = 3.0
        a1 = np.array([c*float(a_loc['o11']),d*float(a_loc['o12']),d*float(a_loc['o13'])])
        a2 = np.array([c*float(a_loc['o21']),d*float(a_loc['o22']),d*float(a_loc['o23'])])
        a3 = np.array([c*float(a_loc['o31']),d*float(a_loc['o32']),d*float(a_loc['o33'])])
        a4 = np.array([c*float(a_loc['o41']),d*float(a_loc['o42']),d*float(a_loc['o43'])])
        a_mat = np.array([a1,a2,a3,a4])
        
        # Compute the Moore-Penrose pseudo inverse matrix with SciPy linalg module
        a_inv = linalg.pinv(a_mat)
            
        # If any gauges are excluded, set their column to 0
        if (0 in exc.value) == True:
            a_inv[:,0] = [0,0,0]
        if (1 in exc.value) == True:
            a_inv[:,1] = [0,0,0]
        if (2 in exc.value) == True:
            a_inv[:,2] = [0,0,0]
        if (3 in exc.value) == True:
            a_inv[:,3] = [0,0,0]
        print(a_inv)
        print('Done!')
ibutton.on_click(on_ibutton_clicked)

display(ibutton, ioutput)

Tip: if you know one of you gauges has bad data, and you don't want that to appear in your regional strains, exclude it now.


SelectMultiple(description='Gauges to exclude:', options={'ch0': 0, 'ch1': 1, 'ch2': 2, 'ch3': 3}, style=Descr…

Button(button_style='danger', description='Get Orientation Matrix', layout=Layout(height='40px', width='30%'),…

Output()

***
## 6. Saving and plotting the product
***
Congrats! We can now save and plot the linearized gauge strains, areal and shear strains, and corrections.

The first cell produces an option to save what we have calculated in two files: (1) Contains the linearized gauge strain and corrections, (2) contains the areal and shear strains and corrections.

The second cell is an interactive plotting option - the second cell is not dependent on the first, so you can plot without saving the files. 

***
### 6.1 Save the dataframe to text files
***

In [110]:
# Option to save:
# a file with raw gauge strain, pressure correction, tidal correction, and linear trend correction
# a file with areal and shear gauge strain, areal and shear tide correction, areal and shear pressure correction, areal and shear linear correction

save_button = widgets.Button(description="Compute and Save Strains",layout=layout, button_style='danger')
save_output = widgets.Output()

display(save_button, save_output)

def on_save_button_clicked(b):
    with save_output:
        save_output.clear_output()
        print('Working on it...')
        print('will take time with long datadets...')
        global moddf
        gauge = regional_s(df['ch0 [ms]'],df['ch1 [ms]'],df['ch2 [ms]'],df['ch3 [ms]'])
        gaugeEA = gauge[0]
        gaugeES = gauge[1]
        gaugeED = gauge[2]
        baros = regional_s(df['baro_ch0'],df['baro_ch1'],df['baro_ch2'],df['baro_ch3'])
        baroEA = baros[0]
        baroES = baros[1]
        baroED = baros[2]
        tide = regional_s(df['tide_ch0'],df['tide_ch1'],df['tide_ch2'],df['tide_ch3'])
        tideEA = tide[0]
        tideES = tide[1]
        tideED = tide[2]
        trend = regional_s(df['trend_ch0'],df['trend_ch1'],df['trend_ch2'],df['trend_ch3'])
        trendEA = trend[0]
        trendES = trend[1]
        trendED = trend[2]
        cols = np.column_stack([gaugeEA,gaugeED,gaugeES,tideEA,tideED,tideES,baroEA,baroED,baroES,trendEA,trendED,trendES])
        moddf = pd.DataFrame(cols,df.index,columns=['gaugeEA','gaugeED','gaugeES','tideEA','tideED','tideES','baroEA','baroED','baroES','trendEA','trendED','trendES'])
        
        chkf = widgets.Checkbox(value=False,
                description='Save level 2 data files?',
                disabled=False, style=style
                )
        voutput = widgets.Output()
        display(chkf,voutput)
        def on_checkf(c):
            with voutput:
                voutput.clear_output()
                dir = './DataFiles/Level2/'
                os.makedirs(dir, exist_ok=True)
                sfile = network + '.' + scode + '.' + loc + '.' + cha + '.' +str(start.date)+'_gauge_strain_and_corrections.txt'
                rfile = network + '.' + scode + '.' + loc + '.' + cha + '.' +str(start.date)+'_regional_strain_and_corrections.txt'
                with open(dir+rfile,'w') as f:
                    if len(exc.value) > 0:
                        for i in range(0,len(exc.value)):
                            exc_ch = list(exc.options.keys())[list(exc.value)[i]]
                            comment = '# '+exc_ch+' excluded in regional strain transformation.\n'
                            f.write(comment)
                f.close()
                # Write the reference strain values as comment
                with open(dir+sfile,'w') as f:
                    for val in [0,1,2,3]:
                        f.write(comment_lines[val])
                f.close()
                df.to_csv(dir+sfile,sep='\t',mode='a')
                moddf.to_csv(dir+rfile,sep='\t',mode='a')
                print(sfile+' and '+rfile+' saved to '+dir)
        chkf.observe(on_checkf,'value')
    
save_button.on_click(on_save_button_clicked)

Button(button_style='danger', description='Compute and Save Strains', layout=Layout(height='40px', width='30%'…

Output()

***
### 6.2 Plot
***

This plotting option performs calculations on the fly for regional strains. Another plotting option is available in NB4 with the files saved in section 6.1.

In [119]:
# Interactive plotting
# Correction Options: linearized strain, barometric correction, linear correction, (offset correction?),tidal correction
# Plot Options: gauges, regional strains
        
# Set initial values
correct = {'Pressure':1, 'Linear':2, 'Tides':3}
A = widgets.SelectMultiple(
    options=correct,
    description='Corrections to apply:',
    disabled=False, style=style
    )

plot = {'ch0':0, 'ch1':1, 'ch2':2, 'ch3':3, 'Areal':4,'Differential Shear':5,'Engineering Shear':6}
B = widgets.SelectMultiple(
    options=plot,
    description='Plot:',
    disabled=False, style=style
    )
plot_corr = {'Pressure Correction':1,'Modelled Tides':2,'Linear Trend':3}
C = widgets.SelectMultiple(
    options=plot_corr,
    description='Include Correction:',
    disabled=False, style=style
    )
display(HBox([A,B,C]))
pbutton = widgets.Button(description="Plot", button_style='danger')
poutput = widgets.Output()

def on_pbutton_clicking(but):
    with poutput:
        poutput.clear_output()
        print('Working on it...')
        plt.close()
        cor_ch = {'0':None,'1':None,'2':None,'3':None}
        for ch in cor_ch:
            if (1 in A.value) == True:
                p = 1
            else:
                p = 0
            if (2 in A.value) == True:
                l = 1
            else:
                l = 0
            if (3 in A.value) == True: 
                t = 1
            else: t = 0
            # Corrected gauge data
            cor_ch[ch] = df['ch'+ch+' [ms]'] - df['baro_ch'+ch] * p - df['trend_ch'+ch] * l - df['tide_ch'+ch] * t
        # Nice time for plotting
        plt.close()
        xtime = (df.index - df.index[0])/60/60/24
        # if areal and/or shears are selected, apply orientation matrix
        if (4 in B.value) == True or (5 in B.value) == True or (6 in B.value) == True:  
            reg = regional_s(cor_ch['0'],cor_ch['1'],cor_ch['2'],cor_ch['3'])
            if (4 in B.value) == True: plt.plot(xtime,reg[0],label='Areal')
            if (5 in B.value) == True: plt.plot(xtime,reg[1],label='Differential Shear')
            if (6 in B.value) == True: plt.plot(xtime,reg[2],label='Engineering Shear')
        # if gauges are selected, plot gauge strain
        if (0 in B.value) == True: plt.plot(xtime,cor_ch['0'],label='ch0')
        if (1 in B.value) == True: plt.plot(xtime,cor_ch['1'],label='ch1')
        if (2 in B.value) == True: plt.plot(xtime,cor_ch['2'],label='ch2')
        if (3 in B.value) == True: plt.plot(xtime,cor_ch['3'],label='ch3')
        # if corrections are selected, plot the corrections
        correction = []
        for i in [1,2,3]:
            if (1 in C.value) == True:
                correction.append('baro')
            else:
                correction.append(0)
            if (2 in C.value) == True:
                correction.append('tide')
            else:
                correction.append(0)
            if (3 in C.value) == True:
                correction.append('trend')
            else:
                correction.append(0)
            if correction[i] != 0:
                if 0 in B.value: plt.plot(xtime,df[correction[i]+'_ch0'],label='ch0 '+correction[i]) 
                if 1 in B.value: plt.plot(xtime,df[correction[i]+'_ch1'],label='ch1 '+correction[i]) 
                if 2 in B.value: plt.plot(xtime,df[correction[i]+'_ch2'],label='ch2 '+correction[i]) 
                if 3 in B.value: plt.plot(xtime,df[correction[i]+'_ch3'],label='ch3 '+correction[i]) 
                if 4 in B.value: plt.plot(xtime,moddf[correction[i]+'EA'],label='EA '+correction[i])
                if 5 in B.value: plt.plot(xtime,moddf[correction[i]+'ED'],label='ED '+correction[i])
                if 6 in B.value: plt.plot(xtime,moddf[correction[i]+'ES'],label='ES '+correction[i])
        plt.legend()
        plt.title(scode+' strain')
        plt.xlabel('Days from '+str(start)[0:10]+' '+str(start)[11:19])
        plt.ylabel('Microstrain')
        plt.show()
pbutton.on_click(on_pbutton_clicking)

display(pbutton, poutput)

HBox(children=(SelectMultiple(description='Corrections to apply:', options={'Pressure': 1, 'Linear': 2, 'Tides…

Button(button_style='danger', description='Plot', style=ButtonStyle())

Output()

### References

Agnew, D. C. (2012). SPOTL: Some Programs for Ocean-Tide Loading, SIO Technical Report, Scripps Institution of Oceanography. From https://igppweb.ucsd.edu/~agnew/Spotl/spotlmain.html

Hodgkinson, K., J. Langbein, B. Henderson, D. Mencin, and A. Borsa (2013), Tidal calibration of plate boundary observatory borehole strainmeters, J. Geophys. Res. Solid Earth, 118, 447–458, doi:10.1029/2012JB009651.

Tamura, Y., & Agnew, D. C. (2008). Baytap08 User's Manual. UC San Diego: Library – Scripps Digital Collection. From https://escholarship.org/uc/item/4c27740c