In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
import glob
import os
from obspy import read, UTCDateTime, read_inventory
from obspy.signal import PPSD
import warnings
import pandas as pd

## Poor man's RMS

We will first simply, for each station-component, compute its file size vs time:

In [None]:
stations = ["BE.MEM", "GR.AHRW"]
channels = ["HHZ", "HHE", "HHN"]


In [None]:

for station in stations:
    for channel in channels:
        files = sorted(glob.glob("DATA/MSEED/{}.*.{}*".format(station, channel)))
        if not len(files):
            continue
        days = []
        sizes = []
        for file in files:
            st = read(file, headlonly=True)
            
            sizes.append(os.path.getsize(file) / 1024)
            days.append((st[0].stats.starttime + 3600).date)
        plt.scatter(days, sizes, label="{}.{}".format(station, channel))
        break
    break
plt.legend(loc=4, ncols=2)
plt.show()

## Computing PSDs using ObsPy

For this, we will first check that the instrument responses information are correct:

In [None]:
inv = read_inventory("DATA/RESPONSES/*")
print(inv)

In [None]:
inv.plot_response(min_freq=0.01);

Next step, let's make sure we are able to compute a PSD for each of the two stations:

Reference: https://docs.obspy.org/tutorial/code_snippets/probabilistic_power_spectral_density.html

In [None]:
for station in stations:
    for channel in channels:
        firstfile = sorted(glob.glob("DATA/MSEED/{}.*.{}*".format(station, channel)))[0]
        st = read(firstfile)
        ppsd = PPSD(st[0].stats, metadata=inv)
        ppsd.add(st)
        ppsd.plot()
        break
    break

We can also play with the parameters to obtain more "nervous" spectra:

Reference:
Robert E. Anthony, Adam T. Ringler, David C. Wilson, Manochehr Bahavar, Keith D. Koper; How Processing Methodologies Can Distort and Bias Power Spectral Density Estimates of Seismic Background Noise. Seismological Research Letters 2020;; 91 (3): 1694–1706. doi: https://doi.org/10.1785/0220190212

In [None]:
for station in stations:
    for channel in channels:
        firstfile = sorted(glob.glob("DATA/MSEED/{}.*.{}*".format(station, channel)))[0]
        st = read(firstfile)
        ppsd = PPSD(st[0].stats, metadata=inv,
                    period_smoothing_width_octaves=0.125,
                   period_step_octaves=0.0125,
                   period_limits=(0.01,100))
        ppsd.add(st)
        ppsd.plot()
        break
    break

# Compute PSDs using MSNoise

For this, we will set up our first MSNoise project, in this very folder, either in the command line, or using ! commands here in the jupyter notebook:

## first create the db.ini file and the SQLite database locally:

In [None]:
! msnoise db init --tech=1

## Output of MSNoise default's configuration

In [None]:
! msnoise info

## Then, define some of the msnoise parameters:
In the console, you can start `msnoise admin` and this will create a python webserver that listens on http://localhost:5000 (or http://127.0.0.1:5000)

We will define:

* ``data_folder`` = ``./DATA/MSEED``
* ``response_path`` = ``./DATA/RESPONSES``
* ``startdate`` = ``2021-06-01``
* ``enddate`` = ``2021-08-01``

There are thus three ways to set parameters in msnoise:
* using the console: ``msnoise config set data_folder=./DATA/MSEED``
* using the admin interface
* using the API: ``from msnoise.api import * ; db = connect ; update_config(db, "data_folder", "./DATA/MSEED")``

In [None]:
! msnoise config set data_folder=./DATA/MSEED
! msnoise config set response_path=./DATA/RESPONSES
! msnoise config set startdate=2021-06-01
! msnoise config set enddate=2021-08-01

## First things first, MSNoise Scan Archive

MSNoise works by building and keeping a data_availability table, containing basic metadata. This table is populated by the ``scan_archive`` command, which here we have to use in it's "lazy" version since our archive is not SDS structured:


In [None]:
! msnoise scan_archive --init --path ./DATA/MSEED

## Update Station table

Since we scanned the archive first, we need now to "populate" the station table:

In [None]:
! msnoise populate --fromDA
! msnoise db update_loc_chan

## Plotting data_availability

In [None]:
from msnoise.plots.data_availability import main
main(chan="HH?", show=True)


## Define jobs to do

MSNoise is job-based: each day is "one job", and this is true for any type of jobs (CC, QC, etc), you can list them by:

In [None]:
! msnoise info -j

We have files in the data_availability table that are marked "N"ew, let's use them to define jobs, and list them again:

In [None]:
! msnoise new_jobs --init
! msnoise info -j

Compute the PSDs ! It's better to run this in a console, so we can keep the interactivity here, but the following command will work (it doens't output the debug info!)

In [None]:
#! msnoise -t 2 -d 5 qc compute_psd

and plot the result:

In [None]:
from msnoise.plots.ppsd import main
with warnings.catch_warnings():
    warnings.simplefilter("ignore")
    main("BE","MEM", "--", "HHZ", period_lim=(0.01, 100), show=True);
    main("GR","AHRW", "--", "HHZ", period_lim=(0.01, 100), show=True);

What about the jobs ?

In [None]:
! msnoise info -j

We have PSD2HDF and then HDF2RM jobs to do, let's to them in the console (command commented here for reference):

In [None]:
# ! msnoise qc psd_to_hdf
# ! msnoise qc hdf_to_rms

and finally, we can export the RMS dataframes:

In [None]:
# ! msnoise qc export_rms

and check their content:

In [None]:
for sta in ["BE.MEM.--.HHZ", "GR.AHRW.--.HHZ"]:
    df = pd.read_csv(os.path.join("PSD","RMS","DISP","{}.csv".format(sta)), index_col=0, parse_dates=True)
    df = df.resample("1H").mean()
    print(df.head())
    df.plot(subplots=True)
    plt.suptitle(sta)

Let's look at the last week before the flood until the day:

In [None]:
for sta in ["BE.MEM.--.HHZ", "GR.AHRW.--.HHZ"]:
    df = pd.read_csv(os.path.join("PSD","RMS","DISP","{}.csv".format(sta)), index_col=0, parse_dates=True)
    df = df.resample("1H").mean()
    df = df.loc["2021-07-07":"2021-07-15 00:00"]
    print(df.head())
    df.plot(subplots=True)
    plt.suptitle(sta)