Skip to content

Commit

Permalink
Major updates to plotting, new dependencies, several new helper scrip…
Browse files Browse the repository at this point in the history
…ts, and now functional in Python 2.7 and 3.5.
  • Loading branch information
ahotovec committed Mar 8, 2018
1 parent 9b8b029 commit 0bce911
Show file tree
Hide file tree
Showing 24 changed files with 1,160 additions and 304 deletions.
27 changes: 19 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,34 @@
<img src="https://raw.githubusercontent.com/ahotovec/REDPy/master/img/logo.png" width=800 alt="REDPy Logo" />

## Overview
REDPy (Repeating Earthquake Detector in Python) is a tool for automated detection and analysis of repeating earthquakes in continuous data. It works without any previous assumptions of what repeating seismicity looks like (that is, does not require a template event). Repeating earthquakes are clustered into "families" via the OPTICS ([Ordering Points To Identify the Clustering Structure](https://en.wikipedia.org/wiki/OPTICS_algorithm)) algorithm, with distance defined by cross-correlation. All data, including waveforms, are stored in an HDF5 table using [PyTables](http://www.pytables.org/).
REDPy (Repeating Earthquake Detector in Python) is a tool for automated detection and analysis of repeating earthquakes in continuous data. It works without any previous assumptions of what repeating seismicity looks like (that is, does not require a template event). Repeating earthquakes are clustered into "families" based on cross-correlation across multiple stations. All data, including waveforms, are stored in an HDF5 table using [PyTables](http://www.pytables.org/).

## Installation
REDPy runs on Python 2.7, and currently has the following major package dependencies:
[numpy](http://www.numpy.org/) | [scipy](http://www.scipy.org/) | [matplotlib with basemap add-on](http://www.matplotlib.org/) | [obspy](http://www.obspy.org/) | [pytables](http://www.pytables.org/) | [pandas](http://pandas.pydata.org/) | [bokeh](http://bokeh.pydata.org/)
REDPy runs on Python 2.7 and Python 3.5, with the following major package dependencies:
[numpy](http://www.numpy.org/) | [scipy](http://www.scipy.org/) | [matplotlib](http://www.matplotlib.org/) | [obspy](http://www.obspy.org/) | [pytables](http://www.pytables.org/) | [pandas](http://pandas.pydata.org/) | [bokeh](http://bokeh.pydata.org/) | [cartopy](http://scitools.org.uk/cartopy/)

These dependencies can be easily installed via [Anaconda](https://www.continuum.io/) on the command line:
These dependencies can be easily installed via [Anaconda](https://www.continuum.io/) on the command line. I *highly* recommend using a virtual environment so that your REDPy environment does not conflict with any other Python packages you may be using. This can be done with the following commands:
```
>> conda install -c obspy obspy pytables pandas basemap bokeh=0.9.3 mock nose PIL
>> conda create -n redpy python=3.5
>> source activate redpy
>> conda install -c obspy obspy
>> conda install -c conda-forge cartopy shapely=1.5.17
>> conda install pytables bokeh pandas
```
You may either use `python=3.5` or `python=2.7`, but other versions are not supported. Whenever you intend to run REDPy, be sure to `source activate redpy` and then `source deactivate` when you are done.

## Usage
Once dependencies are installed and REDPy is downloaded, REDPy can be run out of the box with the following commands to test if the code is working on your computer:
Once dependencies are installed and REDPy is downloaded, REDPy can be run out of the box with the following commands to test if the code is working on your computer. If it completes without error, it will produce files in a folder named `default` after several minutes.
```
>> python initialize.py
>> python catfill.py mshcat.csv
>> python backfill.py -s 2004-09-15 -e 2004-09-24
>> python catfill.py -v mshcat.csv
>> python backfill.py -v -s 2004-09-15 -e 2004-09-24
```

Check out the [Wiki](https://github.com/ahotovec/REDPy/wiki) for more detailed usage!

## Reference

If you would like to reference REDPy in your paper, please cite the following abstract until I finish writing the *Electronic Seismologist* paper for it:

> Hotovec-Ellis, A.J., and Jeffries, C., 2016. Near Real-time Detection, Clustering, and Analysis of Repeating Earthquakes: Application to Mount St. Helens and Redoubt Volcanoes – *Invited*, presented at Seismological Society of America Annual Meeting, Reno, Nevada, 20 Apr.
25 changes: 12 additions & 13 deletions backfill.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# REDPy - Repeating Earthquake Detector in Python
# Copyright (C) 2016 Alicia Hotovec-Ellis (ahotovec@gmail.com)
# Copyright (C) 2016-2018 Alicia Hotovec-Ellis (ahotovec@gmail.com)
# Licensed under GNU GPLv3 (see LICENSE.txt)

import argparse
Expand All @@ -9,21 +9,16 @@
from obspy import UTCDateTime
import time

# Added this to remove the slew of warnings obspy/numpy was throwing at me
import warnings
warnings.filterwarnings("ignore")

"""
Run this script to fill the table with data from the past. If a start time is not
specified, it will check the attributes of the repeater table to pick up where it left
off. Additionally, if this is the first run and a start time is not specified, it will
assume one time chunk prior to the end time. If an end time is not specified, "now" is
assumed. The end time updates at the end of each time chunk processed (default: by hour,
set in configuration). This script can be run as a cron job that will pick up where it
left off if a chunk is missed, but will not run until a full chunk of time has elapsed
since the last trigger. Use -n if you are backfilling with a large amount of time; it will
consume less time downloading the data in small chunks if NSEC is an hour or a day instead
of a few minutes.
left off if a chunk is missed. Use -n if you are backfilling with a large amount of time;
it will consume less time downloading the data in small chunks if NSEC is an hour or a day
instead of a few minutes, but at the cost of keeping orphans for longer.
usage: backfill.py [-h] [-v] [-s STARTTIME] [-e ENDTIME] [-c CONFIGFILE] [-n NSEC]
Expand Down Expand Up @@ -103,16 +98,20 @@
endtime = tend
st, stC = redpy.trigger.getData(tstart+n*opt.nsec-opt.atrig, endtime, opt)
alltrigs = redpy.trigger.trigger(st, stC, rtable, opt)
except (TypeError, obspy.fdsn.header.FDSNException, Exception):
except (TypeError, obspy.clients.fdsn.header.FDSNException, Exception):
print('Could not download or trigger data... moving on')
alltrigs = []

# Clean out data spikes etc.
trigs, junk = redpy.trigger.dataclean(alltrigs, opt, flag=1)
trigs, junk, junkFI, junkKurt = redpy.trigger.dataClean(alltrigs, opt, flag=1)

# Save junk triggers in separate table for quality checking purposes
for i in range(len(junk)):
redpy.table.populateJunk(jtable, junk[i], 0, opt)
redpy.table.populateJunk(jtable, junk[i], 2, opt) # Both types of junk
for i in range(len(junkKurt)):
redpy.table.populateJunk(jtable, junkKurt[i], 1, opt) # Just kurtosis junk
for i in range(len(junkFI)):
redpy.table.populateJunk(jtable, junkFI[i], 0, opt) # Just 'teleseisms'

# Append times of triggers to ttable to compare total seismicity later
redpy.table.populateTriggers(ttable, trigs, ttimes, opt)
Expand Down Expand Up @@ -168,7 +167,7 @@
print("Caught up to: {}".format(endtime-opt.atrig))

if args.verbose: print("Updating plots...")
redpy.plotting.createPlots(rtable, ftable, ttable, ctable, otable, jtable, opt)
redpy.plotting.createPlots(rtable, ftable, ttable, ctable, otable, opt)

if args.verbose: print("Closing table...")
h5file.close()
Expand Down
16 changes: 8 additions & 8 deletions catfill.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# REDPy - Repeating Earthquake Detector in Python
# Copyright (C) 2016 Alicia Hotovec-Ellis (ahotovec@gmail.com)
# Copyright (C) 2016-2018 Alicia Hotovec-Ellis (ahotovec@gmail.com)
# Licensed under GNU GPLv3 (see LICENSE.txt)

import argparse
Expand All @@ -10,10 +10,6 @@
import time
import pandas as pd

# Added this to remove the slew of warnings obspy/numpy was throwing at me
import warnings
warnings.filterwarnings("ignore")

"""
Run this script to fill the table with data from the past using a catalog of events.
Expand Down Expand Up @@ -80,11 +76,15 @@
alltrigs = []

# Clean out data spikes etc.
trigs, junk = redpy.trigger.dataclean(alltrigs, opt, flag=1)
trigs, junk, junkFI, junkKurt = redpy.trigger.dataClean(alltrigs, opt, flag=1)

# Save junk triggers in separate table for quality checking purposes
for i in range(len(junk)):
redpy.table.populateJunk(jtable,junk[i],0,opt)
redpy.table.populateJunk(jtable, junk[i], 2, opt)
for i in range(len(junkKurt)):
redpy.table.populateJunk(jtable, junkKurt[i], 1, opt)
for i in range(len(junkFI)):
redpy.table.populateJunk(jtable, junkFI[i], 0, opt)

# Append times of triggers to ttable to compare total seismicity later
redpy.table.populateTriggers(ttable, trigs, ttimes, opt)
Expand Down Expand Up @@ -130,7 +130,7 @@

if len(rtable) > 1:
if args.verbose: print("Creating plots...")
redpy.plotting.createPlots(rtable, ftable, ttable, ctable, otable, jtable, opt)
redpy.plotting.createPlots(rtable, ftable, ttable, ctable, otable, opt)
else:
print("No repeaters to plot.")

Expand Down
55 changes: 55 additions & 0 deletions clearJunk.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# REDPy - Repeating Earthquake Detector in Python
# Copyright (C) 2016-2018 Alicia Hotovec-Ellis (ahotovec@gmail.com)
# Licensed under GNU GPLv3 (see LICENSE.txt)

import redpy.config
import redpy.table
import redpy.plotting
import argparse
import numpy as np
import os

"""
Run this script to clear the contents of the junk table.
usage: clearJunk.py [-h] [-v] [-c CONFIGFILE]
optional arguments:
-h, --help show this help message and exit
-v, --verbose increase written print statements
-c CONFIGFILE, --configfile CONFIGFILE
use configuration file named CONFIGFILE instead of
default settings.cfg
"""

parser = argparse.ArgumentParser(description=
"Run this script to clear the contents of the junk table.")
parser.add_argument("-v", "--verbose", action="count", default=0,
help="increase written print statements")
parser.add_argument("-c", "--configfile",
help="use configuration file named CONFIGFILE instead of default settings.cfg")
args = parser.parse_args()

if args.configfile:
opt = redpy.config.Options(args.configfile)
if args.verbose: print("Using config file: {0}".format(args.configfile))
else:
opt = redpy.config.Options("settings.cfg")
if args.verbose: print("Using config file: settings.cfg")

if args.verbose: print("Opening hdf5 table: {0}".format(opt.filename))
h5file, rtable, otable, ttable, ctable, jtable, dtable, ftable = redpy.table.openTable(opt)

if args.verbose: print("Removing junk...")

if len(jtable) > 1:
# This will remove all but the last row (have to leave one)
for n in range(len(jtable)-1,0,-1):
jtable.remove_row(n)
jtable.flush()
else:
if args.verbose: print("No junk to remove!")

if args.verbose: print("Closing table...")
h5file.close()
if args.verbose: print("Done")
124 changes: 124 additions & 0 deletions compareCatalog.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
# REDPy - Repeating Earthquake Detector in Python
# Copyright (C) 2016-2018 Alicia Hotovec-Ellis (ahotovec@gmail.com)
# Licensed under GNU GPLv3 (see LICENSE.txt)

import argparse
import redpy
import numpy as np
import obspy
from obspy import UTCDateTime
import time
import pandas as pd
from matplotlib.dates import num2date

"""
Run this script to compare table with a specific catalog of events for agreement.
usage: compareCatalog.py [-h] [-v] [-c CONFIGFILE] csvfile
positional arguments:
csvfile catalog csv file with required column 'Time UTC'
optional arguments:
-h, --help show this help message and exit
-v, --verbose increase written print statements
-c CONFIGFILE, --configfile CONFIGFILE
use configuration file named CONFIGFILE instead of
default settings.cfg
"""

parser = argparse.ArgumentParser(description=
"Compares REDPy catalog with csv catalog")
parser.add_argument("csvfile",
help="catalog csv file with required column 'Time UTC'")
parser.add_argument("-v", "--verbose", action="count", default=0,
help="increase written print statements")
parser.add_argument("-c", "--configfile",
help="use configuration file named CONFIGFILE instead of default settings.cfg")
args = parser.parse_args()

if args.configfile:
opt = redpy.config.Options(args.configfile)
if args.verbose: print("Using config file: {0}".format(args.configfile))
else:
opt = redpy.config.Options("settings.cfg")
if args.verbose: print("Using config file: settings.cfg")

if args.verbose: print("Opening hdf5 table: {0}".format(opt.filename))
h5file, rtable, otable, ttable, ctable, jtable, dtable, ftable = redpy.table.openTable(opt)

# Read in csv file using pandas
df = pd.read_csv(args.csvfile)

# First off, a huge assumption here is that the event times are simply within some number
# of seconds of the trigger times. Unless otherwise necessary, I'm going to start off with
# the assumption that they're within the length of time used for the correlation window.
terr = opt.winlen/opt.samprate

# I'll append the best candidate family and the number of seconds the event times differ,
# so that the user can identify questionable matches. I may add the ability to assume a
# location and do some simple ray-tracing like in checkComCat(), but at this point I'm
# not convinced it's necessary. Column dt here is the REDPy trigger time - csv catalog
# time; if it is negative, REDPy either triggered early or it may not be a match.
df['Cluster'] = ''
df['dt'] = ''

# Set up times to compare
evtimes = [UTCDateTime(i) for i in pd.to_datetime(df['Time UTC']).tolist()]
rtimes = [UTCDateTime(i) for i in rtable.cols.startTime[:]]
rtimes += rtable.cols.windowStart[:]/opt.samprate
otimes = [UTCDateTime(i) for i in otable.cols.startTime[:]]
otimes += otable.cols.windowStart[:]/opt.samprate
jtimes = [UTCDateTime(i) for i in jtable.cols.startTime[:]]
jtimes += jtable.cols.windowStart[:]/opt.samprate
ttimes = [UTCDateTime(num2date(i)) for i in ttable.cols.startTimeMPL[:]]

# Flatten families to list
famlist = np.zeros((len(rtimes),)).astype(int)
for fnum in range(len(ftable)):
members = np.fromstring(ftable[fnum]['members'], dtype=int, sep=' ')
famlist[members] = fnum

for i in range(len(df)):

# See if there's junk that matches
if len(jtimes) > 0:
dtimesj = jtimes-evtimes[i]
bestjunk = dtimesj[np.argmin(np.abs(dtimesj))]
if np.abs(bestjunk) < terr:
df['Cluster'][i] = 'junk'
df['dt'][i] = bestjunk

# See if there are any expired orphans that match
if len(ttimes) > 0:
dtimest = np.array(ttimes)-evtimes[i]
besttrig = dtimest[np.argmin(np.abs(dtimest))]
if np.abs(besttrig) < terr:
df['Cluster'][i] = 'expired'
df['dt'][i] = besttrig

# See if there's an orphan that matches
if len(otimes) > 0:
dtimeso = otimes-evtimes[i]
bestorph = dtimeso[np.argmin(np.abs(dtimeso))]
if np.abs(bestorph) < terr:
df['Cluster'][i] = 'orphan'
df['dt'][i] = bestorph

# See if there's a repeater that matches
if len(rtimes) > 0:
dtimesr = rtimes-evtimes[i]
bestr = dtimesr[np.argmin(np.abs(dtimesr))]
if np.abs(bestr) < terr:
df['Cluster'][i] = famlist[np.argmin(np.abs(dtimesr))]
df['dt'][i] = bestr


# Write to matches.csv
if args.verbose: print("Saving to matches_{}.csv".format(opt.groupName))
df.to_csv('matches_{}.csv'.format(opt.groupName), index=False)

if args.verbose: print("Closing table...")
h5file.close()

if args.verbose: print("Done")
55 changes: 55 additions & 0 deletions createReport.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# REDPy - Repeating Earthquake Detector in Python
# Copyright (C) 2016-2018 Alicia Hotovec-Ellis (ahotovec@gmail.com)
# Licensed under GNU GPLv3 (see LICENSE.txt)

import redpy.config
import redpy.table
import argparse

"""
Run this script to manually produce a more detailed "report" page for a given family
(or families)
usage: createReport.py [-h] [-v] [-o] [-c CONFIGFILE] N [N ...]
positional arguments:
N family number(s) to be reported on
optional arguments:
-h, --help show this help message and exit
-v, --verbose increase written print statements
-o, --ordered order plots by OPTICS
-c CONFIGFILE, --configfile CONFIGFILE
use configuration file named CONFIGFILE instead of
default settings.cfg
"""

parser = argparse.ArgumentParser(description=
"Run this script to manually remove families/clusters")
parser.add_argument('famnum', metavar='N', type=int, nargs='+',
help="family number(s) to be reported on")
parser.add_argument("-v", "--verbose", action="count", default=0,
help="increase written print statements")
parser.add_argument("-o", "--ordered", action="count", default=0,
help="order plots by OPTICS")
parser.add_argument("-c", "--configfile",
help="use configuration file named CONFIGFILE instead of default settings.cfg")
args = parser.parse_args()

if args.configfile:
opt = redpy.config.Options(args.configfile)
if args.verbose: print("Using config file: {0}".format(args.configfile))
else:
opt = redpy.config.Options("settings.cfg")
if args.verbose: print("Using config file: settings.cfg")

if args.verbose: print("Opening hdf5 table: {0}".format(opt.filename))
h5file, rtable, otable, ttable, ctable, jtable, dtable, ftable = redpy.table.openTable(opt)

for fnum in args.famnum:
if args.verbose: print("Creating report for family {}...".format(fnum))
redpy.plotting.plotReport(rtable, ftable, ctable, opt, fnum, args.ordered)

if args.verbose: print("Closing table...")
h5file.close()
if args.verbose: print("Done")
Loading

0 comments on commit 0bce911

Please sign in to comment.