# An Example Inversion Setup

Here we provide a rough template for setting up an inversion, involving gathering an event catalog with moment tensors, collecting observation waveforms and response files, organizing data into the optimal directory structure, and generating ASDFDataSets that can be used in a SeisFlows inversion.


## Event Catalog (Region: Alaska)

Alaska's cool, let's work there. First we'll use ObsPy to gather our initial catalog of events from the past decade in a box bounding Anchorage and Fairbanks.

In [2]:
from obspy import UTCDateTime
from obspy.clients.fdsn import Client

c = Client("USGS")
cat = c.get_events(starttime=UTCDateTime("2010-01-01T00:00:00"), 
                   endtime=UTCDateTime("2020-01-01T00:00:00"), 
                   maxdepth=60.0,
                   minmagnitude=5.0,
                   maxmagnitude=6.0, 
                   minlatitude=59.75, 
                   maxlatitude=65.50, 
                   minlongitude=-154.5, 
                   maxlongitude=-143.789
                  )
cat

15 Event(s) in Catalog:
2019-01-13T16:45:55.437000Z | +61.299, -150.065 | 5.0 ml | manual
2019-01-06T03:45:34.525000Z | +65.407, -153.280 | 5.1 ml | manual
...
2011-06-16T19:06:05.214000Z | +60.765, -151.076 | 5.1 mw | manual
2011-01-23T02:50:04.629000Z | +63.542, -150.865 | 5.2 mw | manual
To see all events call 'print(CatalogObject.__str__(print_all=True))'

Lets have a look at the Event IDs of our events. If we knew these apriori, we could have gathered our catalog based on event ids

In [21]:
from pyatoa.utils.form import format_event_name

event_ids = []
for event in cat:
    event_ids.append(format_event_name(event))
event_ids

['ak019lrs7iu',
 'ak0199za3yf',
 'ak0191pccr7',
 'ak018fe5jk85',
 'ak20421672',
 'ak018fcpk9xi',
 'us1000hyge',
 'ak018fcntv5m',
 'ak018dsf3btv',
 'ak017f7s3c06',
 'ak014dlss56k',
 'ak014b5xf1in',
 'ak013ae2ycca',
 'ak0117oi3hnt',
 'ak011122ukq6']

## Getting moment tensors

Great, we have an event catalog now, but to do waveform simulations we need moment tensors. Unfortunately it's not straight forward to grab moment tensor information directly from USGS as they do not directly provide XML files. It would be possible to manually generate moment tensor objects from each [individual event pages](https://earthquake.usgs.gov/earthquakes/eventpage/ak019lrs7iu/moment-tensor), but that seems tedious for a tutorial. Instead we'll use good 'ole GCMT. Pyatoa has a function to read .ndk files hosted online by the GCMT team, and find events based on origintime and magnitude.

In [24]:
from pyatoa.core.gatherer import get_gcmt_moment_tensor

events = []
for event in cat:
    origintime = event.preferred_origin().time
    magnitude = event.preferred_magnitude().mag
    try:
        events.append(get_gcmt_moment_tensor(origintime, magnitude))
    except FileNotFoundError:
        print(f"No GCMT event found for: {format_event_name(event)}")
        continue
    
gcmt_catalog = Catalog(events)

No GCMT event found for: ak018fcpk9xi
No GCMT event found for: us1000hyge
No GCMT event found for: ak018fcntv5m
No GCMT event found for: ak013ae2ycca


Great, 11 out of 15 isn't bad, we'll go ahead with and use the GCMT catalog that we just collected. However if we wanted to retain the origin information from the USGS catalog, we would need to move the moment tensor objects from the GCMT catalog over to the USGS catalog, an exercise left for the reader...

## Gathering Data

Now we need seismic waveform data for all the events in our catalog. We can use the multithreaded data gathering functioality contained in the Gatherer class. But first we need to determine the available broadband stations in the area, using ObsPy. 

Some pieces of relevant information that help motivate our search:
*  The Alaska Earthquake Center (AEC) operates stations under the network code "AK".
*  The SEED standard seismometer instrument code is "H"
*  The SEED standard for broadband instruments is "B" or "H"


In [19]:
c = Client("IRIS")
inv = c.get_stations(network="AK", 
                     station="*", 
                     location="*",
                     channel="BH?",
                     starttime=UTCDateTime("2010-01-01T00:00:00"), 
                     endtime=UTCDateTime("2020-01-01T00:00:00"), 
                     minlatitude=59.75,                    
                     maxlatitude=65.50, 
                     minlongitude=-154.5, 
                     maxlongitude=-143.789,
                     level="channel"
                    )
inv

Inventory created at 2020-10-08T23:47:15.000000Z
	Created by: IRIS WEB SERVICE: fdsnws-station | version: 1.1.46
		    http://service.iris.edu/fdsnws/station/1/query?starttime=2010-01-01...
	Sending institution: IRIS-DMC (IRIS-DMC)
	Contains:
		Networks (1):
			AK
		Stations (76):
			AK.BMR (Bremner River, AK, USA)
			AK.BPAW (Bear Paw Mountain, AK, USA)
			AK.BRLK (Bradley Lake, AK, USA)
			AK.BWN (Browne, AK, USA)
			AK.CAPN (Captain Cook Nikiski, AK, USA)
			AK.CAST (Castle Rocks, AK, USA)
			AK.CCB (Clear Creek Butte, AK, USA)
			AK.CHUM (Lake Minchumina, AK, USA)
			AK.CUT (Chulitna, AK, USA)
			AK.DDM (Donnely Dome, AK, USA)
			AK.DHY (Denali Highway, AK, USA)
			AK.DIV (Divide Microwave, AK, USA)
			AK.DOT (Dot Lake, AK, USA)
			AK.EYAK (Cordova Ski Area, AK, USA)
			AK.FIB (Fire Island, AK, USA)
			AK.FID (Fidalgo, AK, USA)
			AK.FIRE (Fire Island, AK, USA)
			AK.GHO (Gloryhole, AK, USA)
			AK.GLB (Gilahina Butte, AK, USA)
			AK.GLI (Glacier Island, AK, USA)
			AK.GLM (Gilmore 

In [27]:
# We'll need to create a list of station ids for data gathering
station_codes = []
for net in inv:
    for sta in net:
        station_codes.append(f"{net.code}.{sta.code}.*.BH?")
station_codes

['AK.BMR.*.BH?',
 'AK.BPAW.*.BH?',
 'AK.BRLK.*.BH?',
 'AK.BWN.*.BH?',
 'AK.CAPN.*.BH?',
 'AK.CAST.*.BH?',
 'AK.CCB.*.BH?',
 'AK.CHUM.*.BH?',
 'AK.CUT.*.BH?',
 'AK.DDM.*.BH?',
 'AK.DHY.*.BH?',
 'AK.DIV.*.BH?',
 'AK.DOT.*.BH?',
 'AK.EYAK.*.BH?',
 'AK.FIB.*.BH?',
 'AK.FID.*.BH?',
 'AK.FIRE.*.BH?',
 'AK.GHO.*.BH?',
 'AK.GLB.*.BH?',
 'AK.GLI.*.BH?',
 'AK.GLM.*.BH?',
 'AK.GOAT.*.BH?',
 'AK.HDA.*.BH?',
 'AK.HDA.*.BH?',
 'AK.HIN.*.BH?',
 'AK.HMT.*.BH?',
 'AK.I21K.*.BH?',
 'AK.I23K.*.BH?',
 'AK.J20K.*.BH?',
 'AK.J25K.*.BH?',
 'AK.K20K.*.BH?',
 'AK.K24K.*.BH?',
 'AK.KAI.*.BH?',
 'AK.KLU.*.BH?',
 'AK.KNK.*.BH?',
 'AK.KTH.*.BH?',
 'AK.L20K.*.BH?',
 'AK.L22K.*.BH?',
 'AK.M19K.*.BH?',
 'AK.M20K.*.BH?',
 'AK.MCK.*.BH?',
 'AK.MDM.*.BH?',
 'AK.MLY.*.BH?',
 'AK.N19K.*.BH?',
 'AK.NEA.*.BH?',
 'AK.NEA2.*.BH?',
 'AK.NICH.*.BH?',
 'AK.NKA.*.BH?',
 'AK.O19K.*.BH?',
 'AK.O20K.*.BH?',
 'AK.P23K.*.BH?',
 'AK.PAX.*.BH?',
 'AK.PPLA.*.BH?',
 'AK.PWL.*.BH?',
 'AK.RAG.*.BH?',
 'AK.RC01.*.BH?',
 'AK.RIDG.*.BH?',
 'AK

76 Station is a lot! Lets see how many have data by creating an ASDFDataSet for a single event, and trying to fill it with as much recorded data as possible.

In [33]:
from pyasdf import ASDFDataSet
from pyatoa import Gatherer, Config

event = gcmt_catalog[0]
event_id = format_event_name(event)

# The gatherer needs to know where to look (Client) and when to look (origintime)
cfg = Config(client="IRIS")
origintime = event.preferred_origin().time

g    gthr = Gatherer(config=cfg, ds=ds, origintime=origintime)
    gthr.gather_obs_threaded(codes=station_codes)

AK.FIB.*.BH? status: 0
AK.DDM.*.BH? status: 0
AK.DHY.*.BH? status: 1
AK.FID.*.BH? status: 1
AK.GLI.*.BH? status: 1
AK.GLM.*.BH? status: 0
AK.CAST.*.BH? status: 4
AK.CHUM.*.BH? status: 4
AK.DOT.*.BH? status: 4
AK.BPAW.*.BH? status: 4
AK.EYAK.*.BH? status: 4
AK.BMR.*.BH? status: 4
AK.GLB.*.BH? status: 4
AK.CCB.*.BH? status: 4
AK.DIV.*.BH? status: 4
AK.CUT.*.BH? status: 4
AK.BWN.*.BH? status: 4
AK.GHO.*.BH? status: 4
AK.CAPN.*.BH? status: 4
AK.BRLK.*.BH? status: 4
AK.HDA.*.BH? status: 4
AK.GOAT.*.BH? status: 4
AK.HDA.*.BH? status: 4
AK.I23K.*.BH? status: 0
AK.I21K.*.BH? status: 0
AK.J20K.*.BH? status: 0
AK.J25K.*.BH? status: 0
AK.K20K.*.BH? status: 0




AK.K24K.*.BH? status: 0
AK.HIN.*.BH? status: 4
AK.L20K.*.BH? status: 0
AK.L22K.*.BH? status: 0
AK.FIRE.*.BH? status: 4
AK.M19K.*.BH? status: 0
AK.M20K.*.BH? status: 0
AK.MDM.*.BH? status: 0
AK.NKA.*.BH? status: 0
AK.N19K.*.BH? status: 0
AK.NEA.*.BH? status: 0
AK.MLY.*.BH? status: 1
AK.O19K.*.BH? status: 0
AK.P23K.*.BH? status: 0
AK.O20K.*.BH? status: 0
AK.HMT.*.BH? status: 4
AK.SGA.*.BH? status: 0
AK.KAI.*.BH? status: 4
AK.KLU.*.BH? status: 4
AK.KTH.*.BH? status: 4
AK.SSN.*.BH? status: 1
AK.MCK.*.BH? status: 4
AK.TRF.*.BH? exception: Unable to create link (name already exists)
AK.NEA2.*.BH? status: 4
AK.PAX.*.BH? status: 4
AK.NICH.*.BH? status: 4
AK.WAT1.*.BH? status: 0
AK.WAT3.*.BH? status: 0
AK.WAT2.*.BH? status: 0
AK.PPLA.*.BH? status: 4
AK.KNK.*.BH? status: 4
AK.PWL.*.BH? status: 4
AK.SAW.*.BH? status: 4
AK.SCRK.*.BH? status: 4
AK.WAT4.*.BH? status: 0
AK.RND.*.BH? status: 4
AK.RIDG.*.BH? status: 4
AK.WAT5.*.BH? status: 0
AK.WAT6.*.BH? status: 1
AK.RC01.*.BH? status: 4
AK.SCM.*.BH? 

Great! Looks like we got quite a lot of data. Some stations did not return any data, as expected, but many of them returned a StationXML plus three component waveforms (as explained by status == 4)

In [34]:
ds.waveforms.list()

ValueError: Not a location (invalid object ID)