# Data retrieval with Pyrocko

In this notebook, we will use Pyrocko to retrieve data from the FDSN client via script. [Documentation of Pyrocko FDSN](https://pyrocko.org/docs/current/library/reference/client/waveform.html).

Reference: https://pyrocko.org/docs/current/library/examples/fdsn_download.html


### Tabel of content:
- [Event](#Event)
- [Waveform](#Waveform)
- [Station](#Station)
- [Summary](#Summary)


In [None]:
import matplotlib.pyplot as plt

from pyrocko.client import fdsn
from pyrocko import util, io, trace, model
from pyrocko.io import quakeml

# Event
<a id='Event'></a>
First, we use pyrocko to get events. For that we need to define a time range and a service which has an available 'event'-option, e.g. IRIS or GFZ. The time refers to the origin time of the event, not if signal arrives at a certain station from an event.


In [None]:
tmin = util.stt('2014-01-01 16:10:00.000')
tmax = util.stt('2014-01-01 16:39:59.000')

# request events at IRIS for the given time span
request_event = fdsn.event(
    site='iris', starttime=tmin, endtime=tmax)

# parse QuakeML and extract pyrocko events
events = quakeml.QuakeML.load_xml(request_event).get_pyrocko_events()

# If wanted, one can easily store the events:
# model.dump_events(events, 'iris-events.pf')

In [None]:
for event in events:
    print(event)

# Waveform
<a id='Waveform'></a>
Similar to the event, we also need a service and time, but additional information about the stations. The basic code behind a station name is the NSLC, meaning "Network.Station.Location.Channel". Here, it is possible to provide a list of stations.

In [None]:
tmin = util.stt('2014-01-01 16:10:00.000')
tmax = util.stt('2014-01-01 16:39:59.000')

# select stations by their NSLC id and wildcards (asterisk) for waveform download
selection = [
    ('*', 'HMDT', '*', '*', tmin, tmax),    # all available components
    ('GE', 'EIL', '*', '*Z', tmin, tmax),   # all vertical components
]

# Restricted access token
# token = open('token.asc', 'rb').read()
# request_waveform = fdsn.dataselect(site='geofon', selection=selection,
#                                    token=token)

# setup a waveform data request
request_waveform = fdsn.dataselect(site='geofon', selection=selection)

# write the incoming data stream to 'traces.mseed'
with open('/tmp/traces.mseed', 'wb') as file:
    file.write(request_waveform.read())

In [None]:
traces = io.load('/tmp/traces.mseed')
for tr in traces:
    print(tr)

# Station data
<a id='Station'></a>
Pretty similar to the waveform retrieval. A Pyrocko favoring internal format is YAML.

In [None]:
tmin = util.stt('2014-01-01 16:10:00.000')
tmax = util.stt('2014-01-01 16:39:59.000')

# select stations by their NSLC id and wildcards (asterisk) for waveform download
selection = [
    ('*', 'HMDT', '*', '*', tmin, tmax),    # all available components
    ('GE', 'EIL', '*', '*Z', tmin, tmax),   # all vertical components
]

# request meta data
request_response = fdsn.station(
    site='geofon', selection=selection, level='response')

# save the response in YAML and StationXML format
request_response.dump(filename='/tmp/responses_geofon.yaml')
request_response.dump_xml(filename='/tmp/responses_geofon.xml')

# Combining
Download waveform and station data, do a response correction.

In [None]:
service = 'geofon'

tmin = util.stt('2014-01-01 16:10:00.000')
tmax = util.stt('2014-01-01 16:39:59.000')

# select stations by their NSLC id and wildcards (asterisk) for waveform download
selection = [
    ('*', 'HMDT', '*', '*', tmin, tmax),    # all available components
    ('GE', 'EIL', '*', '*Z', tmin, tmax),   # all vertical components
]

request_waveform = fdsn.dataselect(site=service, selection=selection)

with open('/tmp/traces2.mseed', 'wb') as file:
    file.write(request_waveform.read())

request_response = fdsn.station(
    site=service, selection=selection, level='response')

# Loop through retrieved waveforms and request meta information for each trace
traces = io.load('/tmp/traces2.mseed')
displacement = []
for tr in traces:
    polezero_response = request_response.get_pyrocko_response(
        nslc=tr.nslc_id,
        timespan=(tr.tmin, tr.tmax),
        fake_input_units='M')
    # *fake_input_units*: required for consistent responses throughout entire
    # data set

    # deconvolve transfer function
    restituted = tr.transfer(
        tfade=2.,
        freqlimits=(0.01, 0.1, 1., 2.),
        transfer_function=polezero_response,
        invert=True)

    displacement.append(restituted)

In [None]:
# Inspect waveforms using Snuffler
# trace.snuffle(displacement)

In [None]:
print(displacement)
plt.figure(figsize=(16,9))
f, ax = plt.subplots(len(displacement), 1, sharex=True, figsize=(16,9))
for cnt, tr in enumerate(displacement):
    ax[cnt].plot(tr.get_xdata(), tr.ydata, color='k')
    ax[cnt].set_ylabel('%s.%s' % (tr.station, tr.channel))
plt.show()

# Summary
In this notebook we have learned the usage of Obspy with FDSN to request and download waveform, station and event data.