### Worked example with Parkfield metadata

In this example we use obspy to load the stationxml as an Inventory() object and to access data via ROVER.  The inventory provides an iterable view of all the filter stages that are stored in the stationxml.  

The data are accessed through IRIS ROVER which provides local miniseed files that can also be accessed through obspy.

Much of the code here is a variant of the example archived in the mth5 repository in the file:

make_mth5_from_iris_dmc_local.py




In [1]:
from obspy import read, UTCDateTime
from pathlib import Path

from aurora.general_helper_functions import execute_command
from aurora.sandbox.io_helpers.inventory_review import scan_inventory_for_nonconformity
from aurora.sandbox.io_helpers.iris_dataset_config import IRISDatasetConfig
from aurora.sandbox.mth5_helpers import initialize_mth5
from mth5 import mth5
from mth5.timeseries import RunTS

from mt_metadata.timeseries.stationxml import XMLInventoryMTExperiment


2021-09-02 15:58:56,510 [line 106] mth5.setup_logger - INFO: Logging file can be found /home/kkappler/software/irismt/mth5/logs/mth5_debug.log


# Installing ROVER

To install rover go to
https://iris-edu.github.io/rover/
and follow the directions or:

In [2]:
!pip install rover[mseedindex]



# Configuring ROVER

#### Navigate to a directory where we will execute rover 

In [10]:
ROVER_DIR = Path().home().joinpath("rover")
ROVER_DIR.mkdir(exist_ok=True)
print(f"rover path = {ROVER_DIR}")

rover path = /home/kkappler/rover


In [11]:
cmd = f"rover init-repository {ROVER_DIR}"
print(cmd)

rover init-repository /home/kkappler/rover


In [12]:
execute_command(cmd)

executing from /home/kkappler/


init-repository  DEFAULT: Writing new config file "/home/kkappler/rover/rover.config"


## Defining a request file

In [15]:
request_file = ROVER_DIR.joinpath("BK_PKD_B_Request.txt")
f = open(request_file, 'w')
f.writelines("BK PKD * BQ2 2004-09-28T00:00:00.000000Z 2004-09-28T01:59:59.974999Z\n")
f.writelines("BK PKD * BQ3 2004-09-28T00:00:00.000000Z 2004-09-28T01:59:59.974999Z\n")
f.writelines("BK PKD * BT1 2004-09-28T00:00:00.000000Z 2004-09-28T01:59:59.974999Z\n")
f.writelines("BK PKD * BT2 2004-09-28T00:00:00.000000Z 2004-09-28T01:59:59.974999Z\n")
#f.writelines("BK PKD * BT3 2004-09-28T00:00:00.000000Z 2004-09-28T01:59:59.974999Z\n")
f.close()
cmd = f"cat {request_file}"
execute_command(cmd)


executing from /home/kkappler/
BK PKD * BQ2 2004-09-28T00:00:00.000000Z 2004-09-28T01:59:59.974999Z
BK PKD * BQ3 2004-09-28T00:00:00.000000Z 2004-09-28T01:59:59.974999Z
BK PKD * BT1 2004-09-28T00:00:00.000000Z 2004-09-28T01:59:59.974999Z
BK PKD * BT2 2004-09-28T00:00:00.000000Z 2004-09-28T01:59:59.974999Z
BK PKD * BT3 2004-09-28T00:00:00.000000Z 2004-09-28T01:59:59.974999Z


## Retrieving Data

In [14]:
cmd = f"rover retrieve {request_file}"
execute_command(cmd, exec_dir=str(ROVER_DIR))

executing from /home/kkappler/rover


retrieve  DEFAULT: ROVER version 1.0.5 - starting retrieve
retrieve  DEFAULT: Status available at http://127.0.0.1:8000
retrieve  DEFAULT: Trying new retrieval attempt 1 of 3.
retrieve  DEFAULT: Downloading BK_PKD 2004-272 (N_S 1/1; day 1/1)
retrieve  DEFAULT: Successful retrieval, downloaded data so resetting retry count and verify.
retrieve  DEFAULT: Trying new retrieval attempt 1 of 3.
retrieve  DEFAULT: Retrieval attempt 1 of 3 is complete.
retrieve  DEFAULT: The initial retrieval attempt resulted in no errors or data downloaded, will verify.
retrieve  DEFAULT: Trying new retrieval attempt 2 of 3.
retrieve  DEFAULT: Retrieval attempt 2 of 3 is complete.
retrieve  DEFAULT: The final retrieval, attempt 2 of 3, made no downloads and had no errors, we are complete.
retrieve  DEFAULT: 
retrieve  DEFAULT: ----- Retrieval Finished -----
retrieve  DEFAULT: 
retrieve  DEFAULT: 
retrieve  DEFAULT: A ROVER retrieve task on thales4
retrieve  DEFAULT: started 2021-09-02T16:00:43 (2021-09-02T23:

## Using IRISDatasetConfig to get metadata and data

In [16]:
test_data_set = IRISDatasetConfig()
test_data_set.dataset_id = "rover_example"
test_data_set.network = "BK"
test_data_set.station = "PKD"
test_data_set.starttime = UTCDateTime("2004-09-28T00:00:00")
test_data_set.endtime = UTCDateTime("2004-09-28T02:00:00")
test_data_set.channel_codes = "BQ2,BQ3,BT1,BT2"
test_data_set.description = "Two hours of data at 10Hz from PKD 17h before M6"
test_data_set.components_list = ["ex", "ey", "hx", "hy",]

In [17]:
inventory = test_data_set.get_inventory_from_iris(ensure_inventory_stages_are_named=True)

['Q2', 'Q3', 'T1', 'T2']
Detected a likely non-FDSN conformant convnetion unless there is a vertical electric dipole
Fixing Electric channel codes
HACK FIX ELECTRIC CHANNEL CODES COMPLETE
Detected a likely non-FDSN conformant convnetion unless there are Tidal data in this study
Fixing Magnetic channel codes
HACK FIX MAGNETIC CHANNEL CODES COMPLETE
BQ1 1 V/M
BQ1 2 V
BQ1 3 V
BQ1 4 COUNTS
BQ1 5 COUNTS
BQ1 6 COUNTS
BQ1 7 COUNTS
BQ2 1 V/M
BQ2 2 V
BQ2 3 V
BQ2 4 COUNTS
BQ2 5 COUNTS
BQ2 6 COUNTS
BQ2 7 COUNTS
BF1 1 T
BF1 2 V
BF1 3 COUNTS
BF1 4 COUNTS
BF1 5 COUNTS
BF1 6 COUNTS
BF2 1 T
BF2 2 V
BF2 3 COUNTS
BF2 4 COUNTS
BF2 5 COUNTS
BF2 6 COUNTS
BK-PKD-BQ1 7-stage response
stagename None
ASSIGNING stage Response type: PolesZerosResponseStage, Stage Sequence Number: 1
	BQ1_0 
	From V/M (Electric field in Volts per meter) to V (Volts)
	Stage gain: 101.0, defined at 8.00 Hz
	Transfer function type: LAPLACE (RADIANS/SECOND)
	Normalization factor: 1.22687e+09, Normalization frequency: 8.00 Hz
	Poles: 1

## Handle non-FDSN compliant metadata

In [18]:
inventory = scan_inventory_for_nonconformity(inventory)

['Q1', 'Q2', 'F1', 'F2']
BQ1 1 V/M
BQ1 2 V
BQ1 3 V
BQ1 4 COUNTS
BQ1 5 COUNTS
BQ1 6 COUNTS
BQ1 7 COUNTS
BQ2 1 V/M
BQ2 2 V
BQ2 3 V
BQ2 4 COUNTS
BQ2 5 COUNTS
BQ2 6 COUNTS
BQ2 7 COUNTS
BF1 1 T
BF1 2 V
BF1 3 COUNTS
BF1 4 COUNTS
BF1 5 COUNTS
BF1 6 COUNTS
BF2 1 T
BF2 2 V
BF2 3 COUNTS
BF2 4 COUNTS
BF2 5 COUNTS
BF2 6 COUNTS


In [19]:
inventory

Inventory created at 2021-09-02T23:08:34.000000Z
	Created by: IRIS WEB SERVICE: fdsnws-station | version: 1.1.47
		    http://service.iris.edu/fdsnws/station/1/query?starttime=2004-09-28...
	Sending institution: IRIS-DMC (IRIS-DMC)
	Contains:
		Networks (1):
			BK
		Stations (1):
			BK.PKD (Bear Valley Ranch, Parkfield, CA, USA)
		Channels (4):
			BK.PKD..BF1, BK.PKD..BF2, BK.PKD..BQ1, BK.PKD..BQ2

In [20]:

translator = XMLInventoryMTExperiment()
experiment = translator.xml_to_mt(inventory_object=inventory)
print(experiment)

NEWLY ENCOUNTERED 20210514 -- may need some massaging
NEWLY ENCOUNTERED 20210514 -- may need some massaging
Experiment Contents
--------------------
Number of Surveys: 1
	Survey ID: None
	Number of Stations: 1
	--------------------
		Station ID: PKD
		Number of Runs: 1
		--------------------
			Run ID: 001
			Number of Channels: 4
			Recorded Channels: ex, ey, hx, hy
			Start: 2003-09-12T18:54:00+00:00
			End:   2005-03-15T16:45:00+00:00
			--------------------


In [21]:
run_metadata = experiment.surveys[0].stations[0].runs[0]

# We have the metadata, now get the data

In [22]:
cmd = f"ls -lr {ROVER_DIR.joinpath('data','BK','2004','272')}"
print(cmd)

ls -lr /home/kkappler/rover/data/BK/2004/272


In [23]:
execute_command(cmd)

executing from /home/kkappler/
total 1700
-rw-rw-r-- 1 kkappler kkappler 1740288 Sep  2 16:00 PKD.BK.2004.272


In [24]:
seed_path = ROVER_DIR.joinpath('data','BK','2004','272','PKD.BK.2004.272')


In [25]:
streams = read(seed_path.as_posix())

In [26]:
for i in range(len(streams)):
    print(i,streams[i].stats.channel)


0 BQ2
1 BQ3
2 BT1
3 BT2


In [27]:
streams[0].stats["channel"] = "BQ1"
streams[1].stats["channel"] = "BQ2"
streams[2].stats["channel"] = "BF1"
streams[3].stats["channel"] = "BF2"

In [28]:
# runs can be split into channels with similar start times and sample rates
start_times = sorted(list(set([tr.stats.starttime.isoformat() for tr in streams])))
end_times = sorted(list(set([tr.stats.endtime.isoformat() for tr in streams])))
print(start_times)
print(end_times)

['2004-09-28T00:00:00']
['2004-09-28T01:59:59.950000']


In [29]:
# initiate MTH5 file
h5_path = "from_rover.h5"
mth5_obj = initialize_mth5(h5_path)
# fill metadata
mth5_obj.from_experiment(experiment)
station_group = mth5_obj.get_station(test_data_set.station)

2021-09-02 16:09:34,671 [line 526] mth5.mth5.MTH5._initialize_file - INFO: Initialized MTH5 file from_rover.h5 in mode w


### Below:
We need to add run metadata to each RunTS because in the stationxml the channel metadata is only one entry for all similar channels regardless of their duration so we need to make sure that propagates to the MTH5.

In [30]:
for index, times in enumerate(zip(start_times, end_times), 1):
    run_id = f"{index:03}"
    run_stream = streams.slice(UTCDateTime(times[0]), UTCDateTime(times[1]))
    run_ts_obj = RunTS()
    run_ts_obj.from_obspy_stream(run_stream, run_metadata)
    run_ts_obj.run_metadata.id = run_id
    run_group = station_group.add_run(run_id)
    run_group.from_runts(run_ts_obj)

  if self._ts.coords.indexes["time"][0].freq is None:
  sr = 1e9 / self._ts.coords.indexes["time"][0].freq.nanos
2021-09-02 16:09:40,314 [line 707] mth5.groups.base.Station.add_run - INFO: run 001 already exists, returning existing group.
2021-09-02 16:09:40,570 [line 1165] mth5.groups.base.Run.add_channel - INFO: channel ex already exists, returning existing group.
2021-09-02 16:09:40,586 [line 1169] mth5.groups.base.Run.add_channel - INFO: updating data and metadata
2021-09-02 16:09:43,754 [line 1165] mth5.groups.base.Run.add_channel - INFO: channel ey already exists, returning existing group.
2021-09-02 16:09:43,768 [line 1169] mth5.groups.base.Run.add_channel - INFO: updating data and metadata
2021-09-02 16:09:46,590 [line 1165] mth5.groups.base.Run.add_channel - INFO: channel hx already exists, returning existing group.
2021-09-02 16:09:46,608 [line 1169] mth5.groups.base.Run.add_channel - INFO: updating data and metadata
2021-09-02 16:09:49,552 [line 1165] mth5.groups.base.Run.ad

In [31]:
mth5_obj.close_mth5()

2021-09-02 16:10:02,098 [line 569] mth5.mth5.MTH5.close_mth5 - INFO: Flushing and closing from_rover.h5
