# Make CAS04 -- Single Station

Here is a notebook that builds a single-station mth5 file.  Multistation archives are very similar and will be addressed in another ipynb.

Here station "CAS04" is used and an example station.  The data are archived at EarthScope which is an FDSN compliant data source, so we can use the FDSN class in mth5.clients to do this.  
The FDSN class is demonstrated for extracting metadata about the station.

A more general method MakeMTH5, which calls the FDSN class is used to generate the final product MTH5 file.  The flow for accessing FDSN data is to use a "request dataframe" which is a tabular strucutre with one row per contiguous channel of data. 


Footnote: This notebook was based on aurora/tests/cas04/01_make_cas04_mth5.py

In [1]:
#Imports
import pandas as pd
from aurora.sandbox.mth5_channel_summary_helpers import channel_summary_to_make_mth5
from mth5.clients import FDSN
from mth5.clients.make_mth5 import MakeMTH5
from mth5.utils.helpers import initialize_mth5
from mth5.utils.helpers import read_back_data
from mt_metadata.timeseries.stationxml import XMLInventoryMTExperiment

In [2]:
import logging, sys
logging.disable(sys.maxsize)

In [3]:
import warnings
warnings.filterwarnings('ignore')

In [4]:
#placeholder for controlling which acquistions runs to request
#Leave it empty to get all of them
active_runs = []
#active_runs = ["a",]
#active_runs = ["b", "c", "d"]


In [5]:
# Access an FDSN object to get a list of the columns for the request dataframe
fdsn = FDSN()
print(f" Request df has columns {fdsn.request_columns}")

 Request df has columns ['network', 'station', 'location', 'channel', 'start', 'end']


In [6]:
# Generate data frame telling FDSN data provided 
# Network, Station, Location, Channel, Startime, Endtime codes of interest
network = "8P"
station = "CAS04"
channels = ["LQE", "LQN", "LFE", "LFN", "LFZ", ]
start = "2020-06-02T19:00:00"
end = "2020-07-13T19:00:00"

request_list = []
for channel in channels:
    request = [network, station, "", channel, start, end]
    request_list.append(request)

print(f"Request List \n {request_list}")

# Turn list into dataframe
metadata_request_df = pd.DataFrame(request_list, columns=fdsn.request_columns)
print(f"\n\n metadata_request_df \n ")
metadata_request_df

Request List 
 [['8P', 'CAS04', '', 'LQE', '2020-06-02T19:00:00', '2020-07-13T19:00:00'], ['8P', 'CAS04', '', 'LQN', '2020-06-02T19:00:00', '2020-07-13T19:00:00'], ['8P', 'CAS04', '', 'LFE', '2020-06-02T19:00:00', '2020-07-13T19:00:00'], ['8P', 'CAS04', '', 'LFN', '2020-06-02T19:00:00', '2020-07-13T19:00:00'], ['8P', 'CAS04', '', 'LFZ', '2020-06-02T19:00:00', '2020-07-13T19:00:00']]


 metadata_request_df 
 


Unnamed: 0,network,station,location,channel,start,end
0,8P,CAS04,,LQE,2020-06-02T19:00:00,2020-07-13T19:00:00
1,8P,CAS04,,LQN,2020-06-02T19:00:00,2020-07-13T19:00:00
2,8P,CAS04,,LFE,2020-06-02T19:00:00,2020-07-13T19:00:00
3,8P,CAS04,,LFN,2020-06-02T19:00:00,2020-07-13T19:00:00
4,8P,CAS04,,LFZ,2020-06-02T19:00:00,2020-07-13T19:00:00


In [7]:
# Request the inventory information from IRIS
inventory, traces = fdsn.get_inventory_from_df(metadata_request_df, data=False)
translator = XMLInventoryMTExperiment()
experiment = translator.xml_to_mt(inventory_object=inventory)


[1m2023-09-27T10:25:22.784578-0700 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_si_units to a CoefficientFilter.[0m
[1m2023-09-27T10:25:22.794197-0700 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_dipole_92.000 to a CoefficientFilter.[0m
[1m2023-09-27T10:25:22.839649-0700 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_si_units to a CoefficientFilter.[0m
[1m2023-09-27T10:25:22.853525-0700 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_dipole_92.000 to a CoefficientFilter.[0m
[1m2023-09-27T10:25:22.911610-0700 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_si_units to a CoefficientFilter.[0m


In [8]:
# Initialize an mth5 container, packing the metadata contained in "experiment" variable
h5_path = "tmp.h5"
mth5_obj = initialize_mth5(h5_path)  # mode="a")
mth5_obj.from_experiment(experiment)
mth5_obj.channel_summary.summarize()

summary_df = mth5_obj.channel_summary.to_dataframe()

[1m2023-09-27T10:25:23.459291-0700 | INFO | mth5.groups.base | _add_group | StationGroup CAS04 already exists, returning existing group.[0m
[1m2023-09-27T10:25:23.461325-0700 | INFO | mth5.groups.base | _add_group | RunGroup a already exists, returning existing group.[0m
[1m2023-09-27T10:25:23.518234-0700 | INFO | mth5.groups.base | _add_group | RunGroup b already exists, returning existing group.[0m
[1m2023-09-27T10:25:23.584353-0700 | INFO | mth5.groups.base | _add_group | RunGroup c already exists, returning existing group.[0m
[1m2023-09-27T10:25:23.640654-0700 | INFO | mth5.groups.base | _add_group | RunGroup d already exists, returning existing group.[0m


In [9]:
# Take a look at the channel_summary, this is an index of the available channels, one row per "channel-run"
summary_df

Unnamed: 0,survey,station,run,latitude,longitude,elevation,component,start,end,n_samples,sample_rate,measurement_type,azimuth,tilt,units,hdf5_reference,run_hdf5_reference,station_hdf5_reference
0,CONUS South,CAS04,a,37.633351,-121.468382,329.3875,ex,2020-06-02 18:41:43+00:00,2020-06-02 22:07:46+00:00,12363,1.0,electric,13.2,0.0,digital counts,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
1,CONUS South,CAS04,a,37.633351,-121.468382,329.3875,ey,2020-06-02 18:41:43+00:00,2020-06-02 22:07:46+00:00,12363,1.0,electric,103.2,0.0,digital counts,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
2,CONUS South,CAS04,a,37.633351,-121.468382,329.3875,hx,2020-06-02 18:41:43+00:00,2020-06-02 22:07:46+00:00,12363,1.0,magnetic,13.2,0.0,digital counts,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
3,CONUS South,CAS04,a,37.633351,-121.468382,329.3875,hy,2020-06-02 18:41:43+00:00,2020-06-02 22:07:46+00:00,12363,1.0,magnetic,103.2,0.0,digital counts,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
4,CONUS South,CAS04,a,37.633351,-121.468382,329.3875,hz,2020-06-02 18:41:43+00:00,2020-06-02 22:07:46+00:00,12363,1.0,magnetic,0.0,90.0,digital counts,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
5,CONUS South,CAS04,b,37.633351,-121.468382,329.3875,ex,2020-06-02 22:24:55+00:00,2020-06-12 17:52:23+00:00,847648,1.0,electric,13.2,0.0,digital counts,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
6,CONUS South,CAS04,b,37.633351,-121.468382,329.3875,ey,2020-06-02 22:24:55+00:00,2020-06-12 17:52:23+00:00,847648,1.0,electric,103.2,0.0,digital counts,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
7,CONUS South,CAS04,b,37.633351,-121.468382,329.3875,hx,2020-06-02 22:24:55+00:00,2020-06-12 17:52:23+00:00,847648,1.0,magnetic,13.2,0.0,digital counts,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
8,CONUS South,CAS04,b,37.633351,-121.468382,329.3875,hy,2020-06-02 22:24:55+00:00,2020-06-12 17:52:23+00:00,847648,1.0,magnetic,103.2,0.0,digital counts,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
9,CONUS South,CAS04,b,37.633351,-121.468382,329.3875,hz,2020-06-02 22:24:55+00:00,2020-06-12 17:52:23+00:00,847648,1.0,magnetic,0.0,90.0,digital counts,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>


In [10]:
# A channel summary can be transformed into a request dataframe for the specific runs of interest
if active_runs:
    summary_df = summary_df[summary_df["run"].isin(active_runs)]  # summary_df[0:5]
data_request_df = channel_summary_to_make_mth5(summary_df, network=network, verbose=True)
data_request_df

('CAS04', 'a'), from 2020-06-02 18:41:43+00:00, to 2020-06-02 22:07:46+00:00
('CAS04', 'b'), from 2020-06-02 22:24:55+00:00, to 2020-06-12 17:52:23+00:00
('CAS04', 'c'), from 2020-06-12 18:32:17+00:00, to 2020-07-01 17:32:59+00:00
('CAS04', 'd'), from 2020-07-01 19:36:55+00:00, to 2020-07-13 21:46:12+00:00


Unnamed: 0,network,station,location,channel,start,end
0,8P,CAS04,,LQN,2020-06-02 18:41:43+00:00,2020-06-02 22:07:46+00:00
1,8P,CAS04,,LQE,2020-06-02 18:41:43+00:00,2020-06-02 22:07:46+00:00
2,8P,CAS04,,LFN,2020-06-02 18:41:43+00:00,2020-06-02 22:07:46+00:00
3,8P,CAS04,,LFE,2020-06-02 18:41:43+00:00,2020-06-02 22:07:46+00:00
4,8P,CAS04,,LFZ,2020-06-02 18:41:43+00:00,2020-06-02 22:07:46+00:00
5,8P,CAS04,,LQN,2020-06-02 22:24:55+00:00,2020-06-12 17:52:23+00:00
6,8P,CAS04,,LQE,2020-06-02 22:24:55+00:00,2020-06-12 17:52:23+00:00
7,8P,CAS04,,LFN,2020-06-02 22:24:55+00:00,2020-06-12 17:52:23+00:00
8,8P,CAS04,,LFE,2020-06-02 22:24:55+00:00,2020-06-12 17:52:23+00:00
9,8P,CAS04,,LFZ,2020-06-02 22:24:55+00:00,2020-06-12 17:52:23+00:00


### Build MTH5
- Set mth5 version
- Initialize an mth5 Maker and set some params
    - client IRIS is for EarthScope data
    - mth5 verison supports "0.1.0" and "0.2.0"
    - if interact is True and mth5.mth5.MTH5 object will be returned, if False, the path to the h5 will be returned

In [11]:
client = "IRIS"
mth5_version = "0.1.0" 
# mth5_version = "0.2.0"
interact = False

maker = MakeMTH5(mth5_version=mth5_version, client=client)
maker.client = client

In [12]:
# print("FAILED FOR 0.2.0 with some other error")
# inventory, streams = maker.get_inventory_from_df(request_df, data=False, client="IRIS")    # inventory==inventory0??
mth5_obj = maker.from_fdsn_client(data_request_df, path="", interact=interact)
if interact:
    mth5_path = mth5_obj.filename
else:
    mth5_path = mth5_obj
print(f"Made MTH5 at {mth5_path}")

[1m2023-09-27T10:25:24.220680-0700 | INFO | mth5.mth5 | _initialize_file | Initialized MTH5 0.1.0 file /home/kkappler/software/irismt/aurora/docs/examples/8P_CAS04.h5 in mode w[0m
[1m2023-09-27T10:25:43.681187-0700 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_si_units to a CoefficientFilter.[0m
[1m2023-09-27T10:25:43.690041-0700 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_dipole_92.000 to a CoefficientFilter.[0m
[1m2023-09-27T10:25:43.735354-0700 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_si_units to a CoefficientFilter.[0m
[1m2023-09-27T10:25:43.743114-0700 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_dipole_92.000 to a CoefficientFilter.[0m
[1m2023-09-27T1

In [13]:
if interact:
    mth5_obj.close_mth5()

In [14]:
# Apply a sanity check to make sure that the data are readable
if not active_runs:
    active_runs = ["a", "b", "c", "d"]
for run_id in active_runs:
    if mth5_version == "0.1.0":
        survey = None
    else: 
        survey = "CONUS South"
    read_back_data(mth5_path, "CAS04", run_id, survey)

[1m2023-09-27T10:25:51.904412-0700 | INFO | mth5.utils.helpers | read_back_data | data shape = (5, 12364)[0m
[1m2023-09-27T10:25:51.904923-0700 | INFO | mth5.mth5 | close_mth5 | Flushing and closing /home/kkappler/software/irismt/aurora/docs/examples/8P_CAS04.h5[0m
[1m2023-09-27T10:25:53.487777-0700 | INFO | mth5.utils.helpers | read_back_data | data shape = (5, 847649)[0m
[1m2023-09-27T10:25:53.488332-0700 | INFO | mth5.mth5 | close_mth5 | Flushing and closing /home/kkappler/software/irismt/aurora/docs/examples/8P_CAS04.h5[0m
[1m2023-09-27T10:26:13.489546-0700 | INFO | mth5.utils.helpers | read_back_data | data shape = (5, 1638043)[0m
[1m2023-09-27T10:26:13.490313-0700 | INFO | mth5.mth5 | close_mth5 | Flushing and closing /home/kkappler/software/irismt/aurora/docs/examples/8P_CAS04.h5[0m
[1m2023-09-27T10:26:15.383433-0700 | INFO | mth5.utils.helpers | read_back_data | data shape = (5, 1044558)[0m
[1m2023-09-27T10:26:15.383982-0700 | INFO | mth5.mth5 | close_mth5 | Flus