# Make MTH5 from IRIS Data Managment Center v0.2.0 

**Note:** this example assumes that data availability (Network, Station, Channel, Start, End) are all previously known.  If you do not know the data that you want to download use [IRIS tools](https://ds.iris.edu/ds/nodes/dmc/tools/##) to get data availability.   

In [1]:
from pathlib import Path

import numpy as np
import pandas as pd
from mth5.mth5 import MTH5
from mth5.clients.make_mth5 import FDSN

from matplotlib import pyplot as plt
#%matplotlib widget

## Set the path to save files to as the current working directory

In [2]:
default_path = Path().cwd()
print(default_path)

C:\Users\jpeacock\OneDrive - DOI\Documents\GitHub\mt_examples\notebooks\mth5


## Initialize a MakeMTH5 object

Here, we are setting the MTH5 file version to 0.2.0 so that we can have multiple surveys in a single file.  Also, setting the client to "IRIS".  Here, we are using `obspy.clients` tools for the request.  Here are the available [FDSN clients](https://docs.obspy.org/packages/obspy.clients.fdsn.html). 

**Note:** Only the "IRIS" client has been tested.

In [3]:
fdsn_object = FDSN(mth5_version='0.2.0')
fdsn_object.client = "IRIS"

## Make the data inquiry as a DataFrame

There are a few ways to make the inquiry to request data.  

1. Make a DataFrame by hand.  Here we will make a list of entries and then create a DataFrame with the proper column names
2. You can create a CSV file with a row for each entry. There are some formatting that you need to be aware of.  That is the column names and making sure that date-times are YYYY-MM-DDThh:mm:ss


| Column Name         |   Description                                                                                                 |
| ------------------- | --------------------------------------------------------------------------------------------------------------|
| **network**         | [FDSN Network code (2 letters)](http://www.fdsn.org/networks/)                                                |
| **station**         | [FDSN Station code (usually 5 characters)](https://ds.iris.edu/ds/nodes/dmc/data/formats/seed-channel-naming/)|
| **location**        | [FDSN Location code (typically not used for MT)](http://docs.fdsn.org/projects/source-identifiers/en/v1.0/location-codes.html) |
| **channel**         | [FDSN Channel code (3 characters)](http://docs.fdsn.org/projects/source-identifiers/en/v1.0/channel-codes.html)|
| **start**           | Start time (YYYY-MM-DDThh:mm:ss) UTC |
| **end**             | End time (YYYY-MM-DDThh:mm:ss) UTC  |

#### From IRIS Metadata

Here are the station running during WYYS2, the original transfer function was processed with MTF20.

| Station | Start | End |
|---------|-------|-----|
| MTF18 | 2009-07-11T00:07:55.0000 | 2009-08-28T21:31:41.0000 |
| MTF19 | 2009-07-10T00:17:16.0000 | 2009-09-16T19:39:22.0000 |
| MTF20 | 2009-07-03T22:38:13.0000 | 2009-08-13T17:15:39.0000 |
| MTF21 | 2009-07-31T21:45:22.0000 | 2009-09-18T23:56:11.0000 |
| WYI17 | 2009-07-13T22:17:15.0000 | 2009-08-04T19:41:33.0000 |
| WYI21 | 2009-07-27T07:26:00.0000 | 2009-09-11T16:36:30.0000 |
| WYJ18 | 2009-07-13T00:58:30.0000 | 2009-08-03T19:39:24.0000 |
| WYJ21 | 2009-07-23T22:23:20.0000 | 2009-08-16T21:03:53.0000 |
| WYK21 | 2009-07-22T22:01:44.0000 | 2009-08-06T12:11:52.0000 |
| WYL21 | 2009-07-21T20:27:41.0000 | 2009-08-18T17:56:39.0000 |
| WYYS1 | 2009-07-14T21:58:53.0000 | 2009-08-05T21:14:56.0000 |
| WYYS2 | 2009-07-15T22:43:28.0000 | 2009-08-20T00:17:06.0000 |

Here are the station running during WYYS3, the original transfer function was processed with MTC18

| Station | Start | End |
|---------|-------|-----|
| MTC18 | 2009-08-21T21:52:53.0000 | 2009-09-13T00:02:18.0000 | 
| MTE18 | 2009-08-07T20:40:38.0000 | 2009-08-28T19:03:01.0000 | 
| MTE19 | 2009-08-12T22:45:47.0000 | 2009-09-02T17:10:53.0000 | 
| MTE20 | 2009-08-10T21:12:58.0000 | 2009-09-19T20:25:30.0000 | 
| MTE21 | 2009-08-09T02:45:02.0000 | 2009-08-30T21:43:43.0000 | 
| WYH21 | 2009-08-01T22:50:53.0000 | 2009-08-24T20:23:03.0000 | 
| WYG21 | 2009-08-03T02:56:34.0000 | 2009-08-18T19:56:53.0000 | 
| WYYS3 | 2009-08-20T01:55:41.0000 | 2009-09-17T20:04:21.0000 |

In the examples below we will just download the original remote references, but as an excersize you could add in some other stations.

In [4]:
channels = ["LFE", "LFN", "LFZ", "LQE", "LQN"]
CAS04 = ["8P", "CAS04",  '2020-06-02T19:00:00', '2020-07-13T19:00:00'] 
NVR08 = ["8P", "NVR08", '2020-06-02T19:00:00', '2020-07-13T19:00:00']

request_list = []
for entry in [CAS04, NVR08]:
    for channel in channels:
        request_list.append(
            [entry[0], entry[1], "", channel, entry[2], entry[3]]
        )

# Turn list into dataframe
request_df =  pd.DataFrame(request_list, columns=fdsn_object.request_columns) 
request_df

Unnamed: 0,network,station,location,channel,start,end
0,8P,CAS04,,LFE,2020-06-02T19:00:00,2020-07-13T19:00:00
1,8P,CAS04,,LFN,2020-06-02T19:00:00,2020-07-13T19:00:00
2,8P,CAS04,,LFZ,2020-06-02T19:00:00,2020-07-13T19:00:00
3,8P,CAS04,,LQE,2020-06-02T19:00:00,2020-07-13T19:00:00
4,8P,CAS04,,LQN,2020-06-02T19:00:00,2020-07-13T19:00:00
5,8P,NVR08,,LFE,2020-06-02T19:00:00,2020-07-13T19:00:00
6,8P,NVR08,,LFN,2020-06-02T19:00:00,2020-07-13T19:00:00
7,8P,NVR08,,LFZ,2020-06-02T19:00:00,2020-07-13T19:00:00
8,8P,NVR08,,LQE,2020-06-02T19:00:00,2020-07-13T19:00:00
9,8P,NVR08,,LQN,2020-06-02T19:00:00,2020-07-13T19:00:00


## Save the request as a CSV

Its helpful to be able to save the request as a CSV and modify it and use it later.  A CSV can be input as a request to `MakeMTH5`

In [5]:
request_df.to_csv(default_path.joinpath("fdsn_request.csv"))

## Get only the metadata from IRIS

It can be helpful to make sure that your request is what you would expect.  For that you can request only the metadata from IRIS.  The request is quick and light so shouldn't need to worry about the speed.  This returns a StationXML file and is loaded into an `obspy.Inventory` object.

In [6]:
inventory, data = fdsn_object.get_inventory_from_df(request_df, data=False)

Have a look at the Inventory to make sure it contains what is requested.

In [7]:
inventory

Inventory created at 2024-02-16T17:27:49.399052Z
	Created by: ObsPy 1.4.0
		    https://www.obspy.org
	Sending institution: MTH5
	Contains:
		Networks (1):
			8P
		Stations (2):
			8P.CAS04 (Corral Hollow, CA, USA)
			8P.NVR08 (Rhodes Salt Marsh, NV, USA)
		Channels (13):
			8P.CAS04..LFZ, 8P.CAS04..LFN, 8P.CAS04..LFE, 8P.CAS04..LQN (2x), 
			8P.CAS04..LQE (3x), 8P.NVR08..LFZ, 8P.NVR08..LFN, 8P.NVR08..LFE, 
			8P.NVR08..LQN, 8P.NVR08..LQE

## Make an MTH5 from a request

Now that we've created a request, and made sure that its what we expect, we can make an MTH5 file.  The input can be either the DataFrame or the CSV file.  

We are going to time it just to get an indication how long it might take.  Should take about 4 minutes.

**Note:** we are setting `interact=False`.  If you want to just to keep the file open to interogat it set `interact=True`. 

In [8]:
%%time

mth5_object = fdsn_object.make_mth5_from_fdsn_client(request_df, interact=False)

print(f"Created {mth5_object}")

[1m2024-02-16T09:28:08.224093-0800 | INFO | mth5.mth5 | _initialize_file | Initialized MTH5 0.2.0 file C:\Users\jpeacock\OneDrive - DOI\Documents\GitHub\mt_examples\notebooks\mth5\8P_CAS04_NVR08.h5 in mode w[0m
[1m2024-02-16T09:30:08.684160-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_si_units to a CoefficientFilter.[0m
[1m2024-02-16T09:30:08.699813-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_dipole_92.000 to a CoefficientFilter.[0m
[1m2024-02-16T09:30:08.792751-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_si_units to a CoefficientFilter.[0m
[1m2024-02-16T09:30:08.793325-0800 | INFO | mt_metadata.timeseries.filters.obspy_stages | create_filter_from_stage | Converting PoleZerosResponseStage electric_dipole_92.000 to a Coeffici

In [9]:
# open file already created
mth5_object = MTH5()
mth5_object.open_mth5("8P_CAS04_NVR08.h5")

/:
    |- Group: Experiment
    --------------------
        |- Group: Reports
        -----------------
        |- Group: Standards
        -------------------
            --> Dataset: summary
            ......................
        |- Group: Surveys
        -----------------
            |- Group: CONUS_South
            ---------------------
                |- Group: Filters
                -----------------
                    |- Group: coefficient
                    ---------------------
                        |- Group: electric_analog_to_digital
                        ------------------------------------
                        |- Group: electric_dipole_92.000
                        --------------------------------
                        |- Group: electric_dipole_94.000
                        --------------------------------
                        |- Group: electric_si_units
                        ---------------------------
                        |- Group: magnetic_an

### Add transfer function

In [12]:
from mt_metadata.transfer_functions.core import TF

In [14]:
cas04_tf = TF()
cas04_tf.read(r"USMTArray.CAS04.2020.edi")

In [16]:
cas04_tf.survey = "CONUS_South"

In [17]:
mth5_object.add_transfer_function(cas04_tf)

/Experiment/Surveys/CONUS_South/Stations/CAS04/Transfer_Functions/CAS04:
    --> Dataset: period
    .....................
    --> Dataset: transfer_function
    ................................
    --> Dataset: transfer_function_error
    ......................................

## Have a look at the contents of the created file

In [10]:
mth5_object

/:
    |- Group: Experiment
    --------------------
        |- Group: Reports
        -----------------
        |- Group: Standards
        -------------------
            --> Dataset: summary
            ......................
        |- Group: Surveys
        -----------------
            |- Group: CONUS_South
            ---------------------
                |- Group: Filters
                -----------------
                    |- Group: coefficient
                    ---------------------
                        |- Group: electric_analog_to_digital
                        ------------------------------------
                        |- Group: electric_dipole_92.000
                        --------------------------------
                        |- Group: electric_dipole_94.000
                        --------------------------------
                        |- Group: electric_si_units
                        ---------------------------
                        |- Group: magnetic_an

## Channel Summary

A convenience table is supplied with an MTH5 file.  This table provides some information about each channel that is present in the file.  It also provides columns `hdf5_reference`, `run_hdf5_reference`, and `station_hdf5_reference`, these are internal references within an HDF5 file and can be used to directly access a group or dataset by using `mth5_object.from_reference` method.  

**Note:** When a MTH5 file is close the table is resummarized so when you open the file next the `channel_summary` will be up to date. Same with the `tf_summary`.

In [11]:
mth5_object.channel_summary.clear_table()
mth5_object.channel_summary.summarize()

ch_df = mth5_object.channel_summary.to_dataframe()
ch_df

Unnamed: 0,survey,station,run,latitude,longitude,elevation,component,start,end,n_samples,sample_rate,measurement_type,azimuth,tilt,units,hdf5_reference,run_hdf5_reference,station_hdf5_reference
0,CONUS South,CAS04,a,37.633351,-121.468382,329.3875,ex,2020-06-02 19:00:00+00:00,2020-06-02 22:07:46+00:00,11267,1.0,electric,13.2,0.0,digital counts,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
1,CONUS South,CAS04,a,37.633351,-121.468382,329.3875,ey,2020-06-02 19:00:00+00:00,2020-06-02 22:07:46+00:00,11267,1.0,electric,103.2,0.0,digital counts,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
2,CONUS South,CAS04,a,37.633351,-121.468382,329.3875,hx,2020-06-02 19:00:00+00:00,2020-06-02 22:07:46+00:00,11267,1.0,magnetic,13.2,0.0,digital counts,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
3,CONUS South,CAS04,a,37.633351,-121.468382,329.3875,hy,2020-06-02 19:00:00+00:00,2020-06-02 22:07:46+00:00,11267,1.0,magnetic,103.2,0.0,digital counts,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
4,CONUS South,CAS04,a,37.633351,-121.468382,329.3875,hz,2020-06-02 19:00:00+00:00,2020-06-02 22:07:46+00:00,11267,1.0,magnetic,0.0,90.0,digital counts,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
5,CONUS South,CAS04,b,37.633351,-121.468382,329.3875,ex,2020-06-02 22:24:55+00:00,2020-06-12 17:52:23+00:00,847649,1.0,electric,13.2,0.0,digital counts,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
6,CONUS South,CAS04,b,37.633351,-121.468382,329.3875,ey,2020-06-02 22:24:55+00:00,2020-06-12 17:52:23+00:00,847649,1.0,electric,103.2,0.0,digital counts,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
7,CONUS South,CAS04,b,37.633351,-121.468382,329.3875,hx,2020-06-02 22:24:55+00:00,2020-06-12 17:52:23+00:00,847649,1.0,magnetic,13.2,0.0,digital counts,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
8,CONUS South,CAS04,b,37.633351,-121.468382,329.3875,hy,2020-06-02 22:24:55+00:00,2020-06-12 17:52:23+00:00,847649,1.0,magnetic,103.2,0.0,digital counts,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
9,CONUS South,CAS04,b,37.633351,-121.468382,329.3875,hz,2020-06-02 22:24:55+00:00,2020-06-12 17:52:23+00:00,847649,1.0,magnetic,0.0,90.0,digital counts,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>


## Have a look at a station

Lets grab one station `CAS04` and have a look at its metadata and contents.
Here we will grab it from the `mth5_object`.

In [12]:
cas04 = mth5_object.get_station("CAS04", survey="CONUS_South")
cas04.metadata

MTH5Error: Could not find station WYYS2

### Changing Metadata
If you want to change the metadata of any group, be sure to use the `write_metadata` method.  Here's an example:

In [None]:
cas04.metadata.location.declination.value = 12.2
cas04.write_metadata()
print(cas04.metadata.location.declination)

## Have a look at a single channel

Let's pick out a channel and interogate it. There are a couple ways
1. Get a channel the first will be from the `hdf5_reference` [*demonstrated here*]
2. Get a channel from `mth5_object`
3. Get a station first then get a channel


In [None]:
ex = mth5_object.from_reference(ch_df.iloc[-6].hdf5_reference).to_channel_ts()
print(ex)

In [None]:
ex.channel_metadata

## Calibrate time series data
Most data loggers output data in digital counts.  Then a series of filters that represent the various instrument responses are applied to get the data into physical units.  The data can then be analyzed and processed. Commonly this is done during the processing step, but it is important to be able to look at time series data in physical units.  Here we provide a `remove_instrument_response` method in the `ChananelTS` object.  Here's an example:  

In [None]:
print(ex.channel_response_filter)
ex.channel_response_filter.plot_response(np.logspace(0, 4, 50))

In [None]:
ex.remove_instrument_response(plot=True)

## Have a look at a run

Let's pick out a run, take a slice of it, and interogate it. There are a couple ways
1. Get a run the first will be from the `run_hdf5_reference` [*demonstrated here*]
2. Get a run from `mth5_object`
3. Get a station first then get a run

In [None]:
run_from_reference = mth5_object.from_reference(ch_df.iloc[0].run_hdf5_reference).to_runts(start=ch_df.iloc[0].start.isoformat(), n_samples=360)
print(run_from_reference)

In [None]:
run_from_reference.plot()

### Calibrate Run

In [None]:
calibrated_run = run_from_reference.calibrate()
calibrated_run.plot()

### Have a look at the transfer function summary

In [None]:
mth5_object.tf_summary.summarize()
tf_df = mth5_object.tf_summary.to_dataframe()
tf_df

## Close MTH5

We have now loaded in all the data we need for long period data.  We can now process these data using **Aurora** and have a look at the transfer functions using **MTpy**.

In [18]:
mth5_object.close_mth5()

[1m2024-02-16T09:36:19.636939-0800 | INFO | mth5.mth5 | close_mth5 | Flushing and closing 8P_CAS04_NVR08.h5[0m
