# Create a v0.2.0 MTH5 file from the IRIS Data Managment Center 

**Note:** this example assumes that data availability (Network, Station, Channel, Start, End) are all previously known.  If you do not know the data that you want to download use [IRIS tools](https://ds.iris.edu/ds/nodes/dmc/tools/##) to get data availability.   

In [1]:
# IMPORTS
from pathlib import Path
import sys

import pandas as pd

from mth5.clients.make_mth5 import MakeMTH5
from mth5 import mth5, timeseries

from mt_metadata.utils.mttime import get_now_utc, MTime

2021-11-01 15:38:49,614 [line 135] mth5.setup_logger - INFO: Logging file can be found C:\Users\jpeacock\Documents\GitHub\mth5\logs\mth5_debug.log


## Set the path to save files to as the current working directory

In [2]:
default_path = Path().cwd()

## Initialize a MakeMTH5 object

Here, we are setting the MTH5 file version to 0.2.0 so that we can have multiple surveys in a single file.  Also, setting the client to "IRIS".  Here, we are using `obspy.clients` tools for the request.  Here are the available [FDSN clients](https://docs.obspy.org/packages/obspy.clients.fdsn.html). 

**Note:** Only the "IRIS" client has been tested.

In [3]:
m = MakeMTH5(mth5_version='0.2.0')
m.client = "IRIS"

## Make the data inquiry as a DataFrame

There are a few ways to make the inquiry to request data.  

1. Make a DataFrame by hand.  Here we will make a list of entries and then create a DataFrame with the proper column names
2. You can create a CSV file with a row for each entry. There are some formatting that you need to be aware of.  That is the column names and making sure that date-times are YYYY-MM-DDThh:mm:ss


| Column Name         |   Description                                                                                                 |
| ------------------- | --------------------------------------------------------------------------------------------------------------|
| **network**         | [FDSN Network code (2 letters)](http://www.fdsn.org/networks/)                                                |
| **station**         | [FDSN Station code (usually 5 characters)](https://ds.iris.edu/ds/nodes/dmc/data/formats/seed-channel-naming/)|
| **location**        | [FDSN Location code (typically not used for MT)](http://docs.fdsn.org/projects/source-identifiers/en/v1.0/location-codes.html) |
| **channel**         | [FDSN Channel code (3 characters)](http://docs.fdsn.org/projects/source-identifiers/en/v1.0/channel-codes.html)|
| **start**           | Start time (YYYY-MM-DDThh:mm:ss) UTC |
| **end**             | End time (YYYY-MM-DDThh:mm:ss) UTC  |

In [4]:
# Uncomment to test multi network MTH5s

EMCAY10LFE = ['EM', 'CAY10', '', 'LFE', '2019-10-07T00:00:00', '2019-10-30T00:00:00'] 
EMCAY10LFN = ['EM', 'CAY10', '', 'LFN', '2019-10-07T00:00:00', '2019-10-30T00:00:00'] 
EMCAY10LFZ = ['EM', 'CAY10', '', 'LFZ', '2019-10-07T00:00:00', '2019-10-30T00:00:00'] 
EMCAY10LQE = ['EM', 'CAY10', '', 'LQE', '2019-10-07T00:00:00', '2019-10-30T00:00:00'] 
EMCAY10LQN = ['EM', 'CAY10', '', 'LQN', '2019-10-07T00:00:00', '2019-10-30T00:00:00'] 
ZUCAS04LQ1 = ['ZU', 'CAS04', '', 'LQE', '2020-06-02T19:00:00', '2020-07-13T19:00:00']
ZUCAS04LQ2 = ['ZU', 'CAS04', '', 'LQN', '2020-06-02T19:00:00', '2020-07-13T19:00:00']
ZUCAS04BF1 = ['ZU', 'CAS04', '', 'LFE', '2020-06-02T19:00:00', '2020-07-13T19:00:00']
ZUCAS04BF2 = ['ZU', 'CAS04', '', 'LFN', '2020-06-02T19:00:00', '2020-07-13T19:00:00']
ZUCAS04BF3 = ['ZU', 'CAS04', '', 'LFZ', '2020-06-02T19:00:00', '2020-07-13T19:00:00']
ZUNRV08LQ1 = ['ZU', 'NVR08', '', 'LQE', '2020-06-02T19:00:00', '2020-07-13T19:00:00']
ZUNRV08LQ2 = ['ZU', 'NVR08', '', 'LQN', '2020-06-02T19:00:00', '2020-07-13T19:00:00']
ZUNRV08BF1 = ['ZU', 'NVR08', '', 'LFE', '2020-06-02T19:00:00', '2020-07-13T19:00:00']
ZUNRV08BF2 = ['ZU', 'NVR08', '', 'LFN', '2020-06-02T19:00:00', '2020-07-13T19:00:00']
ZUNRV08BF3 = ['ZU', 'NVR08', '', 'LFZ', '2020-06-02T19:00:00', '2020-07-13T19:00:00']
request_list = [
    EMCAY10LFE, EMCAY10LFN, EMCAY10LFZ, EMCAY10LQE, EMCAY10LQN,
    ZUCAS04LQ1, ZUCAS04LQ2, ZUCAS04BF1, ZUCAS04BF2, ZUCAS04BF3,
    ZUNRV08LQ1, ZUNRV08LQ2, ZUNRV08BF1, ZUNRV08BF2, ZUNRV08BF3
]

# Turn list into dataframe
request_df =  pd.DataFrame(request_list, columns=m.column_names)
request_df

Unnamed: 0,network,station,location,channel,start,end
0,EM,CAY10,,LFE,2019-10-07T00:00:00,2019-10-30T00:00:00
1,EM,CAY10,,LFN,2019-10-07T00:00:00,2019-10-30T00:00:00
2,EM,CAY10,,LFZ,2019-10-07T00:00:00,2019-10-30T00:00:00
3,EM,CAY10,,LQE,2019-10-07T00:00:00,2019-10-30T00:00:00
4,EM,CAY10,,LQN,2019-10-07T00:00:00,2019-10-30T00:00:00
5,ZU,CAS04,,LQE,2020-06-02T19:00:00,2020-07-13T19:00:00
6,ZU,CAS04,,LQN,2020-06-02T19:00:00,2020-07-13T19:00:00
7,ZU,CAS04,,LFE,2020-06-02T19:00:00,2020-07-13T19:00:00
8,ZU,CAS04,,LFN,2020-06-02T19:00:00,2020-07-13T19:00:00
9,ZU,CAS04,,LFZ,2020-06-02T19:00:00,2020-07-13T19:00:00


## Save the request as a CSV

Its helpful to be able to save the request as a CSV and modify it and use it later.  A CSV can be input as a request to `MakeMTH5`

In [5]:
request_df.to_csv(default_path.joinpath("fdsn_request.csv"))

## Get only the metadata from IRIS

It can be helpful to make sure that your request is what you would expect.  For that you can request only the metadata from IRIS.  The request is quick and light so shouldn't need to worry about the speed.  

In [6]:
inventory, data = m.get_inventory_from_df(request_df, data=False)

Have a look at the Inventory to make sure it contains what is requested.

In [7]:
inventory

Inventory created at 2021-11-01T22:38:50.335383Z
	Created by: ObsPy 1.2.2
		    https://www.obspy.org
	Sending institution: MTH5
	Contains:
		Networks (2):
			EM, ZU
		Stations (3):
			EM.CAY10 (Indio Hills, CA, USA)
			ZU.CAS04 (Corral Hollow, CA, USA)
			ZU.NVR08 (Rhodes Salt Marsh, NV, USA)
		Channels (15):
			EM.CAY10..LFZ, EM.CAY10..LFN, EM.CAY10..LFE, EM.CAY10..LQN, 
			EM.CAY10..LQE, ZU.CAS04..LFZ, ZU.CAS04..LFN, ZU.CAS04..LFE, 
			ZU.CAS04..LQN, ZU.CAS04..LQE, ZU.NVR08..LFZ, ZU.NVR08..LFN, 
			ZU.NVR08..LFE, ZU.NVR08..LQN, ZU.NVR08..LQE

## Make an MTH5 from a request

Now that we've created a request, and made sure that its what we expect, we can make an MTH5 file.  The input can be either the DataFrame or the CSV file.  

We are going to time it just to get an indication how long it might take.  Should take about 4 minutes.

**Note:** we are setting `interact=True` so we can interrogate the file when its complete.  If you want to just write a file leave `interact=False` the default. 

In [None]:
begin = MTime(get_now_utc())

mth5_object = m.make_mth5_from_fdsnclient(request_df, interact=True)

end = MTime(get_now_utc())

print(f"Created {mth5_object.filename}.  Took {(int(end - begin) // 60)}:{(end - begin) % 60:05.2f} minutes")

2021-11-01 15:38:52,410 [line 653] mth5.mth5.MTH5._initialize_file - INFO: Initialized MTH5 file C:\Users\jpeacock\Documents\GitHub\mth5\examples\notebooks\EM_CAY10_ZU_CAS04_NVR08.h5 in mode w


## Have a look at the contents of the created file

In [None]:
mth5_object

## Create a DataFrame that summarizes each channel dataset

**Note:** This is quite slow because attribute access is not optimized for speed.  And the MTH5 code used for this summary table is not optimized.  You should only do this once for a given file.  Also, note the `hdf5_reference` column.  This is an internal reference for an open HDF5 file and can be used to directly access a group or dataset.  

In [None]:
begin = MTime(get_now_utc())

ch_summary = mth5_object.channel_summary
ch_summary.to_csv(default_path.joinpath("channel_summary.csv"))

end = MTime(get_now_utc())

print(f"Created channel summary {default_path.joinpath('channel_summary.csv')}")
print(f"Took {(int(end - begin) // 60)}:{(end - begin) % 60:05.2f} minutes")

# When you are finished be sure to close the file

In [None]:
#mth5_object.close_mth5()