# MTpy Example 02

## Data Querying

Now that we have an `MTCollection` stored in an `MTH5` we can now work with the collection of transfer functions.  We can query the data for transfer functions with the same survey name, or set a bounding box to get all transfer functions in a given area, etc.  You can query the Pandas Dataframe provided by `MTCollection`.

`MTCollection` provides a few dataframes to use.  

  - `master_dataframe` which never changes and contains all the transfer functions in the file.
  - `working_dataframe` which is the dataframe that contains only the stations you have queried for from the `master_dataframe`.  By default it is initially equal to the `master_dataframe`.
  - `dataframe` is an alias for working dataframe. 

### MTCollection vs MTData

`MTCollection` is meant to be the archive and database where the transfer functions are stored. This object is stored in memory.

`MTData` is meant to be the working object for analyzing, plotting, and creating input files for modeling.  If you want to store the manipulated transfer functions you can put into the `MTCollection`.  This object is stored in RAM.  

In [1]:
from pathlib import Path
from mtpy import MTCollection
%matplotlib inline

### Open MTCollection

In the previous example we created a MTH5 file from existing Yellowstone data.  Let's open that file here for plotting.

In [2]:
mc = MTCollection()
mc.open_collection(Path().cwd().parent.parent.joinpath("data", "transfer_functions", "yellowstone_mt_collection_02.h5"))

Make sure that everything is there as expected

In [3]:
mc.dataframe

Unnamed: 0,station,survey,latitude,longitude,elevation,tf_id,units,has_impedance,has_tipper,has_covariance,period_min,period_max,hdf5_reference,station_hdf5_reference
0,IDD11,Transportable_Array,47.043986,-116.345982,850.750,IDD11,none,True,True,True,7.31429,18724.57,<HDF5 object reference>,<HDF5 object reference>
1,IDD12,Transportable_Array,47.047800,-115.348000,1137.862,IDD12,none,True,True,True,7.31429,18724.57,<HDF5 object reference>,<HDF5 object reference>
2,IDE11,Transportable_Array,46.353000,-116.212965,882.625,IDE11,none,True,True,True,7.31429,18724.57,<HDF5 object reference>,<HDF5 object reference>
3,IDE12,Transportable_Array,46.387100,-115.582000,1633.900,IDE12,none,True,True,True,7.31429,18724.57,<HDF5 object reference>,<HDF5 object reference>
4,IDF11,Transportable_Array,45.889496,-116.157113,1175.550,IDF11,none,True,True,True,7.31429,18724.57,<HDF5 object reference>,<HDF5 object reference>
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
216,SR930,Yellowstone-Snake_River_Plain,43.188800,-113.042300,1568.050,SR930,none,True,True,True,7.31429,18724.57,<HDF5 object reference>,<HDF5 object reference>
217,SR954,Yellowstone-Snake_River_Plain,43.689700,-112.375700,1536.925,SR954,none,True,True,True,7.31429,18724.57,<HDF5 object reference>,<HDF5 object reference>
218,SR966,Yellowstone-Snake_River_Plain,43.948300,-112.039000,1486.875,SR966,none,True,True,True,7.31429,18724.57,<HDF5 object reference>,<HDF5 object reference>
219,SR980,Yellowstone-Snake_River_Plain,44.258100,-111.553000,2075.900,SR980,none,True,True,True,7.31429,18724.57,<HDF5 object reference>,<HDF5 object reference>


### MTCollection Working Data Frame

The MTH5 includes a summary of all the transfer functions in the file, this is a property, so when it is called it is providing you with current information.  `MTCollection` utilizes the `tf_summary` of MTH5 and calles it `MTCollection.master_dataframe`.  If you have a file with a bunch of transfer functions that cover a wide area, maybe you don't always want to use all the stations for plotting.  In this case you can set the `MTCollection.working_dataframe` as a subset of the `master_dataframe`.  To make it simpler, or more complicated `MTCollection` has a property simply called `dataframe` which will return the `working_dataframe` if one has been set, if not the `master_dataframe` will be returned.  We will see examples of this later.

 - `MTCollection.master_dataframe` is a property that calls `MTH5.tf_summary`.  Because it is a property it is updated in real time, providing a summary of all transfer functions in the collection or MTH5.
 - `MTCollection.working_dataframe` is an attribute that is a subset of the `master_dataframe`.  A user can set the `working_dataframe` by querying the `master_dataframe`.  Once the `working_dataframe` is set this will be used by the methods of `MTCollection`.
 - `MTCollection.dataframe` is a property that returns the `working_dataframe` if set, if not the `master_dataframe` is returned.  This is a convenience property for the user. 

In [4]:
mc.working_dataframe = mc.master_dataframe[mc.master_dataframe.station.str.contains("YNP")]
mc.dataframe

Unnamed: 0,station,survey,latitude,longitude,elevation,tf_id,units,has_impedance,has_tipper,has_covariance,period_min,period_max,hdf5_reference,station_hdf5_reference
140,YNP01S,YSBB,44.742639,-111.260583,2017.63,YNP01S,none,True,True,False,0.003906,1024.002621,<HDF5 object reference>,<HDF5 object reference>
141,YNP02S,YSBB,44.916667,-111.015472,2246.02,YNP02S,none,True,True,False,0.003906,1024.002621,<HDF5 object reference>,<HDF5 object reference>
142,YNP03B,YSBB,45.058972,-110.773333,1587.59,YNP03B,none,True,True,False,0.003906,1024.002621,<HDF5 object reference>,<HDF5 object reference>
143,YNP04B,YSBB,44.653278,-111.084722,2037.31,YNP04B,none,True,True,False,0.003906,1024.002621,<HDF5 object reference>,<HDF5 object reference>
144,YNP05S,YSBB,44.696556,-110.967917,2079.35,YNP05S,none,True,True,False,0.003906,1024.002621,<HDF5 object reference>,<HDF5 object reference>
145,YNP06S,YSBB,44.819056,-110.770444,2293.95,YNP06S,none,True,True,False,0.003906,512.006554,<HDF5 object reference>,<HDF5 object reference>
146,YNP07B,YSBB,44.915333,-110.73275,2217.66,YNP07B,none,True,True,False,0.003906,1024.002621,<HDF5 object reference>,<HDF5 object reference>
147,YNP08S,YSBB,44.958361,-110.549583,2076.33,YNP08S,none,True,True,False,0.003906,1024.002621,<HDF5 object reference>,<HDF5 object reference>
148,YNP09S,YSBB,44.356667,-111.305833,1918.18,YNP09S,none,True,True,False,0.003906,1024.002621,<HDF5 object reference>,<HDF5 object reference>
149,YNP10S,YSBB,44.531444,-111.118,2415.23,YNP10S,none,True,True,False,0.003906,1024.002621,<HDF5 object reference>,<HDF5 object reference>


## Use a Bounding Box to set Working DataFrame

We can also set the `working_dataframe` by applying a bounding box to the `master_dataframe`.  This can be done with the method `apply_bbox`


In [5]:
mc.apply_bbox(-112, -109.5, 44, 45.75)
mc.dataframe

Unnamed: 0,station,survey,latitude,longitude,elevation,tf_id,units,has_impedance,has_tipper,has_covariance,period_min,period_max,hdf5_reference,station_hdf5_reference
72,MTG16,Transportable_Array,45.24855,-111.615,1781.738,MTG16,none,True,True,True,7.31429,7281.778,<HDF5 object reference>,<HDF5 object reference>
73,MTG17,Transportable_Array,45.269325,-110.455,1977.8,MTG17,none,True,True,True,7.31429,18724.57,<HDF5 object reference>,<HDF5 object reference>
74,MTG18,Transportable_Array,45.3061,-109.518,1688.0,MTG18,none,True,True,True,7.31429,18724.57,<HDF5 object reference>,<HDF5 object reference>
78,MTH16,Transportable_Array,44.6047,-111.565,2085.15,MTH16,none,True,True,True,7.31429,18724.57,<HDF5 object reference>,<HDF5 object reference>
106,WYH18,Transportable_Array,44.6759,-109.665,2213.7,WYH18,none,True,True,True,7.31429,18724.57,<HDF5 object reference>,<HDF5 object reference>
137,WYYS1,Transportable_Array,44.7185,-110.638,2423.45,WYYS1,none,True,True,True,7.31429,18724.57,<HDF5 object reference>,<HDF5 object reference>
138,WYYS2,Transportable_Array,44.39635,-110.577,2399.188,WYYS2,none,True,True,True,7.31429,18724.57,<HDF5 object reference>,<HDF5 object reference>
139,WYYS3,Transportable_Array,44.5607,-110.315,2387.725,WYYS3,none,True,True,True,7.31429,18724.57,<HDF5 object reference>,<HDF5 object reference>
140,YNP01S,YSBB,44.742639,-111.260583,2017.63,YNP01S,none,True,True,False,0.003906,1024.002621,<HDF5 object reference>,<HDF5 object reference>
141,YNP02S,YSBB,44.916667,-111.015472,2246.02,YNP02S,none,True,True,False,0.003906,1024.002621,<HDF5 object reference>,<HDF5 object reference>


## MTData Object

Now that we have queried the data to just the stations we want, lets convert those stations to a `MTData` object so we can plot and analyze the data.  In the next series of notebooks we will demonstrate how to plot and analyze the data from the `MTData` object.

In [6]:
mt_data = mc.to_mt_data()

## Close Collection

Remember it is important to close the collection when we are done so there are no open instances of the H5 file.

In [7]:
mc.close_collection()

[1m24:09:26T12:14:22 | INFO | line:777 |mth5.mth5 | close_mth5 | Flushing and closing c:\Users\jpeacock\OneDrive - DOI\Documents\GitHub\iris-mt-course-2022\data\transfer_functions\yellowstone_mt_collection_02.h5[0m
