# Make an MTH5 from NIMS data

This notebook provides an example of how to read in NIMS (.BIN) files into an MTH5. NIMS files represent a single run.   

There are two ways to do this. First, we will demonstrate how to use the automated from_nims() method in MakeMTH5. This should satisfy most users' needs. If a more precise read is required, or if more metadata needs to be appended to the MTH5 file before archiving, then we will provide the routines needed for the older, more granular read as well afterward.

In [1]:
from mth5.mth5 import MTH5
from mth5.io.nims import NIMSCollection
from mth5 import read_file
from mth5.clients import MakeMTH5

##### The New from_nims() Method

As of MTH5 v0.3.5, you can call `from_nims()` with a path to the folder containing a .BIN file and load it automatically. By default, it will then be saved in the working directory under the name from_nims.h5

### NIMS Collection

We will use the `NIMSCollection` to assemble the *.bin* files into a logical order by run. The output NIMS files include all data for each channel for a single run. Therefore the collection is relatively simple.

*Metadata:* we need to input the `survey_id` to provide minimal metadata when making an MTH5 file. 

The `NIMSCollection.get_runs()` will return a two level ordered dictionary (`OrderedDict`).  The first level is keyed by station ID.  These objects are in turn ordered dictionaries by run ID.  Therefore you can loop over stations and runs.  

**Note**: `n_samples` and `end` are estimates based on file size not the data.  To get an accurate number you should read in the full file. 

In [2]:
nims_station_path = r"C:\Users\jpopelar\OneDrive - DOI\Documents\Kilauea\raw\NIMS_103_1305-1"
nc = NIMSCollection(nims_station_path)
nc.survey_id = "KL103"
runs = nc.get_runs(sample_rates=[1])
print(f"Found {len(runs)} station with {len(runs[list(runs.keys())[0]])} runs")
list(runs.keys())

Found 1 station with 1 runs


['KLA103']

In [3]:
for run_id, run_df in runs["KLA103"].items():
    display(run_df)

Unnamed: 0,survey,station,run,start,end,channel_id,component,fn,sample_rate,file_size,n_samples,sequence_number,dipole,coil_number,latitude,longitude,elevation,instrument_id,calibration_fn
0,KL103,KLA103,KLA103a,2023-07-13 00:31:53+00:00,2023-08-04 08:49:19+00:00,1,"hx,hy,hz,ex,ey,temperature",C:\Users\jpopelar\OneDrive - DOI\Documents\Kil...,8,252914688,15445171,1,"[41.2, 75.9]",,,,,NIMS,


## Build MTH5

Now that we have a logical collection of files, lets load them into an MTH5. As mentioned above, this can be handily accomplished using the `from_nims()` method. If you would prefer to load the metadata file by hand, you can simply loop of the stations, runs, and channels in the ordered dictionary.

There are a few things that to keep in mind if you opt for the latter method:  

- The NIMS raw files come with very little metadata, so as a user you will have to manually input most of it. 
- The resultant NIMS .bin file(s) are already calibrated into units of nT and mV/km (I think), therefore there are no filters to apply to calibrate the data. 
- Since this is a MTH5 file version 0.2.0 the filters are in the `survey_group` so add them there.

The process for doing this is very similar to the make_mth5_from_lemi424 example notebook. Please reference the routines there for an idea on how to accomplish a manual read.

In [4]:
mth5_path = MakeMTH5.from_nims(nims_station_path)

[1m2026-01-14T09:46:17.112509-0700 | INFO | mth5.mth5 | _initialize_file | Initialized MTH5 0.2.0 file C:\Users\jpopelar\OneDrive - DOI\Documents\Kilauea\raw\NIMS_103_1305-1\from_nims.h5 in mode w[0m
[1m2026-01-14T09:46:59.666740-0700 | INFO | mth5.mth5 | close_mth5 | Flushing and closing C:\Users\jpopelar\OneDrive - DOI\Documents\Kilauea\raw\NIMS_103_1305-1\from_nims.h5[0m


#### MTH5 Structure

Have a look at the MTH5 structure and make sure it looks correct.

In [5]:
m = MTH5()
m.open_mth5(mth5_path)

/:
    |- Group: Experiment
    --------------------
        |- Group: Reports
        -----------------
        |- Group: Standards
        -------------------
            --> Dataset: summary
            ......................
        |- Group: Surveys
        -----------------
            |- Group: none
            --------------
                |- Group: Filters
                -----------------
                    |- Group: coefficient
                    ---------------------
                        |- Group: dipole_41.20
                        ----------------------
                        |- Group: dipole_75.90
                        ----------------------
                        |- Group: e_analog_to_digital
                        -----------------------------
                        |- Group: h_analog_to_digital
                        -----------------------------
                        |- Group: to_mt_units
                        ---------------------
                 

### Channel Summary

Have a look at the channel summary and make sure everything looks good.

In [6]:
m.channel_summary.summarize()
m.channel_summary.to_dataframe()

Unnamed: 0,survey,station,run,latitude,longitude,elevation,component,start,end,n_samples,sample_rate,measurement_type,azimuth,tilt,units,has_data,hdf5_reference,run_hdf5_reference,station_hdf5_reference
0,none,KLA103,KLA103a,19.433237,-155.308795,1213.5,ex,2023-07-13 00:34:55+00:00,2023-08-04 08:52:12.875000+00:00,15445104,8.0,electric,0.0,0.0,counts,True,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
1,none,KLA103,KLA103a,19.433237,-155.308795,1213.5,ey,2023-07-13 00:34:55+00:00,2023-08-04 08:52:12.875000+00:00,15445104,8.0,electric,90.0,0.0,counts,True,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
2,none,KLA103,KLA103a,19.433237,-155.308795,1213.5,hx,2023-07-13 00:34:55+00:00,2023-08-04 08:52:12.875000+00:00,15445104,8.0,magnetic,0.0,0.0,counts,True,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
3,none,KLA103,KLA103a,19.433237,-155.308795,1213.5,hy,2023-07-13 00:34:55+00:00,2023-08-04 08:52:12.875000+00:00,15445104,8.0,magnetic,90.0,0.0,counts,True,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
4,none,KLA103,KLA103a,19.433237,-155.308795,1213.5,hz,2023-07-13 00:34:55+00:00,2023-08-04 08:52:12.875000+00:00,15445104,8.0,magnetic,0.0,0.0,counts,True,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>
5,none,KLA103,KLA103a,19.433237,-155.308795,1213.5,temperature,2023-07-13 00:34:55+00:00,2023-08-04 08:52:12.875000+00:00,15445104,8.0,auxiliary,0.0,0.0,celsius,True,<HDF5 object reference>,<HDF5 object reference>,<HDF5 object reference>


## Close the MTH5

This is important, you should close the file after you are done using it.  Otherwise bad things can happen if you try to open it with another program or Python interpreter.

In [7]:
m.close_mth5()

[1m2026-01-14T09:47:02.804014-0700 | INFO | mth5.mth5 | close_mth5 | Flushing and closing C:\Users\jpopelar\OneDrive - DOI\Documents\Kilauea\raw\NIMS_103_1305-1\from_nims.h5[0m
