# Check Compatibility of Input File

## License Information

This file is part of _hvsrpy_, a Python package for horizontal-to-vertical spectral ratio processing.

    Copyright (C) 2019-2025 Joseph P. Vantassel (joseph.p.vantassel@gmail.com)

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.|

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <https: //www.gnu.org/licenses/>.
    
## About _hvsrpy_

_hvsrpy_ is an open-source Python package for performing horizontal-to-vertical spectral ratio (HVSR) processing of microtremor and earthquake recordings. _hvsrpy_ was developed by [Joseph P. Vantassel](https://www.jpvantassel.com/) with contributions from Dana M. Brannon under the supervision of Brady R. Cox at The University of Texas at Austin. _hvsrpy_ continues to be developed and maintained by [Joseph P. Vantassel and his research group at Virginia Tech](https://geoimaging-research.org/).

## Citation

If you use _hvsrpy_ in your research or consulting, we ask you please cite the following:

> Vantassel, J.P. (2025). "_hvsrpy_: An Open‐Source Python Package for Microtremor
> and Earthquake Horizontal‐to‐Vertical Spectral Ratio Processing". Seismological
> Research Letters. 96 (4): 2671–2682,
> [https://doi.org/10.1785/0220240395](https://doi.org/10.1785/0220240395)

> Joseph Vantassel. (2020). jpvantassel/hvsrpy: latest (Concept). Zenodo.
> [http://doi.org/10.5281/zenodo.3666956](http://doi.org/10.5281/zenodo.3666956)

_For software, version specific citations should be preferred to
general concept citations. To generate a version specific citation
for `hvsrpy`, please use the citation tool on the `hvsrpy`
[archive](http://doi.org/10.5281/zenodo.3666956)._

## About this notebook

This notebook checks an example data file for common formatting issues. If a formatting issue is identified, the notebook suggests how to fix the file so that it can be ready by _hvsrpy_.

## Getting Started

1. Install _hvsrpy_, with `pip install hvsrpy`. More information on _pip_ can be found [here](https://jpvantassel.github.io/python3-course/#/intro/pip). __(~3 minutes)__
2. Run the provided example __(~2 minutes)__
3. Swap the provided example with your own. __(~2 minutes)__

Happy Processing!

In [1]:
import pathlib

from termcolor import colored
import obspy
import hvsrpy

## User Input

User provided path to the file for processing. The file can be of the following types:
- MiniSEED (MSEED)
- Guralp Compressed Format (GCF)
- Seismic Analysis Code (SAC)

In [2]:
# Inputs -------------
# Option 1: for three-components in a single file
fname_set = ["./data/UT.STN11.A2_C150.miniseed"]

# Option 2: for three, single-component files
# fname_set = [f"./data/453016990.0001.{c}.miniseed" for c in "NEZ"]

# --------------------
print()
print(f"{len(fname_set)} file(s) names provided:")
for fname in fname_set:
    print(f"  {fname}")

for file in fname_set:
    if not pathlib.Path(file).exists():
        raise FileNotFoundError(f"file {file} not found; check spelling.")
print(colored("All file(s) exist.", "green"))
print()


1 file(s) names provided:
  ./data/UT.STN11.A2_C150.miniseed
[32mAll file(s) exist.[0m



## Check File Type

Check if file can be read by `obspy`. Check to see if any special read arguments (`obspy_read_kwargs`) are needed.

In [3]:
# Inputs -------------
obspy_read_kwargs = dict()

# --------------------
failed = False
for file in fname_set:
    try:
        obspy.read(file, **obspy_read_kwargs)
    except Exception:
        failed = True
        break

print()
if failed:
    print(colored("All file(s) were not able to be read by ObsPy.", "red"))
    print("  1. Check if your file is in MiniSEED, CGF, or SAC format.")
    print("  2. Try supplying the file type as part of obspy_read_kwargs, e.g., obspy_read_kwargs = dict(format='MSEED').")
    print("  3. Try reading your file with Geopsy (geopsy.org) and export it as MiniSEED.")
else:
    print(colored("All file(s) were able to be read by ObsPy.", "green"))
print()


[32mAll file(s) were able to be read by ObsPy.[0m



## Check Number of Traces

Check if the data provided supplies exactly three traces of data, one for each component of the sensor.

In [4]:
print()
if len(fname_set) == 1:
    print("1 file(s) names provided:")
    stream = obspy.read(fname_set[0], **obspy_read_kwargs)
    print(f"  {fname_set[0]}")
    if len(stream) != 3:
        print(colored(f"    File must contain 3 traces of data, but {len(stream)} straces were found.", "red"))
        print("      1. Try removing extra components from ObsPy Stream object.")
        print("      2. Try merging together components, e.g., stream.merge()")
    else:
        print(colored(f"    File has 3 components as expected.", "green"))
elif len(fname_set) == 3:
    print("3 file(s) names provided, checking each:")
    for fname in fname_set:
        print(f"  {fname}")
        stream = obspy.read(fname, **obspy_read_kwargs)
        if len(stream) != 1:
            print(colored(f"    File must contain 1 traces of data, but {len(stream)} traces were found.", "red"))
            print("      1. Try removing extra components from ObsPy Stream object.")
            print("      2. Try merging together components, e.g., stream.merge()")
        else:
            print(colored(f"    File has 1 component as expected.", "green"))
else:
    print(colored(f"You can only provide 1, 3-component file or 3, 1-component file, but {len(fname_set)} were provided.", "red"))
print()


1 file(s) names provided:
  ./data/UT.STN11.A2_C150.miniseed
[32m    File has 3 components as expected.[0m



## Check Component Naming

Check each component name is any of the following: NEZ, 123, 12Z, XYZ.

In [5]:
# Inputs -------------
degrees_from_north = None

# --------------------
print()
if len(fname_set) == 1:
    stream = obspy.read(fname_set[0], **obspy_read_kwargs)
elif len(fname_set) == 3:
    traces = []
    for fname in fname_set:
        _stream = obspy.read(fname, **obspy_read_kwargs)
        if len(_stream) != 1:
            raise ValueError(f"File must contain 1 trace of data, but {len(_stream)} traces were found.")
        traces.append(_stream[0])
    stream = obspy.Stream(traces)
else:
    raise ValueError(f"You can only provide 1, 3-component file or 3, 1-component file, but {len(fname_set)} were provided.")

try:
    hvsrpy.data_wrangler._orient_traces(stream, degrees_from_north)
except ValueError:
    failed=True
else:
    failed=False

print("The provided components are:")
for trace in stream:
    print(f"  {trace.meta.channel}")
if failed:
    print(colored(f"The components provided are not valid:.", "red"))
    print("  1. If the components end in '123' or '12Z', try providing the sensors orientation using degrees_from_north.")
    print("  2. Check that the data are from a three-component sensor.")
    print("  3. Try renaming the components such that they end with 'NEZ', '123', '12Z', or 'XYZ'.")
else:
    print(colored(f"The components provided are self-consistent and acceptable.", "green"))
print()


The provided components are:
  BHN
  BHE
  BHZ
[32mThe components provided are self-consistent and acceptable.[0m



## Check Start and End Times

Check that the traces have an overlapped period of recording.

In [6]:
start_times_utc = [trace.stats.starttime for trace in stream]
end_times_utc = [trace.stats.endtime for trace in stream]

print()
print("The start and end time of each trace in UTC is:")
for start_time, end_time in zip(start_times_utc, end_times_utc):
    print(f"  {start_time} - {end_time}")

duration_in_seconds = min(end_times_utc) - max(start_times_utc)
if duration_in_seconds < 0.1:
    print(colored(f"There is no segement where all three components are recording at the same time", "red"))
    print("  1. Check the start times were correctly recorded by the sensor.")
    print("  2. Check the sensor's state-of-health to confirm the sensor was working correctly.")
    print("  3. Try a recording from another sensor.")
    print("  4. Try a second recording with the same sensor.")
else:
    duration_in_minutes = duration_in_seconds/60
    print(colored(f"{duration_in_minutes:.1f} minutes of three-component data were recorded.", "green"))
print()


The start and end time of each trace in UTC is:
  2017-05-04T07:00:00.000000Z - 2017-05-04T08:00:00.000000Z
  2017-05-04T07:00:00.000000Z - 2017-05-04T08:00:00.000000Z
  2017-05-04T07:00:00.000000Z - 2017-05-04T08:00:00.000000Z
[32m60.0 minutes of three-component data were recorded.[0m

