# Build MTH5 from USGS Geomagnetic data

Its common to look at observatory data for geomagnetic storms or to use as a remote reference.  The USGS provides geomagnetic observatory data for observatories in North America.  In the future this will be expanded to the various other observatories using well developed packages like [geomagpy](https://pypi.org/project/geomagpy/). 

You will need to know ahead of time what observatories you would like to download data from, dates, and type of data.  There are no wildcards.  See [USGS Geomagnetic webservices](https://www.usgs.gov/tools/web-service-geomagnetism-data) for more information on allowed options.

Here we will download 2 days of data from 2 different observatories for the x and y components of calibrated data ('adjusted').

In [1]:
import pandas as pd
from mth5.clients import MakeMTH5

2023-03-23 15:13:26,224 [line 135] mth5.setup_logger - INFO: Logging file can be found C:\Users\jpeacock\OneDrive - DOI\Documents\GitHub\mth5\logs\mth5_debug.log




## Create a request DataFrame

The request input is in the form of a `pandas.DataFrame` with the following columns

| Column | Description | Options |
|--------|-------------|---------|
| observatory | Observatory code | BDT, BOU, TST, BRW, BRT, BSL, CMO, CMT, DED, DHT, FRD, FRN, GUA, HON, NEW, SHU, SIT, SJG, TUC,  USGS, BLC, BRD, CBB, EUA, FCC, IQA, MEA, OTT, RES, SNK, STJ, VIC, YKC, HAD, HER, KAK|
| type | The type of data to download | variation, adjusted, quasi-definitive, definitivevariation, adjusted (*default*), quasi-definitive, definitive |
| elements | Components or elements of the geomagnetic data to download, should be a list| D, DIST, DST, E, E-E, E-N, F, G, H, SQ, SV, UK1, UK2, UK3, UK4, X, Y, ZD, DIST, DST, E, E-E, E-N, F, G, H, SQ, SV, UK1, UK2, UK3, UK4, X, Y, Z |
| sampling_period | Sampling period of data to download in seconds | 1, 60, 3600 |
| start | Start time (YYYY-MM-DDThh:mm:ss)  in UTC time| |
| end | End time (YYYY-MM-DDThh:mm:ss) in UTC time||

In [2]:
request_df = pd.DataFrame(
            {
                "observatory": ["frn", "frn", "ott", "ott"],
                "type": ["adjusted"] * 4,
                "elements": [["x", "y"]] * 4,
                "sampling_period": [1] * 4,
                "start": [
                    "2022-01-01T00:00:00",
                    "2022-01-03T00:00:00",
                    "2022-01-01T00:00:00",
                    "2022-01-03T00:00:00",
                ],
                "end": [
                    "2022-01-02T00:00:00",
                    "2022-01-04T00:00:00",
                    "2022-01-02T00:00:00",
                    "2022-01-04T00:00:00",
                ],
            }
        )

In [3]:
request_df

Unnamed: 0,observatory,type,elements,sampling_period,start,end
0,frn,adjusted,"[x, y]",1,2022-01-01T00:00:00,2022-01-02T00:00:00
1,frn,adjusted,"[x, y]",1,2022-01-03T00:00:00,2022-01-04T00:00:00
2,ott,adjusted,"[x, y]",1,2022-01-01T00:00:00,2022-01-02T00:00:00
3,ott,adjusted,"[x, y]",1,2022-01-03T00:00:00,2022-01-04T00:00:00


## Adding Run ID

When the request is input automatically run names will be assigned to different windows of time by `f"sp{sampling_period}_{count:03}"`. So the first run is `sp1_001`  