# Example of Creating Metadata for MTH5

This example will cover producing MTH5 metadata from various inputs.

These examples will work for any mth5.metadata class object.  Of importance to an MT survey are:
    - Survey
    - Station
    - Run
    - Electric
    - Magnetic
    - Auxiliary


In [1]:
# this is a hack for now, once we have a real package with an install this should be removed.
import os
from pathlib import Path

# change this to your local mth5 repository
os.chdir(r"c:\Users\jpeacock\Documents\GitHub\mth5")

# import mth5 metadata
from mth5 import metadata

2020-12-04T10:36:52 [line 23] mth5.<module> - INFO: Started MTH5
2020-12-04T10:36:52 [line 28] mth5.<module> - INFO: Debug Log file can be found at c:\Users\jpeacock\Documents\GitHub\mth5\mth5_debug.log
2020-12-04T10:36:52 [line 29] mth5.<module> - INFO: Error Log file can be found at c:\Users\jpeacock\Documents\GitHub\mth5\mth5_error.log


### *Note*: 

The important part of the metadata is to make sure that the key names match the standards, otherwise you will run into some issues.  If you do not know what the standard keys are for a particular metadata instance you can always do the following.  This will work for any mth5.metadata class objects.

In [21]:
survey = metadata.Survey()
survey.get_attribute_list()

['acquired_by.author',
 'acquired_by.comments',
 'citation_dataset.doi',
 'citation_journal.doi',
 'comments',
 'country',
 'datum',
 'fdsn.channel_code',
 'fdsn.id',
 'fdsn.network',
 'fdsn.new_epoch',
 'geographic_name',
 'name',
 'northwest_corner.latitude',
 'northwest_corner.longitude',
 'project',
 'project_lead.author',
 'project_lead.email',
 'project_lead.organization',
 'release_license',
 'southeast_corner.latitude',
 'southeast_corner.longitude',
 'summary',
 'survey_id',
 'time_period.end_date',
 'time_period.start_date']

If you are unsure of what a key name means you can do the following to get information about a certain attribute:

In [22]:
survey.attribute_information("release_license")

release_license:
	alias: []
	description: How the data can be used. The options are based on Creative Commons licenses. For details visit https://creativecommons.org/licenses/
	example: CC-0
	options: ['CC-0', 'CC-BY', 'CC-BY-SA', 'CC-BY-ND', 'CC-BY-NC-SA', 'CC-BY-NC-ND']
	required: True
	style: controlled vocabulary
	type: string
	units: None


In [26]:
survey.attribute_information("northwest_corner.longitude")

northwest_corner.longitude:
	alias: ['lon', 'long']
	description: longitude of location in datum specified at survey level
	example: 14.23
	options: []
	required: True
	style: number
	type: float
	units: degrees


## Create example files to read in
This includes a csv and json files.

In [13]:
# create an example csv string
example_csv = ["id,location.latitude,location.longitude,location.elevation,time_period.start,time_period.end",
               "mt01,10.0,12.0,100,2020-01-01,2020-01-02",
               "mt02,15,25,150,2020-01-05,2020-01-10"]

# write to a file
example_csv_fn = Path(Path.cwd(), "example.csv")
with open(example_csv_fn, 'w') as fid:
    fid.write("\n".join(example_csv))
print(f" --> Wrote example to {example_csv_fn}")

print("\n".join(example_csv))

 --> Wrote example to c:\Users\jpeacock\Documents\GitHub\mth5\example.csv
id,location.latitude,location.longitude,location.elevation,time_period.start,time_period.end
mt01,10.0,12.0,100,2020-01-01,2020-01-02
mt02,15,25,150,2020-01-05,2020-01-10


In [36]:
import json

# create an example JSON file
example_json = '{"station": {"id": "mt01", "location.elevation": 100.0, "location.latitude": 10.0, "location.longitude": 12.0, "time_period.end": "2020-01-02T00:00:00+00:00", "time_period.start": "2020-01-01T00:00:00+00:00"}}' 
example_json_fn = Path(Path.cwd(), "example.json")
with open(example_json_fn, 'w') as fid:
    json.dump(json.loads(example_json), fid)
    
print(f" --> Wrote example to {example_json_fn}")

print(example_json)

 --> Wrote example to c:\Users\jpeacock\Documents\GitHub\mth5\example.json
{"station": {"id": "mt01", "location.elevation": 100.0, "location.latitude": 10.0, "location.longitude": 12.0, "time_period.end": "2020-01-02T00:00:00+00:00", "time_period.start": "2020-01-01T00:00:00+00:00"}}


## Create metadata from a csv file
This would be if you have a table of metadata that were collected into something like a spreadsheet using Excel or another program.

### Read in the csv string into a list of dictionaries that we can use to input into the metadata.  

In [3]:
keys = example_csv[0].split(',')
list_of_dictionaries = []
for line in example_csv[1:]:
    line_dict = {}
    for key, value in zip(keys, line.split(',')):
        line_dict[key] = value
    list_of_dictionaries.append(line_dict)
    
print(list_of_dictionaries)   

[{'id': 'mt01', 'location.latitude': '10.0', 'location.longitude': '12.0', 'location.elevation': '100', 'time_period.start': '2020-01-01', 'time_period.end': '2020-01-02'}, {'id': 'mt02', 'location.latitude': '15', 'location.longitude': '25', 'location.elevation': '150', 'time_period.start': '2020-01-05', 'time_period.end': '2020-01-10'}]


#### Input into MTH5 station metadata

We'll do one at a time as an example, later you can loop over each entry to create a separate metadata instance.

In [27]:
csv_station = metadata.Station()
csv_station.from_dict(list_of_dictionaries[0])
csv_station

{
    "station": {
        "acquired_by.author": null,
        "channels_recorded": [],
        "data_type": null,
        "geographic_name": null,
        "id": "mt01",
        "location.declination.model": null,
        "location.declination.value": null,
        "location.elevation": 100.0,
        "location.latitude": 10.0,
        "location.longitude": 12.0,
        "orientation.method": null,
        "orientation.reference_frame": "geographic",
        "provenance.creation_time": "2020-12-04T19:18:31.029691+00:00",
        "provenance.software.author": null,
        "provenance.software.name": null,
        "provenance.software.version": null,
        "provenance.submitter.author": null,
        "provenance.submitter.email": null,
        "provenance.submitter.organization": null,
        "time_period.end": "2020-01-02T00:00:00+00:00",
        "time_period.start": "2020-01-01T00:00:00+00:00"
    }
}

### Read in csv using Pandas

In [28]:
import pandas as pd

example_df = pd.read_csv(example_csv_fn, header=0)

print(example_df)

     id  location.latitude  location.longitude  location.elevation  \
0  mt01               10.0                12.0                 100   
1  mt02               15.0                25.0                 150   

  time_period.start time_period.end  
0        2020-01-01      2020-01-02  
1        2020-01-05      2020-01-10  


In [29]:
pandas_station = metadata.Station()
pandas_station.from_dict(example_df.iloc[0].to_dict())
pandas_station

{
    "station": {
        "acquired_by.author": null,
        "channels_recorded": [],
        "data_type": null,
        "geographic_name": null,
        "id": "mt01",
        "location.declination.model": null,
        "location.declination.value": null,
        "location.elevation": 100.0,
        "location.latitude": 10.0,
        "location.longitude": 12.0,
        "orientation.method": null,
        "orientation.reference_frame": "geographic",
        "provenance.creation_time": "2020-12-04T19:18:38.060924+00:00",
        "provenance.software.author": null,
        "provenance.software.name": null,
        "provenance.software.version": null,
        "provenance.submitter.author": null,
        "provenance.submitter.email": null,
        "provenance.submitter.organization": null,
        "time_period.end": "2020-01-02T00:00:00+00:00",
        "time_period.start": "2020-01-01T00:00:00+00:00"
    }
}

## Create metadata from JSON file
If you have your data stored in a json file these can be read in using the json package, which reads into dictionaries.

In [37]:
with open(example_json_fn, "r") as fid:
    example_json_dict = json.load(fid)
    
print(example_json_dict)
    
station_json = metadata.Station()
station_json.from_dict(example_json_dict)
station_json

{'station': {'id': 'mt01', 'location.elevation': 100.0, 'location.latitude': 10.0, 'location.longitude': 12.0, 'time_period.end': '2020-01-02T00:00:00+00:00', 'time_period.start': '2020-01-01T00:00:00+00:00'}}


{
    "station": {
        "acquired_by.author": null,
        "channels_recorded": [],
        "data_type": null,
        "geographic_name": null,
        "id": "mt01",
        "location.declination.model": null,
        "location.declination.value": null,
        "location.elevation": 100.0,
        "location.latitude": 10.0,
        "location.longitude": 12.0,
        "orientation.method": null,
        "orientation.reference_frame": "geographic",
        "provenance.creation_time": "2020-12-04T19:25:30.486927+00:00",
        "provenance.software.author": null,
        "provenance.software.name": null,
        "provenance.software.version": null,
        "provenance.submitter.author": null,
        "provenance.submitter.email": null,
        "provenance.submitter.organization": null,
        "time_period.end": "2020-01-02T00:00:00+00:00",
        "time_period.start": "2020-01-01T00:00:00+00:00"
    }
}