# Standard Name Convention

Standard name conventions define how meta information is called what syntax is accepted. Intentionally this results in efficient, automatated and clear data exploration and processing.

Most basically, a name identifier should be defined as an attribute of every dataset in an HDF file. A popular one is "standard_name" as used by the climate and forecast community. It e.g. does not allow space in standard names and must be lower case. Furthermore, the construction of it is defined in online documentations and naming tables (standard name tables) provide standard names currently excepted by the community. This packages adopts this concept by introducing standardized name tables (class `StandardzedNameTable`) which allows flexible usage of such name definitions.

In [None]:
import h5rdmtoolbox as h5tbx

Whenever a dataset is written and the parameter "standard_name" is set, it is verified against the standard name convention/table associated with the wrapper class. If the constant `STRICT` is set to True (default), the name is looked-up in the table and, if not found, the dataset cannot be written. To allow standard names, that fulfill the spelling requirements but are not yet listed in the table, set `STRICT` to False:

In [None]:
h5tbx.conventions.identifier.STRICT = False

## Initialize a Standard Name Convention
A standardized name table is a XML document, which contains (at least) a description and a canonical unit for a standarized name. We'll build one from scratch first and then have a look into already implemented ones:

Call `StandardizedNameTable` from the sub-package `conventions` and provide a `name`, `version`, `table_dict`, `contact` and and `insitution`:

In [None]:
sc = h5tbx.conventions.StandardizedNameTable(name='Test_SNC', table_dict={}, version_number=1, contact='contact@python.com', institution='my_institution')
sc

We have built an empty convention (no table content). Lets add content. We can do this by creating a dictionary first...

In [None]:
tabledict = {'x_velocity': dict(canonical_units='m/s', description='velocity is a vector quantity.')}
tabledict

... and add it to the object by calling `update()`:

In [None]:
sc.update(tabledict)
sc.dump()

New entries can be assigned by using `set` or `modified` depending on whether the entry already exists or not:

In [None]:
sc.set('time', canonical_units='s', description='physical time')
sc.modify('x_velocity', canonical_units='m/s', description='velocity is a vector quantity. x indicates the component in y-axis direction')
sc.set('y_velocity', canonical_units='m/s', description='velocity is a vector quantity. y indicates the component in y-axis direction')
sc.set('z_velocity', canonical_units='m/s', description='velocity is a vector quantity. z indicates the component in z-axis direction')
sc.sdump()

## Writing Standard Name Convention to file

Standardized name tables should be saved as xml documents or yaml-files:

In [None]:
xml_filename = h5tbx.generate_temporary_filename(suffix='.xml')
sc.to_xml(xml_filename)

yml_filename = h5tbx.generate_temporary_filename(suffix='.yml')
sc.to_yaml(yml_filename)
pass

## Load Standard Name Convention from file

If you have standard name tables to your hand, just load them. They must be provided as XML or YML:

In [None]:
sc_test = h5tbx.conventions.StandardizedNameTable.from_xml(xml_filename)
print(sc_test.versionname)
sc_test.dump()

sc_test = h5tbx.conventions.StandardizedNameTable.from_yml(yml_filename)
print(sc_test.versionname)
sc_test.dump()

## Load from web
Optimally a community has defined a naming conventions, just like the cfconventions from where the concept is adoped. Let's imort their latest xml document:

In [None]:
cf_xml_filename = h5tbx.conventions.StandardizedNameTable.from_web(url='https://cfconventions.org/Data/cf-standard-names/79/src/cf-standard-name-table.xml')
cf_xml_filename

In [None]:
cf_xml_filename.dump(max_rows=4)

## Perform checks
A naming convention can be used to test new standard names, whether they comply with it or not:

In [None]:
cf_xml_filename.check_name('zenith_angle', strict=True)

In [None]:
cf_xml_filename['x_wind_gust'].canonical_units

In [None]:
try:
    cf_xml_filename.check_units('x_wind_gust', units='m/s')
except h5tbx.conventions.StandardizedNameError as e:
    print(e)

In [None]:
from h5rdmtoolbox.conventions.identifier import _units_power_fix

In [None]:
_units_power_fix('degree')

In [None]:
try:
    cf_xml_filename.check_units('zenith_angle', units='K')
except h5tbx.conventions.StandardizedNameError as e:
    print(e)
cf_xml_filename.check_units('zenith_angle', units='degree')