# Standard Name Convention

Standard name conventions define how meta information is called what syntax is accepted. Intentionally this results in efficient, automatated and clear data exploration and processing.

Most basically, a name identifier should be defined as an attribute of every dataset in an HDF file. A popular one is "standard_name" as used by the climate and forecast community. It e.g. does not allow space in standard names and must be lower case. Furthermore, the construction of it is defined in online documentations and naming tables (standard name tables) provide standard names currently excepted by the community. This packages adopts this concept by introducing standardized name tables (class `StandardzedNameTable`) which allows flexible usage of such name definitions.

In [1]:
import h5rdmtoolbox as h5tbx

Whenever a dataset is written and the parameter "standard_name" is set, it is verified against the standard name convention/table associated with the wrapper class. If the constant `STRICT` is set to True (default), the name is looked-up in the table and, if not found, the dataset cannot be written. To allow standard names, that fulfill the spelling requirements but are not yet listed in the table, set `STRICT` to False:

In [2]:
h5tbx.conventions.identifier.STRICT = False

## Initialize a Standard Name Convention
A standardized name table is a XML document, which contains (at least) a description and a canonical unit for a standarized name. We'll build one from scratch first and then have a look into already implemented ones:

Call `StandardizedNameTable` from the sub-package `conventions` and provide a `name`, `version`, `table_dict`, `contact` and and `insitution`:

In [3]:
sc = h5tbx.conventions.StandardizedNameTable(name='Test_SNC', table_dict={}, version_number=1, contact='contact@python.com', institution='my_institution')
sc

Test_SNC (version number: 1)

We have built an empty convention (no table content). Lets add content. We can do this by creating a dictionary first...

In [4]:
tabledict = {'x_velocity': dict(canonical_units='m/s', description='velocity is a vector quantity.')}
tabledict

{'x_velocity': {'canonical_units': 'm/s',
  'description': 'velocity is a vector quantity.'}}

... and add it to the object by calling `update()`:

In [5]:
sc.update(tabledict)
sc.dump()

Unnamed: 0,canonical_units,description
x_velocity,m/s,velocity is a vector quantity.


New entries cann be assigned by using `set` or `modified` depending on whether the entry already exists or not:

In [6]:
sc.set('time', canonical_units='s', description='physical time')
sc.modify('x_velocity', canonical_units='m/s', description='velocity is a vector quantity. x indicates the component in y-axis direction')
sc.set('y_velocity', canonical_units='m/s', description='velocity is a vector quantity. y indicates the component in y-axis direction')
sc.set('z_velocity', canonical_units='m/s', description='velocity is a vector quantity. z indicates the component in z-axis direction')
sc.dump()

Unnamed: 0,canonical_units,description
time,s,physical time
x_velocity,m/s,velocity is a vector quantity. x indicates the component in y-axis direction
y_velocity,m/s,velocity is a vector quantity. y indicates the component in y-axis direction
z_velocity,m/s,velocity is a vector quantity. z indicates the component in z-axis direction


## Writing Standard Name Convention to XML

Standardized name tables should be provided as xml documents:

In [7]:
xml_filename = h5tbx.generate_temporary_filename(suffix='.xml')
sc.to_xml(xml_filename)

WindowsPath('C:/Users/da4323/AppData/Local/h5rdmtoolbox/h5rdmtoolbox/tmp/tmp202/tmp0.xml')

## Load Standard Name Convention from XML

In [8]:
sc_test = h5tbx.conventions.StandardizedNameTable.from_xml(xml_filename)
print(sc_test.versionname)
sc_test.dump()

Test_SNC-v1


Unnamed: 0,canonical_units,description
time,s,physical time
x_velocity,m/s,velocity is a vector quantity. x indicates the component in y-axis direction
y_velocity,m/s,velocity is a vector quantity. y indicates the component in y-axis direction
z_velocity,m/s,velocity is a vector quantity. z indicates the component in z-axis direction


## Existing Standard Name Convention
There are already a domain-specific convention provided by the repository for fluid problems. As they are far from complete and under current deelopment at the current stage, we rather have a look into the cf-conventions from which the concept is adopted. We can download the current XML document from https://cfconventions.org/Data/cf-standard-names/79/src/cf-standard-name-table.xml by using the `turoial` module of this repo:

In [9]:
cf_xml_filename = h5tbx.tutorial.Conventions.fetch_cf_standard_name_table()
cf_xml_filename

standard_name_table (version number: 79)

In [10]:
cf_xml_filename.dump(max_rows=4)

Unnamed: 0,canonical_units,grib,amip,description
acoustic_signal_roundtrip_travel_time_in_sea_water,s,,,"The quantity with standard name acoustic_signal_roundtrip_travel_time_in_sea_water is the time taken for an acoustic signal to propagate from the emitting instrument to a reflecting surface and back again to the instrument. In the case of an instrument based on the sea floor and measuring the roundtrip time to the sea surface, the data are commonly used as a measure of ocean heat content."
aerodynamic_particle_diameter,m,,,The diameter of a spherical particle with density 1000 kg m-3 having the same aerodynamic properties as the particles in question.
...,...,...,...,...
y_wind_gust,m s-1,,,"""y"" indicates a vector component along the grid y-axis, positive with increasing y. Wind is defined as a two-dimensional (horizontal) air velocity vector, with no vertical component. (Vertical motion in the atmosphere has the standard name upward_air_velocity.) A gust is a sudden brief period of high wind speed. In an observed time series of wind speed, the gust wind speed can be indicated by a cell_methods of maximum for the time-interval. In an atmospheric model which has a parametrised calculation of gustiness, the gust wind speed may be separately diagnosed from the wind speed."
zenith_angle,degree,,,Zenith angle is the angle to the local vertical; a value of zero is directly overhead.


In [11]:
cf_xml_filename.check_name('zenith_angle', strict=True)

True

In [12]:
cf_xml_filename['x_wind_gust'].canonical_units

In [13]:
try:
    cf_xml_filename.check_units('x_wind_gust', units='m/s')
except h5tbx.conventions.StandardizedNameError as e:
    print(e)

In [14]:
from h5rdmtoolbox.conventions.identifier import _units_power_fix

In [15]:
_units_power_fix('degree')

'degree'

In [16]:
try:
    cf_xml_filename.check_units('zenith_angle', units='K')
except h5tbx.conventions.StandardizedNameError as e:
    print(e)
cf_xml_filename.check_units('zenith_angle', units='degree')

Unit of standard name "zenith_angle" not as expected: "K" != "deg"


True