# Semantic Interoperability

This is a story about Bob and Alice. Bob is working hard to generate data. He collects information from sensor systems and publish the data. Alice is an analytician and consumes data for running her forecasting and business support systems. Alice collects information from different systems, but the data is organized and structured differently for each provider. However, she knows how the input of her analysis tools should be structured. 

Bob is producing data for a lot of consumers, all using different systems and software. To ensure his knowledge about what data is provided, how the data was aquired, which analysis methods were used, what unit-system and representations has been encoded etc he documents all this information in a machine readable way. 

Alice does not have time or the knowledge to adopt Bobs data to her analytics tool, but she realizes that since machines are able to interpret Bobs data she can document her requirements in a similar fashion, and then let the computers figure out how to brigde the two representations. Alice discovers that such systems actually exists, and learns that they are called "frameworks for semantic interoperability". Eager to learn how this actually works, she asks Bob to give her a documented dataset.

Bob is busy, but explains that there exist a set of example CSV-files on the internet she can play with, and insist the CSV files contains enough information in them for her to get started.

In [None]:
# https://cds.climate.copernicus.eu/

This dataset provides bias-corrected reconstruction of near-surface meteorological variables derived from the fifth generation of the European Centre for Medium-Range Weather Forecasts (ECMWF) atmospheric reanalyses (ERA5). It is intended to be used as a meteorological forcing dataset for land surface and hydrological models.

The dataset has been obtained using the same methodology used to derive the widely used water, energy and climate change (WATCH) forcing data, and is thus also referred to as WATCH Forcing Data methodology applied to ERA5 (WFDE5). The data are derived from the ERA5 reanalysis product that have been re-gridded to a half-degree resolution. Data have been adjusted using an elevation correction and monthly-scale bias corrections based on Climatic Research Unit (CRU) data (for temperature, diurnal temperature range, cloud-cover, wet days number and precipitation fields) and Global Precipitation Climatology Centre (GPCC) data (for precipitation fields only). Additional corrections are included for varying atmospheric aerosol-loading and separate precipitation gauge observations. For full details please refer to the product user-guide.

This dataset was produced on behalf of Copernicus Climate Change Service (C3S) and was generated entirely within the Climate Data Store (CDS) Toolbox. The toolbox source code is provided in the documentation tab.

Name | Units | Description
---|---|---
Grid-point altitude	 | m | The altitude of each grid-point. Values correspond to altitudes of CRU grid-points.
Near-surface air temperature | K | The temperature of air at 2 metres above the surface of land, sea or inland waters. Values are derived from ERA5 2m air temperature with an elevation correction and bias correction using CRU mean monthly temperature and mean diurnal temperature range.
Near-surface specific humidity | kg kg-1 | The amount of moisture in the air divided by amount of air plus moisture at that location. Values are derived from ERA5 vapor pressure and saturation vapor pressure with an elevation correction.
Near-surface wind speed | m s-1 | The horizontal speed of the wind, or movement of air, at a height of 10 metres above the surface of the Earth. Values are derived from ERA5 near-surface wind speed.
Rainfall flux | kg m-2 s-1 | The rate of rain that falls to the Earth's surface. Values are derived from ERA5 total precipitation and snowfall and are bias corrected primarily using precipitation data from CRU and GPCC.
Snowfall flux | kg m-2 s-1 | The rate of snow that falls to the Earth's surface. Values are derived from ERA5 total precipitation and snowfall and are bias corrected primarily using precipitation data from CRU and GPCC.
Surface air pressure | Pa | The pressure (force per unit area) of the atmosphere at the surface of land, sea and inland water. Values are derived from ERA5 surface air pressure with an elevation correction.
Surface downwelling longwave radiation | W m-2 | The amount of thermal (also known as longwave or terrestrial) radiation emitted by the atmosphere and clouds that reaches a horizontal plane at the surface of the Earth. Values are derived from ERA5 surface downwelling longwave radiation with an elevation correction.
Surface downwelling shortwave radiation | W m-2 | The amount of solar radiation (also known as shortwave radiation) that reaches a horizontal plane at the surface of the Earth. This parameter comprises both direct and diffuse solar radiation. Values are derived from ERA5 surface downwelling shortwave radiation and bias corrected using CRU cloud cover and effects of inter-annual changes in atmospheric aerosol loading.

In [None]:
from pydantic import BaseModel
from typing import List

class Snowfall(BaseModel):
    lat  : List[float]
    lon  : List[float]
    time : List[float]
    Snowf : 
    

In [1]:
import netCDF4
import numpy as np
f = netCDF4.Dataset('datasets/Snowf_WFDE5_CRU+GPCC_201901_v2.0.nc')
print (f)

<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4 data model, file format HDF5):
    title: WATCH Forcing Data methodology applied to ERA5 data
    institution: Copernicus Climate Change Service
    contact: http://copernicus-support.ecmwf.int
    comment: Methodology implementation for ERA5 and dataset production by B-Open Solutions for the Copernicus Climate Change Service in the context of contract C3S_25c
    Conventions: CF-1.7
    summary: ERA5 data regridded to half degree regular lat-lon; Genuine land points from CRU grid and ERA5 land-sea mask only; Snowf bias-corrected using CRU TS4.04 wet days & GPCCv2020 precip totals, catch correction, and precip phase correction according to elevation and bias-corrected Tair
    reference: Cucchi et al., 2020, Earth Syst. Sci. Data, 12(3), 2097–2120, doi:10.5194/essd-12-2097-2020; Weedon et al., 2014, Water Resources Res., 50, 7505-7514, doi:10.1002/2014WR015638; Harris et al., 2020, Scientific Data, 7(1), doi:10.1038/s41597-020-0453

In [10]:
for key in (f.variables.keys()):    
    variable = f.variables[key]
    print (key, variable, '\n')

lat <class 'netCDF4._netCDF4.Variable'>
float64 lat(lat)
    _FillValue: nan
    long_name: Latitude
    units: degrees_north
    standard_name: latitude
    axis: Y
unlimited dimensions: 
current shape = (360,)
filling on 

lon <class 'netCDF4._netCDF4.Variable'>
float64 lon(lon)
    _FillValue: nan
    long_name: Longitude
    units: degrees_east
    standard_name: longitude
    axis: X
    type: double
    valid_max: 360.0
    valid_min: -180.0
unlimited dimensions: 
current shape = (720,)
filling on 

time <class 'netCDF4._netCDF4.Variable'>
int64 time(time)
    standard_name: time
    long_name: Time
    axis: T
    units: hours since 1900-01-01
    calendar: proleptic_gregorian
unlimited dimensions: 
current shape = (744,)
filling on, default _FillValue of -9223372036854775806 used 

Snowf <class 'netCDF4._netCDF4.Variable'>
float32 Snowf(time, lat, lon)
    _FillValue: 1e+20
    units: kg m-2 s-1
    long_name: Snowfall Flux
    standard_name: snowfall_flux
unlimited dimensions:

In [7]:
for dim in f.dimensions.items():
    print(dim)

('lat', <class 'netCDF4._netCDF4.Dimension'>: name = 'lat', size = 360)
('lon', <class 'netCDF4._netCDF4.Dimension'>: name = 'lon', size = 720)
('time', <class 'netCDF4._netCDF4.Dimension'>: name = 'time', size = 744)


In [12]:
snow = f.variables['Snowf']
snow[:]

masked_array(
  data=[[[0.0, 0.0, 0.0, ..., 0.0, 0.0, 0.0],
         [0.0, 0.0, 0.0, ..., 0.0, 0.0, 0.0],
         [0.0, 0.0, 0.0, ..., 0.0, 0.0, 0.0],
         ...,
         [--, --, --, ..., --, --, --],
         [--, --, --, ..., --, --, --],
         [--, --, --, ..., --, --, --]],

        [[0.0, 0.0, 0.0, ..., 0.0, 0.0, 0.0],
         [0.0, 0.0, 0.0, ..., 0.0, 0.0, 0.0],
         [0.0, 0.0, 0.0, ..., 0.0, 0.0, 0.0],
         ...,
         [--, --, --, ..., --, --, --],
         [--, --, --, ..., --, --, --],
         [--, --, --, ..., --, --, --]],

        [[0.0, 0.0, 0.0, ..., 0.0, 0.0, 0.0],
         [0.0, 0.0, 0.0, ..., 0.0, 0.0, 0.0],
         [0.0, 0.0, 0.0, ..., 0.0, 0.0, 0.0],
         ...,
         [--, --, --, ..., --, --, --],
         [--, --, --, ..., --, --, --],
         [--, --, --, ..., --, --, --]],

        ...,

        [[0.0, 0.0, 0.0, ..., 0.0, 0.0, 0.0],
         [0.0, 0.0, 0.0, ..., 0.0, 0.0, 0.0],
         [0.0, 0.0, 0.0, ..., 0.0, 0.0, 0.0],
         ...