
# RADOLAN download & upload

This notebook downloads RADOLAN data for all of Germany for a given time period from the [DWD open data server](https://opendata.dwd.de/).  
The data is then processed to a (cf-compliant) netCDF file, which is saved to the `/data` folder.

In the second part of this notebook, the metadata for the RADOLAN data is created in metacatalaog and the netCDF file is linked and uploaded.


## 1.) RADOLAN download

In this part, recent RADOLAN data is downloaded from the DWD open data server using the pacakge `wetterdienst`.  
The data file is then processed with the package `wradlib` and loaded into Python.  
The data is then loaded as an `xarray` dataset. By adding metadata, a cf-compliant netCDF dataset is then saved to the `/data` folder.

In [1]:
# load dependencies
from wetterdienst.provider.dwd.radar import DwdRadarValues, DwdRadarParameter, DwdRadarResolution
import wradlib as wrl
import xarray as xr
from cftime import date2num
import numpy as np
from datetime import datetime as dt
from datetime import timedelta as td
from metacatalog import api, ext


In [2]:
# function to add RADOLAN metadata which makes the netCDF file cf-conform (v1.0)

def add_metadata(data: xr.Dataset, attrs: list, time_units) -> xr.Dataset:
    """
    netCDF.dataset_compliance() (cf-python)
    Add metadata to xr.Dataset to make it cf-compliant.
    """
    # global attributes (metadata source: https://opendata.dwd.de/climate_environment/CDC/grids_germany/daily/radolan/recent/bin/DESCRIPTION_gridsgermany-daily-radolan-recent-bin_en.pdf)
    data.attrs['Conventions'] = 'CF-1.10'
    data.attrs['title'] = 'DWD Climate Data Center (CDC): Recent hourly sliding RADOLAN grid of daily precipitation'
    data.attrs['institution'] = 'Deutscher Wetterdienst (DWD) - DWD Climate Data Center (CDC)'
    data.attrs['source'] = f"RADOLAN version: {attrs['radolanversion']}. The routine procedure RADOLAN (Radar-Online-Aneichung) \n provides area-wide, spatially and temporally highly resolved quantitative precipitation data in \n eal-time operation for Germany from the combination of the hourly values measured at the precipitation \n stations with the precipitation recording of the 17 weather radars."
    data.attrs['history'] = f"{str(dt.utcnow())}: data download and file creation with xarray / python'"
    data.attrs['references'] = "Bartels, H. et al., 2004: Zusammenfassender Abschlussbericht zum Projekt RADOLAN \n Winterrath T. et al., 2012: On the DWD quantitative precipitation analysis and nowcasting system for real-time application in German flood risk management. Weather Radar and Hydrology, IAHS Publ. 351"
    data.attrs['comment'] = "**DATA ORIGIN:** Weather Radars can only measure the reflected signals from the hydrometeors in the atmosphere and not the precipitation directly. For \n the radarbased quantitative precipitation estimation the radar data are adjusted with the measurements oft he conventional precipitation \n stations. The adjusted radar data is a combination of the two sources of radar and surface stations and therefore these data are using \n the advantages of both data sets. \n **VALIDATION AND UNCERTAINTY ESTIMATE:** Verification of the data from 2013 till 2016 against the daily measurements of the precipitation stations shows a mean median of the \n absolute daily deviations of 0.761 mm/day. This is quite better than the corresponding value of 2.390 mm/day for the non adjusted radar data. \n **ADDITIONAL INFORMATION:** The data are not measured values, but represent a best estimate of precipitation due to the indirect method of radar measurement, \n which is calibrated (quantified) on the measured values of conventional stations (average station distance: approx. 20 km)."

    data.attrs['id'] = "urn:x-wmo:md:de.dwd.cdc::gridsgermany-daily-radolan-recent-bin"

    # variable attributes
    data.radolan.attrs['standard_name'] = 'precipitation_amount'
    data.radolan.attrs['long_name'] = 'Recent hourly sliding RADOLAN grid of daily precipitation'
    data.radolan.attrs['units'] = 'mm'
    data.radolan.attrs['missing_value'] = attrs['nodataflag'] / 10

    # coordinate variables
    data.x_dist.attrs['axis'] = 'X'
    data.x_dist.attrs['long_name'] = "RADOLAN Grid x coordinate of projection"
    data.x_dist.attrs['standard_name'] = "projection_x_coordinate"
    data.x_dist.attrs['units'] = "km"

    data.y_dist.attrs['axis'] = 'Y'
    data.y_dist.attrs['long_name'] = "RADOLAN Grid y coordinate of projection"
    data.y_dist.attrs['standard_name'] = "projection_y_coordinate"
    data.y_dist.attrs['units'] = "km"

    data.lon.attrs['standard_name'] = 'longitude'
    data.lon.attrs['long_name'] = 'longitude'
    data.lon.attrs['units'] = 'degrees_east'

    data.lat.attrs['standard_name'] = 'latitude'
    data.lat.attrs['long_name'] = 'latitude'
    data.lat.attrs['units'] = 'degrees_north'

    data.time.attrs['axis'] = 'T'
    data.time.attrs['standard_name'] = 'time'
    data.time.attrs['long_name'] = 'time'
    data.time.encoding['calendar'] = 'gregorian'
    data.time.attrs['units'] = time_units

    # cf convention checker: https://pumatest.nerc.ac.uk/cgi-bin/cf-checker.pl
    
    return data

In [3]:
# function to download RADOLAN data, add metadata and save as netCDF to the given path (if given)


def download_radolan_data(start_date, end_date, path=None, mask=False):
    """
    Download RADOLAN CDC data in daily resolution. 
    """
    #daily_data = {}
    tstamps = []
    data_chunks = []
    attributes_list = []

    radar = DwdRadarValues(
        parameter=DwdRadarParameter.RADOLAN_CDC,
        resolution=DwdRadarResolution.DAILY,
        start_date=start_date,
        end_date=end_date
    )

    for item in radar.query():
        # Decode data using wradlib.
        data, attributes = wrl.io.read_radolan_composite(item.data)

        # Get grid.
        radolan_grid_xy = wrl.georef.get_radolan_grid(
            attributes['nrow'], attributes['ncol'], wgs84=False)
        # Get coordinates (lat, lon).
        radolan_grid_latlon = wrl.georef.get_radolan_grid(
            attributes['nrow'], attributes['ncol'], wgs84=True)

        # Mask data.
        if mask:
            data = np.ma.masked_equal(data, -9999)
        else:
            pass

        # unit conversion (1/10 mm -> mm)
        data = data / 10

        # append datetime
        tstamps.append(attributes['datetime'])

        # append date
        data_chunks.append(data)

        # append attributes
        attributes_list.append(attributes)

    # time cf format
    time_units = f"days since {tstamps[0]}"
    time_values = date2num(tstamps, time_units)

    # build xarray DataArray
    data = xr.Dataset(data_vars={'radolan': (['time', 'x', 'y'], data_chunks)},
                      coords={'time': time_values,
                              'x_dist': (['x', 'y'], radolan_grid_xy[:, :, 1]),
                              'y_dist': (['x', 'y'], radolan_grid_xy[:, :, 0]),
                              'lat': (['x', 'y'], radolan_grid_latlon[:, :, 1]),
                              'lon': (['x', 'y'], radolan_grid_latlon[:, :, 0])})

    # add cf metadata to netCDF
    data = add_metadata(data, attributes_list[0], time_units)

    # export / return
    if path:
        data.to_netcdf(path=path, mode='w', format='NETCDF4')
    else:
        return data


**Give time period here**

In [4]:
start_date = dt.now() - td(days=2)
end_date = dt.now()

In [5]:
# execute functions / download and save data
data = download_radolan_data(start_date=start_date, end_date=end_date)#, path='data/radolan.nc')

download_radolan_data(start_date=start_date, end_date=end_date, path='data/radolan.nc')

  date_obj = tz.localize(date_obj)
  date_obj = tz.localize(date_obj)
 19%|█▉        | 46/240 [00:25<01:49,  1.78it/s]
  date_obj = tz.localize(date_obj)
  date_obj = tz.localize(date_obj)
 19%|█▉        | 46/240 [00:24<01:42,  1.89it/s]


`data` and the opened `netCDF` file have differences!  
e.g. `time`, `time unit`, ...


## Metacatalog metadata creation

Create a metadata Entry for RADOLAN data in metacatalog.

In [7]:
UPLOAD = False
CONNECTION = 'test'
#CONNECTION = 'default'

session = api.connect_database(CONNECTION)
print('Using: %s' % session.bind)

Using: Engine(postgresql://postgres:***@localhost:5432/test)


In [6]:
# check if the IO extension is activate
try:
    print(ext.extension('io'))
except AttributeError:
    ext.activate_extension('io', 'metacatalog.ext.io', 'IOExtension')
    from metacatalog.ext.io import IOExtension
    ext.extension('io', IOExtension)

<class 'metacatalog.ext.io.extension.IOExtension'>


#### Author

In [8]:
author = api.find_person(session, first_name=None, last_name=None,
                         organisation_name='Deutscher Wetterdienst', return_iterator=True).first()

if author is None and UPLOAD:
    author = api.add_organisation(session, organisation_name='Deutscher Wetterdienst',
                                  affiliation='DWD Climate Data Center (CDC)',
                                  organisation_abbrev='DWD'
                                  #attribution='Source: Deutscher Wetterdienst'
                                  )

print(author)


None



#### Location???



#### License

`Open Data Commons` or better `GeoNutzV` (https://www.gesetze-im-internet.de/geonutzv/GeoNutzV.pdf, http://www.gesetze-im-internet.de/geonutzv/) if this is a license??

In [12]:
license = api.find_license(session, short_title='dl-by-de/2.0')[0]

print(license)

Data licence Germany – attribution – version 2.0 <ID=10002>



#### Variable

`variable.column_names` only makes sense with timeseries data!

In [15]:
variable = api.find_variable(session, name='daily rainfall sum')[0]

print(variable.name, variable.column_names)

daily rainfall sum ['daily_rainfall_sum']



#### Unit


In [38]:
unit = api.find_unit(session, name="milimeter")[0]

print(unit.name)

milimeter



#### Create Entry


In [None]:
entry = api.find_entry(
    session, title="DWD Climate Data Center (CDC): Recent hourly sliding RADOLAN grid of daily precipitation", return_iterator=True).first()

if not entry and UPLOAD:
    entry = api.add_entry(session,
                          title="DWD Climate Data Center (CDC): Recent hourly sliding RADOLAN grid of daily precipitation",
                          abstract="The routine procedure RADOLAN (Radar-Online-Aneichung) provides area-wide, spatially and temporally highly resolved quantitative precipitation data in real-time operation for Germany from the combination of the hourly values measured at the precipitation stations with the precipitation recording of the 17 weather radars.",
                          location=None,
                          variable=variable.id,
                          comment="Unit conversion from 0.1 mm (DWD) to mm",
                          license=license,
                          author=author.id,
                          embargo=False,
                          is_partial=False
                          )

print(entry)


#### Details

from: https://opendata.dwd.de/climate_environment/CDC/grids_germany/daily/radolan/recent/bin/DESCRIPTION_gridsgermany-daily-radolan-recent-bin_en.pdf

In [None]:
if not entry.details and UPLOAD:
    details_dict = [
        {
            "key": "Spatial coverage",
            "value": "Germany"
        },
        {
            "key": "Uncertainties",
            "value": "A first validation of the data shows that the mean absolute error is about 1.05 mm/day against the measurements of conventional precipitation stations, details at Beitrag zur Europäischen Radarkonferenz 2010 in Sibiu."
        },
        {
            "key": "DATA ORIGIN",
            "value": "Weather Radars can only measure the reflected signals from the hydrometeors in the atmosphere and not the precipitation directly. For the radarbased quantitative precipitation estimation the radar data are adjusted with the measurements oft he conventional precipitation stations. The adjusted radar data is a combination of the two sources of radar and surface stations and therefore these data are using the advantages of both data sets."
        },
        {
            "key": "VALIDATION AND UNCERTAINTY ESTIMATE",
            "value": "Verification of the data from 2013 till 2016 against the daily measurements of the precipitation stations shows a mean median of the absolute daily deviations of 0.761 mm/day. This is quite better than the corresponding value of 2.390 mm/day for the non adjusted radar data"
        },
        {
            "key": "ADDITIONAL INFORMATION",
            "value": "The data are not measured values, but represent a best estimate of precipitation due to the indirect method of radar measurement, which is calibrated (quantified) on the measured values of conventional stations (average station distance: approx. 20 km)."
        },
        {
            "key": "REFERENCES",
            "value": "Bartels, H. et al., 2004: Zusammenfassender Abschlussbericht zum Projekt RADOLAN; Winterrath T. et al., 2012: On the DWD quantitative precipitation analysis and nowcasting system for real-time application in German flood risk management. Weather Radar and Hydrology, IAHS Publ. 351"
        }
    ]



#### Thesaurus?



#### Data upload
