# ZAMG Data Hub: Synoptische Daten

Quellen:
- **Messstationen Zehnminutendaten**: https://data.hub.zamg.ac.at/dataset/synop-v1-1h
- **API Client:** https://dataset.api.hub.zamg.ac.at/app/frontend/station/historical/synop-v1-1h?anonymous=true

Beachte die Parameterbeschreibung in der Datei *synop_params.tsv* (vor allem bei Niederschlagsmengen und Wind).
Tmax wird um 18h UTC gemeldet, Tmin wird um 6h UTC gemeldet.
Bei neueren Meldungen wird beides um 6h und 18h UTC gemeldet.

In [None]:
import pandas as pd, datetime as dt, numpy as np, requests as req
import matplotlib.pyplot as plt
import bz2
pd.set_option('display.max_rows', 128)

In [None]:
def generate_url(date_from, date_to, station_ids, parameters):
    get_params = {"parameters": ",".join(parameters), "start": date_from,
                "end":date_to, "station_ids": ",".join([str(val) for val in station_ids]),
                "output_format":"csv"}
    return "https://dataset.api.hub.zamg.ac.at/v1/station/historical/synop-v1-1h?"+'&'.join([f"{key}={val}" for key, val in get_params.items()])

Auslesen der möglichen Stationen und der möglichen Parameter aus den Metadaten.

In [None]:
metadata = req.get("https://dataset.api.hub.zamg.ac.at/v1/station/historical/synop-v1-1h/metadata").json()
params = pd.DataFrame(metadata.get("parameters")).astype(
    {"name": "string", "long_name": "string", "desc": "string", "unit": "string"})
stations = pd.DataFrame(metadata.get("stations")).astype(
    {"type": "string", "id": int, "group_id": "Int64", "name": "string", "state": "string",
     "lat": float, "lon": float, "altitude": float, "valid_from": "datetime64", "valid_to": "datetime64",
     "has_sunshine": bool, "has_global_radiation": bool, "is_active": bool })

In [None]:
params.to_csv("synop_params.tsv", sep="\t", index=False)
stations.sort_values(["state", "id"]).to_csv("synop_stations.tsv", sep="\t", index=False)

In [None]:
stations.name.to_clipboard()

Abfragen von bestimmten Parametern einer oder mehrerer Stationen.

In [None]:
measurements = None
station_ids = stations.loc[stations.name.isin(["GUMPOLDSKIRCHEN", "WIEN-INNERE STADT", "WIEN/HOHE WARTE", "RAX/SEILBAHN-BERGSTAT"]), "id"].sort_values()
parameters = ["T", "Tmax", "Tmin", "Td", "rel", "dd", "ff", "Pg", "Pp", "RR3", "RRR", "tr", "tr3", "sonne"]
dtypes = {"station": int}
dtypes.update({val:float  for val in parameters})
for station_id in station_ids:
    for year in range(2000, 2021, 1):
        print(f"Lade Jahr {year} von Station {station_id}...")
        download_url = generate_url(f"{year}-01-01T00:00:00Z", f"{year}-12-31T23:50:00Z", [station_id], parameters)
        df = pd.read_csv(download_url, sep=",", dtype=dtypes, parse_dates=["time"]).query("T.notna()")
        measurements = pd.concat([df, measurements]) if measurements is not None else df

In [None]:
measurements[measurements.time.dt.hour == 18].set_index("time")["Tmax"].plot.line();