# Sonnenstund Prototype implementation

In this prototype we show how to use the [GeoSphere Austria Data Hub](https://data.hub.geosphere.at/) to query the number of hours the sun has shined at every weather station in Austria.
We use the API for [daily data](https://data.hub.geosphere.at/dataset/klima-v2-1d).

In [179]:
# import libraries
import requests
import pandas as pd
import geopandas as gpd

# some constants
HOST = "https://dataset.api.hub.geosphere.at"
VERSION = "v1"
TYPE = "station"
RESOURCE = "klima-v2-1d"

## Station Data

We use the data from weather stations all around Austria. 
First, we look at the metadata provided by the daily station data API

### Station Data Metadata

The (daily) station data metadata is exposed at the `/metadata` endpoint.
We take a look at the response to see what parameters we need in our query and how to access it in the actual endpoint

In [180]:
# historical station metadata endpoint
r = requests.get(f"{HOST}/{VERSION}/{TYPE}/historical/{RESOURCE}/metadata").json()

In [181]:
# keys of the response
r.keys()

dict_keys(['stations', 'parameters', 'title', 'frequency', 'type', 'mode', 'response_formats', 'start_time', 'end_time', 'id_type', 'code_lists'])

In [182]:
# stations metadata
stations = pd.DataFrame(r["stations"])
stations.head()

Unnamed: 0,type,id,group_id,name,state,lat,lon,altitude,valid_from,valid_to,has_sunshine,has_global_radiation,is_active
0,COMBINED,1,,Aflenz,Steiermark,47.54594,15.24069,783.2,1983-05-01T00:00:00+00:00,2100-12-31T00:00:00+00:00,True,True,True
1,COMBINED,2,,Aigen im Ennstal,Steiermark,47.53278,14.13826,641.0,1939-03-01T00:00:00+00:00,2100-12-31T00:00:00+00:00,True,True,True
2,COMBINED,3,,Allentsteig,Niederösterreich,48.69083,15.36694,598.8,1983-10-01T00:00:00+00:00,2100-12-31T00:00:00+00:00,True,True,True
3,COMBINED,4,,Amstetten,Niederösterreich,48.10889,14.895,266.0,1936-01-01T00:00:00+00:00,2100-12-31T00:00:00+00:00,True,True,True
4,COMBINED,5,,Bad Aussee,Steiermark,47.6105,13.75844,743.1,1983-09-01T00:00:00+00:00,2100-12-31T00:00:00+00:00,True,False,True


In [183]:
# number of active stations
len(stations[stations["is_active"]])

496

In [184]:
r["response_formats"]

['geojson', 'csv']

In [185]:
parameters_df = pd.DataFrame(r["parameters"])
# no flag for so_h, but the description contains "sunshine hours"
parameters_df[
    parameters_df["description"].str.contains("sonne", case=False, na=False)
]  # the data we want is so_h - Sonnenscheindauer

Unnamed: 0,name,long_name,description,unit,code_list_ref
102,so_h,Sonnenscheindauer,"Sonnenscheindauer, Summe aus den Stundenwerten...",h,
103,so_h_flag,Qualitätsflag für Sonnenscheindauer,"Qualitätsflag für Sonnenscheindauer, Summe aus...",code,q21


In [186]:
parameters_df[
    (parameters_df["name"] == "so_h") | (parameters_df["name"] == "so_h_flag")
].description.values

<StringArray>
['Sonnenscheindauer, Summe aus den Stundenwerten 0-24 Uhr MOZ (23 Vortag - 23 Tag UTC)', 'Qualitätsflag für Sonnenscheindauer, Summe aus den Stundenwerten 0-24 Uhr MOZ (23 Vortag - 23 Tag UTC)']
Length: 2, dtype: str

In [187]:
# quality flag code list
r["code_lists"]["q21"]

[{'key': None, 'value': 'undefiniert'},
 {'key': 0, 'value': 'ungeprüfte Daten'},
 {'key': 10, 'value': 'automatisch geprüft'},
 {'key': 11, 'value': 'automatisch geprüft (verändert)'},
 {'key': 12, 'value': 'automatisch geprüft (original)'},
 {'key': 20, 'value': 'manuell geprüft (unbekannt)'},
 {'key': 21, 'value': 'manuell geprüft (verändert)'},
 {'key': 22, 'value': 'manuell geprüft (original)'}]

### Example API call to get sunshine hours for a day

We show how to get the number of sunshine hours for a day on a subset of the stations.
We will only query the data for active stations in Vienna.

In [210]:
# station ids in Vienna
stations[(stations["state"] == "Wien") & stations["is_active"]]

Unnamed: 0,type,id,group_id,name,state,lat,lon,altitude,valid_from,valid_to,has_sunshine,has_global_radiation,is_active
102,COMBINED,105,,Wien Hohe Warte,Wien,48.24861,16.35639,198.0,1775-01-01T00:00:00+00:00,2100-12-31T00:00:00+00:00,True,True,True
103,COMBINED,106,,Wien Mariabrunn,Wien,48.20694,16.22944,225.0,1936-01-01T00:00:00+00:00,2100-12-31T00:00:00+00:00,True,True,True
104,COMBINED,107,,Wien Unterlaa,Wien,48.125,16.41944,200.0,1963-10-01T00:00:00+00:00,2100-12-31T00:00:00+00:00,True,True,True
334,INDIVIDUAL,4115,,Wien Stammersdorf,Wien,48.30581,16.40556,190.7,2008-12-09T00:00:00+00:00,2100-12-31T00:00:00+00:00,True,True,True
408,INDIVIDUAL,5802,,Wien Jubiläumswarte,Wien,48.22111,16.26528,450.0,2009-09-01T00:00:00+00:00,2100-12-31T00:00:00+00:00,True,True,True
410,INDIVIDUAL,5805,106.0,Wien Mariabrunn,Wien,48.20694,16.22944,225.0,1997-01-01T00:00:00+00:00,2100-12-31T00:00:00+00:00,True,True,True
422,INDIVIDUAL,5904,105.0,Wien Hohe Warte,Wien,48.24861,16.35639,198.0,1934-07-01T00:00:00+00:00,2100-12-31T00:00:00+00:00,True,True,True
425,INDIVIDUAL,5917,107.0,Wien Unterlaa,Wien,48.125,16.41944,200.0,1996-01-01T00:00:00+00:00,2100-12-31T00:00:00+00:00,True,True,True
426,INDIVIDUAL,5925,,Wien Innere Stadt,Wien,48.19833,16.36694,177.0,1985-01-01T00:00:00+00:00,2100-12-31T00:00:00+00:00,True,True,True
427,INDIVIDUAL,5935,,Wien Donaufeld,Wien,48.25722,16.43139,160.0,1996-07-01T00:00:00+00:00,2100-12-31T00:00:00+00:00,True,True,True


In [211]:
r = requests.get(
    f"{HOST}/{VERSION}/{TYPE}/historical/{RESOURCE}",
    params={
        "station_ids": ",".join(
            stations[(stations["state"] == "Wien") & stations["is_active"]]["id"]
            .astype(str)
            .values
        ),
        "parameters": "so_h,so_h_flag",
        "start": "2026-02-02",
        "end": "2026-02-09",
    },
)

In [212]:
station_ids = ",".join(
    stations[(stations["state"] == "Wien") & stations["is_active"]]["id"]
    .astype(str)
    .values
)
query_url = f"{HOST}/{VERSION}/{TYPE}/historical/{RESOURCE}?station_ids={station_ids}&parameters=so_h,so_h_flag&start=2026-02-02&end=2026-02-09"

In [213]:
requests.get(query_url).json()

{'media_type': 'application/json',
 'type': 'FeatureCollection',
 'version': 'v1',
 'timestamps': ['2026-02-02T00:00+00:00',
  '2026-02-03T00:00+00:00',
  '2026-02-04T00:00+00:00',
  '2026-02-05T00:00+00:00',
  '2026-02-06T00:00+00:00',
  '2026-02-07T00:00+00:00',
  '2026-02-08T00:00+00:00',
  '2026-02-09T00:00+00:00'],
 'features': [{'type': 'Feature',
   'geometry': {'type': 'Point', 'coordinates': [48.24861, 16.35639]},
   'properties': {'parameters': {'so_h': {'name': 'Sonnenscheindauer',
      'unit': 'h',
      'data': [0.0, 0.0, 0.0, 0.0, 0.0, 0.3, 0.0, 0.0]},
     'so_h_flag': {'name': 'Qualitätsflag für Sonnenscheindauer',
      'unit': 'code',
      'data': [10, 10, 10, 10, 10, 10, 10, 10]}},
    'station': 105}},
  {'type': 'Feature',
   'geometry': {'type': 'Point', 'coordinates': [48.20694, 16.22944]},
   'properties': {'parameters': {'so_h': {'name': 'Sonnenscheindauer',
      'unit': 'h',
      'data': [0.0, 0.0, 0.0, 0.0, 0.0, 0.3, 0.1, 0.0]},
     'so_h_flag': {'name':

In [None]:
r.json()

{'media_type': 'application/json',
 'type': 'FeatureCollection',
 'version': 'v1',
 'timestamps': ['2026-02-02T00:00+00:00',
  '2026-02-03T00:00+00:00',
  '2026-02-04T00:00+00:00',
  '2026-02-05T00:00+00:00',
  '2026-02-06T00:00+00:00',
  '2026-02-07T00:00+00:00',
  '2026-02-08T00:00+00:00',
  '2026-02-09T00:00+00:00'],
 'features': [{'type': 'Feature',
   'geometry': {'type': 'Point', 'coordinates': [48.24861, 16.35639]},
   'properties': {'parameters': {'so_h': {'name': 'Sonnenscheindauer',
      'unit': 'h',
      'data': [0.0, 0.0, 0.0, 0.0, 0.0, 0.3, 0.0, 0.0]},
     'so_h_flag': {'name': 'Qualitätsflag für Sonnenscheindauer',
      'unit': 'code',
      'data': [10, 10, 10, 10, 10, 10, 10, 10]}},
    'station': 105}},
  {'type': 'Feature',
   'geometry': {'type': 'Point', 'coordinates': [48.20694, 16.22944]},
   'properties': {'parameters': {'so_h': {'name': 'Sonnenscheindauer',
      'unit': 'h',
      'data': [0.0, 0.0, 0.0, 0.0, 0.0, 0.3, 0.1, 0.0]},
     'so_h_flag': {'name':

In [None]:
gdf = gpd.GeoDataFrame.from_file(
    query_url, columns=["timestamp", "station_id", "parameters"]
)  # this might be nice, but not our desired shape

In [None]:
gdf.dtypes  # TODO: parse parameters column to extract so_h and so_h_flag as separate columns, and convert timestamp to datetime

parameters      object
geometry      geometry
dtype: object

# TODO

* wrangle response into geodataframe
* geodataframe is index + data + geometry column -> can use geometry column to plot stuff on maps!

see: [geopandas documentation](https://geopandas.org/en/stable/docs/user_guide/data_structures.html)