# Will it snow at the ESE 2024 Kongress?  

Copyright 2024 by [Doulos](https://www.doulos.com)

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at:\
http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

## Part1- Getting the Data

Install in your virtual environment the requires Python packages (*wetterdienst* and *tzdata*) in your for working with the DWD Observations data. \
Not needed if the packages are already installed.

In [3]:
#uncomment the next line if needed
#!pip install wetterdienst tzdata

### Setting up Request

In [4]:
import polars as pl
from wetterdienst import (Settings, Resolution, Period)
from wetterdienst.provider.dwd.observation import DwdObservationRequest

In [5]:
settings = Settings(
    ts_shape="long",    # use long format (teaching purpose)
    ts_humanize=True,   # use human-redable name for features, instead of the technical DWD code
    ts_si_units=False   # prefer Celsius and millimeters over Kelvin and meters. 
)

Request historical data for Stuttgart Echterdingen wetter station (4931)
-  temperature, humidity, precipitation form and precipitation height
-  resolution hourly
-  from 1st Jan 2004 to 31 Dec 2023 (20 years)

In [6]:
stuttgart_airport_dwdstation = 4931 # station Stuttgart Echterdingen (=airport)
request = DwdObservationRequest(
    parameter=[
         "TEMPERATURE_AIR_MEAN_2M", # degrees Celsius (°C)
          "HUMIDITY",               # percentage (%)
          "PRECIPITATION_FORM",     # 0:no precipitation, 1:dew/forst, 2:rain only, 3:snow only, 6:rain+other, 7:snow+other, 8:rain+snow, 9:unknown
          "PRECIPITATION_HEIGHT",   # millimeter (mm) 
    ],
    resolution="hourly",
    start_date="2004-01-01",
    end_date="2024-01-01",
    settings=settings
).filter_by_station_id(station_id=stuttgart_airport_dwdstation)

Quick check of the retrieved results

In [7]:
request

StationsResult(df=shape: (1, 8)
┌────────────┬─────────────┬─────────────┬──────────┬───────────┬────────┬─────────────┬─────────────┐
│ station_id ┆ start_date  ┆ end_date    ┆ latitude ┆ longitude ┆ height ┆ name        ┆ state       │
│ ---        ┆ ---         ┆ ---         ┆ ---      ┆ ---       ┆ ---    ┆ ---         ┆ ---         │
│ str        ┆ datetime[μs ┆ datetime[μs ┆ f64      ┆ f64       ┆ f64    ┆ str         ┆ str         │
│            ┆ , UTC]      ┆ , UTC]      ┆          ┆           ┆        ┆             ┆             │
╞════════════╪═════════════╪═════════════╪══════════╪═══════════╪════════╪═════════════╪═════════════╡
│ 04931      ┆ 1988-01-01  ┆ 2024-11-28  ┆ 48.6883  ┆ 9.2235    ┆ 371.0  ┆ Stuttgart-E ┆ Baden-Württ │
│            ┆ 00:00:00    ┆ 00:00:00    ┆          ┆           ┆        ┆ chterdingen ┆ emberg      │
│            ┆ UTC         ┆ UTC         ┆          ┆           ┆        ┆             ┆             │
└────────────┴─────────────┴─────────────

### Fetch the data
Fetch the raw observation data and store to a file. That way, we can work locally.
Polars (and pandas) supports export of dataframe in different format like CSV, Excel and some optimized formats for large data set (efficient storage, faster read/write operation) like HDF5 or parquet. 

In [8]:
raw_data = request.values.all().df
raw_data.write_parquet("stuttgart_dwd.parquet")

Quick look at the data structure. Can serve to double check the data loaded from disk

In [9]:
print(f"""
shape  : {raw_data.shape}
columns: {raw_data.columns},
null   : {raw_data.null_count()}
""")


shape  : (701284, 6)
columns: ['station_id', 'dataset', 'parameter', 'date', 'value', 'quality'],
null   : shape: (1, 6)
┌────────────┬─────────┬───────────┬──────┬───────┬─────────┐
│ station_id ┆ dataset ┆ parameter ┆ date ┆ value ┆ quality │
│ ---        ┆ ---     ┆ ---       ┆ ---  ┆ ---   ┆ ---     │
│ u32        ┆ u32     ┆ u32       ┆ u32  ┆ u32   ┆ u32     │
╞════════════╪═════════╪═══════════╪══════╪═══════╪═════════╡
│ 0          ┆ 0       ┆ 0         ┆ 0    ┆ 69273 ┆ 69273   │
└────────────┴─────────┴───────────┴──────┴───────┴─────────┘



In [8]:
raw_data

station_id,dataset,parameter,date,value,quality
str,str,str,"datetime[μs, UTC]",f64,f64
"""04931""","""precipitation""","""precipitation_form""",2004-01-01 00:00:00 UTC,,
"""04931""","""precipitation""","""precipitation_form""",2004-01-01 01:00:00 UTC,0.0,1.0
"""04931""","""precipitation""","""precipitation_form""",2004-01-01 02:00:00 UTC,0.0,1.0
"""04931""","""precipitation""","""precipitation_form""",2004-01-01 03:00:00 UTC,,
"""04931""","""precipitation""","""precipitation_form""",2004-01-01 04:00:00 UTC,0.0,1.0
…,…,…,…,…,…
"""04931""","""temperature_air""","""temperature_air_mean_2m""",2023-12-31 20:00:00 UTC,5.6,3.0
"""04931""","""temperature_air""","""temperature_air_mean_2m""",2023-12-31 21:00:00 UTC,5.4,3.0
"""04931""","""temperature_air""","""temperature_air_mean_2m""",2023-12-31 22:00:00 UTC,5.1,3.0
"""04931""","""temperature_air""","""temperature_air_mean_2m""",2023-12-31 23:00:00 UTC,4.4,3.0
