The API offers access to different data products. They are outlined in more detail within the data-coverage
chapter. Please also check out complete examples about how to use the API in the example folder. In order to explore all features interactively, you might want to try the cli
.
Acquire historical weather data through requesting by parameter, time resolution and period type.
The options parameter, time resolution and period type can be used in three ways:
- by using the exact enumeration e.g.
Parameter.CLIMATE_SUMMARY
- by using the enumeration string e.g.
"climate_summary" or "CLIMATE_SUMMARY"
- by using the originally defined parameter string e.g.
"kl"
Use wetterdienst.discover_climate_observations()
to discover available time resolution, parameter, period type combinations and their subsets based on the obtained filter arguments.
Get station information for a given set of parameter, time resolution and period type options.
import wetterdienst
from wetterdienst import Parameter, PeriodType, TimeResolution
metadata = wetterdienst.metadata_for_climate_observations(
parameter=Parameter.PRECIPITATION_MORE,
time_resolution=TimeResolution.DAILY,
period_type=PeriodType.HISTORICAL
)
The function returns a Pandas DataFrame with information about the available stations. The column HAS_FILE
indicates whether the station actually has a file with data on the server. That might not always be the case for stations which have been phased out.
When using create_new_file_index=True
, the function can be forced to retrieve a new list of files from the server. Otherwise, data will be served from the cache because this information rarely changes.
Use the DWDStationRequest
class in order to get hold of measurement information.
from wetterdienst import DWDStationRequest
from wetterdienst import Parameter, PeriodType, TimeResolution
request = DWDStationRequest(
station_ids=[3, 1048],
parameter=[Parameter.CLIMATE_SUMMARY, Parameter.SOLAR],
time_resolution=TimeResolution.DAILY,
start_date="1990-01-01",
end_date="2020-01-01",
tidy_data=True,
humanize_column_names=True,
write_file=True,
prefer_local=True
)
for df in request.collect_data():
# analyse the station here
This gives us the most options to work with the data, getting multiple parameters at once, parsed nicely into column structure with improved parameter names and stored automatically on the drive if wanted.
Inquire the list of stations by geographic coordinates.
- Calculate weather stations close to the given coordinates and set of parameters.
- Either select by rank (n stations) or by distance in km.
from datetime import datetime
from wetterdienst import get_nearby_stations, DWDStationRequest
from wetterdienst import Parameter, PeriodType, TimeResolution
stations = get_nearby_stations(
50.0, 8.9,
datetime(2020, 1, 1),
datetime(2020, 1, 20),
Parameter.TEMPERATURE_AIR,
TimeResolution.HOURLY,
PeriodType.RECENT,
num_stations_nearby=1
)
The function returns a DataFrame with the list of stations with distances [in km] to the given coordinates.
The station ids within the DataFrame:
station_ids = stations.STATION_ID.unique()
can be used to download the observation data:
request = DWDStationRequest(
station_ids=station_ids,
parameter=[Parameter.TEMPERATURE_AIR, Parameter.SOLAR],
time_resolution=TimeResolution.HOURLY,
start_date="1990-01-01",
end_date="2020-01-01",
tidy_data=True,
humanize_column_names=True,
write_file=True,
prefer_local=True
)
for df in request.collect_data():
# analyse the station here
Et voila: We just got the data we wanted for our location and are ready to analyse the temperature on historical developments.
Querying data using SQL is provided by an in-memory DuckDB database. In order to explore what is possible, please have a look at the DuckDB SQL introduction.
The result data is provided through a virtual table called data
.
from wetterdienst import DWDStationRequest, DataPackage
from wetterdienst import Parameter, PeriodType, TimeResolution
request = DWDStationRequest(
station_ids=[1048],
parameter=[Parameter.TEMPERATURE_AIR],
time_resolution=TimeResolution.HOURLY,
start_date="2019-01-01",
end_date="2020-01-01",
tidy_data=True,
humanize_column_names=True,
prefer_local=True,
write_file=True,
)
data = DataPackage(request=request)
data.lowercase_fieldnames()
df = data.filter_by_sql("SELECT * FROM data WHERE element='temperature_air_200' AND value < -7.0;")
print(df)
Data can be exported to SQLite, DuckDB, InfluxDB, CrateDB and more targets. A target is identified by a connection string.
Examples:
- sqlite:///dwd.sqlite?table=weather
- duckdb:///dwd.duckdb?table=weather
- influxdb://localhost/?database=dwd&table=weather
- crate://localhost/?database=dwd&table=weather
from wetterdienst import DWDStationRequest, DataPackage
from wetterdienst import Parameter, PeriodType, TimeResolution
request = DWDStationRequest(
station_ids=[1048],
parameter=[Parameter.TEMPERATURE_AIR],
time_resolution=TimeResolution.HOURLY,
start_date="2019-01-01",
end_date="2020-01-01",
tidy_data=True,
humanize_column_names=True,
prefer_local=True,
write_file=True,
)
data = DataPackage(request=request)
data.lowercase_fieldnames()
data.export("influxdb://localhost/?database=dwd&table=weather")
Yet to be implemented...
To use DWDRadolanRequest
, you have to provide a time resolution (either hourly or daily) and date_times
(list of datetimes or strings) or a start date and end date. Datetimes are rounded to HH:50min as the data is packaged for this minute step. Additionally, you can provide a folder to store/restore RADOLAN data to/from the local filesystem.
This is a short snippet which should give you an idea how to use DWDRadolanRequest
together with wradlib
. For a more thorough example, please have a look at example/radolan.py.
from wetterdienst import DWDRadolanRequest, TimeResolution
import wradlib as wrl
radolan = DWDRadolanRequest(
TimeResolution.DAILY,
start_date="2020-09-04T12:00:00",
end_date="2020-09-04T12:00:00"
)
for item in radolan.collect_data():
# Decode item.
timestamp, buffer = item
# Decode data using wradlib.
data, attributes = wrl.io.read_radolan_composite(buffer)
# Do something with the data (numpy.ndarray) here.