We are using the free API of Open Weather, which provides weather data from 1940 until now. 
The API call has different parameters: some are required (latitude, longitude, start_date, end_date), and other are optional (elevation -to improve accuracy, apikey -required only for commercial use, hourly -a list of hourly weather variables which should be returned, daily -a list of daily weather variable aggregations which should be returned)

## Linear regression
We are planning to regress the number of accident on :
- weather_code (WMO code) : a standarzed code between 1 and 99, depending on the weather -the higher it is, the harsher the weather is
- precipitation (mm) : Total precipitation (rain, showers, snow) sum of the preceding hour
- snowfall (cm) : Snowfall amount of the preceding hour in centimeters. For the water equivalent in millimeter, divide by 7. E.g. 7 cm snow = 10 mm precipitation water equivalent
- wind_speed_10m (km/h) : Wind speed at 10 or 100 meters above ground. Wind speed on 10 meters is the standard level.
- night (0 or 1 if it is the night) : this is a variable that we creates using the two daily variables sunrise and sunset

## Focus on the WMO
The conditions corresponding to the WMO code are described here: https://www.nodc.noaa.gov/archive/arc0021/0002199/1.1/data/0-data/HTML/WMO-CODE/WMO4677.HTM.

Weather icons illustration for each code are provided here: https://gist.github.com/stellasphere/9490c195ed2b53c707087c8c2db4ec0c

## DOCS for the Open Meteo
https://open-meteo.com/en/docs/historical-weather-api

In [10]:
import openmeteo_requests

import pandas as pd
import requests_cache
from retry_requests import retry

# Setup the Open-Meteo API client with cache and retry on error
cache_session = requests_cache.CachedSession('.cache', expire_after = -1)
retry_session = retry(cache_session, retries = 5, backoff_factor = 0.2)
openmeteo = openmeteo_requests.Client(session = retry_session)

# Make sure all required weather variables are listed here
# The order of variables in hourly or daily is important to assign them correctly below
url = "https://archive-api.open-meteo.com/v1/archive"
params = {
	"latitude": 52.52,
	"longitude": 13.41,
	"start_date": "2025-11-09",
	"end_date": "2025-11-23",
	"hourly": ["temperature_2m", "relative_humidity_2m", ]
}
responses = openmeteo.weather_api(url, params=params)

# Process first location. Add a for-loop for multiple locations or weather models
response = responses[0]
print(f"Coordinates: {response.Latitude()}°N {response.Longitude()}°E")
print(f"Elevation: {response.Elevation()} m asl")
print(f"Timezone difference to GMT+0: {response.UtcOffsetSeconds()}s")

# Process hourly data. The order of variables needs to be the same as requested.
hourly = response.Hourly()
hourly_temperature_2m = hourly.Variables(0).ValuesAsNumpy()
hourly_relative_humidity_2m = hourly.Variables(1).ValuesAsNumpy()

hourly_data = {"date": pd.date_range(
	start = pd.to_datetime(hourly.Time(), unit = "s", utc = True),
	end =  pd.to_datetime(hourly.TimeEnd(), unit = "s", utc = True),
	freq = pd.Timedelta(seconds = hourly.Interval()),
	inclusive = "left"
)}

hourly_data["temperature_2m"] = hourly_temperature_2m
hourly_data["relative_humidity_2m"] = hourly_relative_humidity_2m

hourly_dataframe = pd.DataFrame(data = hourly_data)
print("\nHourly data\n", hourly_dataframe)



Coordinates: 52.5483283996582°N 13.407821655273438°E
Elevation: 38.0 m asl
Timezone difference to GMT+0: 0s

Hourly data
                          date  temperature_2m  relative_humidity_2m
0   2025-11-09 00:00:00+00:00          2.3585            100.000000
1   2025-11-09 01:00:00+00:00          1.5585             99.286285
2   2025-11-09 02:00:00+00:00          1.6585             97.174431
3   2025-11-09 03:00:00+00:00          1.4085             99.642143
4   2025-11-09 04:00:00+00:00          1.3085             98.574203
..                        ...             ...                   ...
355 2025-11-23 19:00:00+00:00         -3.1915             80.981010
356 2025-11-23 20:00:00+00:00         -3.5415             78.489082
357 2025-11-23 21:00:00+00:00         -3.7915             74.040115
358 2025-11-23 22:00:00+00:00         -4.1415             73.402283
359 2025-11-23 23:00:00+00:00         -3.3415             67.281662

[360 rows x 3 columns]


In [9]:
pip install requests-cache retry-requests numpy pandas

Collecting requests-cache
  Downloading requests_cache-1.2.1-py3-none-any.whl (61 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.4/61.4 kB[0m [31m1.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting retry-requests
  Downloading retry_requests-2.0.0-py3-none-any.whl (15 kB)
Collecting url-normalize>=1.4
  Downloading url_normalize-2.2.1-py3-none-any.whl (14 kB)
Collecting cattrs>=22.2
  Downloading cattrs-25.3.0-py3-none-any.whl (70 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m70.7/70.7 kB[0m [31m2.7 MB/s[0m eta [36m0:00:00[0m
Collecting attrs>=21.2
  Downloading attrs-25.4.0-py3-none-any.whl (67 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.6/67.6 kB[0m [31m3.7 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: url-normalize, attrs, retry-requests, cattrs, requests-cache
  Attempting uninstall: attrs
    Found existing installation: attrs 25.3.0
    Uninstalling attrs-25.3.0:
      Successfully uni