<div style="background-color:#FCE205; padding:10px; border-radius:5px; color:black; font-weight:bold;">
    <h2>Importing weather data from Open Meteo API</h2>
</div>

In [None]:
# import required libraries
import pandas as pd
import os
import numpy as np
import time
import sys

sys.path.append(os.path.abspath('../utils'))
import tinne_utils as tu

<div style="background-color:#FCE205; padding:5px; border-radius:10px; color:black; font-weight:bold;">
    <h3>Variable description</h3>
</div>

- **`temperature_2m_mean`** → Mean daily air temperature at **2 meters above ground** (°C).  
- **`temperature_2m_min`** → Minimum daily air temperature at **2 meters above ground** (°C).    
- **`temperature_2m_max`** → Maximum daily air temperature at **2 meters above ground** (°C). 
- **`relative_humidity_2m_mean`** → Mean relative humidity at  **2 meters above ground** (%).  
- **`relative_humidity_2m_min`** → Minimum relative humidity at **2 meters above ground** (%).    
- **`relative_humidity_2m_max`** → Maximum relative humidity at **2 meters above ground** (%). 

- **`precipitation_hours`** → The number of hours with rain in **hours (h)**.  
- **`wind_speed_10m_max`** → Maximum wind speed and gusts on a day in **km/h**.  

- **`weathercode`** → The most severe **weather condition** on a given day.  

| Weather Code | Description |
|-------------|------------|
| 0  | Clear sky |
| 1  | Mainly clear |
| 2  | Partly cloudy |
| 3  | Overcast |
| 51 | Light drizzle |
| 53 | Moderate drizzle |
| 55 | Heavy drizzle |
| 56 | Light freezing drizzle |
| 61 | Light rain |
| 63 | Moderate rain |
| 65 | Heavy rain |
| 71 | Light snow |
| 73 | Moderate snow |
| 75 | Heavy snow |

<div style="background-color:#FCE205; padding:10px; border-radius:5px; color:black; font-weight:bold;">
    <h3>Assigning US states longitude and latitude for API calls</h3>
</div>

The latitudes and longitudes were obtained through ChatGPT

In [3]:
# Bee dataset contains US states information. Latitudes and longitudes are used to get weather data from Open Meteo API.
states = pd.DataFrame({
    "state": [
        "Alabama", "Alaska", "Arizona", "Arkansas", "California", "Colorado", "Connecticut",
        "Delaware", "Florida", "Georgia", "Hawaii", "Idaho", "Illinois", "Indiana", "Iowa",
        "Kansas", "Kentucky", "Louisiana", "Maine", "Maryland", "Massachusetts", "Michigan",
        "Minnesota", "Mississippi", "Missouri", "Montana", "Nebraska", "Nevada", "New Hampshire",
        "New Jersey", "New Mexico", "New York", "North Carolina", "North Dakota", "Ohio",
        "Oklahoma", "Oregon", "Pennsylvania", "Rhode Island", "South Carolina", "South Dakota",
        "Tennessee", "Texas", "Utah", "Vermont", "Virginia", "Washington", "West Virginia",
        "Wisconsin", "Wyoming"
    ],
    "latitude": [
        32.806671, 61.370716, 33.729759, 34.969704, 36.116203, 39.059811, 41.597782,
        39.318523, 27.766279, 33.040619, 20.902977, 44.068202, 40.633125, 39.849426, 42.011539,
        38.526600, 37.668140, 31.169546, 45.367584, 39.045753, 42.407211, 44.182205,
        46.392410, 32.741646, 38.456085, 46.921925, 41.125370, 38.313515, 43.452492,
        40.298904, 34.840515, 42.165726, 35.630066, 47.528912, 40.388783,
        35.565342, 44.572021, 40.590752, 41.680893, 33.856892, 44.299782,
        35.747845, 31.054487, 39.320980, 44.045876, 37.769337, 47.400902, 38.491226,
        44.268543, 42.755966
    ],
    "longitude": [
        -86.791130, -152.404419, -111.431221, -92.373123, -119.681564, -105.311104, -72.755371,
        -75.507141, -81.686783, -83.643074, -156.207483, -114.742043, -89.398529, -86.258278, -93.210526,
        -96.726486, -84.670067, -91.867805, -68.972168, -76.641273, -71.382439, -84.506836,
        -94.636230, -89.678696, -92.288368, -110.454353, -98.268082, -117.055374, -71.563896,
        -74.521011, -106.248482, -74.948051, -79.806419, -99.784012, -82.764915,
        -96.928917, -122.070938, -77.209755, -71.511780, -80.945007, -99.438828,
        -86.692345, -97.563461, -111.093735, -72.710686, -78.169968, -121.490494, -80.954456,
        -89.616508, -107.302490
    ]
}
)

locations = states[["latitude", "longitude"]].drop_duplicates()

# API is limited in how many requests can be made at once, so we split the locations into smaller chunks
locations_split = np.array_split(locations, 6)

  return bound(*args, **kwds)


<div style="background-color:#FCE205; padding:10px; border-radius:5px; color:black; font-weight:bold;">
    <h3>Open Meteo API calls</h3>
</div>

A limited number of API calls is allowed under free usage. API calls are performed in small chunks spread across time.

In [None]:
# Fetch weather data for each part with a delay of 15 minutes between requests
weather_data_list = []
for i, loc in enumerate(locations_split):
    print(f"Fetching weather data for part {i + 1}...")
    
    # Fetch weather data for the current part
    weather_data = tu.fetch_weather_data(loc)
    weather_data_list.append(weather_data)
    
    # If not the last part, wait for 15 minutes before the next request
    if i < len(locations_split) - 1:
        print("Waiting for 15 minutes before the next request...")
        time.sleep(900)  # Wait for 15 minutes

# Combine all weather data into one DataFrame
weather_data_combined = pd.concat(weather_data_list, ignore_index=True)


In [None]:
# save df's csv
# df1.to_csv("data/import/Hourly_Weather_Data_part1.csv", index=False)
# df2.to_csv("data/import/Hourly_Weather_Data_part2.csv", index=False)
# df3.to_csv("data/import/Hourly_Weather_Data_part3.csv", index=False)
# df4.to_csv("data/import/Hourly_Weather_Data_part4.csv", index=False)
# df5.to_csv("data/import/Hourly_Weather_Data_part5.csv", index=False)
# df6.to_csv("data/import/Hourly_Weather_Data_part6.csv", index=False)