# Read Me
This file contains two sets of weather data for Singapore from January to June 2025. 
The column names are described here: https://dev.meteostat.net/formats.html#meteorological-parameters
You can find the two datasets in the weather_data folder.

1. Changi Station Weather Data (weather_changi.csv)

* This dataset contains weather observations from the Changi meteorological station, which is the official and most representative station for Singapore’s overall weather.

* Changi Station data is straightforward and widely used as the standard reference due to its long-term reliability and comprehensive coverage. Use this dataset for general analysis where a single, authoritative source is sufficient to represent Singapore’s climate.

2. Five-Region Weather Data (weather_region.csv)

* This dataset includes weather data from five representative stations across Singapore’s main regions: East (Changi), West (Jurong), South (Harbourfront/Marina South), North (Sembawang), and Central (Marina Centre).

* It captures local variations and microclimates across different parts of the city. Use this dataset when regional weather differences are important, or when a more detailed spatial analysis of Singapore’s climate is needed.

* The Region column marks the source area of each data record.

In [2]:
# pip install meteostat
from datetime import datetime
from meteostat import Stations, Hourly, Point
import matplotlib.pyplot as plt
import pandas as pd

In [8]:
# Select Changi as the most representative weather station for Singapore
lat, lon = 1.3521, 103.8198
stations = Stations().nearby(lat, lon)
changi_station = stations.fetch(1)            # Find the nearest station
changi_station_id = changi_station.index[0]          # meteostat Changi station ID

# Download hourly data（2025-01-01 ~ 2025-06-30）
start = datetime(2025,1,1)
end   = datetime(2025,6,30,23,59)
changi_data = Hourly(changi_station_id, start, end)
changi_df_hour = changi_data.fetch()                 # DataFrame，index 为 UTC 时间

# 3) 转换为本地时间（SGT = Asia/Singapore），并重采样为 30-min（插值）
if changi_df_hour.index.tz is None:
    changi_df_hour = changi_df_hour.tz_localize('UTC').tz_convert('Asia/Singapore')
else:
    changi_df_hour = changi_df_hour.tz_convert('Asia/Singapore')

# Remove time zone information
changi_df_hour.index = changi_df_hour.index.tz_localize(None)

# Resample to 30 minutes, linear interpolation
changi_df_30 = changi_df_hour.resample('30min').interpolate(method='time')

In [9]:
print(changi_df_30.head())

                     temp   dwpt  rhum  prcp  snow   wdir  wspd  wpgt    pres  \
time                                                                            
2025-01-01 08:00:00  26.0   24.0  89.0   0.0  <NA>  360.0   5.4  <NA>  1010.0   
2025-01-01 08:30:00  26.5  24.05  86.5   0.0  <NA>  360.0   5.4  <NA>  1010.5   
2025-01-01 09:00:00  27.0   24.1  84.0   0.0  <NA>  360.0   5.4  <NA>  1011.0   
2025-01-01 09:30:00  27.5  24.05  81.5   0.0  <NA>  190.0   7.4  <NA>  1011.0   
2025-01-01 10:00:00  28.0   24.0  79.0   0.0  <NA>   20.0   9.4  <NA>  1011.0   

                     tsun  coco  
time                             
2025-01-01 08:00:00  <NA>   3.0  
2025-01-01 08:30:00  <NA>   3.0  
2025-01-01 09:00:00  <NA>   3.0  
2025-01-01 09:30:00  <NA>   3.0  
2025-01-01 10:00:00  <NA>   3.0  


In [10]:
# Export the weather data of Changi Station to csv
# changi_df_30.to_csv('weather_changi.csv', index=False, encoding='utf-8')

In [13]:
# Locating representative weather stations
region_coords = {
    'East': (1.3521, 103.8198),          # East: Changi
    'West': (1.3327, 103.7432),          # West: Jurong
    'South': (1.2734, 103.8198),         # South: Harbourfront / Marina South
    'North': (1.4496, 103.8205),         # North: Sembawang
    'Central': (1.2790, 103.8545)        # Central: Marina Centre
}

# Time Period
start = datetime(2025, 1, 1)
end = datetime(2025, 6, 30, 23, 59)

all_data = []

for region, (lat, lon) in region_coords.items():
    # Find the nearest weather station ID for each region
    stations = Stations().nearby(lat, lon)
    station = stations.fetch(1)
    station_id = station.index[0]

    # Download hourly data
    data = Hourly(station_id, start, end)
    df_hour = data.fetch()

    # Convert time zone from UTC to Singapore time
    if df_hour.index.tz is None:
        df_hour = df_hour.tz_localize('UTC').tz_convert('Asia/Singapore')
    else:
        df_hour = df_hour.tz_convert('Asia/Singapore')

    # Remove time zone information
    df_hour.index = df_hour.index.tz_localize(None)

    # Resample to 30 minutes, linear interpolation
    df_30 = df_hour.resample('30min').interpolate(method='time')

    # Add new column to lable "region"
    df_30['Region'] = region

    all_data.append(df_30)

# Merge all regions data
df_all = pd.concat(all_data)

In [14]:
print(df_all.head())

                     temp   dwpt  rhum  prcp  snow   wdir  wspd  wpgt    pres  \
time                                                                            
2025-01-01 08:00:00  26.0   24.0  89.0   0.0  <NA>  360.0   5.4  <NA>  1010.0   
2025-01-01 08:30:00  26.5  24.05  86.5   0.0  <NA>  360.0   5.4  <NA>  1010.5   
2025-01-01 09:00:00  27.0   24.1  84.0   0.0  <NA>  360.0   5.4  <NA>  1011.0   
2025-01-01 09:30:00  27.5  24.05  81.5   0.0  <NA>  190.0   7.4  <NA>  1011.0   
2025-01-01 10:00:00  28.0   24.0  79.0   0.0  <NA>   20.0   9.4  <NA>  1011.0   

                     tsun  coco Region  
time                                    
2025-01-01 08:00:00  <NA>   3.0   East  
2025-01-01 08:30:00  <NA>   3.0   East  
2025-01-01 09:00:00  <NA>   3.0   East  
2025-01-01 09:30:00  <NA>   3.0   East  
2025-01-01 10:00:00  <NA>   3.0   East  


In [5]:
# Export the weather data to csv
# df_all.to_csv('weather_region.csv', index=False, encoding='utf-8')