# Wind Energy Analysis for Bangladesh  
## ERA5 Data Acquisition & Preprocessing (100 m Wind)

**Thesis:** Statistical and Machine Learning Approaches for Wind, Solar, and Hybrid Renewable Energy Analysis in Bangladesh  
**Stage:** Wind Energy – Data Preparation  
**Platform:** Google Colab

**Prepared by: Md Abid Hassan Mitul**

In [None]:
# Install necessary libraries for ERA5 data access and processing
!pip install cdsapi xarray netCDF4 pandas numpy


Collecting cdsapi
  Downloading cdsapi-0.7.7-py2.py3-none-any.whl.metadata (3.1 kB)
Collecting netCDF4
  Downloading netcdf4-1.7.4-cp311-abi3-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (2.1 kB)
Collecting ecmwf-datastores-client>=0.4.0 (from cdsapi)
  Downloading ecmwf_datastores_client-0.4.2-py3-none-any.whl.metadata (21 kB)
Collecting cftime (from netCDF4)
  Downloading cftime-1.6.5-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (8.7 kB)
Collecting multiurl>=0.3.7 (from ecmwf-datastores-client>=0.4.0->cdsapi)
  Downloading multiurl-0.3.7-py3-none-any.whl.metadata (2.8 kB)
Downloading cdsapi-0.7.7-py2.py3-none-any.whl (12 kB)
Downloading netcdf4-1.7.4-cp311-abi3-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (10.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.1/10.1 MB[0m [31m68.3 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading ecmwf_datastores_client-0.4.2-py3-none-any.whl (29 kB)
Downloading cftime-1.6.5-cp312-cp312-manyli

In [None]:
# Configure Copernicus Climate Data Store (CDS) API credentials

import os

with open('/root/.cdsapirc', 'w') as f:
    f.write("""url: https://cds.climate.copernicus.eu/api
key: 7735289f-ae09-4075-b618-d4da90bb040c
""")

print("CDS API configured successfully")


CDS API configured successfully


In [None]:
# Import Python libraries for data download and processing

import cdsapi
import xarray as xr
import pandas as pd
import numpy as np
import os


In [None]:
# Define spatial and temporal scope for wind data extraction

START_YEAR = 2015
END_YEAR = 2015   # Start with one year (storage-efficient strategy)

# Cox's Bazar / Kutubdia bounding box
AREA = [
    22.6, 91.8,   # North, West
    20.5, 92.4    # South, East
]


In [None]:
# Download ERA5 hourly wind data (100 m) for a single month

c = cdsapi.Client()

c.retrieve(
    'reanalysis-era5-single-levels',
    {
        'product_type': 'reanalysis',
        'variable': [
            '100m_u_component_of_wind',
            '100m_v_component_of_wind',
            '2m_temperature',
            'surface_pressure',
        ],
        'year': '2015',
        'month': '01',
        'day': [f"{d:02d}" for d in range(1, 32)],
        'time': [f"{h:02d}:00" for h in range(0, 24)],
        'area': AREA,
        'format': 'netcdf',
    },
    'era5_wind_2015_01.nc'
)


2026-02-10 15:14:03,253 INFO [2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
INFO:ecmwf.datastores.legacy_client:[2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
2026-

7b8ae44428cdf5ab91f084414ac86060.nc:   0%|          | 0.00/292k [00:00<?, ?B/s]

'era5_wind_2015_01.nc'

In [None]:
# Load downloaded NetCDF file and inspect contents

ds = xr.open_dataset('era5_wind_2015_01.nc')
ds


In [None]:
# Compute wind speed and direction from u and v components

u = ds['u100']
v = ds['v100']

wind_speed = np.sqrt(u**2 + v**2)
wind_direction = (np.degrees(np.arctan2(v, u)) + 360) % 360


In [None]:
wind_speed = wind_speed.rename("wind_speed_100m")
wind_direction = wind_direction.rename("wind_direction")
print("Named OK")


Named OK


In [None]:
# Spatially average over latitude and longitude (area mean)

wind_speed_1d = wind_speed.mean(dim=["latitude", "longitude"])
wind_direction_1d = wind_direction.mean(dim=["latitude", "longitude"])


In [None]:
# Convert to pandas DataFrame (time series)

df = xr.Dataset(
    {
        "wind_speed_100m": wind_speed_1d,
        "wind_direction": wind_direction_1d,
        "t2m": ds["t2m"].mean(dim=["latitude", "longitude"]),
        "sp": ds["sp"].mean(dim=["latitude", "longitude"]),
    }
).to_dataframe().reset_index()

df.head()


Unnamed: 0,valid_time,number,expver,wind_speed_100m,wind_direction,t2m,sp
0,2015-01-01 00:00:00,0,1,6.537967,240.28653,295.286682,100856.265625
1,2015-01-01 01:00:00,0,1,6.903614,232.82164,295.156586,100922.3125
2,2015-01-01 02:00:00,0,1,6.771497,224.101486,295.493744,100982.789062
3,2015-01-01 03:00:00,0,1,6.230996,219.548828,295.380981,101037.28125
4,2015-01-01 04:00:00,0,1,5.27108,219.518341,295.237274,101047.4375


In [None]:
# Clean and rename columns for thesis use

df = df.rename(columns={"valid_time": "datetime"})
df = df.drop(columns=["number", "expver"])

df.head()


Unnamed: 0,datetime,wind_speed_100m,wind_direction,t2m,sp
0,2015-01-01 00:00:00,6.537967,240.28653,295.286682,100856.265625
1,2015-01-01 01:00:00,6.903614,232.82164,295.156586,100922.3125
2,2015-01-01 02:00:00,6.771497,224.101486,295.493744,100982.789062
3,2015-01-01 03:00:00,6.230996,219.548828,295.380981,101037.28125
4,2015-01-01 04:00:00,5.27108,219.518341,295.237274,101047.4375


In [None]:
# Save processed January 2015 data to CSV

csv_file = "wind_100m_cxb_2015_01.csv"
df.to_csv(csv_file, index=False)

print("Saved:", csv_file)


Saved: wind_100m_cxb_2015_01.csv


In [None]:
import os
import shutil

# Create folders
os.makedirs("Wind_ERA5/nc_raw", exist_ok=True)
os.makedirs("Wind_ERA5/csv_processed", exist_ok=True)

# Move files to proper folders
shutil.move("era5_wind_2015_01.nc", "Wind_ERA5/nc_raw/era5_wind_2015_01.nc")
shutil.move("wind_100m_cxb_2015_01.csv", "Wind_ERA5/csv_processed/wind_100m_cxb_2015_01.csv")

print("Files organized successfully")


Files organized successfully


In [None]:
from google.colab import drive
drive.mount('/content/drive')


Mounted at /content/drive


In [None]:
import shutil
import os

# Source (Colab temporary storage)
source = "/content/Wind_ERA5"

# Destination (Google Drive permanent storage)
destination = "/content/drive/MyDrive/Wind_ERA5"

# Copy folder to Drive
if not os.path.exists(destination):
    shutil.copytree(source, destination)
    print("Folder copied to Google Drive")
else:
    print("Folder already exists in Drive")


Folder copied to Google Drive


In [None]:
#reusable function

def process_wind_month(year, month):
    nc_file = f"era5_wind_{year}_{month}.nc"
    csv_file = f"wind_100m_cxb_{year}_{month}.csv"

    c = cdsapi.Client()
    c.retrieve(
        "reanalysis-era5-single-levels",
        {
            "product_type": "reanalysis",
            "variable": [
                "100m_u_component_of_wind",
                "100m_v_component_of_wind",
                "2m_temperature",
                "surface_pressure",
            ],
            "year": str(year),
            "month": str(month),
            "day": [f"{d:02d}" for d in range(1, 32)],
            "time": [f"{h:02d}:00" for h in range(24)],
            "area": [22.6, 91.8, 20.5, 92.4],
            "format": "netcdf",
        },
        nc_file,
    )

    ds = xr.open_dataset(nc_file)

    u = ds["u100"]
    v = ds["v100"]

    wind_speed = np.sqrt(u**2 + v**2).mean(dim=["latitude", "longitude"])
    wind_dir = (np.degrees(np.arctan2(v, u)) + 360).mean(dim=["latitude", "longitude"])

    df = xr.Dataset(
        {
            "wind_speed_100m": wind_speed,
            "wind_direction": wind_dir,
            "t2m": ds["t2m"].mean(dim=["latitude", "longitude"]),
            "sp": ds["sp"].mean(dim=["latitude", "longitude"]),
        }
    ).to_dataframe().reset_index()

    df = df.rename(columns={"valid_time": "datetime"})
    df = df.drop(columns=[c for c in ["number", "expver"] if c in df.columns])

    os.makedirs("Wind_ERA5/nc_raw", exist_ok=True)
    os.makedirs("Wind_ERA5/csv_processed", exist_ok=True)

    df.to_csv(f"Wind_ERA5/csv_processed/{csv_file}", index=False)
    shutil.move(nc_file, f"Wind_ERA5/nc_raw/{nc_file}")

    print(f"Completed: {year}-{month}")


In [None]:
process_wind_month(2015, "02")


2026-02-10 15:34:32,482 INFO [2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
INFO:ecmwf.datastores.legacy_client:[2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
2026-

154572e5b3a651f85519c45d6f7ed0bc.nc:   0%|          | 0.00/272k [00:00<?, ?B/s]

Completed: 2015-02


In [None]:
import os
import shutil
import glob

# Source and destination folders
src_folder = "/content"
dst_folder = "/content/drive/MyDrive/Wind_ERA5/csv_processed"

os.makedirs(dst_folder, exist_ok=True)

# Find all wind CSVs in Colab root
csv_files = glob.glob(os.path.join(src_folder, "wind_100m_cxb_*.csv"))

copied = 0
for f in csv_files:
    dst = os.path.join(dst_folder, os.path.basename(f))
    if not os.path.exists(dst):
        shutil.copy(f, dst)
        copied += 1

print(f"CSV sync complete. New files copied: {copied}")


CSV sync complete. New files copied: 0


In [None]:
process_wind_month(2015, "02")


2026-02-10 15:39:22,754 INFO [2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
INFO:ecmwf.datastores.legacy_client:[2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
2026-

154572e5b3a651f85519c45d6f7ed0bc.nc:   0%|          | 0.00/272k [00:00<?, ?B/s]

Completed: 2015-02


In [None]:
import shutil
import os

src = "/content/Wind_ERA5/csv_processed/wind_100m_cxb_2015_02.csv"
dst_dir = "/content/drive/MyDrive/Wind_ERA5/csv_processed"

os.makedirs(dst_dir, exist_ok=True)

shutil.copy(src, dst_dir)

print("February CSV copied to Drive")


February CSV copied to Drive


In [None]:
import os
import shutil
import glob

# Source: Colab nc files (both possible locations)
src_candidates = [
    "/content/era5_wind_*.nc",
    "/content/Wind_ERA5/nc_raw/era5_wind_*.nc"
]

# Destination: Google Drive
dst_folder = "/content/drive/MyDrive/Wind_ERA5/nc_raw"
os.makedirs(dst_folder, exist_ok=True)

copied = 0
for pattern in src_candidates:
    for f in glob.glob(pattern):
        dst = os.path.join(dst_folder, os.path.basename(f))
        if not os.path.exists(dst):
            shutil.copy(f, dst)
            copied += 1

print(f"NC sync complete. New files copied: {copied}")


NC sync complete. New files copied: 1


In [2]:
# ================== SETUP + REUSABLE FUNCTION ==================

# Install libs
!pip install -q cdsapi xarray netCDF4 pandas numpy

# Configure CDS API
with open('/root/.cdsapirc', 'w') as f:
    f.write("""url: https://cds.climate.copernicus.eu/api
key: 7735289f-ae09-4075-b618-d4da90bb040c
""")

# Imports
import cdsapi
import xarray as xr
import numpy as np
import pandas as pd
import os, shutil
from google.colab import drive

# Mount Drive
drive.mount('/content/drive')

# Reusable function
def process_wind_month(year, month):
    nc_file = f"era5_wind_{year}_{month}.nc"
    csv_file = f"wind_100m_cxb_{year}_{month}.csv"

    c = cdsapi.Client()
    c.retrieve(
        "reanalysis-era5-single-levels",
        {
            "product_type": "reanalysis",
            "variable": [
                "100m_u_component_of_wind",
                "100m_v_component_of_wind",
                "2m_temperature",
                "surface_pressure",
            ],
            "year": str(year),
            "month": str(month),
            "day": [f"{d:02d}" for d in range(1, 32)],
            "time": [f"{h:02d}:00" for h in range(24)],
            "area": [22.6, 91.8, 20.5, 92.4],
            "format": "netcdf",
        },
        nc_file,
    )

    ds = xr.open_dataset(nc_file)

    u = ds["u100"]
    v = ds["v100"]

    wind_speed = np.sqrt(u**2 + v**2).mean(dim=["latitude", "longitude"])
    wind_dir = (np.degrees(np.arctan2(v, u)) + 360).mean(dim=["latitude", "longitude"])

    df = xr.Dataset(
        {
            "wind_speed_100m": wind_speed,
            "wind_direction": wind_dir,
            "t2m": ds["t2m"].mean(dim=["latitude", "longitude"]),
            "sp": ds["sp"].mean(dim=["latitude", "longitude"]),
        }
    ).to_dataframe().reset_index()

    df = df.rename(columns={"valid_time": "datetime"})
    df = df.drop(columns=[c for c in ["number", "expver"] if c in df.columns])

    base = "/content/drive/MyDrive/Wind_ERA5"
    os.makedirs(f"{base}/csv_processed", exist_ok=True)
    os.makedirs(f"{base}/nc_raw", exist_ok=True)

    df.to_csv(f"{base}/csv_processed/{csv_file}", index=False)
    shutil.move(nc_file, f"{base}/nc_raw/{nc_file}")

    print(f"COMPLETED → {year}-{month}")

# ==============================================================


[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.1/10.1 MB[0m [31m40.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m94.0 MB/s[0m eta [36m0:00:00[0m
[?25hDrive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
process_wind_month(2015, "03")


2026-02-10 15:50:26,195 INFO [2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
INFO:ecmwf.datastores.legacy_client:[2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
2026-

86073adc63250008332c4a7f36c8e2ad.nc:   0%|          | 0.00/293k [00:00<?, ?B/s]

COMPLETED → 2015-03


In [None]:
process_wind_month(2015, "04")


2026-02-10 15:52:00,518 INFO [2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
INFO:ecmwf.datastores.legacy_client:[2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
2026-

b5487364a9b726492f78fb8bdf9c91dc.nc:   0%|          | 0.00/289k [00:00<?, ?B/s]

COMPLETED → 2015-04


In [None]:
process_wind_month(2015, "05")

2026-02-10 15:52:37,127 INFO [2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
INFO:ecmwf.datastores.legacy_client:[2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
2026-

e56a095411e0e7398fdfb09a1540e273.nc:   0%|          | 0.00/294k [00:00<?, ?B/s]

COMPLETED → 2015-05


In [None]:
process_wind_month(2015, "06")

2026-02-10 15:53:01,508 INFO [2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
INFO:ecmwf.datastores.legacy_client:[2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
2026-

e338965deb24f04ff1465945603da469.nc:   0%|          | 0.00/285k [00:00<?, ?B/s]

COMPLETED → 2015-06


In [None]:
# ===== USER INPUT =====
YEAR = 2015

# Choose ONE option:
MODE = "batch"   # "single" or "batch"

# If MODE == "single"
MONTH = "07"

# If MODE == "batch"
START_MONTH = 7
END_MONTH = 12
# ======================


In [None]:
process_wind_month(2015, "07")

2026-02-10 16:02:21,937 INFO [2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
INFO:ecmwf.datastores.legacy_client:[2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
2026-

3dbda60b168e3b973840ed8f5ebbc880.nc:   0%|          | 0.00/292k [00:00<?, ?B/s]

COMPLETED → 2015-07


In [None]:
# ===== SINGLE MONTH INPUT =====
year = int(input("Enter year (e.g., 2015): "))
month = input("Enter month (MM, e.g., 07): ").zfill(2)

process_wind_month(year, month)


Enter year (e.g., 2015): 2015
Enter month (MM, e.g., 07): 08


2026-02-10 16:04:36,313 INFO [2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
INFO:ecmwf.datastores.legacy_client:[2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
2026-

19bf18eee522af632b70ac76dd411fdb.nc:   0%|          | 0.00/292k [00:00<?, ?B/s]

COMPLETED → 2015-08


In [None]:
# ===== SINGLE MONTH INPUT =====
year = int(input("Enter year (e.g., 2015): "))
month = input("Enter month (MM, e.g., 07): ").zfill(2)

process_wind_month(year, month)

Enter year (e.g., 2015): 2015
Enter month (MM, e.g., 07): 12


2026-02-11 03:45:06,867 INFO [2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
INFO:ecmwf.datastores.legacy_client:[2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
2026-

b7e069798974915bcb36f259fc698d93.nc:   0%|          | 0.00/292k [00:00<?, ?B/s]

COMPLETED → 2015-12


In [5]:
# ===== 6-MONTH BATCH INPUT =====

year = int(input("Enter year (e.g., 2015): "))
start_month = int(input("Enter start month (1-12): "))
end_month = start_month + 5

if end_month > 12:
    raise ValueError("6-month batch exceeds December. Adjust start month.")

print(f"\nProcessing {year} months {start_month:02d} to {end_month:02d}\n")

for m in range(start_month, end_month + 1):
    month = f"{m:02d}"
    print(f"--- Starting {year}-{month} ---")
    process_wind_month(year, month)
    print(f"--- Finished {year}-{month} ---\n")


print("6-month batch completed.")


Enter year (e.g., 2015): 2024
Enter start month (1-12): 7

Processing 2024 months 07 to 12

--- Starting 2024-07 ---


2026-02-12 03:24:01,200 INFO [2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
INFO:ecmwf.datastores.legacy_client:[2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
2026-

59294020bb9a212b9fc795a94b652d3d.nc:   0%|          | 0.00/292k [00:00<?, ?B/s]

COMPLETED → 2024-07
--- Finished 2024-07 ---

--- Starting 2024-08 ---


2026-02-12 03:28:25,201 INFO [2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
INFO:ecmwf.datastores.legacy_client:[2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
2026-

b0dea2946f53ca2d5cb3a85ec0d05947.nc:   0%|          | 0.00/292k [00:00<?, ?B/s]

COMPLETED → 2024-08
--- Finished 2024-08 ---

--- Starting 2024-09 ---


2026-02-12 03:32:49,189 INFO [2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
INFO:ecmwf.datastores.legacy_client:[2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
2026-

4863731b541250279378ae84ae7505f9.nc:   0%|          | 0.00/289k [00:00<?, ?B/s]

COMPLETED → 2024-09
--- Finished 2024-09 ---

--- Starting 2024-10 ---


2026-02-12 03:37:13,408 INFO [2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
INFO:ecmwf.datastores.legacy_client:[2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
2026-

24bf2a7b0e9c3507460156b292220af8.nc:   0%|          | 0.00/294k [00:00<?, ?B/s]

COMPLETED → 2024-10
--- Finished 2024-10 ---

--- Starting 2024-11 ---


2026-02-12 03:41:37,995 INFO [2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
INFO:ecmwf.datastores.legacy_client:[2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
2026-

f1e3bc764a3be2f284db1afe829e3c11.nc:   0%|          | 0.00/287k [00:00<?, ?B/s]

COMPLETED → 2024-11
--- Finished 2024-11 ---

--- Starting 2024-12 ---


2026-02-12 03:46:01,910 INFO [2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
INFO:ecmwf.datastores.legacy_client:[2025-12-11T00:00:00] Please note that a dedicated catalogue entry for this dataset, post-processed and stored in Analysis Ready Cloud Optimized (ARCO) format (Zarr), is available for optimised time-series retrievals (i.e. for retrieving data from selected variables for a single point over an extended period of time in an efficient way). You can discover it [here](https://cds.climate.copernicus.eu/datasets/reanalysis-era5-single-levels-timeseries?tab=overview)
2026-

f6d9520d735ddcedca5569689da16ea4.nc:   0%|          | 0.00/291k [00:00<?, ?B/s]

COMPLETED → 2024-12
--- Finished 2024-12 ---

6-month batch completed.


In [6]:
import pandas as pd
import glob

# Path to Drive CSV folder
path = "/content/drive/MyDrive/Wind_ERA5/csv_processed/"

# Get all CSV files
all_files = sorted(glob.glob(path + "wind_100m_cxb_*.csv"))

# Merge
df_list = [pd.read_csv(f) for f in all_files]
wind_master = pd.concat(df_list, ignore_index=True)

# Convert datetime
wind_master["datetime"] = pd.to_datetime(wind_master["datetime"])

# Sort
wind_master = wind_master.sort_values("datetime")

print("Merged rows:", len(wind_master))
wind_master.head()


Merged rows: 87672


Unnamed: 0,datetime,wind_speed_100m,wind_direction,t2m,sp
0,2015-01-01 00:00:00,6.537967,240.28653,295.28668,100856.266
1,2015-01-01 01:00:00,6.903614,232.82164,295.1566,100922.31
2,2015-01-01 02:00:00,6.771497,224.10149,295.49374,100982.79
3,2015-01-01 03:00:00,6.230996,219.54883,295.38098,101037.28
4,2015-01-01 04:00:00,5.271079,219.51834,295.23727,101047.44


In [7]:
# Save merged master dataset to Google Drive

save_path = "/content/drive/MyDrive/Wind_ERA5/wind_master_2015_2024.csv"

wind_master.to_csv(save_path, index=False)

print("Master dataset saved to Drive.")


Master dataset saved to Drive.
