## DATA FETCH FROM ERA5

### Time Period
Data yang diambil akan diambil data 20 tahun untuk averaging faktor-faktor musiman seperti el-nino, dll. 

### Data yang dibutuhkan untuk solar

- Surface Solar Radiation Downwards (SSRD) 
Untuk menghitung berapa energi solar yang sampai ke permukaan bumi. 
- 2m Temperature (t2m)
Semakin tinggi temperature, semakin rendah efisiensi solar panel


### Data yang dibutuhkan untuk wind

- u-component of wind
- v-component of wind

Dipakai untuk menghitung vektor kecepatan angin. 

In [2]:
%pip install cdsapi
%load_ext autoreload
%autoreload 2

Note: you may need to restart the kernel to use updated packages.


In [3]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [None]:
# Params

START_YEAR = 2024
END_YEAR = 2024

In [None]:
import cdsapi

dataset = "derived-era5-pressure-levels-daily-statistics"
request = {
    "product_type": "reanalysis",
    "variable": [
        "u_component_of_wind",
        "v_component_of_wind"
    ],
    "year": "2024",
    "month": ["01"],
    "day": [
        "01", "02", "03",
        "04", "05", "06",
        "07", "08", "09",
        "10", "11", "12",
        "13", "14", "15",
        "16", "17", "18",
        "19", "20", "21",
        "22", "23", "24",
        "25", "26", "27",
        "28", "29", "30",
        "31"
    ],
    "pressure_level": ["975"],
    "daily_statistic": "daily_mean",
    "time_zone": "utc+07:00",
    "frequency": "6_hourly",
    "area": [6, 95, -11, 141]
}

client = cdsapi.Client()
client.retrieve(dataset, request).download()


2025-07-27 17:02:23,391 INFO [2024-09-26T00:00:00] Watch our [Forum](https://forum.ecmwf.int/) for Announcements, news and other discussed topics.
2025-07-27 17:02:23,909 INFO Request ID is c63b2335-db0a-43b8-8343-ae654b08ca43
2025-07-27 17:02:24,125 INFO status has been updated to accepted
2025-07-27 17:02:46,481 INFO status has been updated to successful


28ee8c76e1b59830b3bef39aec6d6dd1.grib:   0%|          | 0.00/24.7M [00:00<?, ?B/s]

'28ee8c76e1b59830b3bef39aec6d6dd1.grib'

In [5]:
import cdsapi
import os
import pandas as pd

# --- Configuration ---
YEAR_TO_DOWNLOAD = 2023  # The year you want data for
OUTPUT_FOLDER = "../data/wind_data_975hpa/" # Make sure this folder exists
DATASET = "reanalysis-era5-pressure-levels" # Correct dataset for this type of request

# --- Script Starts Here ---
# Ensure the output directory exists
os.makedirs(OUTPUT_FOLDER, exist_ok=True)

# Initialize the API client
client = cdsapi.Client()

# Generate a list of the first day of each month for the specified year
date_range = pd.date_range(start=f'{YEAR_TO_DOWNLOAD}-01-01', end=f'{YEAR_TO_DOWNLOAD}-12-31', freq='MS')

# Loop through each month
for date in date_range:
    year = date.strftime("%Y")
    month = date.strftime("%m")
    
    # Define a unique output filename for this month
    output_filename = f"wind_data_{year}_{month}.grib"
    output_file_path = os.path.join(OUTPUT_FOLDER, output_filename)
    
    # Check if the file already exists to avoid re-downloading
    if os.path.exists(output_file_path):
        print(f"Skipping {output_filename}, file already exists.")
        continue

    print(f"--- Downloading data for {year}-{month} ---")

    try:
        # Create the API request for the current month
        request = {
            "product_type": "reanalysis",
            "format": "grib",
            "variable": [
                "u_component_of_wind",
                "v_component_of_wind"
            ],
            'pressure_level': '975',
            "year": year,
            "month": month,
            "day": [str(d).zfill(2) for d in range(1, 32)], # API handles non-existent days
            'time': [f'{h:02d}:00' for h in range(0, 24)], # All 24 hours
            "area": [6, 95, -11, 141], # North, West, South, East
        }

        # Retrieve and download the data
        client.retrieve(DATASET, request, output_file_path)
        
        print(f"Successfully downloaded to: {output_file_path}")

    except Exception as e:
        print(f"!!! FAILED to download data for {year}-{month}. Error: {e}")
        # The loop will continue to the next month

print("\n--- Automation complete. ---")

2025-07-27 19:24:07,083 INFO [2024-09-26T00:00:00] Watch our [Forum](https://forum.ecmwf.int/) for Announcements, news and other discussed topics.


--- Downloading data for 2023-01 ---


2025-07-27 19:24:07,670 INFO Request ID is 07cceb7b-21ed-48ab-ac43-d17943b0317c
2025-07-27 19:24:07,915 INFO status has been updated to accepted
2025-07-27 19:24:17,278 INFO status has been updated to running
2025-07-27 19:26:04,431 INFO status has been updated to successful


fbf3d81b3a25bb766336b5607d712de7.grib:   0%|          | 0.00/36.4M [00:00<?, ?B/s]

Successfully downloaded to: ../data/wind_data_975hpa/wind_data_2023_01.grib
--- Downloading data for 2023-02 ---


2025-07-27 19:26:16,174 INFO Request ID is eee152b6-e786-4da9-adec-ce4ab84ff399
2025-07-27 19:26:16,416 INFO status has been updated to accepted
2025-07-27 19:26:25,705 INFO status has been updated to running
2025-07-27 19:28:12,799 INFO status has been updated to successful


4cf90ffebfa75390a4594e75eab97251.grib:   0%|          | 0.00/32.9M [00:00<?, ?B/s]

Successfully downloaded to: ../data/wind_data_975hpa/wind_data_2023_02.grib
--- Downloading data for 2023-03 ---


2025-07-27 19:28:24,479 INFO Request ID is 4feec453-004a-4912-a7fa-59831331b0fd
2025-07-27 19:28:24,715 INFO status has been updated to accepted
2025-07-27 19:28:34,284 INFO status has been updated to running
2025-07-27 19:30:21,113 INFO status has been updated to successful


971672f33f7cc7694a2240abc4e0b95f.grib:   0%|          | 0.00/36.4M [00:00<?, ?B/s]

Successfully downloaded to: ../data/wind_data_975hpa/wind_data_2023_03.grib
--- Downloading data for 2023-04 ---


2025-07-27 19:30:32,756 INFO Request ID is 1ee88180-2420-4cf0-9ce2-3a9d1c2dca8b
2025-07-27 19:30:33,008 INFO status has been updated to accepted
2025-07-27 19:30:47,547 INFO status has been updated to running
2025-07-27 19:32:29,174 INFO status has been updated to successful


c040b35fd982a872664f8522a4b6d585.grib:   0%|          | 0.00/35.2M [00:00<?, ?B/s]

Successfully downloaded to: ../data/wind_data_975hpa/wind_data_2023_04.grib
--- Downloading data for 2023-05 ---


2025-07-27 19:32:44,002 INFO Request ID is 7be9032d-bf38-4d2f-a0c3-7365bd3d6c6a
2025-07-27 19:32:44,250 INFO status has been updated to accepted
2025-07-27 19:32:53,419 INFO status has been updated to running
2025-07-27 19:34:40,284 INFO status has been updated to successful


70a67aa762b4aad4784329eeca5746b4.grib:   0%|          | 0.00/36.4M [00:00<?, ?B/s]

Successfully downloaded to: ../data/wind_data_975hpa/wind_data_2023_05.grib
--- Downloading data for 2023-06 ---


2025-07-27 19:35:37,342 INFO Request ID is 5d122aa7-d495-4a85-8488-034a419a3cb9
2025-07-27 19:35:37,635 INFO status has been updated to accepted
2025-07-27 19:35:52,163 INFO status has been updated to running
2025-07-27 19:37:33,750 INFO status has been updated to successful


1df14c6a6b4836692b822330c4387b49.grib:   0%|          | 0.00/35.2M [00:00<?, ?B/s]

Successfully downloaded to: ../data/wind_data_975hpa/wind_data_2023_06.grib
--- Downloading data for 2023-07 ---


2025-07-27 19:37:46,293 INFO Request ID is 261c2ca2-1ed6-4200-80e9-2b9e459967d8
2025-07-27 19:37:46,543 INFO status has been updated to accepted
2025-07-27 19:37:55,864 INFO status has been updated to running
2025-07-27 19:39:43,077 INFO status has been updated to successful


80cdd8848bfac48882d09f383c59019c.grib:   0%|          | 0.00/36.4M [00:00<?, ?B/s]

Successfully downloaded to: ../data/wind_data_975hpa/wind_data_2023_07.grib
--- Downloading data for 2023-08 ---


2025-07-27 19:40:10,828 INFO Request ID is 162291cf-ebaa-488c-96cb-a1fd923657cd
2025-07-27 19:40:11,152 INFO status has been updated to accepted
2025-07-27 19:40:25,571 INFO status has been updated to running
2025-07-27 19:42:07,967 INFO status has been updated to successful


76100b1ebf16999ea996577105a1816a.grib:   0%|          | 0.00/36.4M [00:00<?, ?B/s]

Successfully downloaded to: ../data/wind_data_975hpa/wind_data_2023_08.grib
--- Downloading data for 2023-09 ---


2025-07-27 19:42:36,859 INFO Request ID is 6c830629-7609-48cc-8d0c-be85e75d512e
2025-07-27 19:42:37,160 INFO status has been updated to accepted
2025-07-27 19:42:46,886 INFO status has been updated to running
2025-07-27 19:42:52,414 INFO status has been updated to accepted
2025-07-27 19:43:00,300 INFO status has been updated to running
2025-07-27 19:44:34,304 INFO status has been updated to successful


e49e843ee07e0aab1f3f3b4268e9f9b3.grib:   0%|          | 0.00/35.2M [00:00<?, ?B/s]

Successfully downloaded to: ../data/wind_data_975hpa/wind_data_2023_09.grib
--- Downloading data for 2023-10 ---


2025-07-27 19:44:59,298 INFO Request ID is f8bc234a-a4ec-4d01-8569-901af7e164c5
2025-07-27 19:44:59,597 INFO status has been updated to accepted
2025-07-27 19:45:09,015 INFO status has been updated to running
2025-07-27 19:47:55,008 INFO status has been updated to successful


6626eb47bca489a24974c8d97f465829.grib:   0%|          | 0.00/36.4M [00:00<?, ?B/s]

Successfully downloaded to: ../data/wind_data_975hpa/wind_data_2023_10.grib
--- Downloading data for 2023-11 ---


2025-07-27 19:48:09,751 INFO Request ID is a081995f-ce01-4a00-a261-a19b41bef937
2025-07-27 19:48:10,058 INFO status has been updated to accepted
2025-07-27 19:48:15,692 INFO status has been updated to running
2025-07-27 19:50:06,386 INFO status has been updated to successful


715a01ab3c3d1f0ebff08b915d215554.grib:   0%|          | 0.00/35.2M [00:00<?, ?B/s]

Successfully downloaded to: ../data/wind_data_975hpa/wind_data_2023_11.grib
--- Downloading data for 2023-12 ---


2025-07-27 19:50:51,233 INFO Request ID is b24e4f8d-0ab6-416f-88cd-b1fcbdd8c6b1
2025-07-27 19:50:51,648 INFO status has been updated to accepted
2025-07-27 19:51:01,282 INFO status has been updated to running
2025-07-27 19:52:09,880 INFO status has been updated to successful


2b97efcce9c0dbd10e6170a320cbbdd9.grib:   0%|          | 0.00/36.4M [00:00<?, ?B/s]

Successfully downloaded to: ../data/wind_data_975hpa/wind_data_2023_12.grib

--- Automation complete. ---


In [6]:
%pip install pandas xarray cfgrib numpy


Collecting cfgrib
  Downloading cfgrib-0.9.15.0-py3-none-any.whl (48 kB)
     ---------------------------------------- 48.9/48.9 kB 1.2 MB/s eta 0:00:00
Collecting eccodes>=0.9.8
  Downloading eccodes-2.43.0-cp39-cp39-win_amd64.whl (7.0 MB)
     ---------------------------------------- 7.0/7.0 MB 4.1 MB/s eta 0:00:00
Collecting findlibs
  Downloading findlibs-0.1.1-py3-none-any.whl (10 kB)
Installing collected packages: findlibs, eccodes, cfgrib
Successfully installed cfgrib-0.9.15.0 eccodes-2.43.0 findlibs-0.1.1


In [21]:
import os
import glob
import xarray as xr
import pandas as pd
import numpy as np

# --- Configuration ---
INPUT_FOLDER = "../data/wind_data_975hpa/"
VARIABLE_NAMES = ['u', 'v']

# Find all .grib files
grib_files = sorted(glob.glob(os.path.join(INPUT_FOLDER, '*.grib')))

if not grib_files:
    print(f"Error: No .grib files found in '{INPUT_FOLDER}'")
else:
    print(f"Found {len(grib_files)} GRIB files to process.")
    
    all_coarse_dataframes = []
    for file_path in grib_files:
        print(f"Processing and coarsening {os.path.basename(file_path)}...")
        try:
            with xr.open_dataset(file_path, engine='cfgrib') as ds:
                # Aggregate to a 1-degree grid using coarsen().mean() FIRST
                coarse_ds = ds.coarsen(latitude=2, longitude=2, boundary='trim').mean()
                
                # Convert the much smaller, coarsened dataset to a DataFrame
                df = coarse_ds.to_dataframe()
                all_coarse_dataframes.append(df)
        except Exception as e:
            print(f"!!! Could not process {file_path}. Error: {e}")

    if all_coarse_dataframes:
        # Combine all the coarse DataFrames
        combined_df = pd.concat(all_coarse_dataframes).reset_index()
        
        # Calculate wind speed from the averaged components
        combined_df['wind_speed'] = np.sqrt(combined_df['u']**2 + combined_df['v']**2)

        print("\n--- Processing Complete --- ✅")
        print("Final Coarse DataFrame Head:")
        print(combined_df.head())



Ignoring index file '../data/wind_data_975hpa\\wind_data_2023_01.grib.5b7b6.idx' incompatible with GRIB file


Found 6 GRIB files to process.
Processing and coarsening wind_data_2023_01.grib...
Processing and coarsening wind_data_2023_02.grib...
Processing and coarsening wind_data_2023_03.grib...
Processing and coarsening wind_data_2023_04.grib...
Processing and coarsening wind_data_2023_05.grib...
Processing and coarsening wind_data_2023_06.grib...

--- Processing Complete --- ✅
Final Coarse DataFrame Head:
        time  latitude  longitude         u         v  number   step  \
0 2023-01-01     5.875     95.125 -8.483521 -2.083054       0 0 days   
1 2023-01-01     5.875     95.625 -6.798218 -0.075485       0 0 days   
2 2023-01-01     5.875     96.125 -6.045288 -0.768845       0 0 days   
3 2023-01-01     5.875     96.625 -5.609253 -1.209763       0 0 days   
4 2023-01-01     5.875     97.125 -5.634644 -1.775925       0 0 days   

   isobaricInhPa valid_time  wind_speed  
0          975.0 2023-01-01    8.735516  
1          975.0 2023-01-01    6.798637  
2          975.0 2023-01-01    6.09398

In [20]:
final_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9497160 entries, 0 to 9497159
Data columns (total 6 columns):
 #   Column      Dtype         
---  ------      -----         
 0   time        datetime64[ns]
 1   latitude    float64       
 2   longitude   float64       
 3   u           float32       
 4   v           float32       
 5   wind_speed  float32       
dtypes: datetime64[ns](1), float32(3), float64(2)
memory usage: 326.1 MB
