In [1]:
import datetime as dt

import eranest

In [2]:
# eranest.era5ify(request_id, variables, start_date, end_date, json_file, frequency, resolution)
# request_id : str, unique identifier for the request
# variables : list, list of variables to download
# start_date : datetime, start date of the data
# end_date : datetime, end date of the data
# json_file : str, path to the geojson file containing the area of interest
# frequency : str, frequency of the data (hourly, daily, weekly, monthly, yearly), optional (hourly by default)
# resolution : float, resolution of the data in degrees (0.1, 0.25, etc.), optional (0.25 by default)

In [3]:
df = eranest.era5ify(
    request_id="test",
    variables=[
        "2m_temperature",
        "total_precipitation",
        "surface_pressure",
        "2m_dewpoint_temperature",
    ],
    start_date=dt.datetime(2023, 1, 1),
    end_date=dt.datetime(2023, 1, 31),
    json_file="../data/india.json",
    frequency="daily",
    resolution="0.25",
)

Successfully loaded JSON file with utf-8 encoding
Valid GeoJSON detected: ../data/india.json
✓ CDS API configuration is already set up and valid.

STARTING ERA5 DATA PROCESSING
Request ID: test
Variables: ['2m_temperature', 'total_precipitation', 'surface_pressure', '2m_dewpoint_temperature']
Date Range: 2023-01-01 to 2023-01-31
Frequency: daily
Resolution: 0.25°
GeoJSON File: ../data/india.json

--- Input Validation ---
✓ All inputs validated successfully

--- Loading GeoJSON File ---
Attempting to load: ../data/india.json
  Trying encoding 1/4: utf-8
✓ Successfully loaded GeoJSON file with utf-8 encoding
✓ GeoJSON contains 1 feature(s)

--- Calculating Bounding Box ---
✓ Bounding Box calculated:
  North: 35.4940°
  South: 7.9655°
  East:  97.4026°
  West:  68.1766°
  Area:  29.2259° × 27.5285°

--- Determining Processing Strategy ---
Using monthly dataset: False
Total days to process: 31
Max days per chunk: 14
Needs chunking: True
Will process 3 chunks C2

CHUNK 1/3
Processing: 2023-

2025-05-26 22:56:10,913 INFO [2024-09-26T00:00:00] Watch our [Forum](https://forum.ecmwf.int/) for Announcements, news and other discussed topics.
2025-05-26 22:56:11,661 INFO Request ID is e5328113-a911-4497-8003-de28a64298b3
2025-05-26 22:56:11,953 INFO status has been updated to accepted
2025-05-26 22:56:34,431 INFO status has been updated to successful
                                                                                                                 

Download complete: test_chunk1.zip
  ✓ Download completed: test_chunk1.zip
  → Extracting files...
Extracting zip file: test_chunk1.zip
Extracted NetCDF files:
  - /Users/saket/github/eranest/notebooks/test_chunk1/data_stream-oper_stepType-accum.nc
  - /Users/saket/github/eranest/notebooks/test_chunk1/data_stream-oper_stepType-instant.nc
  ✓ Extracted 2 files
  → Processing NetCDF files...
    Processing file 1/2: data_stream-oper_stepType-accum.nc


  print(f"    ✓ Loaded dataset with shape: {dict(ds.dims)}")
  print(f"    ✓ Loaded dataset with shape: {dict(ds.dims)}")
  print(f"  ✓ Merged dataset shape: {dict(merged_chunk_ds.dims)}")


    ✓ Loaded dataset with shape: {'valid_time': 336, 'latitude': 111, 'longitude': 117}
    Processing file 2/2: data_stream-oper_stepType-instant.nc
    ✓ Loaded dataset with shape: {'valid_time': 336, 'latitude': 111, 'longitude': 117}
  → Merging datasets...
  ✓ Merged dataset shape: {'valid_time': 336, 'latitude': 111, 'longitude': 117}
  → Filtering by shapefile...
Starting optimized filtering process...
→ Extracting unique lat/lon coordinates from dataset...
✓ Found 12987 unique lat/lon combinations
→ Filtering unique coordinates against polygon...
✓ Coordinate filtering completed in 0.16 seconds
  - Points inside: 4446
  - Points outside: 8541
  - Percentage inside: 34.23%
→ Filtering original dataset using inside coordinates...
  Converting dataset to DataFrame...
  ✓ Converted to DataFrame with 4363632 rows
  ✓ Created lookup set with 4446 coordinate pairs
  Filtering DataFrame rows...
  ✓ Filtered from 4363632 to 1493856 rows
✓ Dataset filtering completed in 2.87 seconds

---

2025-05-26 22:56:49,261 INFO [2024-09-26T00:00:00] Watch our [Forum](https://forum.ecmwf.int/) for Announcements, news and other discussed topics.
2025-05-26 22:56:50,555 INFO Request ID is 0ba79ae4-1c0d-428d-8936-f3ec7a1ed50f
2025-05-26 22:56:50,757 INFO status has been updated to accepted
2025-05-26 22:57:13,270 INFO status has been updated to successful
  print(f"    ✓ Loaded dataset with shape: {dict(ds.dims)}")
  print(f"    ✓ Loaded dataset with shape: {dict(ds.dims)}")
  print(f"  ✓ Merged dataset shape: {dict(merged_chunk_ds.dims)}")


Download complete: test_chunk2.zip
  ✓ Download completed: test_chunk2.zip
  → Extracting files...
Extracting zip file: test_chunk2.zip
Extracted NetCDF files:
  - /Users/saket/github/eranest/notebooks/test_chunk2/data_stream-oper_stepType-accum.nc
  - /Users/saket/github/eranest/notebooks/test_chunk2/data_stream-oper_stepType-instant.nc
  ✓ Extracted 2 files
  → Processing NetCDF files...
    Processing file 1/2: data_stream-oper_stepType-accum.nc
    ✓ Loaded dataset with shape: {'valid_time': 336, 'latitude': 111, 'longitude': 117}
    Processing file 2/2: data_stream-oper_stepType-instant.nc
    ✓ Loaded dataset with shape: {'valid_time': 336, 'latitude': 111, 'longitude': 117}
  → Merging datasets...
  ✓ Merged dataset shape: {'valid_time': 336, 'latitude': 111, 'longitude': 117}
  → Filtering by shapefile...
Starting optimized filtering process...
→ Extracting unique lat/lon coordinates from dataset...
✓ Found 12987 unique lat/lon combinations
→ Filtering unique coordinates again

2025-05-26 22:57:28,286 INFO [2024-09-26T00:00:00] Watch our [Forum](https://forum.ecmwf.int/) for Announcements, news and other discussed topics.
2025-05-26 22:57:29,264 INFO Request ID is a9250abb-5f95-4200-8e0e-a3c3feb0c531
2025-05-26 22:57:29,585 INFO status has been updated to accepted
2025-05-26 22:57:38,796 INFO status has been updated to running
2025-05-26 22:57:44,144 INFO status has been updated to successful
  print(f"    ✓ Loaded dataset with shape: {dict(ds.dims)}")
  print(f"    ✓ Loaded dataset with shape: {dict(ds.dims)}")
  print(f"  ✓ Merged dataset shape: {dict(merged_chunk_ds.dims)}")


Download complete: test_chunk3.zip
  ✓ Download completed: test_chunk3.zip
  → Extracting files...
Extracting zip file: test_chunk3.zip
Extracted NetCDF files:
  - /Users/saket/github/eranest/notebooks/test_chunk3/data_stream-oper_stepType-accum.nc
  - /Users/saket/github/eranest/notebooks/test_chunk3/data_stream-oper_stepType-instant.nc
  ✓ Extracted 2 files
  → Processing NetCDF files...
    Processing file 1/2: data_stream-oper_stepType-accum.nc
    ✓ Loaded dataset with shape: {'valid_time': 72, 'latitude': 111, 'longitude': 117}
    Processing file 2/2: data_stream-oper_stepType-instant.nc
    ✓ Loaded dataset with shape: {'valid_time': 72, 'latitude': 111, 'longitude': 117}
  → Merging datasets...
  ✓ Merged dataset shape: {'valid_time': 72, 'latitude': 111, 'longitude': 117}
  → Filtering by shapefile...
Starting optimized filtering process...
→ Extracting unique lat/lon coordinates from dataset...
✓ Found 12987 unique lat/lon combinations
→ Filtering unique coordinates against 

In [4]:
df = eranest.era5ify(
    request_id="test2",
    variables=["2m_temperature", "total_precipitation"],
    start_date=dt.datetime(2024, 1, 1),
    end_date=dt.datetime(2024, 12, 31),
    json_file="../data/latvia.geojson",
    frequency="monthly",
    resolution="0.1",
)

Successfully loaded JSON file with utf-8 encoding
Valid GeoJSON detected: ../data/latvia.geojson
✓ CDS API configuration is already set up and valid.

STARTING ERA5 DATA PROCESSING
Request ID: test2
Variables: ['2m_temperature', 'total_precipitation']
Date Range: 2024-01-01 to 2024-12-31
Frequency: monthly
Resolution: 0.1°
GeoJSON File: ../data/latvia.geojson

--- Input Validation ---
✓ All inputs validated successfully

--- Loading GeoJSON File ---
Attempting to load: ../data/latvia.geojson
  Trying encoding 1/4: utf-8
✓ Successfully loaded GeoJSON file with utf-8 encoding
✓ GeoJSON contains 1 feature(s)

--- Calculating Bounding Box ---
✓ Bounding Box calculated:
  North: 58.0856°
  South: 55.6776°
  East:  28.2431°
  West:  20.9537°
  Area:  7.2894° × 2.4080°

--- Determining Processing Strategy ---
Using monthly dataset: True
Total months to process: 12
Max months per chunk: 10
Needs chunking: True
Will process 2 chunks C1

CHUNK 1/2
Processing: 2024-01-01 to 2024-10-31
  → Downloa

2025-05-26 22:58:40,968 INFO [2024-09-26T00:00:00] Watch our [Forum](https://forum.ecmwf.int/) for Announcements, news and other discussed topics.
2025-05-26 22:58:42,098 INFO Request ID is d75e9b45-bd62-4400-9937-5c229e8a4ac3
2025-05-26 22:58:42,481 INFO status has been updated to accepted
2025-05-26 22:58:56,946 INFO status has been updated to running
2025-05-26 22:59:16,666 INFO status has been updated to successful
  print(f"    ✓ Loaded dataset with shape: {dict(ds.dims)}")
  print(f"    ✓ Loaded dataset with shape: {dict(ds.dims)}")
  print(f"  ✓ Merged dataset shape: {dict(merged_chunk_ds.dims)}")


Download complete: test2_chunk1.zip
  ✓ Download completed: test2_chunk1.zip
  → Extracting files...
Extracting zip file: test2_chunk1.zip
Extracted NetCDF files:
  - /Users/saket/github/eranest/notebooks/test2_chunk1/data_0.nc
  - /Users/saket/github/eranest/notebooks/test2_chunk1/data_1.nc
  ✓ Extracted 2 files
  → Processing NetCDF files...
    Processing file 1/2: data_0.nc
    ✓ Loaded dataset with shape: {'valid_time': 10, 'latitude': 25, 'longitude': 73}
    Processing file 2/2: data_1.nc
    ✓ Loaded dataset with shape: {'valid_time': 10, 'latitude': 25, 'longitude': 73}
  → Merging datasets...
  ✓ Merged dataset shape: {'valid_time': 20, 'latitude': 25, 'longitude': 73}
  → Filtering by shapefile...
Starting optimized filtering process...
→ Extracting unique lat/lon coordinates from dataset...
✓ Found 1825 unique lat/lon combinations
→ Filtering unique coordinates against polygon...
✓ Coordinate filtering completed in 0.01 seconds
  - Points inside: 962
  - Points outside: 863

2025-05-26 22:59:24,498 INFO [2024-09-26T00:00:00] Watch our [Forum](https://forum.ecmwf.int/) for Announcements, news and other discussed topics.
2025-05-26 22:59:25,387 INFO Request ID is 99db9761-26b3-44db-aebe-60c2f0dd5f72
2025-05-26 22:59:25,594 INFO status has been updated to accepted
2025-05-26 22:59:40,095 INFO status has been updated to running
2025-05-26 22:59:47,898 INFO status has been updated to successful
                                                                                                                 

Download complete: test2_chunk2.zip
  ✓ Download completed: test2_chunk2.zip
  → Extracting files...
Extracting zip file: test2_chunk2.zip
Extracted NetCDF files:
  - /Users/saket/github/eranest/notebooks/test2_chunk2/data_0.nc
  - /Users/saket/github/eranest/notebooks/test2_chunk2/data_1.nc
  ✓ Extracted 2 files
  → Processing NetCDF files...
    Processing file 1/2: data_0.nc
    ✓ Loaded dataset with shape: {'valid_time': 2, 'latitude': 25, 'longitude': 73}
    Processing file 2/2: data_1.nc
    ✓ Loaded dataset with shape: {'valid_time': 2, 'latitude': 25, 'longitude': 73}
  → Merging datasets...
  ✓ Merged dataset shape: {'valid_time': 4, 'latitude': 25, 'longitude': 73}
  → Filtering by shapefile...
Starting optimized filtering process...
→ Extracting unique lat/lon coordinates from dataset...
✓ Found 1825 unique lat/lon combinations
→ Filtering unique coordinates against polygon...
✓ Coordinate filtering completed in 0.02 seconds
  - Points inside: 962
  - Points outside: 863
  

  print(f"    ✓ Loaded dataset with shape: {dict(ds.dims)}")
  print(f"    ✓ Loaded dataset with shape: {dict(ds.dims)}")
  print(f"  ✓ Merged dataset shape: {dict(merged_chunk_ds.dims)}")
