# Debug ERA5-Land Download

This notebook tests the `download_era5_land_data` function from `climate_downloader.py` to verify if it successfully downloads and processes ERA5-Land data, and to check why the pipeline reports a failure despite the file being created.

## Objectives
1. Test the download function with the same parameters as the pipeline.
2. Verify the downloaded NetCDF file's validity and contents.
3. Confirm the return value of the function.
4. Suggest fixes for the pipeline.


In [None]:
import sys
import os
from pathlib import Path
import xarray as xr
from calendar import monthrange

# Add project root to PYTHONPATH
project_root = Path.cwd().parent
sys.path.insert(0, str(project_root))

# Import required modules
try:
    from src.config import settings
    from src.utils import paths
    from src.data.climate_downloader import download_era5_land_data
except ModuleNotFoundError as e:
    print(f"❌ Import error: {e}")
    print("Ensure the project structure includes src/config/settings.py, src/utils/paths.py, and src/data/climate_downloader.py")
    raise

# Create project directories
paths.create_project_dirs()

# Define test parameters (same as in terminal output)
job_id = "analysis_-22.818_-47.069_1754078650"
output_path = paths.RAW_CLIMATE_DIR / f"{job_id}_era5.nc"
area_cds = [-22.40409587125399, -47.510241818393816, -23.212575928745903, -46.63318158160609]  # [N, W, S, E]
year = "2024"
month = "07"
days = [str(d).zfill(2) for d in range(1, monthrange(int(year), int(month))[1] + 1)]
variables = ["total_precipitation", "2m_temperature"]
time = ["00:00", "12:00"]

print(f"📋 Test parameters:")
print(f"Output path: {output_path}")
print(f"Area (N/W/S/E): {area_cds}")
print(f"Year: {year}, Month: {month}, Days: {days}")
print(f"Variables: {variables}")
print(f"Time: {time}")


In [None]:
# Run the download
print("🚀 Starting ERA5-Land download test...")
result = download_era5_land_data(
    variables=variables,
    year=year,
    month=month,
    days=days,
    time=time,
    area=area_cds,
    output_path=output_path
)

print(f"\n📈 Download function returned: {result}")


In [None]:
# Verify the downloaded file
if output_path.exists():
    print(f"✅ File exists at: {output_path}")
    file_size = output_path.stat().st_size / 1024  # Size in KB
    print(f"📦 File size: {file_size:.1f} KB")

    # Try to open the NetCDF file
    try:
        ds = xr.open_dataset(output_path)
        print(f"\n📊 NetCDF file contents:")
        print(f"Dimensions: {dict(ds.dims)}")
        print(f"Coordinates: {list(ds.coords)}")
        print(f"Variables: {list(ds.data_vars)}")

        # Check specific variables
        for var in variables:
            if var in ds.data_vars:
                print(f"\n🔍 Variable {var}:")
                print(f"Shape: {ds[var].shape}")
                print(f"Mean value: {ds[var].mean().values:.4f}")
            else:
                print(f"⚠️ Variable {var} not found in dataset")

        ds.close()
    except Exception as e:
        print(f"❌ Error opening NetCDF file: {e}")
else:
    print(f"❌ File not found: {output_path}")


## Expected Outcomes
- If the download succeeds and the file is valid, you should see the file size (~53.8 KB, as in the terminal) and the NetCDF contents (dimensions, coordinates, variables).
- If the function returns `None`, this confirms the issue with `safe_execute` misinterpreting the result.
- If the file is corrupt or empty, the `xr.open_dataset` call will fail, indicating a deeper issue with the download or decompression.

## Fix for the Pipeline
To resolve the issue, modify `download_era5_land_data` in `climate_downloader.py` to return the output path. Add this line at the end of the function (before the `except` block):

```python
return output_path
```

This ensures `safe_execute` receives a non-`None` value, preventing the warning. Here’s the updated function ending:

```python
        # Verify final file
        if output_path.exists():
            final_size = output_path.stat().st_size
            print(f"✅ Arquivo final: {final_size / 1024:.1f} KB")
        else:
            raise FileNotFoundError(f"Arquivo final não encontrado: {output_path}")
        
        print(f"🎉 Download completo! Arquivo salvo em: {output_path}")
        return output_path  # Add this line

    except Exception as e:
        print(f"❌ Falha ao baixar os dados do ERA5-Land: {e}")
        ...
```

After applying this fix, rerun the pipeline to confirm the warning disappears.
