# Creating a Cutout with the SARAH-2 dataset

This walkthrough describes the process of creating a cutout using the [SARAH-2 dataset by EUMETSAT](https://wui.cmsaf.eu/safira/action/viewDoiDetails?acronym=SARAH_V002_01).

The SARAH-2 dataset contains extensive information on solar radiation variables, like surface incoming direct radiation (SID) or surface incoming shortwave radiation (SIS).
It serves as an addition to the ERA5 dataset and as such requires the `cdsapi` to be setup properly.

> **Recommendation**
>
> This is a reduced version for cutout creation. Creating cutouts with ERA-5 is simpler and explained in more details.
> We therefore recommend you have a look at [this example first](https://atlite.readthedocs.io/en/latest/examples/create_cutout.html).

> **Note**:
>
> For creating a cutout from this dataset, you need to download large files and your computers memory needs to be able to handle these as well.

## Downloading the data set

To download the dataset, head to the EUMETSTATs website (the link points to the current 2.1 edition)

https://wui.cmsaf.eu/safira/action/viewDoiDetails?acronym=SARAH_V002_01 

On the bottom, select the products you want to include in the cutout, i.e. for us:

| variable | time span | time resolution | 
| --- | --- | --- |
| Surface incoming direct radiation (SID) | 2013 | Instantaneous |
| Surface incoming shortwave radiation (SIS) | 2013 | Instantaneous |

* Add each product to your cart and register with the website.
* Follow the instructions to activate your account, confirm your order and wait for the download to be ready.
* You will be notified by email with the download instructions.
* Download the ordered files of your order into a directory, e.g. `sarah-2`.
* Extract the `tar` files (e.g. for linux systems `tar -xvf *` or with `7zip` for windows) into the same folder

You are now ready to create cutouts using the SARAH-2 dataset.

## Specifying the cutout

Import the package and set recommended logging settings:

In [None]:
# I am trying something
import logging

import atlite
import xarray as xr

logging.basicConfig(level=logging.INFO)
import cdsapi
c = cdsapi.Client()

2025-02-05 09:51:35,048 INFO [2024-09-26T00:00:00] Watch our [Forum](https://forum.ecmwf.int/) for Announcements, news and other discussed topics.
INFO:datapi.legacy_api_client:[2024-09-26T00:00:00] Watch our [Forum](https://forum.ecmwf.int/) for Announcements, news and other discussed topics.


In [13]:
import os
import glob
import xarray as xr

# Original directory containing your SARAH NetCDF files
sarah_dir = r"C:/Users/marta/Desktop/Thesis/Climate-Change-Impacted-Solar-Energy-Generation/atlite examples/sarah2"

# New directory where fixed files will be saved
sarah_dir_fixed = r"C:/Users/marta/Desktop/Thesis/Climate-Change-Impacted-Solar-Energy-Generation/atlite examples/sarah2_fixed"

# Ensure the new directory exists
os.makedirs(sarah_dir_fixed, exist_ok=True)

# Pattern to find all .nc files in the original directory
file_pattern = os.path.join(sarah_dir, "*.nc")

# Loop over each NetCDF file in the directory
for file_path in glob.glob(file_pattern):
    print(f"Processing: {file_path}")
    
    # Open the dataset
    ds = xr.open_dataset(file_path)
    
    # Update or set the 'timefreq' attribute to "1h"
    ds.attrs["timefreq"] = "1h"
    
    # Optional resampling if needed
    # ds = ds.resample(time="1H").nearest()

    # Build a new filename in the 'sarah2_fixed' directory
    # e.g. if file_path = ".../sarah2/file1.nc", 
    # we create ".../sarah2_fixed/file1_fixed.nc"
    base_name = os.path.splitext(os.path.basename(file_path))[0]
    new_file_path = os.path.join(sarah_dir_fixed, base_name + "_fixed.nc")

    # Save the modified dataset
    ds.to_netcdf(new_file_path)
    
    # Close the dataset
    ds.close()
    
    print(f"Saved fixed file to: {new_file_path}")

print("All NetCDF files processed!")


Processing: C:/Users/marta/Desktop/Thesis/Climate-Change-Impacted-Solar-Energy-Generation/atlite examples/sarah2\SIDin201301010000004UD1000101UD.nc
Saved fixed file to: C:/Users/marta/Desktop/Thesis/Climate-Change-Impacted-Solar-Energy-Generation/atlite examples/sarah2_fixed\SIDin201301010000004UD1000101UD_fixed.nc
Processing: C:/Users/marta/Desktop/Thesis/Climate-Change-Impacted-Solar-Energy-Generation/atlite examples/sarah2\SIDin201301020000004UD1000101UD.nc
Saved fixed file to: C:/Users/marta/Desktop/Thesis/Climate-Change-Impacted-Solar-Energy-Generation/atlite examples/sarah2_fixed\SIDin201301020000004UD1000101UD_fixed.nc
Processing: C:/Users/marta/Desktop/Thesis/Climate-Change-Impacted-Solar-Energy-Generation/atlite examples/sarah2\SIDin201301030000004UD1000101UD.nc
Saved fixed file to: C:/Users/marta/Desktop/Thesis/Climate-Change-Impacted-Solar-Energy-Generation/atlite examples/sarah2_fixed\SIDin201301030000004UD1000101UD_fixed.nc
Processing: C:/Users/marta/Desktop/Thesis/Climate

In [15]:
ds

In [19]:
cutout = atlite.Cutout(
    path="europe-2013-01.nc",
    module=["sarah", "era5"],
    sarah_dir="C:/Users/marta/Desktop/Thesis/Climate-Change-Impacted-Solar-Energy-Generation/atlite examples/sarah2_fixed",
    x=slice(-13.6913, 1.7712),
    y=slice(49.9096, 60.8479),
    time="2013-01",
)
print(f"Time resolution (dt): {cutout.dt}")

INFO:atlite.cutout:Building new cutout europe-2013-01.nc


Time resolution (dt): H


AttributeError: 'Cutout' object has no attribute 'tmpdir'

In [17]:
cutout.prepare()

INFO:atlite.data:Storing temporary files in C:\Users\marta\AppData\Local\Temp\tmpqkmc3xg2
INFO:atlite.data:Calculating and writing with module sarah:


AssertionError: 

In [7]:
# Open the NetCDF file
file_path = "europe-2013-01.nc"
ds = xr.open_dataset(file_path)

# Check current attributes
print(ds.attrs)

# Modify the time resolution attribute
ds.attrs["dt"] = "h"  

# Save the updated dataset
updated_file_path = "europe-2013-01-updated.nc"
ds.to_netcdf(updated_file_path)
ds.close()

print("Updated NetCDF file saved.")

FileNotFoundError: [Errno 2] No such file or directory: b'c:\\Users\\marta\\Desktop\\Thesis\\Climate-Change-Impacted-Solar-Energy-Generation\\atlite examples\\europe-2013-01.nc'

Let's see what the available features that is the available weather data variables are.

## Preparing the Cutout

No matter which dataset you use, this is where all the work actually happens.
This can be fast or take some or a lot of time and resources, among others depending on
your computer ressources (especially memory for SARAH-2).

In [5]:
cutout.dt = "h"
cutout.prepare()

AttributeError: property 'dt' of 'Cutout' object has no setter

Querying the cutout gives us basic information on which data is contained and can already be used.

## Inspecting the Cutout

In [None]:
cutout  # basic information

In [None]:
cutout.data.attrs  # cutout meta data

In [None]:
cutout.prepared_features  # included weather variables

In [None]:
cutout.data  # access to underlying xarray data