**<center>Georeferencing CMIP6 data and extracting an area of interest (AOI)</center>**

*This script was developed on Python 3.9.19.*

*Ensure that the following packages are installed:  pandas, xarray, numpy*

*Feel free to optimise this script and share the update*

**1. Importing libraries**

In [1]:
import xarray as xr
import numpy as np

**2. Defining the function to be used for georeferencing and extracting the AOI**

*The function below will georeference the dataset then extract the area of interest (AOI). It takes the following parameters:*
    
*1. input_file: the CMIP6 data you want to process*
    
*2. aoi: the list containing the geographic coordinates of your AOI. The list must contain values in the following order [lat_max, lat_min, lon_max, lon_min]*
    
*3. output_file: the path to save the processed file*

In [2]:

def georef_clip (input_file, aoi, output_file): 

    dataset = xr.open_dataset(input_file) # Opening the CMIP6 dataset

    # Georeferencing step

    georef_data = dataset

    georef_data["lon"] = np.where(georef_data["lon"] > 180, georef_data["lon"] - 360, georef_data["lon"]) # Changing the 0-360 longitude format to use the -180 to 180 format

    georef_data = georef_data.sortby(georef_data.lon) # Sorting the data by longitude values

    # Extracting the area of interest (AOI)

        # Finding the closest coordinates (in the dataset) to the provided AOI. 

    north = np.argmin(np.abs(georef_data["lat"].values - aoi[0]))

    south = np.argmin(np.abs(georef_data["lat"].values - aoi[1]))

    east = np.argmin(np.abs(georef_data["lon"].values - aoi[2]))

    west = np.argmin(np.abs(georef_data["lon"].values - aoi[3]))

        # Getting the id of the closest coordinates

    lon_min = georef_data["lon"].values[west]

    lon_max = georef_data["lon"].values[east]

    lat_min = georef_data["lat"].values[south]

    lat_max = georef_data["lat"].values[north]

        # Clipping the data using the identified coordinates in the dataset that close to the provided AOI

    clipped_data = georef_data.sel(lon=slice(lon_min, lon_max), lat=slice(lat_min, lat_max))

    # Saving the result

    clipped_data.to_netcdf(output_file)

    print (f"Process completed. Data saved here {output_file}")


**3. Georeferencing and extracting the AOI**

In [5]:
# Setting the parameters to be used by the function

cmip6_file = "C:/Users/ilung/Documents/georef_cmip6/uas_day_IPSL-CM6A-LR_ssp245_r1i1p1f1_gr_20150101-21001231.nc" # CMIP6 file

aoi = [15, -3, 3, -14] # The list containing the AOI bounding box. Must use the following format [lat_max, lat_min, lon_max, lon_min]

output_file = "C:/Users/ilung/Documents/georef_cmip6/uas_day_clip.nc" # Output file

# Launching the function to georeference and extract the AOI

georef_clip(cmip6_file, aoi, output_file)