# CHELSA Climate Data Processing

This script is designed to process monthly climate data from the CHELSA dataset. It extracts and projects the data for specific geographical regions using the arcpy library. The script performs the following tasks:

1. **Importing Packages**: Import necessary packages including arcpy for geospatial processing.

2. **Setting Up Workspace**: Set up the arcpy workspace and configure overwrite behavior for outputs.

3. **Folders and Paths**: Specify the folders and paths for input and output data, as well as mask shapefiles for the regions of interest (Europe and Switzerland).

4. **Data Elaboration - Europe Extraction and Projection**:
   - Iterate through the specified climate data folders (e.g., "pr", "rsds") for Europe.
   - Extract and project data for each file using the `ExtractByMask` function and `ProjectRaster` tool. The data are projected trom the WGS84 to the ETRS 1989 LAEA 3035 coordinate system.
   - Save processed data in the corresponding output folder for Europe.

5. **Data Elaboration - Switzerland Extraction and Projection**:
   - Iterate through the processed data folders for Europe.
   - Extract and project data for Switzerland using similar methods. The data are projected from the ETRS 1989 LAEA 3035 to the LV95 CH03+ coordinate system.
   - Save processed data in the corresponding output folder for Switzerland.

6. **Execution and Timing**: The script provides progress updates and measures the time taken for processing.

## Prerequisites

1. **ArcGIS Software**: This script requires arcpy, which is part of the ArcGIS software suite. Ensure you have a compatible version of ArcGIS installed.

## Usage

1. **Open the Jupyter Notebook**: Launch your Jupyter Notebook environment.

2. **Configuration**: Modify the script's parameters, such as workspace paths and mask shapefile paths, to match your setup.

3. **Run the Notebook Cells**: Execute the notebook cells sequentially by clicking on each cell and pressing Shift + Enter. Make sure to run the cells in the correct order.

4. **Output**: Processed climate data files will be saved in the specified output folders for both Europe and Switzerland regions.

## Notes

- This script assumes that you have the necessary data, including input climate data files and mask shapefiles, in the specified directories.
- Make sure to review and adjust the geographic extent and coordinate systems in the `analysis_extent` and `out_coor_system` parameters based on your dataset and requirements.
- Always verify the output data's accuracy and correctness after running the script.

## Author

Script written by Luca Ferrari

Contact: luca.ferrari@usys.ethz.ch

For inquiries or assistance, please contact the author.

**Note:** Ensure compliance with data usage agreements and copyrights when processing and using external datasets.

This README content was generated with the assistance of an AI language model from OpenAI. The provided content is based on user input and has been tailored to the specific requirements of the project.

In [None]:
# %% Import packages
import arcpy
from arcpy import env
import os
import time
arcpy.CheckOutExtension("Spatial")

In [None]:
# %% Define workspace
env.workspace = r"N:\Luca_data"
arcpy.env.overwriteOutput = True

folders = ["pr", "rsds", "tasmax", "tasmin", "vpd"]

# Specify the mask path
path_mask_Europe = r"\\Code\Extract_Europe_Data\EuropeMask\EuropeMask.shp"
path_mask_Switzerland = r"\\Code\Extract_Europe_Data\SwissMask\SwissMask.shp"

In [None]:
# %% Elaborate data
start = time.time()

for folder in folders:
    # Specify the input path
    input_path = os.path.join(env.workspace, "Chelsa_V2_Monthly\Original data", folder)

    # Specify the output path
    output_path = os.path.join(env.workspace, "Chelsa_V2_Monthly", f"Output_Europe_{folder}")
    
    # Create output folder if it does not exist
    if not os.path.exists(output_path):
        os.makedirs(output_path)
    
    # Get the list of files in the directory using os.scandir()
    with os.scandir(input_path) as entries:
        # Filter out directories and get only file names
        file_names = [entry.name for entry in entries if entry.is_file()]

    # Process data for each file name and output file path
    for file_name in file_names:
        # start = time.time()
        # start_mask = time.time()
        
        # Construct full paths to the file and output
        file_path = os.path.join(input_path, file_name)
        output_file_path =  os.path.join(output_path, f"Europe_{file_name}")

        # Perform processing using ExtractByMask function
        extract_raster = arcpy.sa.ExtractByMask(in_raster = file_path, 
                                                in_mask_data = path_mask_Europe, 
                                                extraction_area = "INSIDE",
                                                analysis_extent='-63.088253117 -21.390765218 55.838081786 71.118136504 GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137.0,298.257223563]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]]'
                                            )
        
        # end_mask = time.time()
        # print('Extract by mask took', end_mask - start_mask, 's to be processed')
        
        # start_project = time.time()
                
        # Project the output raster
        arcpy.management.ProjectRaster(
            in_raster = extract_raster,
            out_raster = output_file_path,
            out_coor_system = 'PROJCS["ETRS_1989_LAEA",GEOGCS["GCS_ETRS_1989",DATUM["D_ETRS_1989",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Lambert_Azimuthal_Equal_Area"],PARAMETER["False_Easting",4321000.0],PARAMETER["False_Northing",3210000.0],PARAMETER["Central_Meridian",10.0],PARAMETER["Latitude_Of_Origin",52.0],UNIT["Meter",1.0]]',
            resampling_type = "NEAREST",
            cell_size = "848.515796439947 848.515796439947",
            geographic_transform = "ETRS_1989_To_WGS_1984",
            Registration_Point = None,
            in_coor_system = 'GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137.0,298.257223563]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]]',
            vertical = "NO_VERTICAL"
        )
        
        # end_project = time.time()
        # print('Projecting took', end_project - start_project, 's to be processed')
        
        # end = time.time()
        # print('The file took', end - start, 's to be processed')

        # Print progress or other relevant information
        print(f"Processed: {file_path} -> {output_file_path}\n")

end = time.time()
print('The files took', end - start, 's to be processed')

In [None]:
# %% Elaborate data
start = time.time()

for folder in folders:
    # Specify the input path
    input_path = os.path.join(env.workspace, "Chelsa_V2_Monthly", f"Output_Europe_{folder}")

    # Specify the output path
    output_path = os.path.join(env.workspace, "Chelsa_V2_Monthly", f"Output_Switzerland_{folder}")
    
    # Create output folder if it does not exist
    if not os.path.exists(output_path):
        os.makedirs(output_path)
    
    # Get the list of files in the directory using os.scandir()
    with os.scandir(input_path) as entries:
        # Filter out directories and get only file names
        file_names = [entry.name for entry in entries if entry.is_file()]
        # Filter the list to only include .tif files
        file_names = [f for f in file_names if f.endswith('.tif')]

    # Process data for each file name and output file path
    for file_name in file_names:
        #start = time.time()
        #start_mask = time.time()
        
        # Construct full paths to the file and output
        file_path = os.path.join(input_path, file_name)
        file_name = file_name.replace("Europe", "Switzerland")
        output_file_path =  os.path.join(output_path, file_name)

        # Perform processing using ExtractByMask function
        extract_raster = arcpy.sa.ExtractByMask(in_raster=file_path,
                                                in_mask_data=path_mask_Switzerland,
                                                extraction_area="INSIDE",
                                                analysis_extent='5.95606700000008 45.81836186 10.4920640000001 47.807378701 GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137.0,298.257223563]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]]'
                                                )
        extract_raster.save(output_file_path)
        
        #end_mask = time.time()
        #print('Extract by mask took', end_mask - start_mask, 's to be processed')

        #end = time.time()
        #print('The file took', end - start, 's to be processed')

        # Print progress or other relevant information
        print(f"Processed: {file_path} -> {output_file_path}\n")

end = time.time()
print('The files took', end - start, 's to be processed')
    