# Pre-Preocess Global Rainfall Data 

The notebook is used to pre-process global rainfall data for specific AOIs. The rainfall data is first cropped to specific AOIs followed by computing a yearly sum and daily average.

The source data is available globally [here](https://data.chc.ucsb.edu/products/CHIRPS-2.0/global_monthly/tifs/) as monthly rainfall totals in mm. Also availble [here](https://dataexchange.gatesfoundation.org/dataset/chirps_rainfall_dataset_2022/resource/c9a40f6d-09d3-4b00-894b-8864573b82f8) for 2022.

Set the configurations below to define the location of the source data GeoTiff files along with the output folders from the cropped files and the sum and average output files. The specific AOI also needs to be configured which defines the region to crop the data prior to computing the sum and averages.

## UTM Zones for Reference

```
EPSG:32628  28N	 18°W to 12°W
EPSG:32629  29N	 12°W to 6°W
EPSG:32630  30N	 6°W  to 0°
EPSG:32631  31N	 0°   to 6°E
EPSG:32632  32N  6°E  to 12°E
EPSG:32633  33N  12°E to 18°E
EPSG:32634  34N	 18°E to 24°E
EPSG:32635  35N	 24°E to 30°E
EPSG:32636  36N	 30°E to 36°E
EPSG:32637  37N	 36°E to 42°E
EPSG:32638  38N	 42°E to 48°E
EPSG:32639  39N	 48°E to 54°E
EPSG:32640  40N	 54°E to 60°E
EPSG:32641  41N	 60°E to 66°E
EPSG:32642  42N	 66°E to 72°E
EPSG:32643  43N	 72°E to 78°E
EPSG:32644  44N	 78°E to 84°E
EPSG:32645  45N	 84°E to 90°E
```


In [1]:
import os
from dataclasses import dataclass

# Import module that contains several convenience functions (e.g., gdal wrappers)
from gist_utils import *

# Adding path to gdal commands for local system
os.environ['PATH'] += ':/Users/billk/miniforge3/envs/py39-pt/bin/' 

## 1 Set Country Code and UTM Zone

The only input settings required in this notebook are the two-letter country code, UTM zone, and the latitude band for the specified UTM zone.

In [2]:
country_code = 'PK'
utm_zone     = '42N'
utm_code = get_epsg_code(utm_zone)

In [3]:
@dataclass(frozen=True)
class AOIConfig:
    UTM_CODE:    str
    UTM_BUF_DEG: float = 1.0
    LAT_NORTH:   float = 38.0    # Specify for the specific AOI (country and UTM zone)
    LAT_SOUTH:   float = 23.0    # Specify for the specific AOI (country and UTM zone)
        
@dataclass(frozen=True)
class DatasetConfig:
    COUNTRY_CODE:  str
    UTM_ZONE:      str
    DATA_DIR:      str = './Rainfall/Chirps_2023/'
    OUT_DIR_CROP:  str = './Rainfall/Chirps_2023/{country_code}_{utm_zone}/output_crop'  
    OUT_DIR:       str = './Rainfall/Chirps_2023/{country_code}_{utm_zone}/output' 
    OUT_BASE:      str = 'chirps-v2.0.2023'
    GDAL_INFO:    bool = False

    def get_out_dir_crop(self):
        return self.OUT_DIR_CROP.format(country_code=self.COUNTRY_CODE, utm_zone=self.UTM_ZONE)
    
    def get_out_dir(self):
        return self.OUT_DIR.format(country_code=self.COUNTRY_CODE, utm_zone=self.UTM_ZONE)

# Instantiate data classes
data_config = DatasetConfig(COUNTRY_CODE=country_code, UTM_ZONE=utm_zone)
aoi_config  = AOIConfig(UTM_CODE=utm_code)

case = country_code + '_' + utm_zone 

In [4]:
print(data_config.get_out_dir_crop())
print(data_config.get_out_dir())
print(aoi_config.UTM_CODE)

./Rainfall/Chirps_2023/PK_42N/output_crop
./Rainfall/Chirps_2023/PK_42N/output
EPSG:32642


In [5]:
# Set output filenames
output_sum = data_config.get_out_dir() + "/" + data_config.OUT_BASE + "_" + case + "_sum.tif"
output_avg = data_config.get_out_dir() + "/" + data_config.OUT_BASE + "_" + case + "_avg.tif"

print("output_sum will be saved here: ", output_sum)
print("output_avg will be saved here: ", output_avg)

output_sum will be saved here:  ./Rainfall/Chirps_2023/PK_42N/output/chirps-v2.0.2023_PK_42N_sum.tif
output_avg will be saved here:  ./Rainfall/Chirps_2023/PK_42N/output/chirps-v2.0.2023_PK_42N_avg.tif


In [6]:
# Create output folders if they do not already exist
if not os.path.exists(data_config.get_out_dir_crop()):
    os.makedirs(data_config.get_out_dir_crop())
    
if not os.path.exists(data_config.get_out_dir()):
    os.makedirs(data_config.get_out_dir())

## 2 Define the Cropped Region
The cropped region is defined by the UTM zone specified above and the desired latitude bands that span the region of interest.

In [7]:
# Define the longitude booundaries for the specified UTM zone
utm_west_lon, utm_east_lon = utm_zone_longitude_bounds(aoi_config.UTM_CODE)

# Define AOI to encompass local UTM zone (+/- small buffer). Choose latitude to cover data for region
ul_lat, ul_lon = aoi_config.LAT_NORTH, utm_west_lon - aoi_config.UTM_BUF_DEG
lr_lat, lr_lon = aoi_config.LAT_SOUTH, utm_east_lon + aoi_config.UTM_BUF_DEG

# Print the results
print(f"Upper Left Lat: {ul_lat}")
print(f"Upper Left Lon: {ul_lon}")
print(f"Lower Right Lat: {lr_lat}")
print(f"Lower Right Lon: {lr_lon}")

epsg_code:  EPSG:32642
Upper Left Lat: 38.0
Upper Left Lon: 65.0
Lower Right Lat: 23.0
Lower Right Lon: 73.0


### Confirm Input Files

In [8]:
# Create a list of all files in the directory
files_in_directory = os.listdir(data_config.DATA_DIR)

# Filter the list to include only TIFF files
tiff_files = sorted([file for file in files_in_directory if file.endswith('.tif')])

for file in tiff_files:
    print(file)

chirps-v2.0.2023.01.tif
chirps-v2.0.2023.02.tif
chirps-v2.0.2023.03.tif
chirps-v2.0.2023.04.tif
chirps-v2.0.2023.05.tif
chirps-v2.0.2023.06.tif
chirps-v2.0.2023.07.tif
chirps-v2.0.2023.08.tif
chirps-v2.0.2023.09.tif
chirps-v2.0.2023.10.tif
chirps-v2.0.2023.11.tif
chirps-v2.0.2023.12.tif


## 3 Crop the Source Files 

In [9]:
# Loop through each TIFF file
for file_name in tiff_files:
    
    input_tif = os.path.join(data_config.DATA_DIR, file_name)
    
    # Construct the output file name based on the input file name
    temp = '_' + case + "_crop.tif"
    intermediate_tif = os.path.join(data_config.get_out_dir_crop(), os.path.splitext(file_name)[0] + temp)
    print(intermediate_tif)
    # Crop the data
    gdal_crop(input_tif, intermediate_tif, ul_lon, ul_lat, lr_lon, lr_lat, False)

./Rainfall/Chirps_2023/PK_42N/output_crop/chirps-v2.0.2023.01_PK_42N_crop.tif
Input file size is 7200, 2000
0...10...20...30...40...50...60...70...80...90...100 - done.

./Rainfall/Chirps_2023/PK_42N/output_crop/chirps-v2.0.2023.02_PK_42N_crop.tif
Input file size is 7200, 2000
0...10...20...30...40...50...60...70...80...90...100 - done.

./Rainfall/Chirps_2023/PK_42N/output_crop/chirps-v2.0.2023.03_PK_42N_crop.tif
Input file size is 7200, 2000
0...10...20...30...40...50...60...70...80...90...100 - done.

./Rainfall/Chirps_2023/PK_42N/output_crop/chirps-v2.0.2023.04_PK_42N_crop.tif
Input file size is 7200, 2000
0...10...20...30...40...50...60...70...80...90...100 - done.

./Rainfall/Chirps_2023/PK_42N/output_crop/chirps-v2.0.2023.05_PK_42N_crop.tif
Input file size is 7200, 2000
0...10...20...30...40...50...60...70...80...90...100 - done.

./Rainfall/Chirps_2023/PK_42N/output_crop/chirps-v2.0.2023.06_PK_42N_crop.tif
Input file size is 7200, 2000
0...10...20...30...40...50...60...70...80.

### Confirm the Cropped Files

In [10]:
# List all the cropped TIFF files
cropped_files = [os.path.join(data_config.get_out_dir_crop(), file) for file in os.listdir(data_config.get_out_dir_crop()) if file.endswith('.tif') or file.endswith('.tiff')]

for file in cropped_files:
    print(file)

./Rainfall/Chirps_2023/PK_42N/output_crop/chirps-v2.0.2023.03_PK_42N_crop.tif
./Rainfall/Chirps_2023/PK_42N/output_crop/chirps-v2.0.2023.12_PK_42N_crop.tif
./Rainfall/Chirps_2023/PK_42N/output_crop/chirps-v2.0.2023.04_PK_42N_crop.tif
./Rainfall/Chirps_2023/PK_42N/output_crop/chirps-v2.0.2023.05_PK_42N_crop.tif
./Rainfall/Chirps_2023/PK_42N/output_crop/chirps-v2.0.2023.02_PK_42N_crop.tif
./Rainfall/Chirps_2023/PK_42N/output_crop/chirps-v2.0.2023.11_PK_42N_crop.tif
./Rainfall/Chirps_2023/PK_42N/output_crop/chirps-v2.0.2023.08_PK_42N_crop.tif
./Rainfall/Chirps_2023/PK_42N/output_crop/chirps-v2.0.2023.07_PK_42N_crop.tif
./Rainfall/Chirps_2023/PK_42N/output_crop/chirps-v2.0.2023.01_PK_42N_crop.tif
./Rainfall/Chirps_2023/PK_42N/output_crop/chirps-v2.0.2023.10_PK_42N_crop.tif
./Rainfall/Chirps_2023/PK_42N/output_crop/chirps-v2.0.2023.09_PK_42N_crop.tif
./Rainfall/Chirps_2023/PK_42N/output_crop/chirps-v2.0.2023.06_PK_42N_crop.tif


## 4 Compute the Yearly Sum

In [11]:
# Compute the yearly sum
sum_rasters(cropped_files, output_sum)

Sum of rasters successfully saved to: ./Rainfall/Chirps_2023/PK_42N/output/chirps-v2.0.2023_PK_42N_sum.tif
Output from command:
0...10...20...30...40...50...60...70...80...90...100 - done.



## 5 Compute the Daily Average

In [12]:
# Set the input file to the sum computed above
input_file = output_sum

# Compute the daily average
average_raster(input_file, output_avg, divisor=365)

Average operation successfully saved to: ./Rainfall/Chirps_2023/PK_42N/output/chirps-v2.0.2023_PK_42N_avg.tif
Output from command:
0...10...20...30...40...50...60...70...80...90...100 - done.

