# Cross Validation for IDW & RK Interpolation 
## Task 2 (continuous & discrete) cross-year for the four seasons

This document includes Python codes that conduct cross validation (CV) for Inverse Distance Weighting (IDW) Interpolation and RK on water quality parameters, including 6 water quality parameters in arcpy environment:
- Dissolved oxygen (DO_mgl)
- Salinity (Sal_ppt)
- Turbidity (Turb_ntu)
- Temperature (T_c)
- Secchi (Secc_m)
- Total Nitrogen (TN_mgl) 

The analysis is conducted in the separate water bodies:
- Guana Tolomato Matanzas (GTM)
- Estero Bay (EB)
- Charlotte Harbor (CH)
- Biscayne Bay (BB)
- Big Bend Seagrasses (BBS)

**Tasks:**  

**Calculate the RMSE and Mean Error (ME) for IDW and RK results using both continuous and discrete data across-year for four seasons**


<br>
<div style="text-align: left;">
    <img src="../misc/CrossYear.png" style="display: block; margin-left: 0; margin-right: auto; width: 600px;"/>
</div>


**Contents:**
* [1. Data Preprocess](#reg_preprocessing)
    * [1.1 Load csv files](#reg_subset)
    * [1.2 Subsetting data](#reg_preview)
    * [1.3 Filter the data](#reg_studied)
    * [1.4 Calculating average values](#reg_average)
    * [1.5 Convert coordinate system](#reg_coordinate)
* [2. Prepare for batch interpolation](#reg_batch)
    * [2.1 Preset abbreviation](#reg_preset)
    * [2.2 Define the barrier files](#reg_barrier)
    * [2.3 Define waterbody boundary](#reg_boundary)
    * [2.4 Load the table of study periods,  parameters, and seasons](#reg_study)
    * [2.5 Define output folders](#reg_output)
    * [2.6 Fill NaN RowID with unique ID](#reg_id)
* [3. Create Shapefiles](#reg_create_shp)
* [4. Cross Validation for IDW](#reg_cv_idw)
* [5. RK Interpolation](#reg_rk)

## 1. Loading packages

In [1]:
import pandas as pd
import numpy as np
import arcpy
from arcpy.sa import *
import os
import math
import csv
import random

import importlib
import sys
# path = r'C:/Users/cong1/WQ/IDW/git/misc'
path = r'E:\Projects\SEACAR_WQ_2024\git\misc'

sys.path.insert(0, path)
import idw_rk
importlib.reload(idw_rk)

import pyproj

# define scratch folder to avoid overwritting from parallel threats
arcpy.env.scratchWorkspace = r"E:\Projects\SEACAR_WQ_2024\scratch/IDW_crossyear"

arcpy.env.overwriteOutput = True


## 1. Data Preprocessing <a class="anchor" id="reg_preprocessing"></a>
### 1.1 Load csv files

In [2]:
gis_path = r'E:/Projects/SEACAR_WQ_2024/GIS_Data/'

dfDis = pd.read_csv(gis_path + 'OEAT_Discrete_WQ-2024-May-06.csv', low_memory=False)
dfCon = pd.read_csv(gis_path + 'OEAT_Continuous_WQ-2024-Feb-21.csv', low_memory=False)

dfAll = pd.concat([dfDis, dfCon], ignore_index=True)

## 1.2 Subsetting Data <a class="anchor" id="reg_subset"></a>
### Selecting data from 8 am to 18 pm (daytime)

In [3]:
# Convert string to datetime
dfCon['SampleDate'] = pd.to_datetime(dfCon['SampleDate'], format='%Y-%m-%d %H:%M:%S.%f')
dfDis['SampleDate'] = pd.to_datetime(dfDis['SampleDate'], format='%Y-%m-%d %H:%M:%S.%f')

# Include date from 8:00 am to 18:00 pm
start_time = '08:00'
end_time = '18:00'

dfCon = dfCon[dfCon['SampleDate'].dt.time.between(pd.to_datetime(start_time).time(), pd.to_datetime(end_time).time())]

dfAll = pd.concat([dfDis, dfCon], ignore_index=True)

dfAll.head()

Unnamed: 0,RowID,ProgramID,ParameterName,ParameterUnits,ProgramLocationID,ActivityType,SampleDate,Year,Month,RelativeDepth,ResultValue,Latitude_DD,Longitude_DD,ManagedAreaName,AreaID,SEACAR_QAQCFlagCode,WaterBody,WbodyAcronym,Season
0,1,69,Secchi Depth,m,CKM2017100405,Field,2017-10-06,2017,10,,0.3,29.3221,-83.129866,Big Bend Seagrasses Aquatic Preserve,5,6Q,Big Bend Seagrasses,BBS,Fall
1,2,69,Secchi Depth,m,CKM2017080401,Field,2017-08-08,2017,8,Surface,0.6,29.145966,-83.07225,Big Bend Seagrasses Aquatic Preserve,5,6Q/9Q,Big Bend Seagrasses,BBS,Summer
2,3,69,Secchi Depth,m,CKM2017060703,Field,2017-06-19,2017,6,Surface,0.4,29.294516,-83.155316,Big Bend Seagrasses Aquatic Preserve,5,9Q/6Q,Big Bend Seagrasses,BBS,Summer
3,4,69,Secchi Depth,m,CKM2017060202,Field,2017-06-06,2017,6,Surface,0.4,29.1404,-83.01705,Big Bend Seagrasses Aquatic Preserve,5,6Q/9Q,Big Bend Seagrasses,BBS,Spring
4,5,69,Secchi Depth,m,CKM2017110804,Field,2017-11-14,2017,11,Surface,0.4,29.269566,-83.107283,Big Bend Seagrasses Aquatic Preserve,5,9Q/6Q,Big Bend Seagrasses,BBS,Fall


### 1.3 Filter the data<a class="anchor" id="reg_studied"></a>

In [4]:
# Load the table of cross-year seasons definitions
cross_year = pd.read_csv(gis_path + 'season_def/CrossYear.csv', low_memory=False)
cross_year

Unnamed: 0,WaterBody,Season,Year1,Year2,Year3
0,Charlotte Harbor,Spring,2017,2018,
1,Charlotte Harbor,Summer,2016,2017,
2,Charlotte Harbor,Fall,2016,2017,
3,Charlotte Harbor,Winter,2016,2017,2018.0
4,Big Bend Seagrasses,Spring,2021,2022,
5,Big Bend Seagrasses,Summer,2020,2021,
6,Big Bend Seagrasses,Fall,2020,2021,
7,Big Bend Seagrasses,Winter,2020,2021,2022.0
8,Estero Bay,Spring,2017,2018,
9,Estero Bay,Summer,2016,2017,


In [5]:
filtered_dfAllTime = idw_rk.filter_data_crossyear(cross_year, dfAll)
filtered_dfAllTime.head()

Unnamed: 0,RowID,ProgramID,ParameterName,ParameterUnits,ProgramLocationID,ActivityType,SampleDate,Year,Month,RelativeDepth,ResultValue,Latitude_DD,Longitude_DD,ManagedAreaName,AreaID,SEACAR_QAQCFlagCode,WaterBody,WbodyAcronym,Season
0,663,69,Secchi Depth,m,CHM2017050903,Field,2017-05-08,2017,5,Surface,0.7,26.711183,-82.134966,Gasparilla Sound-Charlotte Harbor Aquatic Pres...,18,9Q/6Q,Charlotte Harbor,CH,Spring
1,692,69,Secchi Depth,m,CHM2017041307,Field,2017-04-13,2017,4,Surface,2.4,26.700716,-82.243783,Pine Island Sound Aquatic Preserve,34,6Q/9Q,Charlotte Harbor,CH,Spring
2,724,69,Secchi Depth,m,CHM2017031502,Field,2017-03-20,2017,3,Surface,0.8,26.614233,-82.167866,Pine Island Sound Aquatic Preserve,34,6Q/9Q,Charlotte Harbor,CH,Spring
3,756,69,Secchi Depth,m,CHM2017040211,Field,2017-04-03,2017,4,Surface,0.5,26.969566,-82.11525,Gasparilla Sound-Charlotte Harbor Aquatic Pres...,18,6Q/9Q,Charlotte Harbor,CH,Spring
4,768,69,Secchi Depth,m,CHM2017040407,Field,2017-04-17,2017,4,Surface,1.0,26.904766,-82.1797,Gasparilla Sound-Charlotte Harbor Aquatic Pres...,18,6Q/9Q,Charlotte Harbor,CH,Spring


In [6]:
# Check the filtered results
CH_Winter = filtered_dfAllTime[(filtered_dfAllTime['WaterBody'] == 'Charlotte Harbor') & (filtered_dfAllTime['Season'] == 'Winter')]['Year'].unique()
CH_Winter

array([2016, 2017, 2018], dtype=int64)

In [7]:
GTM_Fall = filtered_dfAllTime[(filtered_dfAllTime['WaterBody'] == 'Guana Tolomato Matanzas') & (filtered_dfAllTime['Season'] == 'Fall')]['Year'].unique()
GTM_Fall

array([2015, 2016], dtype=int64)

### 1.4 Calculating average values at unique observation points<a class="anchor" id="reg_average"></a>

In [8]:
dfAll_Mean = filtered_dfAllTime.groupby(['WaterBody','ParameterName','ParameterUnits', 'Season','Latitude_DD','Longitude_DD','WbodyAcronym'])["ResultValue"].agg("mean").reset_index()
dfAll = dfAll_Mean

### 1.5 Convert coordinate system to EPSG: 3086<a class="anchor" id="reg_coordinate"></a>

In [9]:
# Define the EPSG codes for source (EPSG:4326) and target (EPSG:3086) coordinate systems
source_epsg = 'EPSG:4326'
target_epsg = 'EPSG:3086'

# Create a PyProj Transformer for the conversion
transformer = pyproj.Transformer.from_crs(source_epsg, target_epsg, always_xy=True)

# Define a function to apply the transformation to each row of the DataFrame
def transform_coordinates(row):
    x, y = transformer.transform(row['Longitude_DD'], row['Latitude_DD'])
    return pd.Series({'x': x, 'y': y})

# Apply the transformation function to the DataFrame and create new columns for the converted coordinates
dfAll[['x', 'y']] = dfAll.apply(transform_coordinates, axis=1)

#### Save aggregated data to csv file

In [10]:
dfAll.to_csv(gis_path + 'OEAT_CrossYear_All_WQ-2024-May-2.csv', index=False)

## 2. Prepare for batch interpolation<a class="anchor" id="reg_batch"></a>
### 2.1 Preset abbreviation for waterbody and parameter name<a class="anchor" id="reg_preset"></a>

In [11]:
area_shortnames = {
    'Guana Tolomato Matanzas': 'GTM',
    'Estero Bay': 'EB',
    'Charlotte Harbor': 'CH',
    'Biscayne Bay': 'BB',
    'Big Bend Seagrasses':'BBS'
}

param_shortnames = {
    'Salinity': 'Sal_ppt',
    'Total Nitrogen': 'TN_mgl',
    'Dissolved Oxygen': 'DO_mgl',
    'Turbidity':'Turb_ntu',
    'Secchi Depth':'Secc_m',
    'Water Temperature':'T_c'
}

covariates_dict = {
    "GTM":"LDI",
    "EB":"bathymetry+LDI+popden",
    "CH":"bathymetry+LDI+popden+water_flow_wet",
    "BB":"bathymetry+LDI+popden",
    "BBS":"bathymetry+LDI"
}

### 2.2 Define the barrier files<a class="anchor" id="reg_barrier"></a>

In [12]:
barrier_folder = os.path.join(gis_path, 'Barriers')
barrier_folder

barriers = []
for file in os.listdir(barrier_folder):
    if file.endswith(".shp"):
        barriers.append(os.path.join(barrier_folder, file))

for barrier in barriers:
    print(barrier)

E:/Projects/SEACAR_WQ_2024/GIS_Data/Barriers\BBS_Barriers.shp
E:/Projects/SEACAR_WQ_2024/GIS_Data/Barriers\BB_Barriers.shp
E:/Projects/SEACAR_WQ_2024/GIS_Data/Barriers\CH_Barriers.shp
E:/Projects/SEACAR_WQ_2024/GIS_Data/Barriers\EB_Barriers.shp
E:/Projects/SEACAR_WQ_2024/GIS_Data/Barriers\GTM_Barriers.shp


### 2.3 Define waterbody boundary for spatial extent and masking<a class="anchor" id="reg_boundary"></a>

In [13]:
waterbody_extent = os.path.join(gis_path, 'OEAT_Waterbody_Boundaries', 'OEAT_Waterbody_Boundary.shp')

unique_waterbodies = []
with arcpy.da.SearchCursor(waterbody_extent, ['WaterbodyA']) as cursor:
    for row in cursor:
        unique_waterbodies.append(row[0])

print("Unique Waterbodies:", unique_waterbodies)

Unique Waterbodies: ['BBS', 'BB', 'CH', 'EB', 'GTM']


### 2.4 Load the table of study periods,  parameters, and seasons<a class="anchor" id="reg_study"></a>

In [14]:
crossyear_all = pd.read_csv(gis_path + 'season_def/CrossYear_all.csv', low_memory=False)
crossyear_all

Unnamed: 0,WaterBody,Season,Year1,Year2,Year3,Parameter,Filename,NumDataPoints,RMSE,ME
0,Charlotte Harbor,Spring,2017,2018,,Total Nitrogen,,,,
1,Charlotte Harbor,Summer,2016,2017,,Total Nitrogen,,,,
2,Charlotte Harbor,Fall,2016,2017,,Total Nitrogen,,,,
3,Charlotte Harbor,Winter,2016,2017,2018.0,Total Nitrogen,,,,
4,Charlotte Harbor,Spring,2017,2018,,Salinity,,,,
...,...,...,...,...,...,...,...,...,...,...
115,Biscayne Bay,Winter,2021,2022,2023.0,Secchi Depth,,,,
116,Biscayne Bay,Spring,2022,2023,,Water Temperature,,,,
117,Biscayne Bay,Summer,2021,2022,,Water Temperature,,,,
118,Biscayne Bay,Fall,2021,2022,,Water Temperature,,,,


### 2.5 Define output folders<a class="anchor" id="reg_output"></a>

In [15]:
shpAll_folder = gis_path + r"shapefiles/CrossYear_shapefiles_All/"
idwAll_folder = gis_path + r"raster_output/CrossYear_IDW_All/"

# Preview dataset
dfAll

Unnamed: 0,WaterBody,ParameterName,ParameterUnits,Season,Latitude_DD,Longitude_DD,WbodyAcronym,ResultValue,x,y
0,Big Bend Seagrasses,Dissolved Oxygen,mg/L,Fall,29.008300,-82.825250,BBS,6.873333,514236.421562,556316.395208
1,Big Bend Seagrasses,Dissolved Oxygen,mg/L,Fall,29.125000,-82.841666,BBS,6.976000,512518.355037,569259.880703
2,Big Bend Seagrasses,Dissolved Oxygen,mg/L,Fall,29.149500,-83.079500,BBS,7.225000,489395.665621,571785.712589
3,Big Bend Seagrasses,Dissolved Oxygen,mg/L,Fall,29.161167,-83.047333,BBS,7.110000,492509.553281,573104.900093
4,Big Bend Seagrasses,Dissolved Oxygen,mg/L,Fall,29.162167,-82.810500,BBS,6.350000,515505.865155,573415.656435
...,...,...,...,...,...,...,...,...,...,...
17399,Guana Tolomato Matanzas,Water Temperature,Degrees C,Winter,30.062183,-81.369175,GTM,19.712500,653318.271561,675486.194857
17400,Guana Tolomato Matanzas,Water Temperature,Degrees C,Winter,30.083020,-81.342860,GTM,12.800000,655801.884635,677852.911263
17401,Guana Tolomato Matanzas,Water Temperature,Degrees C,Winter,30.116880,-81.344440,GTM,17.000000,655568.871648,681607.260977
17402,Guana Tolomato Matanzas,Water Temperature,Degrees C,Winter,30.160736,-81.360278,GTM,12.300000,653940.692803,686441.390786


### 2.6 Fill NaN RowID with unique ID, IDW function needs unique ID <a class="anchor" id="reg_id"></a>

In [16]:
idw_rk.fill_nan_rowids(dfAll, 'RowID')

# Keep RowID as integer
dfAll['RowID'] = dfAll['RowID'].astype(int)

## 3. Create Shapefiles <a class="anchor" id="reg_create_shp"></a>

In [17]:
# Merge interested with latitude and longitude columns
crossyear_all_coord = idw_rk.merge_with_lat_long_new(crossyear_all, dfAll, "Season")
crossyear_all_coord

Unnamed: 0,WaterBody,Season,Year1,Year2,Year3,Parameter,Filename,NumDataPoints,RMSE,ME,x,y,RowID,ResultValue
0,Charlotte Harbor,Spring,2017,2018,,Total Nitrogen,,,,,591267.325151,272548.455825,11341,0.790000
1,Charlotte Harbor,Spring,2017,2018,,Total Nitrogen,,,,,591484.455148,272735.013444,11342,0.870000
2,Charlotte Harbor,Spring,2017,2018,,Total Nitrogen,,,,,589465.414295,275500.131085,11343,0.780000
3,Charlotte Harbor,Spring,2017,2018,,Total Nitrogen,,,,,589338.150366,275567.884388,11344,0.780000
4,Charlotte Harbor,Spring,2017,2018,,Total Nitrogen,,,,,591931.122901,275878.152266,11345,0.800000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
17402,Biscayne Bay,Winter,2021,2022,2023.0,Water Temperature,,,,,784697.277475,213473.642774,3035,23.895922
17403,Biscayne Bay,Winter,2021,2022,2023.0,Water Temperature,,,,,785871.933987,216879.419467,3036,24.480743
17404,Biscayne Bay,Winter,2021,2022,2023.0,Water Temperature,,,,,787466.319703,218700.069505,3037,24.634917
17405,Biscayne Bay,Winter,2021,2022,2023.0,Water Temperature,,,,,784434.115885,221871.575426,3038,23.509827


In [18]:
idw_rk.create_shp_season_new(crossyear_all_coord, "Season", shpAll_folder, start_year_included=False)

Number of data rows for BBS, DO_mgl, None, Fall: 42
Shapefile for BBS, DO_mgl for season Fall has been saved as SHP_BBS_DO_mgl_Fall.shp
Number of data rows for BBS, Sal_ppt, None, Fall: 34
Shapefile for BBS, Sal_ppt for season Fall has been saved as SHP_BBS_Sal_ppt_Fall.shp
Number of data rows for BBS, Secc_m, None, Fall: 33
Shapefile for BBS, Secc_m for season Fall has been saved as SHP_BBS_Secc_m_Fall.shp
Number of data rows for BBS, TN_mgl, None, Fall: 37
Shapefile for BBS, TN_mgl for season Fall has been saved as SHP_BBS_TN_mgl_Fall.shp
Number of data rows for BBS, Turb_ntu, None, Fall: 39
Shapefile for BBS, Turb_ntu for season Fall has been saved as SHP_BBS_Turb_ntu_Fall.shp
Number of data rows for BBS, T_c, None, Fall: 42
Shapefile for BBS, T_c for season Fall has been saved as SHP_BBS_T_c_Fall.shp
Number of data rows for BBS, DO_mgl, None, Spring: 55
Shapefile for BBS, DO_mgl for season Spring has been saved as SHP_BBS_DO_mgl_Spring.shp
Number of data rows for BBS, Sal_ppt, None

Shapefile for CH, DO_mgl for season Summer has been saved as SHP_CH_DO_mgl_Summer.shp
Number of data rows for CH, Sal_ppt, None, Summer: 837
Shapefile for CH, Sal_ppt for season Summer has been saved as SHP_CH_Sal_ppt_Summer.shp
Number of data rows for CH, Secc_m, None, Summer: 726
Shapefile for CH, Secc_m for season Summer has been saved as SHP_CH_Secc_m_Summer.shp
Number of data rows for CH, TN_mgl, None, Summer: 100
Shapefile for CH, TN_mgl for season Summer has been saved as SHP_CH_TN_mgl_Summer.shp
Number of data rows for CH, Turb_ntu, None, Summer: 88
Shapefile for CH, Turb_ntu for season Summer has been saved as SHP_CH_Turb_ntu_Summer.shp
Number of data rows for CH, T_c, None, Summer: 901
Shapefile for CH, T_c for season Summer has been saved as SHP_CH_T_c_Summer.shp
Number of data rows for CH, DO_mgl, None, Winter: 766
Shapefile for CH, DO_mgl for season Winter has been saved as SHP_CH_DO_mgl_Winter.shp
Number of data rows for CH, Sal_ppt, None, Winter: 743
Shapefile for CH, Sa

Shapefile for GTM, T_c for season Winter has been saved as SHP_GTM_T_c_Winter.shp


## 4. Cross Validation for IDW <a class="anchor" id="reg_cv_idw"></a>

In [None]:
# Empty the shapefile folder
# idw_rk.delete_all_files(idwAll_folder)

In [None]:
# Select a section of table to process
seasons_slct = crossyear_all.iloc[:]
seasons_slct.head()

In [None]:
# If the number of data points is less than 3，skipping calculate IDW
idw_rk.idw_interpolation_new2(seasons_slct, shpAll_folder, idwAll_folder, waterbody_extent, barrier_folder, "Season", include_start_year=False, percentage=10)

In [None]:
# If the number of data points is less than 3，skipping calculate IDW
# idw_rk.idw_interpolation_new(seasons_slct, shpAll_folder, idwAll_folder, waterbody_extent, barrier_folder, "Season", include_start_year=False)

## 5. RK Interpolation<a class="anchor" id="reg_rk"></a>

### Define output folder

In [18]:
out_raster_folder = gis_path + r"raster_output/CrossYear_RK_All/"
out_ga_folder     = gis_path + r"ga_output_rk/CrossYear_RK_All/"
diagnostic_folder = gis_path + r"diagnostic_rk/CrossYear_RK_All/"
std_error_folder  = gis_path + r"std_error_pred/CrossYear_RK_All/"

# Clean existing files in folders
# idw_rk.delete_all_files(out_raster_folder)
# idw_rk.delete_all_files(out_ga_folder)
# idw_rk.delete_all_files(diagnostic_folder)
# idw_rk.delete_all_files(std_error_folder)

In [22]:
rk_crossyear_all = crossyear_all.copy()
rk_crossyear_all['covariates'] = rk_crossyear_all['WaterBody'].apply(lambda x: covariates_dict.get(x, 'default_covariate'))
# rk_crossyear_all.drop('Select_NumDataPoints', axis=1, inplace=True)

rk_csv = gis_path + "rk_crs.csv" 
rk_crossyear_all.to_csv(rk_csv, index=False, encoding='utf-8-sig') 
rk_crossyear_all.head()

Unnamed: 0,WaterBody,Season,Year1,Year2,Year3,Parameter,Filename,NumDataPoints,RMSE,ME,covariates
0,Charlotte Harbor,Spring,2017,2018,,Total Nitrogen,,,,,default_covariate
1,Charlotte Harbor,Summer,2016,2017,,Total Nitrogen,,,,,default_covariate
2,Charlotte Harbor,Fall,2016,2017,,Total Nitrogen,,,,,default_covariate
3,Charlotte Harbor,Winter,2016,2017,2018.0,Total Nitrogen,,,,,default_covariate
4,Charlotte Harbor,Spring,2017,2018,,Salinity,,,,,default_covariate


### Select rows to process

In [20]:
crossyear_slct = rk_crossyear_all.iloc[:]
crossyear_slct.drop(crossyear_slct[crossyear_slct['WaterBody'] != 'Big Bend Seagrasses'].index, inplace=True)
crossyear_slct.head()

Unnamed: 0,WaterBody,Season,Year1,Year2,Year3,Parameter,Filename,NumDataPoints,RMSE,ME,covariates
24,Big Bend Seagrasses,Spring,2021,2022,,Total Nitrogen,,,,,default_covariate
25,Big Bend Seagrasses,Summer,2020,2021,,Total Nitrogen,,,,,default_covariate
26,Big Bend Seagrasses,Fall,2020,2021,,Total Nitrogen,,,,,default_covariate
27,Big Bend Seagrasses,Winter,2020,2021,2022.0,Total Nitrogen,,,,,default_covariate
28,Big Bend Seagrasses,Spring,2021,2022,,Salinity,,,,,default_covariate


In [21]:
importlib.reload(idw_rk)

with open(gis_path + "rk_crs.csv", 'w', newline='') as csvfile:
    csv_writer = csv.writer(csvfile)
    # Determine if year should be included in the output based on a condition
    start_year_included = False

    # Write the header line based on whether the year is included
    cols = list(crossyear_slct.columns)
    csv_writer.writerow(cols)
    
    for i in crossyear_slct.index:
        s_time = time.time()
        process, rmse, me, count, file_loc = idw_rk.rk_interpolation_new(
            method="rk",
            radius=50000,
            folder_path=gis_path,
            waterbody=area_shortnames[crossyear_slct.loc[i]["WaterBody"]],
            parameter=param_shortnames[crossyear_slct.loc[i]["Parameter"]],
            year=crossyear_slct.loc[i]["Start Year"] if start_year_included else None,
            season=crossyear_slct.loc[i]['Season'],
            covariates=covariates_dict[area_shortnames[crossyear_slct.loc[i]["WaterBody"]]],
            out_raster_folder=out_raster_folder,
            out_ga_folder=out_ga_folder,
            std_error_folder=std_error_folder,
            diagnostic_folder=diagnostic_folder,
            shapefile_folder_name="shapefiles/CrossYear_shapefiles_All",
            start_year_included=start_year_included  # Pass the variable to the function
        )
        e_time = time.time()

        # Write data row, conditionally include year based on the setting
        data_row = [
            crossyear_slct.loc[i]["WaterBody"], 
            crossyear_slct.loc[i]['Season'],
            crossyear_slct.loc[i]["Parameter"],
            file_loc, count, rmse, me,
            covariates_dict[area_shortnames[rk_crossyear_all.loc[i]["WaterBody"]]]
        ]

        print(f"{int(e_time - s_time)} seconds elapsed for processing {count} points in {i}th row: RMSE: {rmse}, ME: {me}, file exported to {file_loc}")
        csv_writer.writerow(data_row)
        if i % 10 == 0:
            csvfile.flush()  # Flush the csv file every 10 rows.

Processing file: SHP_BBS_TN_mgl_Spring.shp
--- Time lapse: 2836.783664226532 seconds ---
2837 seconds elapsed for processing 45 points in 24th row: RMSE: 0.163567188855, ME: 0.00223321449112, file exported to E:/Projects/SEACAR_WQ_2024/GIS_Data/raster_output/CrossYear_RK_All/BBS_TN_mgl_Spring_RK.tif
Processing file: SHP_BBS_TN_mgl_Summer.shp
--- Time lapse: 2309.689077615738 seconds ---
2309 seconds elapsed for processing 39 points in 25th row: RMSE: 0.277711893993, ME: -0.00245978525315, file exported to E:/Projects/SEACAR_WQ_2024/GIS_Data/raster_output/CrossYear_RK_All/BBS_TN_mgl_Summer_RK.tif
Processing file: SHP_BBS_TN_mgl_Fall.shp
--- Time lapse: 2601.6203858852386 seconds ---
2601 seconds elapsed for processing 37 points in 26th row: RMSE: 0.157620849573, ME: -0.0180831910025, file exported to E:/Projects/SEACAR_WQ_2024/GIS_Data/raster_output/CrossYear_RK_All/BBS_TN_mgl_Fall_RK.tif
Processing file: SHP_BBS_TN_mgl_Winter.shp
--- Time lapse: 2870.2168202400208 seconds ---
2870 seco